User Guide

Installation

Ragged requires Python 3.10 or later and is available on PyPI:

pip install ragged

The only runtime dependencies are awkward >= 2.6.7 and numpy >= 1.24. GPU support (CUDA) requires cupy; see Device.

Quickstart

import ragged
import numpy as np

# Create a ragged array from a nested Python list
a = ragged.array([[1.0, 2.0, 3.0], [4.0], [5.0, 6.0]])

print(a.shape)   # (3, None)
print(a.dtype)   # float64

# Elementwise operations preserve shape
b = ragged.sqrt(a)

# Boolean indexing
c = a[a > 2.0]

# Reduction along an axis
row_sums = ragged.sum(a, axis=1)   # shape (3,)

# Functional update (JAX-style, copy semantics)
d = a.at[0].set(ragged.array([10.0, 20.0, 30.0]))

The Array Model

Shape

A ragged.array has a shape tuple where each element is either an int (fixed-size dimension) or None (variable-size / ragged dimension).

  • shape[0] is always an int — the number of rows at the outermost level.

  • Any later dimension can be None when row lengths are non-uniform.

ragged.array([1.0, 2.0, 3.0]).shape          # (3,)
ragged.array([[1.0, 2.0], [3.0, 4.0]]).shape  # (2, None)  ← uniform rows
ragged.array([[1.0, 2.0], [3.0]]).shape        # (2, None)  ← ragged rows

Note

Even when all rows happen to have the same length, dimensions past the first report None if the array was built from Python lists. Pass a NumPy array to the constructor to obtain a fully regular shape:

ragged.array(np.ones((3, 4))).shape   # (3, 4)  ← regular from numpy

Dtype

ragged.array always stores a single numpy.dtype for all elements. Mixed-type input is upcasted following NumPy’s type promotion rules.

a = ragged.array([[1, 2], [3]], dtype=np.float32)
a.dtype   # dtype('float32')

Supported dtypes mirror the Array API dtype table: bool, int8/16/32/64, uint8/16/32/64, float32/64, complex64/128.

Device

ragged.array tracks whether data live on CPU or GPU:

a = ragged.array([1.0, 2.0], device="cpu")   # backed by NumPy (default)
# a = ragged.array([1.0, 2.0], device="cuda")  # backed by CuPy (needs GPU)

Creating Arrays

From Python objects

ragged.array([[1, 2, 3], [4, 5]])       # from nested list
ragged.array(np.arange(6).reshape(2,3)) # from numpy array
ragged.asarray([1.0, 2.0, 3.0])         # alias for array()

From factory functions

ragged.zeros((3,))           # 1-D zeros
ragged.ones((2, 4))          # 2-D ones
ragged.full((3,), 7.0)       # filled with scalar
ragged.arange(10)            # 0..9
ragged.linspace(0, 1, 5)     # five evenly-spaced values
ragged.eye(3)                # 3×3 identity
ragged.empty_like(a)         # same shape/dtype, uninitialised
ragged.zeros_like(a)
ragged.ones_like(a)

Meshgrid

xs, ys = ragged.meshgrid(ragged.arange(3), ragged.arange(4))

Indexing

Integer and slice indexing

a = ragged.array([[1.0, 2.0, 3.0], [4.0, 5.0]])

a[0]        # first row  → ragged.array([1., 2., 3.])
a[-1]       # last row   → ragged.array([4., 5.])
a[0:2]      # slice      → ragged.array([[1., 2., 3.], [4., 5.]])
a[0, 1]     # element    → ragged.array(2.)   (for uniform rows)

Boolean masking

mask = ragged.array([True, False])
a[mask]    # → ragged.array([[1., 2., 3.]])

# Computed mask
a[a > 2.0]

In-place mutation (__setitem__)

ragged.array supports in-place assignment, enabling compatibility with array_api_extra.at:

a = ragged.array([1.0, 2.0, 3.0, 4.0])
a[1:3] = 0.0                             # slice ← scalar
a[ragged.array([True, False, True, False])] = 99.0  # boolean mask

For ragged arrays, in-place assignment supports integer and slice keys on the outermost axis (boolean-mask keys raise TypeError on ragged layouts):

r = ragged.array([[1.0, 2.0], [3.0]])
r[0] = [10.0, 20.0, 30.0]    # replace row with a different-length row

Functional updates with .at

The .at interface provides JAX-style copy-semantics updates — the original array is never mutated:

a = ragged.array([1.0, 2.0, 3.0])

a.at[1].set(99.0)            # → [1., 99., 3.]
a.at[0:2].add(10.0)          # → [11., 12., 3.]
a.at[2].multiply(2.0)        # → [1., 2., 6.]
a.at[1].subtract(0.5)
a.at[1].divide(2.0)
a.at[2].power(2.0)
a.at[1].min(3.0)             # x[1] = min(x[1], 3.0)
a.at[1].max(0.0)             # x[1] = max(x[1], 0.0)

print(a)   # [1., 2., 3.]  ← unchanged

Elementwise Operations

All Array API elementwise functions are available:

ragged.sqrt(a)
ragged.abs(a)
ragged.exp(a)
ragged.log(a)
ragged.sin(a); ragged.cos(a); ragged.tan(a)
ragged.add(a, b)          # or a + b
ragged.multiply(a, 2.0)   # or a * 2.0
ragged.equal(a, b)        # or a == b

Operator overloads

All standard Python operators are supported: +, -, *, /, //, **, %, &, |, ^, ~, <<, >>, ==, !=, <, <=, >, >=, @ (matrix multiply).

NumPy interoperability (NEP-13 / NEP-18)

ragged.array implements __array_ufunc__ (NEP-13) and __array_function__ (NEP-18), so NumPy functions work transparently:

import numpy as np

np.sqrt(a)                   # delegates to ragged.sqrt
np.add(a, b)                 # delegates through __array_ufunc__
np.concatenate([a, b])       # delegates to ragged.concat
np.stack([a, b])             # delegates to ragged.stack

Array Manipulation

Shape operations

ragged.reshape(a, (6,))         # reshape (uniform arrays only)
ragged.squeeze(a, axis=0)       # remove size-1 axis
ragged.expand_dims(a, axis=0)   # insert new axis
ragged.permute_dims(a, (1, 0))  # transpose / axis permutation

Reordering

ragged.flip(a)                  # reverse all elements
ragged.flip(a, axis=0)          # reverse along axis 0
ragged.roll(a, shift=2)         # roll all elements by 2
ragged.roll(a, shift=1, axis=0) # roll rows

Joining

ragged.concat([a, b])           # concatenate along axis 0
ragged.concat([a, b], axis=1)   # concatenate along axis 1
ragged.stack([a, b])            # stack into new axis 0
ragged.stack([a, b], axis=1)

Broadcasting

ragged.broadcast_to(a, (4, 3))
x, y = ragged.broadcast_arrays(a, b)

Statistical and Searching Functions

ragged.sum(a)                  # sum of all elements
ragged.sum(a, axis=1)          # sum per row
ragged.mean(a, axis=0)
ragged.max(a); ragged.min(a)
ragged.std(a); ragged.var(a)
ragged.prod(a)

ragged.argmax(a); ragged.argmin(a)
ragged.nonzero(a)
ragged.where(condition, x, y)

ragged.sort(a)
ragged.argsort(a)

Linear Algebra

ragged.matmul(a, b)            # or a @ b
ragged.tensordot(a, b, axes=1)
ragged.vecdot(a, b)            # inner product over last axis
ragged.matrix_transpose(a)     # swap last two axes

Data Type Functions

ragged.astype(a, np.float32)
ragged.can_cast(np.float32, np.float64)
ragged.result_type(a, b)
ragged.isdtype(np.float32, "real floating")

I/O: CF Conventions

The ragged.io sub-package provides serialisation helpers for the Climate and Forecast (CF) Conventions ragged-array encodings.

Contiguous ragged array (CF DSG H.3.1)

A contiguous encoding stores all values in a flat content array and row lengths in a counts array:

import ragged.io

a = ragged.array([[1.0, 2.0], [3.0], [4.0, 5.0, 6.0]])

content, counts = ragged.io.to_cf_contiguous(a)
# content: ragged.array([1., 2., 3., 4., 5., 6.])
# counts:  ragged.array([2, 1, 3])

restored = ragged.io.from_cf_contiguous(content, counts)

Indexed ragged array (CF DSG H.4.1)

An indexed encoding stores the row index for every element:

content, index = ragged.io.to_cf_indexed(a)
# content: ragged.array([1., 2., 3., 4., 5., 6.])
# index:   ragged.array([0, 0, 1, 2, 2, 2])

restored = ragged.io.from_cf_indexed(content, index)

Constants

ragged.e          # Euler's number
ragged.pi         # π
ragged.inf        # IEEE 754 infinity
ragged.nan        # IEEE 754 NaN
ragged.newaxis    # alias for None, for use in indexing

Array API Compliance

ragged targets the Python Array API Standard (2022.12 and later). The namespace is discoverable via:

xp = a.__array_namespace__()   # returns the ragged module
xp.sqrt(a)

This allows Array-API-consuming libraries (e.g. array_api_extra, scipy, sklearn) to use ragged.array transparently wherever they accept an Array API input.