User Guide¶
Installation¶
Ragged requires Python 3.10 or later and is available on PyPI:
pip install ragged
The only runtime dependencies are awkward >= 2.6.7 and numpy >= 1.24.
GPU support (CUDA) requires cupy; see Device.
Quickstart¶
import ragged
import numpy as np
# Create a ragged array from a nested Python list
a = ragged.array([[1.0, 2.0, 3.0], [4.0], [5.0, 6.0]])
print(a.shape) # (3, None)
print(a.dtype) # float64
# Elementwise operations preserve shape
b = ragged.sqrt(a)
# Boolean indexing
c = a[a > 2.0]
# Reduction along an axis
row_sums = ragged.sum(a, axis=1) # shape (3,)
# Functional update (JAX-style, copy semantics)
d = a.at[0].set(ragged.array([10.0, 20.0, 30.0]))
The Array Model¶
Shape¶
A ragged.array has a shape tuple where each element is either an
int (fixed-size dimension) or None (variable-size / ragged dimension).
shape[0]is always anint— the number of rows at the outermost level.Any later dimension can be
Nonewhen row lengths are non-uniform.
ragged.array([1.0, 2.0, 3.0]).shape # (3,)
ragged.array([[1.0, 2.0], [3.0, 4.0]]).shape # (2, None) ← uniform rows
ragged.array([[1.0, 2.0], [3.0]]).shape # (2, None) ← ragged rows
Note
Even when all rows happen to have the same length, dimensions past the first
report None if the array was built from Python lists. Pass a NumPy
array to the constructor to obtain a fully regular shape:
ragged.array(np.ones((3, 4))).shape # (3, 4) ← regular from numpy
Dtype¶
ragged.array always stores a single numpy.dtype for all elements.
Mixed-type input is upcasted following NumPy’s type promotion rules.
a = ragged.array([[1, 2], [3]], dtype=np.float32)
a.dtype # dtype('float32')
Supported dtypes mirror the Array API dtype table: bool, int8/16/32/64, uint8/16/32/64, float32/64, complex64/128.
Device¶
ragged.array tracks whether data live on CPU or GPU:
a = ragged.array([1.0, 2.0], device="cpu") # backed by NumPy (default)
# a = ragged.array([1.0, 2.0], device="cuda") # backed by CuPy (needs GPU)
Creating Arrays¶
From Python objects¶
ragged.array([[1, 2, 3], [4, 5]]) # from nested list
ragged.array(np.arange(6).reshape(2,3)) # from numpy array
ragged.asarray([1.0, 2.0, 3.0]) # alias for array()
From factory functions¶
ragged.zeros((3,)) # 1-D zeros
ragged.ones((2, 4)) # 2-D ones
ragged.full((3,), 7.0) # filled with scalar
ragged.arange(10) # 0..9
ragged.linspace(0, 1, 5) # five evenly-spaced values
ragged.eye(3) # 3×3 identity
ragged.empty_like(a) # same shape/dtype, uninitialised
ragged.zeros_like(a)
ragged.ones_like(a)
Meshgrid¶
xs, ys = ragged.meshgrid(ragged.arange(3), ragged.arange(4))
Indexing¶
Integer and slice indexing¶
a = ragged.array([[1.0, 2.0, 3.0], [4.0, 5.0]])
a[0] # first row → ragged.array([1., 2., 3.])
a[-1] # last row → ragged.array([4., 5.])
a[0:2] # slice → ragged.array([[1., 2., 3.], [4., 5.]])
a[0, 1] # element → ragged.array(2.) (for uniform rows)
Boolean masking¶
mask = ragged.array([True, False])
a[mask] # → ragged.array([[1., 2., 3.]])
# Computed mask
a[a > 2.0]
In-place mutation (__setitem__)¶
ragged.array supports in-place assignment, enabling compatibility with
array_api_extra.at:
a = ragged.array([1.0, 2.0, 3.0, 4.0])
a[1:3] = 0.0 # slice ← scalar
a[ragged.array([True, False, True, False])] = 99.0 # boolean mask
For ragged arrays, in-place assignment supports integer and slice keys on the
outermost axis (boolean-mask keys raise TypeError on ragged layouts):
r = ragged.array([[1.0, 2.0], [3.0]])
r[0] = [10.0, 20.0, 30.0] # replace row with a different-length row
Functional updates with .at¶
The .at interface provides JAX-style copy-semantics updates — the original
array is never mutated:
a = ragged.array([1.0, 2.0, 3.0])
a.at[1].set(99.0) # → [1., 99., 3.]
a.at[0:2].add(10.0) # → [11., 12., 3.]
a.at[2].multiply(2.0) # → [1., 2., 6.]
a.at[1].subtract(0.5)
a.at[1].divide(2.0)
a.at[2].power(2.0)
a.at[1].min(3.0) # x[1] = min(x[1], 3.0)
a.at[1].max(0.0) # x[1] = max(x[1], 0.0)
print(a) # [1., 2., 3.] ← unchanged
Elementwise Operations¶
All Array API elementwise functions are available:
ragged.sqrt(a)
ragged.abs(a)
ragged.exp(a)
ragged.log(a)
ragged.sin(a); ragged.cos(a); ragged.tan(a)
ragged.add(a, b) # or a + b
ragged.multiply(a, 2.0) # or a * 2.0
ragged.equal(a, b) # or a == b
Operator overloads¶
All standard Python operators are supported:
+, -, *, /, //, **, %,
&, |, ^, ~, <<, >>,
==, !=, <, <=, >, >=,
@ (matrix multiply).
NumPy interoperability (NEP-13 / NEP-18)¶
ragged.array implements __array_ufunc__ (NEP-13) and
__array_function__ (NEP-18), so NumPy functions work transparently:
import numpy as np
np.sqrt(a) # delegates to ragged.sqrt
np.add(a, b) # delegates through __array_ufunc__
np.concatenate([a, b]) # delegates to ragged.concat
np.stack([a, b]) # delegates to ragged.stack
Array Manipulation¶
Shape operations¶
ragged.reshape(a, (6,)) # reshape (uniform arrays only)
ragged.squeeze(a, axis=0) # remove size-1 axis
ragged.expand_dims(a, axis=0) # insert new axis
ragged.permute_dims(a, (1, 0)) # transpose / axis permutation
Reordering¶
ragged.flip(a) # reverse all elements
ragged.flip(a, axis=0) # reverse along axis 0
ragged.roll(a, shift=2) # roll all elements by 2
ragged.roll(a, shift=1, axis=0) # roll rows
Joining¶
ragged.concat([a, b]) # concatenate along axis 0
ragged.concat([a, b], axis=1) # concatenate along axis 1
ragged.stack([a, b]) # stack into new axis 0
ragged.stack([a, b], axis=1)
Broadcasting¶
ragged.broadcast_to(a, (4, 3))
x, y = ragged.broadcast_arrays(a, b)
Statistical and Searching Functions¶
ragged.sum(a) # sum of all elements
ragged.sum(a, axis=1) # sum per row
ragged.mean(a, axis=0)
ragged.max(a); ragged.min(a)
ragged.std(a); ragged.var(a)
ragged.prod(a)
ragged.argmax(a); ragged.argmin(a)
ragged.nonzero(a)
ragged.where(condition, x, y)
ragged.sort(a)
ragged.argsort(a)
Linear Algebra¶
ragged.matmul(a, b) # or a @ b
ragged.tensordot(a, b, axes=1)
ragged.vecdot(a, b) # inner product over last axis
ragged.matrix_transpose(a) # swap last two axes
Data Type Functions¶
ragged.astype(a, np.float32)
ragged.can_cast(np.float32, np.float64)
ragged.result_type(a, b)
ragged.isdtype(np.float32, "real floating")
I/O: CF Conventions¶
The ragged.io sub-package provides serialisation helpers for the
Climate and Forecast (CF) Conventions ragged-array encodings.
Contiguous ragged array (CF DSG H.3.1)¶
A contiguous encoding stores all values in a flat content array and
row lengths in a counts array:
import ragged.io
a = ragged.array([[1.0, 2.0], [3.0], [4.0, 5.0, 6.0]])
content, counts = ragged.io.to_cf_contiguous(a)
# content: ragged.array([1., 2., 3., 4., 5., 6.])
# counts: ragged.array([2, 1, 3])
restored = ragged.io.from_cf_contiguous(content, counts)
Indexed ragged array (CF DSG H.4.1)¶
An indexed encoding stores the row index for every element:
content, index = ragged.io.to_cf_indexed(a)
# content: ragged.array([1., 2., 3., 4., 5., 6.])
# index: ragged.array([0, 0, 1, 2, 2, 2])
restored = ragged.io.from_cf_indexed(content, index)
Constants¶
ragged.e # Euler's number
ragged.pi # π
ragged.inf # IEEE 754 infinity
ragged.nan # IEEE 754 NaN
ragged.newaxis # alias for None, for use in indexing
Array API Compliance¶
ragged targets the Python Array API Standard (2022.12 and later).
The namespace is discoverable via:
xp = a.__array_namespace__() # returns the ragged module
xp.sqrt(a)
This allows Array-API-consuming libraries (e.g. array_api_extra,
scipy, sklearn) to use ragged.array transparently wherever they
accept an Array API input.