Marrow

Apache Arrow in Mojo

Marrow is an implementation of Apache Arrow written in Mojo. It provides a columnar in-memory array format with null handling, SIMD-accelerated compute kernels, GPU support, and a Python binding layer that outperforms PyArrow on numeric array construction.

Build

git clone https://github.com/kszucs/marrow
cd marrow
pixi run build_python    # compiles python/marrow.so

Then in Python:

import marrow as ma

5-minute example

import sys
sys.path.insert(0, "../python")
import marrow as ma

Create arrays

# Primitive arrays — type is inferred from the Python data
a = ma.array([1, 2, 3, None, 5])
b = ma.array([10, 20, 30, 40, 50])

print(a)
print(b)
PrimitiveArray[int64]([1, 2, 3, NULL, 5])
PrimitiveArray[int64]([10, 20, 30, 40, 50])
# Strings and nested structures work too
names  = ma.array(["Alice", "Bob", None, "Dana"])
scores = ma.array([9.5, 8.2, 7.8, None])
print(names)
print(scores)
StringArray([Alice, Bob, NULL, Dana])
PrimitiveArray[float64]([9.5, 8.2, 7.8, NULL])

Arithmetic

Null values propagate through element-wise operations — if either input is null, the output is null.

result = ma.add(a, b, None)
print(result)   # index 3 is null because a[3] is null
PrimitiveArray[int64]([11, 22, 33, NULL, 55])
print(ma.mul(a, b, None))
PrimitiveArray[int64]([10, 40, 90, NULL, 250])

Aggregates

Aggregates skip null values.

print("sum:    ", ma.sum_(a, None))    # 1+2+3+5 = 11, null skipped
print("min:    ", ma.min_(a, None))
print("max:    ", ma.max_(a, None))
print("product:", ma.product(a, None))
sum:     11
min:     1
max:     5
product: 30

Filter and drop nulls

mask = ma.array([True, False, True, False, True])
print("filter:", ma.filter_(a, mask))
print("drop nulls:", ma.drop_nulls(a))
filter: PrimitiveArray[int64]([1, 3, 5])
drop nulls: PrimitiveArray[int64]([1, 2, 3, 5])

Inspect arrays

print("length:    ", a.__len__())
print("null count:", a.null_count())
print("type:      ", a.type())
print("is_valid(3):", a.is_valid(3))   # False — index 3 is null
length:     5
null count: 1
type:       int64
is_valid(3): False

What’s next

Page What you’ll learn
Arrays All array types, nulls, slicing, nested structures
Compute Arithmetic, aggregates, filter, drop_nulls
Type System Type inference, explicit schemas, struct types
PyArrow Interop Benchmarks vs PyArrow, zero-copy C Data Interface