Arrays

Arrays are the core data structure in Marrow. They are immutable and columnar — values are stored contiguously in memory with a separate validity bitmap that tracks null positions.

Primitive arrays

Primitive arrays hold fixed-size scalar values: booleans, integers, and floats.

# Integer arrays
i64 = ma.array([1, 2, 3, 4, 5])
i32 = ma.array([1, 2, 3, 4, 5], type=ma.int32())
u8  = ma.array([0, 127, 255],   type=ma.uint8())

print(i64)
print(i32)
print(u8)
PrimitiveArray[int64]([1, 2, 3, 4, 5])
PrimitiveArray[int32]([1, 2, 3, 4, 5])
PrimitiveArray[uint8]([0, 127, 255])
# Floating point
f32 = ma.array([1.0, 2.5, 3.14], type=ma.float32())
f64 = ma.array([1.0, 2.5, 3.14])   # float64 inferred
print(f32)
print(f64)
PrimitiveArray[float32]([1.0, 2.5, 3.14])
PrimitiveArray[float64]([1.0, 2.5, 3.14])
# Boolean
b = ma.array([True, False, True, False])
print(b)
BoolArray([True, False, True, False])

Null values

Pass None to mark a position as null. The validity bitmap tracks which positions are valid.

arr = ma.array([1, None, 3, None, 5])
print(arr)
print("null count:", arr.null_count())
print("is_valid(0):", arr.is_valid(0))   # True
print("is_valid(1):", arr.is_valid(1))   # False — null
PrimitiveArray[int64]([1, NULL, 3, NULL, 5])
null count: 2
is_valid(0): True
is_valid(1): False
Note

Null is not zero. is_valid(1) is False — the value at index 1 is absent entirely, not 0 or empty.

Inspecting arrays

arr = ma.array([10, 20, 30, None, 50], type=ma.int32())

print("length:    ", len(arr))
print("null count:", arr.null_count())
print("type:      ", arr.type())
length:     5
null count: 1
type:       int32

String arrays

String arrays store variable-length UTF-8 strings.

s = ma.array(["hello", None, "world", "mañana"])
print(s)
print("null count:", s.null_count())
print("is_valid(1):", s.is_valid(1))
StringArray([hello, NULL, world, mañana])
null count: 1
is_valid(1): False

List arrays

List arrays store variable-length sequences. Each element is itself a list.

nested = ma.array([[1, 2], [3, 4, 5], [6]])
print(nested)
print("type:", nested.type())
ListArray([PrimitiveArray[int64]([1, 2]), PrimitiveArray[int64]([3, 4, 5]), PrimitiveArray[int64]([6])])
type: list<int64>

Nulls in list arrays

with_null = ma.array([[1, 2], None, [3, 4, 5]])
print(with_null)
print("null count:", with_null.null_count())
ListArray([PrimitiveArray[int64]([1, 2]), NULL, PrimitiveArray[int64]([3, 4, 5])])
null count: 1

Struct arrays

Struct arrays store rows of named fields — like a table with one row per element.

people = ma.array([
    {"name": "Alice", "age": 30},
    {"name": "Bob",   "age": 25},
    {"name": None,    "age": 40},   # null name
])
print(people)
print("type:", people.type())
StructArray({'name': StringArray([Alice, Bob, NULL]), 'age': PrimitiveArray[int64]([30, 25, 40])})
type: struct<name: string, age: int64>

Slicing

slice(offset, length) returns a zero-copy view of a sub-range. No data is copied.

arr = ma.array([10, 20, 30, None, 50, 60])
print("original: ", arr)
print("slice(2, 3):", arr.slice(2, 3))   # [30, NULL, 50]
print("slice(0, 4):", arr.slice(0, 4))   # [10, 20, 30, NULL]
original:  PrimitiveArray[int64]([10, 20, 30, NULL, 50, 60])
slice(2, 3): PrimitiveArray[int64]([30, NULL, 50])
slice(0, 4): PrimitiveArray[int64]([10, 20, 30, NULL])

Slicing works on all array types:

s = ma.array(["a", "b", "c", "d", "e"])
print(s.slice(1, 3))   # ["b", "c", "d"]
StringArray([b, c, d])

FixedSizeList arrays

FixedSizeList arrays store sequences of a fixed length per element — useful for embedding vectors or coordinates.

# 2D coordinates: each element has exactly 2 values
coords = ma.array([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
print(coords)
print("type:", coords.type())
ListArray([PrimitiveArray[float64]([1.0, 2.0]), PrimitiveArray[float64]([3.0, 4.0]), PrimitiveArray[float64]([5.0, 6.0])])
type: list<float64>