etha.comm.ir

etha.comm.ir#

Intermediate Representation for tensor transfer operations.

Attributes#

logger

Classes#

`Bucket`	A transfer unit: one buffer + one wire op for its chunks.
`Chunk`	A shape-dependent transfer descriptor: a tensor region plus its role.
`Endpoint`	A chunk location: a rank and its cell (multi-index) in the transfer grid.
`M2MMap`	Mesh-to-mesh topology (shape-independent, reusable across batches).
`Route`	One source cell's delivery: `src` endpoint to a set of dst endpoints.

Module Contents#

class etha.comm.ir.Bucket#

A transfer unit: one buffer + one wire op for its chunks.

Identity (transport/role/ranks/key) is uniform across a bucket’s chunks (they share a bucket_key), so it is read from the first chunk. Byte offsets (the prefix sum of chunk.nbytes) are computed once at construction since a bucket is reused across transfers.

finalize() → None#: Finalize communication and cleanup.

is_complete() → bool#

Check if communication is complete.

Returns:: True if complete, False otherwise.

launch() → bool#

Issue the wire op once the assembled buffer is ready.

Returns False if the buffer-assembly event hasn’t fired yet; otherwise issues the transport (no-op for LOCAL/NONE) and returns True.

prepare() → None#

Assemble the bucket buffer from its chunks.

Source-side Partial reduce and dtype cast both live inside Chunk.prepare. A producing chunk’s data is copied into the bucket; the per-chunk buffer is kept only when the chunk also is_target (self-copy), so finalize can write it to the target. A consume-only recv instead points its buffer at the bucket slice to land directly.

buffer: torch.Tensor | None = None#

buffer_ready_event: torch.cuda.Event | None = None#

chunks: list[Chunk]#

device: torch.device | None = None#

property dst_ranks: tuple[int, Ellipsis]#

property is_source: bool#

property is_target: bool#

property key: tuple#

offsets: list[int]#

property src_rank: int#

property total_bytes: int#

property transport: etha.comm.transfer.Transport#

work: torch.distributed.Work | None = None#

class etha.comm.ir.Chunk#

A shape-dependent transfer descriptor: a tensor region plus its role.

Not a transfer unit — a Bucket runs the wire op; a chunk only describes one tensor region and how to prepare/finalize it.

finalize() → None#: Write the produced/received buffer back to the target tensor.

prepare(contiguous: bool = True) → None#

Prepare the buffer.

is_source reads src_slice then performs (in order): in-place all-reduce on Partial sub-groups (in source dtype) → cast to transfer_dtype. Reducing before the cast matches DTensor Partial → Replicate semantics; running the all-reduce in the (possibly lower-precision) wire dtype would change numerical results. A consume-only chunk (recv) instead views dst_slice so the wire op lands directly in the target.

property bucket_key: tuple#

Return bucket grouping key.

transport is in the key so a local (self-copy) chunk never bundles with a co-located broadcast source chunk: they share src_rank and dst_ranks but must run different ops and produce buffers of different sizes across the broadcast group.

cell_key is added for reduce-only Partial chunks only — they all share dst_ranks=() and would otherwise bundle across cells, making the bucket’s all_reduce sequence per-rank-specific and out of sync with peer ranks whose matching cells live in separate buckets. Shipping chunks already differ by dst_ranks across cells.

buffer: torch.Tensor | None = None#

chunk_shape: tuple[int, Ellipsis]#

dst_idx: tuple | None = None#

dst_ranks: tuple[int, Ellipsis]#

dst_slice: tuple[slice, Ellipsis] = ()#

is_source: bool#

is_target: bool#

property nbytes: int#: Byte size on the wire (transfer dtype).

source_partial_groups: list[tuple[torch.distributed.ProcessGroup, str]] | None = None#

src_idx: tuple#

src_rank: int#

src_slice: tuple[slice, Ellipsis] = ()#

tensor: torch.Tensor | None = None#

transfer_dtype: torch.dtype | None = None#

transport: etha.comm.transfer.Transport#

class etha.comm.ir.Endpoint#

Bases: msgspec.Struct

A chunk location: a rank and its cell (multi-index) in the transfer grid.

cell: tuple[int, Ellipsis]#

rank: int#

class etha.comm.ir.M2MMap#

Bases: msgspec.Struct

Mesh-to-mesh topology (shape-independent, reusable across batches).

routes is a flat list of per-cell delivery plans (see Route); source_num_slicers/target_num_slicers describe how each side partitions the tensor; source_partial_reductions lists (mesh_dim, reduce_op) per source Partial dim (empty when none).

routes: list[Route] | None = None#

source_num_slicers: list[int] | None = None#

source_partial_reductions: list[tuple[int, str]] = []#

target_num_slicers: list[int] | None = None#

class etha.comm.ir.Route#

Bases: msgspec.Struct

One source cell’s delivery: src endpoint to a set of dst endpoints.

kind is fixed at construction: empty dsts -> NONE (reduce-only), else BROADCAST (>1) or P2P (1). LOCAL is not a route kind; a dst that lands on the source rank is refined to a local copy when chunks are built.

dsts: tuple[Endpoint, Ellipsis]#

kind: etha.comm.transfer.Transport#

src: Endpoint#

etha.comm.ir.logger#