API Reference#
Complete API documentation for Helion.
Kernel Creation and Control#
Everything you need to create and configure Helion kernels using the helion.kernel() decorator:
Language Constructs#
The helion.language module contains DSL constructs for authoring kernels:
Debugging and Utilities#
- Exceptions
- Overview
- Exception Hierarchy
- Kernel Context Errors
- Loop and Control Flow Errors
- Tile and Indexing Errors
- Assignment and Variable Errors
- Type and Inference Errors
- Configuration Errors
- Tunable Parameter Errors
- Language and Syntax Errors
- Grid and Execution Errors
- Compilation and Runtime Errors
- Warning Classes
- See Also
Advanced Topics#
Quick Reference#
Main Functions#
Language Functions#
Break up an iteration space defined by a size or sequence of sizes into tiles. |
|
Iterate over individual indices of the given iteration space. |
|
Create a range that gets unrolled at compile time by iterating over constant integer values. |
|
Load a value from a tensor using a list of indices. |
|
Store a value to a tensor using a list of indices. |
|
Atomically add a value to a target tensor. |
|
Atomically apply bitwise AND with |
|
Atomically apply bitwise OR with |
|
Atomically apply bitwise XOR with |
|
Atomically exchange (set) a value at |
|
Atomically update |
|
Atomically update |
|
Atomically compare-and-swap a value at |
|
Print values from device code. |
|
Set global memory barriers. |
|
Wait for global memory barriers. |
|
Creates a StackTensor from a tensor of data pointers (dev_ptrs) pointing to tensors alike residing at different memory locations. |
|
Return a device-tensor filled with zeros. |
|
Create a device-tensor filled with a specified value. |
|
Same as torch.arange(), but defaults to same device as the current kernel. |
|
Equivalent to tensor[index] where tensor is a kernel-tensor (not a host-tensor). |
|
Split the last dimension of a tensor with size two into two separate tensors. |
|
Join two tensors along a new minor dimension. |
|
Applies a reduction operation along a specified dimension or all dimensions. |
|
Applies an associative scan operation along a specified dimension. |
|
Compute the cumulative sum along a specified dimension. |
|
Compute the cumulative product along a specified dimension. |
|
Performs a matrix multiplication of tensors with support for multiple dtypes. |
|
Execute inline assembly over a tensor. |
|
Inline a raw Triton snippet inside a Helion kernel. |
|
Explicitly register a block size that should be autotuned and can be used for allocations and inside hl.tile(..., block_size=...). |
|
Register a tunable parameter for autotuning. |
|
alias of |
|
Turn dynamic shapes into compile-time constants. Examples::. |
Language Classes#
This class should not be instantiated directly, it is the result of hl.tile(...) and represents a single tile of the iteration space. |
|
This class should not be instantiated directly. |
Tile Helpers#
Retrieve the index (a 1D tensor containing offsets) of the given tile. |
|
Retrieve the start offset of the given tile. |
|
Retrieve the end offset of the given tile. |
|
Retrieve block size of a given tile, usually set the autotuner. |
|
Retrieve tile_id of a given tile or list of tiles. |