Internals overview
Cheetah implements a two-level IR pipeline and a small solver for index algebra.
Data flow
- User code in
cheetah.api
builds a typed core IR inside a globalContext
. - The function decorator wraps the Python function to capture argument/local names and feed statements into the IR (via operator overloading and context managers).
core_to_codegen.render
:- Gathers transitive dependencies of selected entry points
- Assigns export and internal names
- Analyses inlining and side-effects
- Lowers core IR → codegen IR and pretty-prints to CUDA/C++
IR layers
- Core IR (
cheetah.internal.core_ir
): typed nodes, control blocks, validation - Codegen IR (
cheetah.internal.codegen_ir
): target-oriented expressions and statements with precedence-aware printing
Index tools
cheetah.index_tools
models dimension sizes, scoped equations, and runtime indices. A small algebraic solver ensures equations are well-formed and can be turned into linear index computations.
Inline PTX
- Inline assembly supports positional/named operands with type-checked constraints, volatile flag, and memory clobbers.