UOp Graph Traversal and Rewriting¶
tinygrad provides multiple layers of APIs for parsing and modifying UOp graphs, ranging from simple manual iteration to powerful pattern-matching-based rewrites. This document catalogs all available methods.
Table of Contents¶
- Read-Only Graph Traversal
UOp.toposort(gate=None)UOp.topovisit(visitor, cache)UOp.backward_sliceUOp.backward_slice_with_selfUOp.op_in_backward_slice_with_self(*ops)UOp.get_consumer_map()UOp._ranges,UOp.ranges- Single Node Operations
UOp.replace(**kwargs)UOp.f(op, **kwargs)UOp.detach()UOp.cast(dtype)UOp.bitcast(dtype)- Pattern Matching
UPat- The Pattern LanguagePatternMatcher.rewrite(uop, ctx)PatternMatcher.__add__- Full Graph Rewriting
graph_rewrite(sink, pm, ctx, bottom_up, bpm)RewriteContext.unified_rewrite(root)- Specialized Rewrites
UOp.substitute(dvars, name, extra_pm)UOp.simplify(tracked)UOp.ssimplify()UOp.sintify()UOp.render(simplify, pm)- Specialized Traversal Functions
_deepwalk(root, targets)full_rewrite_to_sink(sink, ren, optimize)- Context and Tracking
TrackedPatternMatchertrack_rewritesdecorator- Utility Functions
consumer_map_from_toposort(lst)all_metadata
1. Read-Only Graph Traversal¶
UOp.toposort(gate=None) -> dict[UOp, None]¶
Location: tinygrad/uop/ops.py:167
Returns all nodes reachable from self in topological order (children before parents). Uses an iterative stack-based algorithm to avoid recursion depth issues.
# Iterate all nodes reachable from sink
for node in sink.toposort():
if node.op == Ops.CONST:
print(f"Found constant: {node.arg}")
# With gating - control which branches to descend into
for node in sink.toposort(lambda n: n.op != Ops.DETACH):
# Skip DETACH ops and their descendants
...
Parameters:
- gate: Optional callback Callable[[UOp], bool]. If provided, only descends into nodes where gate(node) returns True.
Returns: dict[UOp, None] - ordered dict (insertion order = topological order)
Use when: - You need to iterate over all nodes in a graph for analysis - Building maps/dictionaries based on graph structure - Simple transformations where you manually manage replacement
UOp.topovisit(visitor, cache) -> T¶
Location: tinygrad/uop/ops.py:180
Similar to toposort but applies a visitor function to each node. Returns the result for the root node. Useful when you need to compute a value for each node.
# Count all UOps of a certain type
def count_ops(uop):
return 1 + sum(count_ops(child) for child in uop.src)
# More efficient with topovisit
cache = {}
def count_visitor(node):
return 1 + sum(cache.get(child, 0) for child in node.src)
root.topovisit(count_visitor, cache)
Parameters:
- visitor: Callable[[UOp], T] - applied to each node in topological order
- cache: dict[UOp, T] - must be provided, used to store visitor results
Returns: The visitor result for the root node (T)
Use when: - Computing aggregated values over the graph - When you need to avoid creating intermediate structures
UOp.backward_slice¶
Location: tinygrad/uop/ops.py:156
Returns all nodes that are ancestors of the current node (all nodes that can reach this node via src pointers). This is the "backward" direction in the dataflow graph.
# Get all nodes that feed into this computation
ancestors = uop.backward_slice
# Returns dict[UOp, None] - self is NOT included
Returns: dict[UOp, None] - all nodes that are parents (in the dataflow sense)
Use when: - Finding all inputs that affect a computation - Dependency analysis
UOp.backward_slice_with_self¶
Location: tinygrad/uop/ops.py:162
Same as backward_slice but includes self in the result.
# Get all nodes that feed into this computation, including itself
ancestors_with_self = uop.backward_slice_with_self
UOp.op_in_backward_slice_with_self(*ops) -> bool¶
Location: tinygrad/uop/ops.py:163
Efficiently checks if any node in the backward slice (including self) has one of the specified ops. Avoids creating intermediate dict.
# Check if any ancestor (or self) is a CONST
if uop.op_in_backward_slice_with_self(Ops.CONST, Ops.VCONST):
...
UOp.get_consumer_map() -> dict[UOp, dict[UOp, None]]¶
Location: tinygrad/uop/ops.py:193
Returns a map from each UOp to all its consumers (nodes that have it as a source). This is the reverse of the dataflow direction.
consumer_map = uop.get_consumer_map()
for node, consumers in consumer_map.items():
print(f"Node {node} is used by {len(consumers)} consumers")
Note: Internally uses toposort() and then builds the reverse map.
UOp._ranges, UOp.ranges¶
Location: tinygrad/uop/ops.py:319, 333
Computes the "ranges" of a UOp - the iteration space if the UOp represents a loop. These are cached properties.
# Get iteration ranges for a UOp
ranges = uop.ranges
# Returns dict[UOp, None] mapping range UOps to None
2. Single Node Operations¶
These methods create new UOps with modified properties. They don't traverse the graph - they're building blocks for manual graph modification.
UOp.replace(**kwargs) -> UOp¶
Location: tinygrad/uop/ops.py:137
Creates a new UOp with modified properties. The most common way to create modified nodes.
# Change the dtype
new_uop = uop.replace(dtype=dtypes.float32)
# Change sources
new_uop = uop.replace(src=(new_src1, new_src2))
# Change arg
new_uop = uop.replace(arg=new_arg)
# Remove tag
new_uop = uop.replace(tag=None)
Key property: If no kwargs differ from the original, returns the same object (self). This enables structural sharing and efficient comparison.
UOp.f(op, **kwargs) -> UOp¶
Location: tinygrad/uop/ops.py:153
Shorthand for creating a new UOp with self as the only source.
UOp.detach() -> UOp¶
Location: tinygrad/uop/ops.py:381
Wraps the UOp in a DETACH op, which prevents gradient flow in backward passes.
UOp.cast(dtype) -> UOp¶
Location: tinygrad/uop/ops.py:403
Creates a CAST op to change the dtype.
UOp.bitcast(dtype) -> UOp¶
Location: tinygrad/uop/ops.py:404
Creates a BITCAST op (no data conversion, just reinterpretation).
3. Pattern Matching¶
UPat - The Pattern Language¶
Location: tinygrad/uop/ops.py:895
UPat is a pattern that matches UOps. It's the core of tinygrad's pattern matching system.
# Match specific ops
UPat(Ops.ADD) # Match ADD op
UPat((Ops.ADD, Ops.SUB)) # Match ADD or SUB
# Match by dtype
UPat(dtype=dtypes.float32) # Match float32 dtype
UPat(dtype=(dtypes.float32, dtypes.float64)) # Match multiple dtypes
# Match by arg
UPat(arg=0) # Match arg equals 0
UPat(arg=None) # Match any arg
# Match sources (children)
UPat(src=(UPat.var("x"), UPat.cvar("c"))) # Two sources: any x and a const
UPat(src=(UPat(), UPat())) # Exactly two sources
# Named captures - available in callback
UPat(Ops.ADD, name="add") # Captures matched UOp as "add"
# Variable matching
UPat.var("x") # Matches anything, captures as "x"
UPat.cvar("c") # Matches CONST or VCONST, captures as "c"
# Commutative matching (list = try all permutations)
UPat(Ops.ADD, src=[UPat.var("x"), UPat.var("y")]) # x+y matches both x+y and y+x
# Repeating pattern (matches any number)
UPat(Ops.VECTORIZE, src=UPat(Ops.CONST)) # Matches VECTORIZE with any number of CONST sources
# Custom early reject - optimization
UPat(Ops.ADD, custom_early_reject={Ops.CONST}) # Skip if no CONST in sources
Key methods:
- UPat.var(name) - Match anything, capture as name
- UPat.cvar(name) - Match CONST or VCONST
- UPat.const(dtype, b) - Match specific constant value
- UPat.any(*pats) - Match if any sub-pattern matches
- UPat.named(name) - Add a name to existing pattern
- UPat.or_casted(name) - Match self or CAST of self
- UPat.or_after(name) - Match self or AFTER of self
PatternMatcher.rewrite(uop, ctx) -> UOp | None¶
Location: tinygrad/uop/ops.py:1038
Rewrites a single UOp using the pattern matcher. This is the core pattern matching function.
pm = PatternMatcher([
# (pattern, callback)
(UPat(Ops.ADD, src=(UPat.var("x"), UPat.cvar("c", arg=0)), lambda x: x), # x + 0 -> x
(UPat(Ops.MUL, src=(UPat.var("x"), UPat.cvar("c", arg=1)), lambda x: x), # x * 1 -> x
])
result = pm.rewrite(some_uop) # Returns rewritten node or None
How it works:
1. Look up patterns by op in pdict
2. For each pattern, check early_reject (optimization)
3. Run the pattern match
4. If match succeeds, call the callback with captured variables
5. Return first non-None result, or None if no match
Parameters:
- uop: The UOp to rewrite
- ctx: Optional context passed to callbacks
Returns: Rewritten UOp if pattern matched, None otherwise
Note: Does NOT recursively rewrite - only matches the single node.
PatternMatcher.__add__¶
Location: tinygrad/uop/ops.py:1035
Combines two PatternMatchers.
pm1 = PatternMatcher([...])
pm2 = PatternMatcher([...])
combined = pm1 + pm2 # Returns new PatternMatcher with all rules
4. Full Graph Rewriting¶
graph_rewrite(sink, pm, ctx, bottom_up, bpm) -> UOp¶
Location: tinygrad/uop/ops.py:1266
Applies a PatternMatcher across an entire UOp graph, handling topological ordering, source reconstruction, and fixed-point iteration.
new_sink = graph_rewrite(sink, pm)
new_sink = graph_rewrite(sink, pm, ctx=some_context)
new_sink = graph_rewrite(sink, pm, bottom_up=True) # Bottom-up mode
new_sink = graph_rewrite(sink, pm, bpm=bpm) # With bottom-up pre-pass
Parameters:
| Parameter | Type | Description |
|---|---|---|
sink |
UOp |
Root UOp of the graph to rewrite |
pm |
PatternMatcher |
Main pattern matcher (top-down) |
ctx |
Any | Optional context passed to all callbacks |
bottom_up |
bool |
If True, pm is applied bottom-up instead of top-down |
bpm |
PatternMatcher |
Optional bottom-up pre-pass (only used when bottom_up=False) |
name |
str |
Optional name for profiling/visualization |
Bottom-up mode (bottom_up=True):
- Applies pm bottom-up (children before parents)
- Uses fixed-point iteration - keeps applying until no more matches
With bpm (bottom-up pre-pass):
- First applies bpm bottom-up with fixed-point
- Then applies pm top-down
- This is useful when you want both bottom-up simplification and top-down lowering
RewriteContext.unified_rewrite(root)¶
Location: tinygrad/uop/ops.py:1205
The internal implementation of graph_rewrite. Uses a 3-stage stack-based traversal:
Stage 0 (descend):
- If bpm is set, apply it in a fixed-point loop
- Push all sources onto the stack
Stage 1 (rebuild):
- Once all sources are rewritten, rebuild the node
- If top-down, apply pm.rewrite()
- If rewrite produces new node, push for full re-processing
Stage 2 (link): - After rewritten node finishes, link original to final result
Key properties:
- Stack-based: No recursion, prevents stack overflow
- Fixed-point for bpm: Bottom-up rules applied repeatedly
- Single-pass for pm: Top-down rules fire once per node
- Deduplication: Each original node processed at most once
graph_rewrite_map¶
Current tinygrad no longer exposes graph_rewrite_map(...).
If you need the old behavior, the usual replacement is:
new_sink = graph_rewrite(sink, pm, name="my rewrite")
# compare roots if you only care about top-level outputs
for old_uop, new_uop in zip(sink.src, new_sink.src):
if old_uop is not new_uop:
print("top-level output changed")
# or apply a known explicit remap with substitute
updated = sink.substitute({old_uop: new_uop}, name="apply remap")
In practice most call sites now either:
- call
graph_rewrite(...)and compare the rewritten root(s), or - keep an explicit old→new map produced by some other pass such as
transform_to_call(...), then apply it withUOp.substitute(...).
5. Specialized Rewrites¶
UOp.substitute(dvars, name, extra_pm)¶
Location: tinygrad/uop/ops.py:356
Substitutes variables in the graph - replaces specified UOps with others throughout the graph.
# Simple substitution
new_uop = uop.substitute({old_uop: new_uop})
# With extra pattern matcher
new_uop = uop.substitute({a: b}, extra_pm=some_pm)
# With name for profiling
new_uop = uop.substitute({a: b}, name="my_substitution")
How it works:
- Internally uses graph_rewrite with a special _substitute PatternMatcher
- Applied bottom-up (bottom_up=True)
- The extra_pm allows additional rewrites during substitution
Use cases: - Variable elimination - Common subexpression elimination - Gradient computation (substituting gradients for parameters)
UOp.simplify(tracked=False)¶
Location: tinygrad/uop/ops.py:339
Simplifies the UOp using symbolic rewrites.
How it works:
- Applies the full symbolic PatternMatcher via graph_rewrite
- Performs algebraic simplification, constant folding, etc.
Parameters:
- tracked: If True, tracks rewrite statistics
UOp.ssimplify() -> UOp|ConstType¶
Location: tinygrad/uop/ops.py:345
Like simplify() but returns the constant value directly if the result is a CONST.
UOp.sintify() -> sint¶
Location: tinygrad/uop/ops.py:346
Converts to symbolic integer - returns the arg if CONST, otherwise returns self.
UOp.render(simplify=True, pm=None) -> str¶
Location: tinygrad/uop/ops.py:785
Renders the UOp to a string representation.
# Simple render (with simplification)
print(uop.render())
# Without simplification
print(uop.render(simplify=False))
# With custom pattern matcher
print(uop.render(pm=custom_renderer))
How it works: - Toposorts the graph - Applies the renderer PatternMatcher to each node in order - Accumulates string representations in a context dict
6. Specialized Traversal Functions¶
_deepwalk(root, targets)¶
Location: tinygrad/gradient.py
A specialized function that computes the "target path" - nodes on the path from root to specified target nodes.
# Find all nodes between root and targets
path = _deepwalk(root, {target1, target2})
# Returns list[UOp] in topological order
How it works:
1. First computes in_target_path by traversing backward from targets
2. Then toposorts with a gate that:
- Skips DETACH and ASSIGN ops
- Only includes nodes in the target path
Use when:
- Computing gradients (this is used in gradient.py)
- Finding specific subgraphs between two points
full_rewrite_to_sink(sink, ren=None, optimize=True)¶
Location: tinygrad/codegen/__init__.py:26
The main entry point for full graph optimization in tinygrad's codegen pipeline. Applies multiple optimization passes.
What it does (in order):
1. pm_mops + pm_syntactic_sugar - early movement ops (bottom-up)
2. pm_load_collapse - collapse loads
3. pm_split_ranges + pm_flatten_range - split ranges
4. sym + pm_flatten_range - symbolic simplification
5. Many more optimization passes...
This is what gets called when you create a schedule or execute a tensor operation.
7. Context and Tracking¶
TrackedPatternMatcher¶
Location: tinygrad/uop/ops.py:1125
A subclass of PatternMatcher that tracks match statistics.
# Automatically used when TRACK_MATCH_STATS >= 2
pm = TrackedPatternMatcher([...])
result = pm.rewrite(uop)
# Tracks: number of matches, time spent, etc.
track_rewrites decorator¶
Location: tinygrad/uop/ops.py:1091
Decorator to track rewrite operations for visualization/profiling.
@track_rewrites()
def my_rewrite_function(uop):
return graph_rewrite(uop, pm)
# Records all rewrites for analysis
8. Utility Functions¶
consumer_map_from_toposort(lst)¶
Location: tinygrad/uop/ops.py:67
Builds a consumer map from a list of UOps in topological order.
consumer_map = consumer_map_from_toposort(sink.toposort())
# Returns: dict[UOp, dict[UOp, None]] - node -> set of consumers
all_metadata¶
Location: tinygrad/uop/ops.py:105
A WeakKeyDictionary that stores metadata for UOps.
# Store metadata
all_metadata[uop] = (Metadata(...),)
# Retrieve metadata
meta = all_metadata.get(uop)
Used for tracking additional information about UOps (line numbers, source locations, etc.)
Summary Table¶
| Method | Traversal | Pattern Matching | Fixed-point | Graph Rebuild | Use Case |
|---|---|---|---|---|---|
toposort/topovisit |
Manual iteration | No (manual if) |
No | Manual | Analysis, simple maps |
backward_slice |
Backward | No | No | No | Dependency analysis |
PatternMatcher.rewrite |
None (single node) | Yes (UPat) |
No | No | Building block, spot checks |
graph_rewrite |
Full graph | Yes (via PM) | Yes (for bpm) |
Automatic | Optimization, lowering |
substitute |
Full graph | Yes (+ extra_pm) | Yes (bottom-up) | Automatic | Variable substitution |
simplify |
Full graph | Yes (symbolic) | Yes | Automatic | Algebraic simplification |
render |
Full graph | Yes (renderer PM) | No | Automatic | String representation |
Additional Notes¶
Graph Specification¶
Most of these methods assume:
- UOp graph: A DAG where uop.src are the inputs (children in dataflow)
- Rooted at sink: There's a single "sink" node that all others feed into
- Immutable nodes: UOps are immutable; modifications create new nodes
Performance Considerations¶
- toposort: O(V+E) where V=vertices, E=edges
- PatternMatcher: Uses
pdictfor O(1) op lookup,early_rejectfor optimization - graph_rewrite: Stack-based to avoid recursion; limits stack size to prevent infinite loops
- Caching: Many methods use
cached_propertyor explicit caches
Common Patterns¶
# Manual bottom-up transformation
replace_map = {}
for u in sink.toposort()[::-1]: # Reverse = bottom-up
new_src = tuple(replace_map.get(s, s) for s in u.src)
replace_map[u] = transform(u, new_src)
result = replace_map[sink]
# Using graph_rewrite for the same
result = graph_rewrite(sink, PatternMatcher([(UPat(), transform_with_pm)]))