UOp Graph Traversal and Rewriting¶

tinygrad provides multiple layers of APIs for parsing and modifying UOp graphs, ranging from simple manual iteration to powerful pattern-matching-based rewrites. This document catalogs all available methods.

Table of Contents¶

Read-Only Graph Traversal
UOp.toposort(gate=None)
UOp.topovisit(visitor, cache)
UOp.backward_slice
UOp.backward_slice_with_self
UOp.op_in_backward_slice_with_self(*ops)
UOp.get_consumer_map()
UOp._ranges, UOp.ranges
Single Node Operations
UOp.replace(**kwargs)
UOp.f(op, **kwargs)
UOp.detach()
UOp.cast(dtype)
UOp.bitcast(dtype)
Pattern Matching
UPat - The Pattern Language
PatternMatcher.rewrite(uop, ctx)
PatternMatcher.__add__
Full Graph Rewriting
graph_rewrite(sink, pm, ctx, bottom_up, bpm)
RewriteContext.unified_rewrite(root)
Specialized Rewrites
UOp.substitute(dvars, name, extra_pm)
UOp.simplify(tracked)
UOp.ssimplify()
UOp.sintify()
UOp.render(simplify, pm)
Specialized Traversal Functions
_deepwalk(root, targets)
full_rewrite_to_sink(sink, ren, optimize)
Context and Tracking
TrackedPatternMatcher
track_rewrites decorator
Utility Functions
consumer_map_from_toposort(lst)
all_metadata

1. Read-Only Graph Traversal¶

`UOp.toposort(gate=None) -> dict[UOp, None]`¶

Location: tinygrad/uop/ops.py:167

Returns all nodes reachable from self in topological order (children before parents). Uses an iterative stack-based algorithm to avoid recursion depth issues.

# Iterate all nodes reachable from sink
for node in sink.toposort():
    if node.op == Ops.CONST:
        print(f"Found constant: {node.arg}")

# With gating - control which branches to descend into
for node in sink.toposort(lambda n: n.op != Ops.DETACH):
    # Skip DETACH ops and their descendants
    ...

Parameters: - gate: Optional callback Callable[[UOp], bool]. If provided, only descends into nodes where gate(node) returns True.

Returns: dict[UOp, None] - ordered dict (insertion order = topological order)

Use when: - You need to iterate over all nodes in a graph for analysis - Building maps/dictionaries based on graph structure - Simple transformations where you manually manage replacement

`UOp.topovisit(visitor, cache) -> T`¶

Location: tinygrad/uop/ops.py:180

Similar to toposort but applies a visitor function to each node. Returns the result for the root node. Useful when you need to compute a value for each node.

# Count all UOps of a certain type
def count_ops(uop):
    return 1 + sum(count_ops(child) for child in uop.src)

# More efficient with topovisit
cache = {}
def count_visitor(node):
    return 1 + sum(cache.get(child, 0) for child in node.src)

root.topovisit(count_visitor, cache)

Parameters: - visitor: Callable[[UOp], T] - applied to each node in topological order - cache: dict[UOp, T] - must be provided, used to store visitor results

Returns: The visitor result for the root node (T)

Use when: - Computing aggregated values over the graph - When you need to avoid creating intermediate structures

`UOp.backward_slice`¶

Location: tinygrad/uop/ops.py:156

Returns all nodes that are ancestors of the current node (all nodes that can reach this node via src pointers). This is the "backward" direction in the dataflow graph.

# Get all nodes that feed into this computation
ancestors = uop.backward_slice
# Returns dict[UOp, None] - self is NOT included

Returns: dict[UOp, None] - all nodes that are parents (in the dataflow sense)

Use when: - Finding all inputs that affect a computation - Dependency analysis

`UOp.backward_slice_with_self`¶

Location: tinygrad/uop/ops.py:162

Same as backward_slice but includes self in the result.

# Get all nodes that feed into this computation, including itself
ancestors_with_self = uop.backward_slice_with_self

`UOp.op_in_backward_slice_with_self(*ops) -> bool`¶

Location: tinygrad/uop/ops.py:163

Efficiently checks if any node in the backward slice (including self) has one of the specified ops. Avoids creating intermediate dict.

# Check if any ancestor (or self) is a CONST
if uop.op_in_backward_slice_with_self(Ops.CONST, Ops.VCONST):
    ...

`UOp.get_consumer_map() -> dict[UOp, dict[UOp, None]]`¶

Location: tinygrad/uop/ops.py:193

Returns a map from each UOp to all its consumers (nodes that have it as a source). This is the reverse of the dataflow direction.

consumer_map = uop.get_consumer_map()
for node, consumers in consumer_map.items():
    print(f"Node {node} is used by {len(consumers)} consumers")

Note: Internally uses toposort() and then builds the reverse map.

`UOp._ranges`, `UOp.ranges`¶

Location: tinygrad/uop/ops.py:319, 333

Computes the "ranges" of a UOp - the iteration space if the UOp represents a loop. These are cached properties.

# Get iteration ranges for a UOp
ranges = uop.ranges
# Returns dict[UOp, None] mapping range UOps to None

2. Single Node Operations¶

These methods create new UOps with modified properties. They don't traverse the graph - they're building blocks for manual graph modification.

`UOp.replace(**kwargs) -> UOp`¶

Location: tinygrad/uop/ops.py:137

Creates a new UOp with modified properties. The most common way to create modified nodes.

# Change the dtype
new_uop = uop.replace(dtype=dtypes.float32)

# Change sources
new_uop = uop.replace(src=(new_src1, new_src2))

# Change arg
new_uop = uop.replace(arg=new_arg)

# Remove tag
new_uop = uop.replace(tag=None)

Key property: If no kwargs differ from the original, returns the same object (self). This enables structural sharing and efficient comparison.

`UOp.f(op, **kwargs) -> UOp`¶

Location: tinygrad/uop/ops.py:153

Shorthand for creating a new UOp with self as the only source.

# These are equivalent:
uop.f(Ops.NEG) == UOp(Ops.NEG, uop.dtype, (uop,))

`UOp.detach() -> UOp`¶

Location: tinygrad/uop/ops.py:381

Wraps the UOp in a DETACH op, which prevents gradient flow in backward passes.

detached = uop.detach()
# Creates: UOp(Ops.DETACH, uop.dtype, (uop,))

`UOp.cast(dtype) -> UOp`¶

Location: tinygrad/uop/ops.py:403

Creates a CAST op to change the dtype.

casted = uop.cast(dtypes.float32)
# Creates: UOp(Ops.CAST, dtypes.float32, (uop,))

`UOp.bitcast(dtype) -> UOp`¶

Location: tinygrad/uop/ops.py:404

Creates a BITCAST op (no data conversion, just reinterpretation).

bitcast = uop.bitcast(dtypes.int32)
# Creates: UOp(Ops.BITCAST, dtypes.int32, (uop,))

3. Pattern Matching¶

`UPat` - The Pattern Language¶

Location: tinygrad/uop/ops.py:895

UPat is a pattern that matches UOps. It's the core of tinygrad's pattern matching system.

# Match specific ops
UPat(Ops.ADD)                    # Match ADD op
UPat((Ops.ADD, Ops.SUB))         # Match ADD or SUB

# Match by dtype
UPat(dtype=dtypes.float32)       # Match float32 dtype
UPat(dtype=(dtypes.float32, dtypes.float64))  # Match multiple dtypes

# Match by arg
UPat(arg=0)                      # Match arg equals 0
UPat(arg=None)                   # Match any arg

# Match sources (children)
UPat(src=(UPat.var("x"), UPat.cvar("c")))  # Two sources: any x and a const
UPat(src=(UPat(), UPat()))                 # Exactly two sources

# Named captures - available in callback
UPat(Ops.ADD, name="add")         # Captures matched UOp as "add"

# Variable matching
UPat.var("x")                    # Matches anything, captures as "x"
UPat.cvar("c")                   # Matches CONST or VCONST, captures as "c"

# Commutative matching (list = try all permutations)
UPat(Ops.ADD, src=[UPat.var("x"), UPat.var("y")])  # x+y matches both x+y and y+x

# Repeating pattern (matches any number)
UPat(Ops.VECTORIZE, src=UPat(Ops.CONST))  # Matches VECTORIZE with any number of CONST sources

# Custom early reject - optimization
UPat(Ops.ADD, custom_early_reject={Ops.CONST})  # Skip if no CONST in sources

Key methods: - UPat.var(name) - Match anything, capture as name - UPat.cvar(name) - Match CONST or VCONST - UPat.const(dtype, b) - Match specific constant value - UPat.any(*pats) - Match if any sub-pattern matches - UPat.named(name) - Add a name to existing pattern - UPat.or_casted(name) - Match self or CAST of self - UPat.or_after(name) - Match self or AFTER of self

`PatternMatcher.rewrite(uop, ctx) -> UOp | None`¶

Location: tinygrad/uop/ops.py:1038

Rewrites a single UOp using the pattern matcher. This is the core pattern matching function.

pm = PatternMatcher([
    # (pattern, callback)
    (UPat(Ops.ADD, src=(UPat.var("x"), UPat.cvar("c", arg=0)), lambda x: x),  # x + 0 -> x
    (UPat(Ops.MUL, src=(UPat.var("x"), UPat.cvar("c", arg=1)), lambda x: x),  # x * 1 -> x
])

result = pm.rewrite(some_uop)  # Returns rewritten node or None

How it works: 1. Look up patterns by op in pdict 2. For each pattern, check early_reject (optimization) 3. Run the pattern match 4. If match succeeds, call the callback with captured variables 5. Return first non-None result, or None if no match

Parameters: - uop: The UOp to rewrite - ctx: Optional context passed to callbacks

Returns: Rewritten UOp if pattern matched, None otherwise

Note: Does NOT recursively rewrite - only matches the single node.

`PatternMatcher.add`¶

Location: tinygrad/uop/ops.py:1035

Combines two PatternMatchers.

pm1 = PatternMatcher([...])
pm2 = PatternMatcher([...])
combined = pm1 + pm2  # Returns new PatternMatcher with all rules

4. Full Graph Rewriting¶

`graph_rewrite(sink, pm, ctx, bottom_up, bpm) -> UOp`¶

Location: tinygrad/uop/ops.py:1266

Applies a PatternMatcher across an entire UOp graph, handling topological ordering, source reconstruction, and fixed-point iteration.

new_sink = graph_rewrite(sink, pm)
new_sink = graph_rewrite(sink, pm, ctx=some_context)
new_sink = graph_rewrite(sink, pm, bottom_up=True)  # Bottom-up mode
new_sink = graph_rewrite(sink, pm, bpm=bpm)        # With bottom-up pre-pass

Parameters:

Parameter	Type	Description
`sink`	`UOp`	Root UOp of the graph to rewrite
`pm`	`PatternMatcher`	Main pattern matcher (top-down)
`ctx`	Any	Optional context passed to all callbacks
`bottom_up`	`bool`	If True, `pm` is applied bottom-up instead of top-down
`bpm`	`PatternMatcher`	Optional bottom-up pre-pass (only used when `bottom_up=False`)
`name`	`str`	Optional name for profiling/visualization

Bottom-up mode (bottom_up=True): - Applies pm bottom-up (children before parents) - Uses fixed-point iteration - keeps applying until no more matches

With bpm (bottom-up pre-pass): - First applies bpm bottom-up with fixed-point - Then applies pm top-down - This is useful when you want both bottom-up simplification and top-down lowering

`RewriteContext.unified_rewrite(root)`¶

Location: tinygrad/uop/ops.py:1205

The internal implementation of graph_rewrite. Uses a 3-stage stack-based traversal:

Stage 0 (descend): - If bpm is set, apply it in a fixed-point loop - Push all sources onto the stack

Stage 1 (rebuild): - Once all sources are rewritten, rebuild the node - If top-down, apply pm.rewrite() - If rewrite produces new node, push for full re-processing

Stage 2 (link): - After rewritten node finishes, link original to final result

Key properties: - Stack-based: No recursion, prevents stack overflow - Fixed-point for bpm: Bottom-up rules applied repeatedly - Single-pass for pm: Top-down rules fire once per node - Deduplication: Each original node processed at most once

`graph_rewrite_map`¶

Current tinygrad no longer exposes graph_rewrite_map(...).

If you need the old behavior, the usual replacement is:

new_sink = graph_rewrite(sink, pm, name="my rewrite")

# compare roots if you only care about top-level outputs
for old_uop, new_uop in zip(sink.src, new_sink.src):
    if old_uop is not new_uop:
        print("top-level output changed")

# or apply a known explicit remap with substitute
updated = sink.substitute({old_uop: new_uop}, name="apply remap")

In practice most call sites now either:

call graph_rewrite(...) and compare the rewritten root(s), or
keep an explicit old→new map produced by some other pass such as transform_to_call(...), then apply it with UOp.substitute(...).

5. Specialized Rewrites¶

`UOp.substitute(dvars, name, extra_pm)`¶

Location: tinygrad/uop/ops.py:356

Substitutes variables in the graph - replaces specified UOps with others throughout the graph.

# Simple substitution
new_uop = uop.substitute({old_uop: new_uop})

# With extra pattern matcher
new_uop = uop.substitute({a: b}, extra_pm=some_pm)

# With name for profiling
new_uop = uop.substitute({a: b}, name="my_substitution")

How it works: - Internally uses graph_rewrite with a special _substitute PatternMatcher - Applied bottom-up (bottom_up=True) - The extra_pm allows additional rewrites during substitution

Use cases: - Variable elimination - Common subexpression elimination - Gradient computation (substituting gradients for parameters)

`UOp.simplify(tracked=False)`¶

Location: tinygrad/uop/ops.py:339

Simplifies the UOp using symbolic rewrites.

simplified = uop.simplify()
# Uses the `symbolic` PatternMatcher from tinygrad.uop.symbolic

How it works: - Applies the full symbolic PatternMatcher via graph_rewrite - Performs algebraic simplification, constant folding, etc.

Parameters: - tracked: If True, tracks rewrite statistics

`UOp.ssimplify() -> UOp|ConstType`¶

Location: tinygrad/uop/ops.py:345

Like simplify() but returns the constant value directly if the result is a CONST.

value = uop.ssimplify()
# Returns: UOp if not constant, or the constant value (int, float, etc.)

`UOp.sintify() -> sint`¶

Location: tinygrad/uop/ops.py:346

Converts to symbolic integer - returns the arg if CONST, otherwise returns self.

value = uop.sintify()
# If CONST: returns the integer value
# Otherwise: returns the UOp itself

`UOp.render(simplify=True, pm=None) -> str`¶

Location: tinygrad/uop/ops.py:785

Renders the UOp to a string representation.

# Simple render (with simplification)
print(uop.render())

# Without simplification
print(uop.render(simplify=False))

# With custom pattern matcher
print(uop.render(pm=custom_renderer))

How it works: - Toposorts the graph - Applies the renderer PatternMatcher to each node in order - Accumulates string representations in a context dict

6. Specialized Traversal Functions¶

`_deepwalk(root, targets)`¶

Location: tinygrad/gradient.py

A specialized function that computes the "target path" - nodes on the path from root to specified target nodes.

# Find all nodes between root and targets
path = _deepwalk(root, {target1, target2})
# Returns list[UOp] in topological order

How it works: 1. First computes in_target_path by traversing backward from targets 2. Then toposorts with a gate that: - Skips DETACH and ASSIGN ops - Only includes nodes in the target path

Use when: - Computing gradients (this is used in gradient.py) - Finding specific subgraphs between two points

`full_rewrite_to_sink(sink, ren=None, optimize=True)`¶

Location: tinygrad/codegen/__init__.py:26

The main entry point for full graph optimization in tinygrad's codegen pipeline. Applies multiple optimization passes.

optimized_sink = full_rewrite_to_sink(sink)

What it does (in order): 1. pm_mops + pm_syntactic_sugar - early movement ops (bottom-up) 2. pm_load_collapse - collapse loads 3. pm_split_ranges + pm_flatten_range - split ranges 4. sym + pm_flatten_range - symbolic simplification 5. Many more optimization passes...

This is what gets called when you create a schedule or execute a tensor operation.

7. Context and Tracking¶

`TrackedPatternMatcher`¶

Location: tinygrad/uop/ops.py:1125

A subclass of PatternMatcher that tracks match statistics.

# Automatically used when TRACK_MATCH_STATS >= 2
pm = TrackedPatternMatcher([...])
result = pm.rewrite(uop)
# Tracks: number of matches, time spent, etc.

`track_rewrites` decorator¶

Location: tinygrad/uop/ops.py:1091

Decorator to track rewrite operations for visualization/profiling.

@track_rewrites()
def my_rewrite_function(uop):
    return graph_rewrite(uop, pm)
# Records all rewrites for analysis

8. Utility Functions¶

`consumer_map_from_toposort(lst)`¶

Location: tinygrad/uop/ops.py:67

Builds a consumer map from a list of UOps in topological order.

consumer_map = consumer_map_from_toposort(sink.toposort())
# Returns: dict[UOp, dict[UOp, None]] - node -> set of consumers

`all_metadata`¶

Location: tinygrad/uop/ops.py:105

A WeakKeyDictionary that stores metadata for UOps.

# Store metadata
all_metadata[uop] = (Metadata(...),)

# Retrieve metadata
meta = all_metadata.get(uop)

Used for tracking additional information about UOps (line numbers, source locations, etc.)

Summary Table¶

Method	Traversal	Pattern Matching	Fixed-point	Graph Rebuild	Use Case
`toposort`/`topovisit`	Manual iteration	No (manual `if`)	No	Manual	Analysis, simple maps
`backward_slice`	Backward	No	No	No	Dependency analysis
`PatternMatcher.rewrite`	None (single node)	Yes (`UPat`)	No	No	Building block, spot checks
`graph_rewrite`	Full graph	Yes (via PM)	Yes (for `bpm`)	Automatic	Optimization, lowering
`substitute`	Full graph	Yes (+ extra_pm)	Yes (bottom-up)	Automatic	Variable substitution
`simplify`	Full graph	Yes (symbolic)	Yes	Automatic	Algebraic simplification
`render`	Full graph	Yes (renderer PM)	No	Automatic	String representation

Additional Notes¶

Graph Specification¶

Most of these methods assume: - UOp graph: A DAG where uop.src are the inputs (children in dataflow) - Rooted at sink: There's a single "sink" node that all others feed into - Immutable nodes: UOps are immutable; modifications create new nodes

Performance Considerations¶

toposort: O(V+E) where V=vertices, E=edges
PatternMatcher: Uses pdict for O(1) op lookup, early_reject for optimization
graph_rewrite: Stack-based to avoid recursion; limits stack size to prevent infinite loops
Caching: Many methods use cached_property or explicit caches

Common Patterns¶

# Manual bottom-up transformation
replace_map = {}
for u in sink.toposort()[::-1]:  # Reverse = bottom-up
    new_src = tuple(replace_map.get(s, s) for s in u.src)
    replace_map[u] = transform(u, new_src)
result = replace_map[sink]

# Using graph_rewrite for the same
result = graph_rewrite(sink, PatternMatcher([(UPat(), transform_with_pm)]))

UOp Graph Traversal and Rewriting¶

Table of Contents¶

1. Read-Only Graph Traversal¶

UOp.toposort(gate=None) -> dict[UOp, None]¶

UOp.topovisit(visitor, cache) -> T¶

UOp.backward_slice¶

UOp.backward_slice_with_self¶

UOp.op_in_backward_slice_with_self(*ops) -> bool¶

UOp.get_consumer_map() -> dict[UOp, dict[UOp, None]]¶

UOp._ranges, UOp.ranges¶

2. Single Node Operations¶

UOp.replace(**kwargs) -> UOp¶

UOp.f(op, **kwargs) -> UOp¶

UOp.detach() -> UOp¶

UOp.cast(dtype) -> UOp¶

UOp.bitcast(dtype) -> UOp¶