Skip to content

UOp Graph Traversal and Rewriting

tinygrad provides multiple layers of APIs for parsing and modifying UOp graphs, ranging from simple manual iteration to powerful pattern-matching-based rewrites. This document catalogs all available methods.


Table of Contents

  1. Read-Only Graph Traversal
  2. UOp.toposort(gate=None)
  3. UOp.topovisit(visitor, cache)
  4. UOp.backward_slice
  5. UOp.backward_slice_with_self
  6. UOp.op_in_backward_slice_with_self(*ops)
  7. UOp.get_consumer_map()
  8. UOp._ranges, UOp.ranges
  9. Single Node Operations
  10. UOp.replace(**kwargs)
  11. UOp.f(op, **kwargs)
  12. UOp.detach()
  13. UOp.cast(dtype)
  14. UOp.bitcast(dtype)
  15. Pattern Matching
  16. UPat - The Pattern Language
  17. PatternMatcher.rewrite(uop, ctx)
  18. PatternMatcher.__add__
  19. Full Graph Rewriting
  20. graph_rewrite(sink, pm, ctx, bottom_up, bpm)
  21. RewriteContext.unified_rewrite(root)
  22. Specialized Rewrites
  23. UOp.substitute(dvars, name, extra_pm)
  24. UOp.simplify(tracked)
  25. UOp.ssimplify()
  26. UOp.sintify()
  27. UOp.render(simplify, pm)
  28. Specialized Traversal Functions
  29. _deepwalk(root, targets)
  30. full_rewrite_to_sink(sink, ren, optimize)
  31. Context and Tracking
  32. TrackedPatternMatcher
  33. track_rewrites decorator
  34. Utility Functions
  35. consumer_map_from_toposort(lst)
  36. all_metadata

1. Read-Only Graph Traversal

UOp.toposort(gate=None) -> dict[UOp, None]

Location: tinygrad/uop/ops.py:167

Returns all nodes reachable from self in topological order (children before parents). Uses an iterative stack-based algorithm to avoid recursion depth issues.

# Iterate all nodes reachable from sink
for node in sink.toposort():
    if node.op == Ops.CONST:
        print(f"Found constant: {node.arg}")

# With gating - control which branches to descend into
for node in sink.toposort(lambda n: n.op != Ops.DETACH):
    # Skip DETACH ops and their descendants
    ...

Parameters: - gate: Optional callback Callable[[UOp], bool]. If provided, only descends into nodes where gate(node) returns True.

Returns: dict[UOp, None] - ordered dict (insertion order = topological order)

Use when: - You need to iterate over all nodes in a graph for analysis - Building maps/dictionaries based on graph structure - Simple transformations where you manually manage replacement


UOp.topovisit(visitor, cache) -> T

Location: tinygrad/uop/ops.py:180

Similar to toposort but applies a visitor function to each node. Returns the result for the root node. Useful when you need to compute a value for each node.

# Count all UOps of a certain type
def count_ops(uop):
    return 1 + sum(count_ops(child) for child in uop.src)

# More efficient with topovisit
cache = {}
def count_visitor(node):
    return 1 + sum(cache.get(child, 0) for child in node.src)

root.topovisit(count_visitor, cache)

Parameters: - visitor: Callable[[UOp], T] - applied to each node in topological order - cache: dict[UOp, T] - must be provided, used to store visitor results

Returns: The visitor result for the root node (T)

Use when: - Computing aggregated values over the graph - When you need to avoid creating intermediate structures


UOp.backward_slice

Location: tinygrad/uop/ops.py:156

Returns all nodes that are ancestors of the current node (all nodes that can reach this node via src pointers). This is the "backward" direction in the dataflow graph.

# Get all nodes that feed into this computation
ancestors = uop.backward_slice
# Returns dict[UOp, None] - self is NOT included

Returns: dict[UOp, None] - all nodes that are parents (in the dataflow sense)

Use when: - Finding all inputs that affect a computation - Dependency analysis


UOp.backward_slice_with_self

Location: tinygrad/uop/ops.py:162

Same as backward_slice but includes self in the result.

# Get all nodes that feed into this computation, including itself
ancestors_with_self = uop.backward_slice_with_self

UOp.op_in_backward_slice_with_self(*ops) -> bool

Location: tinygrad/uop/ops.py:163

Efficiently checks if any node in the backward slice (including self) has one of the specified ops. Avoids creating intermediate dict.

# Check if any ancestor (or self) is a CONST
if uop.op_in_backward_slice_with_self(Ops.CONST, Ops.VCONST):
    ...

UOp.get_consumer_map() -> dict[UOp, dict[UOp, None]]

Location: tinygrad/uop/ops.py:193

Returns a map from each UOp to all its consumers (nodes that have it as a source). This is the reverse of the dataflow direction.

consumer_map = uop.get_consumer_map()
for node, consumers in consumer_map.items():
    print(f"Node {node} is used by {len(consumers)} consumers")

Note: Internally uses toposort() and then builds the reverse map.


UOp._ranges, UOp.ranges

Location: tinygrad/uop/ops.py:319, 333

Computes the "ranges" of a UOp - the iteration space if the UOp represents a loop. These are cached properties.

# Get iteration ranges for a UOp
ranges = uop.ranges
# Returns dict[UOp, None] mapping range UOps to None

2. Single Node Operations

These methods create new UOps with modified properties. They don't traverse the graph - they're building blocks for manual graph modification.

UOp.replace(**kwargs) -> UOp

Location: tinygrad/uop/ops.py:137

Creates a new UOp with modified properties. The most common way to create modified nodes.

# Change the dtype
new_uop = uop.replace(dtype=dtypes.float32)

# Change sources
new_uop = uop.replace(src=(new_src1, new_src2))

# Change arg
new_uop = uop.replace(arg=new_arg)

# Remove tag
new_uop = uop.replace(tag=None)

Key property: If no kwargs differ from the original, returns the same object (self). This enables structural sharing and efficient comparison.


UOp.f(op, **kwargs) -> UOp

Location: tinygrad/uop/ops.py:153

Shorthand for creating a new UOp with self as the only source.

# These are equivalent:
uop.f(Ops.NEG) == UOp(Ops.NEG, uop.dtype, (uop,))

UOp.detach() -> UOp

Location: tinygrad/uop/ops.py:381

Wraps the UOp in a DETACH op, which prevents gradient flow in backward passes.

detached = uop.detach()
# Creates: UOp(Ops.DETACH, uop.dtype, (uop,))

UOp.cast(dtype) -> UOp

Location: tinygrad/uop/ops.py:403

Creates a CAST op to change the dtype.

casted = uop.cast(dtypes.float32)
# Creates: UOp(Ops.CAST, dtypes.float32, (uop,))

UOp.bitcast(dtype) -> UOp

Location: tinygrad/uop/ops.py:404

Creates a BITCAST op (no data conversion, just reinterpretation).

bitcast = uop.bitcast(dtypes.int32)
# Creates: UOp(Ops.BITCAST, dtypes.int32, (uop,))

3. Pattern Matching

UPat - The Pattern Language

Location: tinygrad/uop/ops.py:895

UPat is a pattern that matches UOps. It's the core of tinygrad's pattern matching system.

# Match specific ops
UPat(Ops.ADD)                    # Match ADD op
UPat((Ops.ADD, Ops.SUB))         # Match ADD or SUB

# Match by dtype
UPat(dtype=dtypes.float32)       # Match float32 dtype
UPat(dtype=(dtypes.float32, dtypes.float64))  # Match multiple dtypes

# Match by arg
UPat(arg=0)                      # Match arg equals 0
UPat(arg=None)                   # Match any arg

# Match sources (children)
UPat(src=(UPat.var("x"), UPat.cvar("c")))  # Two sources: any x and a const
UPat(src=(UPat(), UPat()))                 # Exactly two sources

# Named captures - available in callback
UPat(Ops.ADD, name="add")         # Captures matched UOp as "add"

# Variable matching
UPat.var("x")                    # Matches anything, captures as "x"
UPat.cvar("c")                   # Matches CONST or VCONST, captures as "c"

# Commutative matching (list = try all permutations)
UPat(Ops.ADD, src=[UPat.var("x"), UPat.var("y")])  # x+y matches both x+y and y+x

# Repeating pattern (matches any number)
UPat(Ops.VECTORIZE, src=UPat(Ops.CONST))  # Matches VECTORIZE with any number of CONST sources

# Custom early reject - optimization
UPat(Ops.ADD, custom_early_reject={Ops.CONST})  # Skip if no CONST in sources

Key methods: - UPat.var(name) - Match anything, capture as name - UPat.cvar(name) - Match CONST or VCONST - UPat.const(dtype, b) - Match specific constant value - UPat.any(*pats) - Match if any sub-pattern matches - UPat.named(name) - Add a name to existing pattern - UPat.or_casted(name) - Match self or CAST of self - UPat.or_after(name) - Match self or AFTER of self


PatternMatcher.rewrite(uop, ctx) -> UOp | None

Location: tinygrad/uop/ops.py:1038

Rewrites a single UOp using the pattern matcher. This is the core pattern matching function.

pm = PatternMatcher([
    # (pattern, callback)
    (UPat(Ops.ADD, src=(UPat.var("x"), UPat.cvar("c", arg=0)), lambda x: x),  # x + 0 -> x
    (UPat(Ops.MUL, src=(UPat.var("x"), UPat.cvar("c", arg=1)), lambda x: x),  # x * 1 -> x
])

result = pm.rewrite(some_uop)  # Returns rewritten node or None

How it works: 1. Look up patterns by op in pdict 2. For each pattern, check early_reject (optimization) 3. Run the pattern match 4. If match succeeds, call the callback with captured variables 5. Return first non-None result, or None if no match

Parameters: - uop: The UOp to rewrite - ctx: Optional context passed to callbacks

Returns: Rewritten UOp if pattern matched, None otherwise

Note: Does NOT recursively rewrite - only matches the single node.


PatternMatcher.__add__

Location: tinygrad/uop/ops.py:1035

Combines two PatternMatchers.

pm1 = PatternMatcher([...])
pm2 = PatternMatcher([...])
combined = pm1 + pm2  # Returns new PatternMatcher with all rules

4. Full Graph Rewriting

graph_rewrite(sink, pm, ctx, bottom_up, bpm) -> UOp

Location: tinygrad/uop/ops.py:1266

Applies a PatternMatcher across an entire UOp graph, handling topological ordering, source reconstruction, and fixed-point iteration.

new_sink = graph_rewrite(sink, pm)
new_sink = graph_rewrite(sink, pm, ctx=some_context)
new_sink = graph_rewrite(sink, pm, bottom_up=True)  # Bottom-up mode
new_sink = graph_rewrite(sink, pm, bpm=bpm)        # With bottom-up pre-pass

Parameters:

Parameter Type Description
sink UOp Root UOp of the graph to rewrite
pm PatternMatcher Main pattern matcher (top-down)
ctx Any Optional context passed to all callbacks
bottom_up bool If True, pm is applied bottom-up instead of top-down
bpm PatternMatcher Optional bottom-up pre-pass (only used when bottom_up=False)
name str Optional name for profiling/visualization

Bottom-up mode (bottom_up=True): - Applies pm bottom-up (children before parents) - Uses fixed-point iteration - keeps applying until no more matches

With bpm (bottom-up pre-pass): - First applies bpm bottom-up with fixed-point - Then applies pm top-down - This is useful when you want both bottom-up simplification and top-down lowering


RewriteContext.unified_rewrite(root)

Location: tinygrad/uop/ops.py:1205

The internal implementation of graph_rewrite. Uses a 3-stage stack-based traversal:

Stage 0 (descend): - If bpm is set, apply it in a fixed-point loop - Push all sources onto the stack

Stage 1 (rebuild): - Once all sources are rewritten, rebuild the node - If top-down, apply pm.rewrite() - If rewrite produces new node, push for full re-processing

Stage 2 (link): - After rewritten node finishes, link original to final result

Key properties: - Stack-based: No recursion, prevents stack overflow - Fixed-point for bpm: Bottom-up rules applied repeatedly - Single-pass for pm: Top-down rules fire once per node - Deduplication: Each original node processed at most once


graph_rewrite_map

Current tinygrad no longer exposes graph_rewrite_map(...).

If you need the old behavior, the usual replacement is:

new_sink = graph_rewrite(sink, pm, name="my rewrite")

# compare roots if you only care about top-level outputs
for old_uop, new_uop in zip(sink.src, new_sink.src):
    if old_uop is not new_uop:
        print("top-level output changed")

# or apply a known explicit remap with substitute
updated = sink.substitute({old_uop: new_uop}, name="apply remap")

In practice most call sites now either:

  • call graph_rewrite(...) and compare the rewritten root(s), or
  • keep an explicit old→new map produced by some other pass such as transform_to_call(...), then apply it with UOp.substitute(...).

5. Specialized Rewrites

UOp.substitute(dvars, name, extra_pm)

Location: tinygrad/uop/ops.py:356

Substitutes variables in the graph - replaces specified UOps with others throughout the graph.

# Simple substitution
new_uop = uop.substitute({old_uop: new_uop})

# With extra pattern matcher
new_uop = uop.substitute({a: b}, extra_pm=some_pm)

# With name for profiling
new_uop = uop.substitute({a: b}, name="my_substitution")

How it works: - Internally uses graph_rewrite with a special _substitute PatternMatcher - Applied bottom-up (bottom_up=True) - The extra_pm allows additional rewrites during substitution

Use cases: - Variable elimination - Common subexpression elimination - Gradient computation (substituting gradients for parameters)


UOp.simplify(tracked=False)

Location: tinygrad/uop/ops.py:339

Simplifies the UOp using symbolic rewrites.

simplified = uop.simplify()
# Uses the `symbolic` PatternMatcher from tinygrad.uop.symbolic

How it works: - Applies the full symbolic PatternMatcher via graph_rewrite - Performs algebraic simplification, constant folding, etc.

Parameters: - tracked: If True, tracks rewrite statistics


UOp.ssimplify() -> UOp|ConstType

Location: tinygrad/uop/ops.py:345

Like simplify() but returns the constant value directly if the result is a CONST.

value = uop.ssimplify()
# Returns: UOp if not constant, or the constant value (int, float, etc.)

UOp.sintify() -> sint

Location: tinygrad/uop/ops.py:346

Converts to symbolic integer - returns the arg if CONST, otherwise returns self.

value = uop.sintify()
# If CONST: returns the integer value
# Otherwise: returns the UOp itself

UOp.render(simplify=True, pm=None) -> str

Location: tinygrad/uop/ops.py:785

Renders the UOp to a string representation.

# Simple render (with simplification)
print(uop.render())

# Without simplification
print(uop.render(simplify=False))

# With custom pattern matcher
print(uop.render(pm=custom_renderer))

How it works: - Toposorts the graph - Applies the renderer PatternMatcher to each node in order - Accumulates string representations in a context dict


6. Specialized Traversal Functions

_deepwalk(root, targets)

Location: tinygrad/gradient.py

A specialized function that computes the "target path" - nodes on the path from root to specified target nodes.

# Find all nodes between root and targets
path = _deepwalk(root, {target1, target2})
# Returns list[UOp] in topological order

How it works: 1. First computes in_target_path by traversing backward from targets 2. Then toposorts with a gate that: - Skips DETACH and ASSIGN ops - Only includes nodes in the target path

Use when: - Computing gradients (this is used in gradient.py) - Finding specific subgraphs between two points


full_rewrite_to_sink(sink, ren=None, optimize=True)

Location: tinygrad/codegen/__init__.py:26

The main entry point for full graph optimization in tinygrad's codegen pipeline. Applies multiple optimization passes.

optimized_sink = full_rewrite_to_sink(sink)

What it does (in order): 1. pm_mops + pm_syntactic_sugar - early movement ops (bottom-up) 2. pm_load_collapse - collapse loads 3. pm_split_ranges + pm_flatten_range - split ranges 4. sym + pm_flatten_range - symbolic simplification 5. Many more optimization passes...

This is what gets called when you create a schedule or execute a tensor operation.


7. Context and Tracking

TrackedPatternMatcher

Location: tinygrad/uop/ops.py:1125

A subclass of PatternMatcher that tracks match statistics.

# Automatically used when TRACK_MATCH_STATS >= 2
pm = TrackedPatternMatcher([...])
result = pm.rewrite(uop)
# Tracks: number of matches, time spent, etc.

track_rewrites decorator

Location: tinygrad/uop/ops.py:1091

Decorator to track rewrite operations for visualization/profiling.

@track_rewrites()
def my_rewrite_function(uop):
    return graph_rewrite(uop, pm)
# Records all rewrites for analysis

8. Utility Functions

consumer_map_from_toposort(lst)

Location: tinygrad/uop/ops.py:67

Builds a consumer map from a list of UOps in topological order.

consumer_map = consumer_map_from_toposort(sink.toposort())
# Returns: dict[UOp, dict[UOp, None]] - node -> set of consumers

all_metadata

Location: tinygrad/uop/ops.py:105

A WeakKeyDictionary that stores metadata for UOps.

# Store metadata
all_metadata[uop] = (Metadata(...),)

# Retrieve metadata
meta = all_metadata.get(uop)

Used for tracking additional information about UOps (line numbers, source locations, etc.)


Summary Table

Method Traversal Pattern Matching Fixed-point Graph Rebuild Use Case
toposort/topovisit Manual iteration No (manual if) No Manual Analysis, simple maps
backward_slice Backward No No No Dependency analysis
PatternMatcher.rewrite None (single node) Yes (UPat) No No Building block, spot checks
graph_rewrite Full graph Yes (via PM) Yes (for bpm) Automatic Optimization, lowering
substitute Full graph Yes (+ extra_pm) Yes (bottom-up) Automatic Variable substitution
simplify Full graph Yes (symbolic) Yes Automatic Algebraic simplification
render Full graph Yes (renderer PM) No Automatic String representation

Additional Notes

Graph Specification

Most of these methods assume: - UOp graph: A DAG where uop.src are the inputs (children in dataflow) - Rooted at sink: There's a single "sink" node that all others feed into - Immutable nodes: UOps are immutable; modifications create new nodes

Performance Considerations

  1. toposort: O(V+E) where V=vertices, E=edges
  2. PatternMatcher: Uses pdict for O(1) op lookup, early_reject for optimization
  3. graph_rewrite: Stack-based to avoid recursion; limits stack size to prevent infinite loops
  4. Caching: Many methods use cached_property or explicit caches

Common Patterns

# Manual bottom-up transformation
replace_map = {}
for u in sink.toposort()[::-1]:  # Reverse = bottom-up
    new_src = tuple(replace_map.get(s, s) for s in u.src)
    replace_map[u] = transform(u, new_src)
result = replace_map[sink]

# Using graph_rewrite for the same
result = graph_rewrite(sink, PatternMatcher([(UPat(), transform_with_pm)]))