Skip to content

Conversation

@neel-patel-1
Copy link
Contributor

@neel-patel-1 neel-patel-1 commented May 13, 2025

closes #511

Copy link
Owner

@sampsyo sampsyo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, everybody—thanks for the detailed writeup; this made the adjusted goal and the concrete approach clear. It's too bad that the original plan to use SmoothE did not work out. But it was definitely interesting to see how well ILP-based extraction worked for this workload.

One big question I had was: while I understand how an ILP-based extractor could be an interesting intermediate step, what was your motivation for adding support for good_lp? Were there specific limitations in egg's existing ILP solver support that you needed to work around? It might be nice to add some additional motivation for this in the body of the post.


## Introduction
### E-graphs and Equality Saturation
Equality-graphs, or more simply, e-graphs, efficiently represent equivalence classes of expressions. This makes them useful for superoptimization, where, given an input program, we seek to find the sequence of optimizations that emits the *best* program, minimizing some user-provided cost function. E-graphs are directed graphs with edges from expressions (*e-nodes* in an e-graph), to a group of e-nodes (an *e-class* in an e-graph, where all the e-nodes contained are equivalent expressions). To efficiently maintain and merge these equivalence classes, e-graphs rely on a union-find data structure. The example below shows an arithmetic expression `a * 2 / 2` represented as an e-graph.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it would be nice to link to the Wikipedia page, or some other definition? https://en.wikipedia.org/wiki/E-graph

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added


<img src="canonical-example-before.svg" alt="" width="33%">

In *equality saturation*, we apply pattern-based *rewrites* to repeatedly produce equivalent expressions. Each rewrite grows the e-graph, without losing the previous versions of the expression. So if two expressions always evaluate to the same result, they belong in the same e-class. Upon saturation, an e-graph will have undergone enough rewrites to reach a fixed point where it encodes all possible programs simultaneously.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Equality saturation probably also deserves a hyperlink.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed


<img src="canonical-example-before.svg" alt="" width="33%">

In *equality saturation*, we apply pattern-based *rewrites* to repeatedly produce equivalent expressions. Each rewrite grows the e-graph, without losing the previous versions of the expression. So if two expressions always evaluate to the same result, they belong in the same e-class. Upon saturation, an e-graph will have undergone enough rewrites to reach a fixed point where it encodes all possible programs simultaneously.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"all possible programs" perhaps could use a little more precision here… maybe you mean all the possible results of applying all the rewrite rules?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarified that "all possible programs resulting from all rewrite rules are encoded in the e-graph"

### E-graphs and Equality Saturation
Equality-graphs, or more simply, e-graphs, efficiently represent equivalence classes of expressions. This makes them useful for superoptimization, where, given an input program, we seek to find the sequence of optimizations that emits the *best* program, minimizing some user-provided cost function. E-graphs are directed graphs with edges from expressions (*e-nodes* in an e-graph), to a group of e-nodes (an *e-class* in an e-graph, where all the e-nodes contained are equivalent expressions). To efficiently maintain and merge these equivalence classes, e-graphs rely on a union-find data structure. The example below shows an arithmetic expression `a * 2 / 2` represented as an e-graph.

<img src="canonical-example-before.svg" alt="" width="33%">
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like a nice idea to use alt text, if you can think of anything…

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Descriptions added for each img

The second is an [exact extractor](https://github.com/egraphs-good/egg/blob/v0.10.0/src/lp_extract.rs) which formulates extraction as a mixed integer linear programming problem and solves it using the [COIN-OR](https://github.com/coin-or/Cbc) Branch-and-cut solver.

### E-graphs for Technology Mapping
The task of technology mapping in logic synthesis is to express a given Boolean function as a network of gates from a standard cell library (for ASICs) or programmable LUTs (for FPGAs) so that an objective function, such as total area, is optimized.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Technology mapping" is another good place to use a hyperlink to some kind of place where people can learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added


## Results

We compare the number of LUTs in the designs extracted using greedy and ILP extraction on the [ISCAS85](https://github.com/matth2k/synth-benchmarks/tree/main/verilog/iscas85) design benchmarks. Since finding an exact solution quickly becomes prohibitive in terms of extraction time, (the size of the e-graph and corresponding linear programming problem gets too large), we incrementally increase the number of "rewrite iterations" until reaching 10 . This limits the size of the e-graph by restricting the number of times rewrites are applied. Some benchmarks/solvers were not able to complete even a single iteration within the 30 minute timeout threshold.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

time, (the size of the e-graph and corresponding linear programming problem gets too large), we

Use commas or parentheses here, but not both.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

| c7552 | 335 | 325 | DNF | 315 (7) | DNF |
| c880 | DNF | DNF | DNF | DNF | DNF |

The times to solution are reported in the table below
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What units?

Copy link
Contributor Author

@neel-patel-1 neel-patel-1 May 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

number of LUTs in the synthesized design (specified in column headers)


| Benchmark | Greedy Time | Microlp Time | HiGHS Time | CBC Time |
|-----------|-------------------|-------------------|-------------------|--------------------|
| c1355 | 0.046134086 | DNF | 640.92384104 | 230.738058229 |
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please reduce the number of significant figures. For example, it's enough to say that this benchmark using greedy extraction takes 0.05 units (seconds?); we do not need all these decimal digits.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

| c6288 | 0.136245309 | DNF | 608.710588015 | DNF |
| c7552 | 0.045072024 | DNF | 600.26766583 | DNF |
| c880 | DNF | DNF | DNF | DNF |

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For all these results, is there some kind of context you can give to convey whether this is "good" or "bad"? The ideal thing to have done here would be to compare against ABC or similar (on both QoR and running time). But given that it's probably too late to do that, is there any way you could just give some kind of very vague context for how well this system is doing? Are the LUT count reductions anywhere close to the results of traditional tools? How about the running times—is 600 units of time (seconds?) in the ballpark of a traditional tool?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. There is still time to rerun with ABC to provide an optimized baseline

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will report when this is complete

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are using yosys synth_xilinx to synthesize a design targetting an FPGA in all configurations, an ABC tech mapping is always performed. This means that any further LUT decreases in the greedy and exact configurations are in addition to the improvements from ABC. This has been clarified in the text and configuration names.


We also compare the area (µm²) of the designs extracted using greedy and exact extraction (this time only using the top-performing - HiGHs - solver) on the ISCAS85 verilog design benchmarks using a standard cell library. Synthesizing an ASIC design using exact extraction becomes prohibitive faster than synthesis targetting an FPGA due to the larger design search space. There is no "No Optimization" column in this chart. To convert to a standard cell library, a set of rewrite rules must be applied required to convert digital logic to standard cells. We also compare against the Synopsys commercial design compiler to show the design quality a tuned EDA tool can achieve.

| Bench | Greedy Area | Exact Area (Node Limit) (msynth) | Synopsys |
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "msynth" mean here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is actually another tool. similar to lvv, but uses rewrite rules that convert to a standard cell library to target ASICs. clarified in the text: "We use another e-graph-based logic synthesis tool, called msynth, which operates similarly to lvv, but can synthesize ASIC designs using a standard cell library."

@sampsyo sampsyo added the 2025sp label May 16, 2025
@neel-patel-1
Copy link
Contributor Author

One big question I had was: while I understand how an ILP-based extractor could be an interesting intermediate step, what was your motivation for adding support for good_lp? Were there specific limitations in egg's existing ILP solver support that you needed to work around? It might be nice to add some additional motivation for this in the body of the post.

This is a reasonable question which deserved further explanation. We added more information to motivate our implementation decision in Design Extraction in the post.

Briefly, the Coin-OR Branch-and-Cut solver used by egg emits errors for some of the large problems constructed for ASIC synthesis. We hypothesized that the issue had to do with the solver itself, since previous users have also triggered unexpected behavior with their problem formulations. We figured we would encounter similar issues with other solvers as well, so being able to quickly switch between them to see if any of them could produce a result would be helpful. In addition, many solvers actually implement different algorithms, so some might be faster than others depending on the problem. This motivated the integration of a good_lp based extractor, which would allow us to quickly switch between solvers at runtime. Also, good_lp provides a clean API for expressing problem variables, constraints and objective function, making it easy to understand the code. Our implementation has drawbacks (performance) as shown in the Results section, though.

@sampsyo
Copy link
Owner

sampsyo commented May 17, 2025

Got it; that absolutely makes sense!

Thanks for all the changes overall. Can you let me know when I should take another look (and publish) by requesting a review with the widget on the right-hand side of the GitHub PR page?

@neel-patel-1 neel-patel-1 requested a review from sampsyo May 17, 2025 20:12
@neel-patel-1
Copy link
Contributor Author

I have finished reviewing. Thanks for the feedback on this!

@sampsyo
Copy link
Owner

sampsyo commented May 18, 2025

Looks great; nice work on this! The expanded detail in the evaluation is especially helpful to see how things worked out.

@sampsyo sampsyo merged commit 02295d9 into sampsyo:2025sp May 18, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Project Proposal: Testing Extraction Techniques for E-graph Based Digital Logic Rewriting

3 participants