Skip to content

Conversation

@aw578
Copy link
Contributor

@aw578 aw578 commented May 13, 2025

No description provided.

Copy link
Owner

@sampsyo sampsyo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work here, Allen—it sounds like you got the algorithm working in the way that you wanted to. I just have a few writing suggestions within. I especially think it would be interesting to hear any stories you might have about what was hard to get right, correctness-wise.

Do you think it might be worthwhile to contribute this implementation of GVN to the Bril monorepo?

title = "Welcome to CS 6120!"
[extra]
bio = """
Allen Wang is a CS M.Eng student at Cornell University. He's pretty tired right now.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😅


### Global Value Numbering

Global value numbering is a set of techniques which perform value numbering at the level of a function, rather than a single block. [This paper](https://www.cs.tufts.edu/~nr/cs257/archive/keith-cooper/value-numbering.pdf) goes over hash-based and partitioning implementations of global value numbering. There's already a hash-based implementation for Bril and it's very conceptually similar to local value numbering, so I decided to implement value partitioning instead.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's already a hash-based implementation for Bril

Maybe consider linking to that earlier blog post?


### Value partitioning

Instead of hashing expressions to values like local value numbering, value partitioning works by directly computing congruence classes of expressions, where two expressions are congruent if they have the same opcode all their arguments are congruent with each other. To perform value partitioning, we first put a program into SSA to ensure that each value has a unique variable associated with it. We assume that all operations of a type are in the same congruence class, then repeatedly partition congruence classes where this cannot be true until we obtain a maximum fixed point.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if they have the same opcode all their arguments are congruent with each other

Maybe missing an "and"?


#### Implementation Notes

Getting GVN right was very finicky and required reading the text very carefully. My biggest struggles in the end were first understanding the processing algorithm, then figuring out and debugging all the edge cases that arose from not reading the paper carefully enough.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you expand on this a little bit? What are one or two examples of edge cases that you didn't anticipate? How about one debugging "war story" where you describe a problem that was hard to track down?


### evaluation

For correctness, I ran my optimizations on the core benchmarks with different inputs to test whether they would cause problems. I also wrote a series of test cases for various edge cases and optimizations GVN should be able to identify.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What were the results of this experiment? Out of N total core benchmarks, how many produced the same answer after transformation?


For correctness, I ran my optimizations on the core benchmarks with different inputs to test whether they would cause problems. I also wrote a series of test cases for various edge cases and optimizations GVN should be able to identify.

For performance, I tested against the core benchmarks, using the same inputs as the correctness tests. I found that using only the AVAIL-based removal resulted in a median improvement of 1.5% less instructions executed over base SSA and a max speedup across runs of around 58% less instructions executed. Most of the benchmarks were written directly in Bril, so they were relatively optimized and there were few opportunities to identify congruence classes across blocks.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did the number of instructions ever increase, or was the worst case a mere 0% improvement? It would be interesting to know whether any "artifacts" are possible here.

@sampsyo sampsyo added the 2025sp label May 16, 2025
@aw578
Copy link
Contributor Author

aw578 commented May 17, 2025

I think it would be worthwhile to at least contribute this implementation, although it rarely does anything that local value numbering couldn't to any of the current benchmarks. I've updated the blogpost with the changes you suggested.

@sampsyo
Copy link
Owner

sampsyo commented May 18, 2025

The revised version looks great; I'll publish it now!

If you think it might be fun to contribute, let's give it a shot. There's a chance that, even if the current version is pretty limited, it might be a useful foundation or future work.

@sampsyo sampsyo merged commit e5fa875 into sampsyo:2025sp May 18, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants