Skip to content

Conversation

@gabizon103
Copy link
Contributor

Closes #502

Copy link
Owner

@sampsyo sampsyo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice work, y'all! This is a fantastic set of outcomes, and it seems like you've set things up for even more cool results in this area. Would any of you be interested in continuing to work in this vein and possibly to come up with something publishable?

## Mixed
During testing and early evaluations, we found that some benchmarks are too small to benefit from parallelization. This is likely because the amount of time required to execute the worklist algorithm is less than the amount of time it takes to spawn and collect threads. We attempted to find a heuristic, based on the size of a function in basic blocks, that we can use for switching between our sequential and parallel versions of the algorithm.

This version of the algorithm takes an integer threshold as an additional input; if the size of the function is below that threshold it uses the sequential algorithm and otherwise it uses the parallel algorithm.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the “size” here the number of basic blocks?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep! Just made it clearer


For generating random Bril programs, we used [Bear](https://stephenverderame.github.io/blog/bear/), an existing fuzzer for Bril.

## Parallelizing a Single CFG
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you say a little more about your experimental setup? The most relevant details are the core/thread count.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a short experimental setup section here

Runtime includes the amount of time it takes to build a CFG and run the worklist algorithm on it. It does not include the time to parse a Bril program.

![alt text](./averages_runtime.png)
In general, it seems our parallel algorithm provides at least some speedup over the sequential one. It also seems that our heuristics for all of our hybrid algorithms were quite bad, since at best they are on par with the fully parallel implementation. In the case of the reaching definitions analysis all of the hybrid algorithms are actually slower than the sequential one, so our heuristics were probably wrong more often than they were right. It is also possible that different heuristics are required for different types of analyses, which we did not explore.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is where it seems especially important to know how many threads we have to "spend" to yet this speedup.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a short paragraph discussing speedup vs thread count here

@gabizon103 gabizon103 requested a review from sampsyo May 15, 2025 13:33
@sampsyo
Copy link
Owner

sampsyo commented May 15, 2025

Looks great!! I'll hit the green button now, but do let me know if any of y'all would be interested in chatting about taking this project to its natural conclusion…

@sampsyo sampsyo merged commit af4678f into sampsyo:2025sp May 15, 2025
2 checks passed
@sampsyo sampsyo added the 2025sp label May 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Project Proposal: Parallelizing Dataflow Analyses

2 participants