Skip to content

Conversation

@mariasoroka
Copy link
Contributor

Closes #514

Copy link
Owner

@sampsyo sampsyo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's too bad that the original plan did not work out! This was, however, a clear explanation of the small change that you did manage to apply. There are a few places where some additional detail would be very useful.

Comment on lines 1 to 12
+++
title = "Welcome to CS 6120!"
[extra]
bio = """
Grace Hopper made the first compiler. [Adrian Sampson](https://www.cs.cornell.edu/~asampson/) is an associate professor of computer science, so that's pretty cool too I guess.
"""
[[extra.authors]]
name = "Adrian Sampson"
link = "https://www.cs.cornell.edu/~asampson/" # Links are optional.
[[extra.authors]]
name = "Grace Hopper"
+++
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please include your own title, author name, and bio.

name = "Grace Hopper"
+++

My project was based on Dr.Jit codebase. Here is the [paper](https://dl.acm.org/doi/10.1145/3528223.3530099) that describes the compiler. In short, Dr.Jit traces the program to compute an AST, performs some optimizations on this representation, then manually assembles either LLVM IR or PTX code depending on the used backend, and finally compiles it into a kernel.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "manually" mean in this context?

<img src="./2025-05-13-Final_Project/graph_old.png" alt="drawing" height="300"/>
<img src="./2025-05-13-Final_Project/graph_new.png" alt="drawing" height="300"/>

To better test the optimization and evaluate the performance, I planned to render the three scenes shown in Fig. 6 of the Dr.Jit paper. However, I noticed that during rendering, the optimization was never invoked. To address this, I modified the renderer code to make it less efficient, ensuring that there will be nodes to which the optimization can be applied. I rendered all the scenes with and without my optimization five times to get average times. The results are reported below.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you say a tiny bit more about that change you applied? Did it involve manually going the "opposite direction" from your transformation (i.e., replacing a cos expression with a sin expression?


To better test the optimization and evaluate the performance, I planned to render the three scenes shown in Fig. 6 of the Dr.Jit paper. However, I noticed that during rendering, the optimization was never invoked. To address this, I modified the renderer code to make it less efficient, ensuring that there will be nodes to which the optimization can be applied. I rendered all the scenes with and without my optimization five times to get average times. The results are reported below.

<img src="./2025-05-13-Final_Project/evaluation.png" alt="drawing" width="300"/>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before showing your results, can you briefly say something about your experimental setup (hardware/OS, Dr.Jit version, etc.) and your approach to measurement (how did you measure the execution time, how many replicas did you use, etc.)?


Well, my optimization did not improve the performance, but at least I know that the trace modification was correct since produced images were identical.

The second part of the project was much less straightforward, and I was not able to figure it out. As described in the project proposal, the idea was to trace functions that lack hardware support and cannot be represented by a single node into a separate trace, and then redirect the main AST to that newly created trace. I was unable to find a way to achieve this using the existing tools in the codebase. Implementing this optimization would require introducing a new type of node (e.g., `call`) and writing the corresponding PTX or LLVM IR code to support it.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As described in the project proposal

Maybe it would be a good idea to link to the proposal, so people can read it if they want.

@sampsyo
Copy link
Owner

sampsyo commented May 15, 2025

Looks good; I'll publish this now!

@sampsyo sampsyo merged commit c74bc79 into sampsyo:2025sp May 15, 2025
2 checks passed
@sampsyo sampsyo added the 2025sp label May 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Project Proposal: Optimizing code with trigonometric functions

2 participants