Skip to content

[offload] Redesign the ELF format for device-side binaries. #139037

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
StevenYangCC opened this issue May 8, 2025 · 10 comments
Closed

[offload] Redesign the ELF format for device-side binaries. #139037

StevenYangCC opened this issue May 8, 2025 · 10 comments
Labels
offload question A question, not bug report. Check out https://llvm.org/docs/GettingInvolved.html instead!

Comments

@StevenYangCC
Copy link

Redesign the ELF format for device-side binaries.
The original binary format is not suitable for device-side binaries.
For example:
Provide a callgraph section to enable binary optimizations at link time, as well as support for lazy loading.
Provide a prototype section to support function pointers.
Provide a section for recording metadata for each function, such as register kind, register number, stack size, etc. This metadata should be extensible, as different architectures may require different metadata, and the attributes of kernel functions and regular functions may also vary.
Moderately reference the CUDA cubin format for inspiration.

@llvmbot
Copy link
Member

llvmbot commented May 8, 2025

@llvm/issue-subscribers-offload

Author: None (StevenYangCC)

Redesign the ELF format for device-side binaries. The original binary format is not suitable for device-side binaries. For example: Provide a callgraph section to enable binary optimizations at link time, as well as support for lazy loading. Provide a prototype section to support function pointers. Provide a section for recording metadata for each function, such as register kind, register number, stack size, etc. This metadata should be extensible, as different architectures may require different metadata, and the attributes of kernel functions and regular functions may also vary. Moderately reference the CUDA cubin format for inspiration.

@jhuber6
Copy link
Contributor

jhuber6 commented May 8, 2025

I also do not understand this issue. Most of what you're describing is handled by the GPU runtime's loader, which is not something we have access to from offload/. We already have metadata for things like this in AMDGPU at least, unless you're talking about using that information with a call graph to generate the launch packet metadata?

@Artem-B
Copy link
Member

Artem-B commented May 8, 2025

@StevenYangCC if you could elaborate on the issue(s) that prompt this request, it would be very helpful to figure out how those issues should be addressed.

@StevenYangCC
Copy link
Author

The ELF format generated by LLVM compilation is very rigid and cannot be flexibly handled on different architectures and does not meet the needs of heterogeneous computing architectures.

@jhuber6
Copy link
Contributor

jhuber6 commented May 9, 2025

The ELF format generated by LLVM compilation is very rigid and cannot be flexibly handled on different architectures and does not meet the needs of heterogeneous computing architectures.

This is completely vague so I'm just going to close this.

@jhuber6 jhuber6 closed this as completed May 9, 2025
@StevenYangCC
Copy link
Author

StevenYangCC commented May 9, 2025

@jhuber6 What I mean is that the device-side ELF format should be redesigned instead of using the same ELF format as the host-side, since the needs of the two are not consistent. For example, the metadata in the file in the device side of the ELF for AMDGPUs is currently a big, overarching structure, and a lot of unneeded attributes take up space as well. We can design a binary ELF format that is applicable to the device side of all heterogeneous architectures, and each architecture can flexibly add sections, symbols, or attributes.

@jhuber6
Copy link
Contributor

jhuber6 commented May 9, 2025

@jhuber6 What I mean is that the device-side ELF format should be redesigned instead of using the same ELF format as the host-side, since the needs of the two are not consistent. For example, the metadata in the file in the device side of the ELF for AMDGPUs is currently a big, overarching structure, and a lot of unneeded attributes take up space as well. We can design a binary ELF format that is applicable to the device side of all heterogeneous architectures, and each architecture can flexibly add sections, symbols, or attributes.

I do not know what this means, feel free to write up a design document and contribute patches.

@StevenYangCC
Copy link
Author

StevenYangCC commented May 9, 2025

@jhuber6 You can express your doubts in detail and I will try my best to express them clearly.

@EugeneZelenko EugeneZelenko added the question A question, not bug report. Check out https://llvm.org/docs/GettingInvolved.html instead! label May 9, 2025
@Artem-B
Copy link
Member

Artem-B commented May 9, 2025

Moderately reference the CUDA cubin format for inspiration.

That request and responses read like a chat with an LLM. I wonder if we're being curl'ed here? https://arstechnica.com/gadgets/2025/05/open-source-project-curl-is-sick-of-users-submitting-ai-slop-vulnerabilities/

@jhuber6
Copy link
Contributor

jhuber6 commented May 10, 2025

Moderately reference the CUDA cubin format for inspiration.

That request and responses read like a chat with an LLM. I wonder if we're being curl'ed here? https://arstechnica.com/gadgets/2025/05/open-source-project-curl-is-sick-of-users-submitting-ai-slop-vulnerabilities/

I said as much in #139039 which is why I closed the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
offload question A question, not bug report. Check out https://llvm.org/docs/GettingInvolved.html instead!
Projects
None yet
Development

No branches or pull requests

5 participants