-
Notifications
You must be signed in to change notification settings - Fork 183
Single source for both CPU and GPU code possible? #49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I'd actually asked about this on the Although, I haven't had the time to check to see if what was proposed there would work as I've still been working on the designs of the library. |
Couldn't you also check the |
This is a hard problem, i think its kind of impossible to solve without either a rustc fork, or some weird hacks. Primarily because cfg is a thing, which means we would need to recompile things twice or use the CPU target for GPU codegen which causes its own problems. Its just a hard issue overall |
Thanks for your reply. I just came across an abandoned project |
StupidQ: would declaring the function as |
Technically yes, but that will still abide by the cpu target's cfg stuff which causes problems. You would also need another custom codegen that derefs to cg_llvm for some things and nvvm for others. And that would moreover probably require forking cg_llvm to make some things public... its a mess |
Seems like a good place for a function macro then |
Hello. Thanks for this awesome project. I can now compile CUDA kernels in rust into ptx in one cargo package, and use them in another package. Now I wonder whether it is possible to write both the kernels and the CPU code within one package, or even one rust source file. For example, it might look like this:
Unfortunately, this seems not possible at this moment. However, I think it is purely a matter of some convenient macros. For example, if we define the
kernel
macro to, instead of directly mark the function as kernel, launch a separate cargo build with cuda_builder, and replace the CPU code with a lazy_static ptx module import. This way it would be easier to manage dependency and reuse between CPU code and GPU code and save some boilerplates.I don't know if you are interested.. Though it might be harder than I think:(
The text was updated successfully, but these errors were encountered: