-
Notifications
You must be signed in to change notification settings - Fork 60
Description
With 8GB of memory (can be emulated with ulimit -v 8000000
) currently not all example and benchmark compile and not all unit tests run. I'm not sure what limit is reasonable. But I think if it's more than 8GB (or whatever we think we can expect everyone to have at minimum) we should document the requirement. Also we should test that we stick to whatever limit we choose in the CI by setting ulimit.
06_pvc_prefill_attention.cpp, 10_pvc_prefill_attention_cachedKV.cpp, and benchmarks/main.cpp fail to compile. We instantiate many templates in the same source file. The easiest solution might be to split each source file into multiple ones to avoid this. We could also try to understand why each template instantiation requires so much memory.
cutlass_test_unit_gemm_device_tensorop_xe_group_gemm fails to run. It appears to me that we over allocate the buffers. But I might misunderstood something. If this test actual requires more memory than we require we should automatic disable the test if not sufficient memory is available rather than show the test as failing.