[SYCL] Overcoming workaround for mmap() allocation on Windows and remove useless wait #13482
+19
−40
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR removes the usage of a workaround for mmap bug on some Intel GPUs on Linux. The bug is not present on Windows, so there is no meaning of having it in place.
This causes a small split in the codebase according to the OS in use, but it shows good performance improvements.
Moreover, it also removes some
wait()
on copy that are not necessary in SYCL backend, due to the usage of in_order queues.The work introduced here is based on #13109
N.B All numbers assessed with
GGML_SYCL_DISABLE_OPT=0
Lunar Lake's performance (this PR)
build: 0e1009f (5334)
Lunar Lake's performance (#13109)
build: f7e7d2a (5331)
Battlemage(B580) performance (this PR)
build: 0e1009f (5334)
Battlemage(B580) performance(#13109 )
build: f7e7d2a (5331)