-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Chunked Prefill VLM #3188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Chunked Prefill VLM #3188
Changes from 1 commit
Commits
Show all changes
31 commits
Select commit
Hold shift + click to select a range
92909f3
add logic
mht-sharma 44ed5ef
working
mht-sharma 526a878
add encoder cache free
mht-sharma b86919a
fixes
mht-sharma 52e4186
fix idefics
mht-sharma 7237e8e
update pixel_values
mht-sharma be8e60a
add improvements
mht-sharma 6ed540b
add improvements
mht-sharma 46ff016
improve
mht-sharma f34b06c
nit
mht-sharma 26212b9
fix inputs_embeds
mht-sharma 2f67c53
nit
mht-sharma 6545cdd
optimizations
mht-sharma 136b989
add prometheus port
mht-sharma 63ddba2
rename vars
mht-sharma f1da19d
rename vars
mht-sharma dd91b60
nit
mht-sharma 1592621
disable chunking for qwen
mht-sharma d58ec38
review comments
mht-sharma 8015f5f
Merge branch 'main' into add_vlm_chunking_optimized
mht-sharma b86a73d
remove port
mht-sharma 36c5ec2
improve headdim
mht-sharma 3bb514d
remove kwargs and redundant args
mht-sharma 419ecd0
fix qwen2_5
mht-sharma 60b8cb0
fix config image_token_id error
mht-sharma 534a16d
fix test
mht-sharma 61ccbf6
update paligemma
mht-sharma 5cfd4b1
fix paligemma text
mht-sharma d1cf64a
minor fix
mht-sharma 9964731
fix qwen test
mht-sharma 6a5955a
fix qwen test
mht-sharma File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
fix config image_token_id error
- Loading branch information
commit 60b8cb0e46838ca8681a3ff1c4e438744ba7b86b
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we not have both, or either
input_ids
andinput_embeds
?Seems like it's an antipattern to accept both (since we don't know which one is valid.
It's totally fine to handle the embeddings before this step and only accept
input_embeds
imho