Skip to content

Conversation

@ufownl
Copy link
Contributor

@ufownl ufownl commented Mar 13, 2024

Just add a compiler flag like GEMMA_MAX_SEQLEN.

Copy link
Member

@jan-wassenberg jan-wassenberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@jan-wassenberg jan-wassenberg added the copybara-import Trigger Copybara for merging pull requests label Mar 13, 2024
@austinvhuang
Copy link
Contributor

austinvhuang commented Mar 14, 2024

This is alright for now - but full disclosure - kSeqLen was a bit of a hack (in a bad way) that I'd like to get rid of sooner rather than later. It's very tempting to couple code to it as a global variable.

I think the general direction in the future is to move most configuration (except for a very select few items) from comptime constants to runtime variables for more flexibility and not having to run cmake to change something about the inference and to more flexibly combine different model capabilities in a single application.

One mitigation is to try to write code to still take top-k parametrically in functions (especially low-level ones) and avoid the temptation to reference kTopK (or any of these comptime constants) directly except at the very high-level application code, where it can be more easily replaced by a runtime value.

@copybara-service copybara-service bot merged commit 8fb44ed into google:dev Mar 14, 2024
@ufownl
Copy link
Contributor Author

ufownl commented Mar 14, 2024

This is alright for now - but full disclosure - kSeqLen was a bit of a hack (in a bad way) that I'd like to get rid of sooner rather than later. It's very tempting to couple code to it as a global variable.

I think the general direction in the future is to move most configuration (except for a very select few items) from comptime constants to runtime variables for more flexibility and not having to run cmake to change something about the inference and to more flexibly combine different model capabilities in a single application.

One mitigation is to try to write code to still take top-k parametrically in functions (especially low-level ones) and avoid the temptation to reference kTopK (or any of these comptime constants) directly except at the very high-level application code, where it can be more easily replaced by a runtime value.

Agreed, when I read this part of the code I was wondering why TopK uses compile-time parameters instead of in InferenceArgs (and RuntimeConfig added later). It turns out you guys thought so too. 😂

@austinvhuang
Copy link
Contributor

Yes, one distinction b/w InferenceArgs and RuntimeConfig - in general command line arguments are specific to an application/frontend so we are trying to keep those concerns (argument information, data validations, error handling) decoupled from the core library.

One can think of run.cc + app.h as a sort of "default app" (that uses InferenceArgs as one of its sets of CLI args) and distinct from the library concerns in gemma.cc / gemma.h.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

copybara-import Trigger Copybara for merging pull requests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants