Skip to content

Is it correct to keep using adapter_kv_cache during training in litgpt/adapter.py? #1287

Open
@fleetfootwork

Description

@fleetfootwork

Is it correct to keep using adapter_kv_cache during training in litgpt/adapter.py? I think self.adapter_wte and self.attn are updated during training, so ak and av should not use kv_cache. Thank you very much!
But it seems that during training, self.adapter_kv_cache was also used.
image

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions