[Feature]: Add process_weights_after_loading to AttentionImpl

### 🚀 The feature, motivation and pitch

Currently, in the `Attention` layer, we check if `process_weights_after_loading` exists and then call it conditionally, and after that we apply flashinfer-specific logic.

Instead, we should just add a `process_weights_after_loading` method to AttentionImpl (no-op) by default, call it from `Attention.process_weights_after_loading`, and override it in `FlashInferAttentionImpl`.

### Alternatives

_No response_

### Additional context

https://github.com/vllm-project/vllm/pull/23016#discussion_r2414787224

### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Feature]: Add process_weights_after_loading to AttentionImpl #26817

🚀 The feature, motivation and pitch

Alternatives

Additional context

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Feature]: Add process_weights_after_loading to AttentionImpl #26817

Description

🚀 The feature, motivation and pitch

Alternatives

Additional context

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions