Skip to content

Commit 9c4b960

Browse files
committed
Feature(MInference): update FAQ
1 parent 3581688 commit 9c4b960

File tree

2 files changed

+4
-2
lines changed

2 files changed

+4
-2
lines changed

README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -91,8 +91,9 @@ Firstly, attention is dynamically sparse, a characteristic inherent to the mecha
9191

9292
**Q3: Does this dynamic sparse attention pattern only exist in Auto-regressive LMs or RoPE based LLMs?**
9393

94-
Similar vertical and slash line sparse patterns were discovered during the BERT era [1]. Our analysis of T5's attention patterns, shown in the figure, reveals these patterns persist across different heads, even in bidirectional attention.<br/>
94+
Similar vertical and slash line sparse patterns have been discovered in BERT[1] and multi-modal LLMs[2]. Our analysis of T5's attention patterns, shown in the figure, reveals these patterns persist across different heads, even in bidirectional attention.<br/>
9595
[1] SparseBERT: Rethinking the Importance Analysis in Self-Attention, ICML 2021.<br/>
96+
[2] LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context Inference, 2024.<br/>
9697
<div style="text-align: center;">
9798
<img src="images/t5_sparse_pattern.png" width="600px" style="margin:auto;border-radius: 5px;display: inline-block;padding: 0 0 0 10px;" alt=''>
9899
</div>

Transparency_FAQ.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,8 +38,9 @@ Firstly, attention is dynamically sparse, a characteristic inherent to the mecha
3838

3939
## Does this dynamic sparse attention pattern only exist in Auto-regressive LMs or RoPE based LLMs?
4040

41-
Similar vertical and slash line sparse patterns were discovered during the BERT era [1]. Our analysis of T5's attention patterns, shown in the figure, reveals these patterns persist across different heads, even in bidirectional attention.<br/>
41+
Similar vertical and slash line sparse patterns have been discovered in BERT[1] and multi-modal LLMs[2]. Our analysis of T5's attention patterns, shown in the figure, reveals these patterns persist across different heads, even in bidirectional attention.<br/>
4242
[1] SparseBERT: Rethinking the Importance Analysis in Self-Attention, ICML 2021.<br/>
43+
[2] LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context Inference, 2024.<br/>
4344
<div style="text-align: center;">
4445
<img src="images/t5_sparse_pattern.png" width="600px" style="margin:auto;border-radius: 5px;display: inline-block;padding: 0 0 0 10px;" alt=''>
4546
</div>

0 commit comments

Comments
 (0)