Skip to content

Commit 052069b

Browse files
JackWeiwyao-fengchen
authored andcommitted
ascend: align attention mask to 32bytes (#7)
1 parent 2d654fd commit 052069b

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

lmdeploy/pytorch/engine/devices/ascend.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ def update_step_context(cls, step_context):
1717
single_attention_mask = torch.logical_not(
1818
torch.tril(
1919
torch.ones(step_context.q_seq_length[i],
20-
step_context.kv_seq_length[i],
20+
(step_context.kv_seq_length[i] + 31) & (~31),
2121
dtype=torch.bool).cuda(),
2222
diagonal=step_context.kv_seq_length[i] -
2323
step_context.q_seq_length[i],

0 commit comments

Comments
 (0)