Skip to content
View ZhenbinChan's full-sized avatar
🏠
Working
🏠
Working

Highlights

  • Pro

Block or report ZhenbinChan

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

Speculative Decoding

6 repositories

[COLM 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding

Python 274 17 Updated Aug 31, 2024

[ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding

Python 1,310 78 Updated Mar 6, 2025

Repository of the paper "Accelerating Transformer Inference for Translation via Parallel Decoding"

Python 121 8 Updated Mar 15, 2024

Fast inference from large lauguage models via speculative decoding

Python 872 93 Updated Aug 22, 2024

Codes for our paper "Speculative Decoding: Exploiting Speculative Execution for Accelerating Seq2seq Generation" (EMNLP 2023 Findings)

Python 46 1 Updated Dec 9, 2023

📰 Must-read papers and blogs on Speculative Decoding ⚡️

1,064 55 Updated Dec 22, 2025