Zhenbin Chen ZhenbinChan

🏠

Working

I am Zhenbin Chen, a PhD student in ShanghaiTech University. My research interests is LLM && RL.

Achievements

Stars

6 repositories

[COLM 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding

Python 274 17 Updated Aug 31, 2024

[ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding

Python 1,310 78 Updated Mar 6, 2025

Repository of the paper "Accelerating Transformer Inference for Translation via Parallel Decoding"

Python 121 8 Updated Mar 15, 2024

Fast inference from large lauguage models via speculative decoding

Python 872 93 Updated Aug 22, 2024

Codes for our paper "Speculative Decoding: Exploiting Speculative Execution for Accelerating Seq2seq Generation" (EMNLP 2023 Findings)

Python 46 1 Updated Dec 9, 2023

📰 Must-read papers and blogs on Speculative Decoding ⚡️