Skip to content
View ShawnLee0910's full-sized avatar
  • westlake university
  • Hangzhou
  • 17:09 (UTC +08:00)

Block or report ShawnLee0910

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

自动抓取并维护Cursor各平台(Windows、macOS、Linux)的历史版本下载链接,让用户可以根据需要安装或降级到特定版本。

Python 195 14 Updated Dec 25, 2025

[NeurIPS 2024] Official code of $\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$

Python 50 5 Updated Oct 23, 2024

Official implementation of Bootstrapping Language Models via DPO Implicit Rewards

Python 46 3 Updated Apr 15, 2025

MiniCPM4 & MiniCPM4.1: Ultra-Efficient LLMs on End Devices, achieving 3+ generation speedup on reasoning tasks

Jupyter Notebook 8,479 527 Updated Oct 8, 2025

A Framework of Small-scale Large Multimodal Models

Python 940 95 Updated Apr 26, 2025

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 64,468 7,820 Updated Dec 24, 2025

Code release for the paper "The Delta Learning Hypothesis: Preference Tuning on Weak Data can Yield Strong Gains"

9 Updated Jul 8, 2025

[EMNLP'25] A novel alignment framework that leverages image retrieval to mitigate hallucinations in Vision Language Models.

Python 50 3 Updated Aug 21, 2025

Reinforcement Learning of Vision Language Models with Self Visual Perception Reward

Python 154 17 Updated Sep 23, 2025

Jax Codebase for Evolutionary Strategies at the Hyperscale

Python 203 16 Updated Dec 25, 2025

[NeurIPS 2024 Oral] Aligner: Efficient Alignment by Learning to Correct

Python 191 10 Updated Jan 16, 2025

Official repo for ICLR 2025: Your Weak LLM is Secretly a Strong Teacher for Alignment

Python 7 Updated Feb 11, 2025

Official implementation of paper "Learning to Optimize Multi-objective Alignment Through Dynamic Reward Weighting"

Jupyter Notebook 18 Updated Sep 29, 2025
Python 26 4 Updated Jul 16, 2024

[ICML 2025] Official code of "DAMA: Data- and Model-aware Alignment of Multi-modal LLMs"

Python 15 1 Updated May 24, 2025

Align Anything: Training All-modality Model with Feedback

Python 4,609 507 Updated Nov 27, 2025
Python 6 Updated Oct 13, 2025

[AAAI 2026 Oral] Implementation for "W2S-AlignTree: Weak-to-Strong Inference-Time Alignment for Large Language Models via Monte Carlo Tree Search".

Python 21 Updated Nov 17, 2025

"LightReasoner: Can Small Language Models Teach Large Language Models Reasoning?"

Python 578 27 Updated Nov 1, 2025

Public code repo for paper "Alice: Proactive Learning with Teacher's Demonstrations for Weak-to-Strong Generalization"

Python 10 Updated May 28, 2025

[ICLR 2025 Spotlight] Weak-to-strong preference optimization: stealing reward from weak aligned model

Python 16 Updated Feb 24, 2025
Python 38 2 Updated Feb 8, 2024

[ACM MM 2025] MCM-DPO: Multifaceted Cross-Modal Direct Preference Optimization for Alter-text Generation

Python 1 Updated Oct 2, 2025

The collection of awesome papers on alignment of diffusion models.

382 16 Updated Oct 27, 2025

(CVPR2024)A benchmark for evaluating Multimodal LLMs using multiple-choice questions.

Python 356 13 Updated Jan 14, 2025

Train transformer language models with reinforcement learning.

Python 16,780 2,375 Updated Dec 24, 2025
Python 100 2 Updated Dec 22, 2023

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, …

Python 11,843 1,086 Updated Dec 25, 2025

A RLHF Infrastructure for Vision-Language Models

Python 189 8 Updated Nov 15, 2024

[NeurIPS 2025] Mitigating Hallucination Through Theory-Consistent Symmetric Multimodal Preference Optimization

Python 3 1 Updated Nov 19, 2025
Next