Skip to content
View brickee's full-sized avatar

Block or report brickee

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Tools for merging pretrained large language models.

Python 6,474 636 Updated Oct 31, 2025

《代码随想录》LeetCode 刷题攻略:200道经典题目刷题顺序,共60w字的详细图解,视频难点剖析,50余张思维导图,支持C++,Java,Python,Go,JavaScript等多语言版本,从此算法学习不再迷茫!🔥🔥 来看看,你会发现相见恨晚!🚀

Shell 59,261 12,263 Updated Nov 7, 2025

[ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.

Python 5,346 453 Updated May 21, 2025

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 62,900 7,612 Updated Nov 19, 2025

This repo aims to record resource of role-playing abilities in LLMs, including dataset, paper, application, etc.

131 6 Updated Sep 23, 2024

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

6,849 371 Updated Oct 17, 2025

A bibliography and survey of the papers surrounding o1

TeX 1,212 51 Updated Nov 16, 2024

[ICLR2025 Spotlight] MagicPIG: LSH Sampling for Efficient LLM Generation

Python 240 16 Updated Dec 16, 2024

A framework for detecting, highlighting and correcting grammatical errors on natural language text. Created by Prithiviraj Damodaran. Open to pull requests and other forms of collaboration.

Python 1,562 175 Updated Feb 15, 2023

A quick guide (especially) for trending instruction finetuning datasets

3,310 222 Updated Nov 28, 2023

DataComp for Language Models

HTML 1,390 129 Updated Sep 9, 2025

Tools to download and cleanup Common Crawl data

Python 1,033 152 Updated Apr 25, 2023

Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.

Python 4,730 272 Updated Jul 18, 2025

BABILong is a benchmark for LLM evaluation using the needle-in-a-haystack approach.

Jupyter Notebook 217 21 Updated Sep 2, 2025

Awesome things about LLM-powered agents. Papers / Repos / Blogs / ...

2,161 177 Updated Apr 30, 2025

RAGChecker: A Fine-grained Framework For Diagnosing RAG

Python 1,019 86 Updated Dec 13, 2024

List of papers on hallucination detection in LLMs.

990 77 Updated Nov 14, 2025

This repo contains the source code for RULER: What’s the Real Context Size of Your Long-Context Language Models?

Python 1,371 114 Updated Nov 13, 2025

Code and documents of LongLoRA and LongAlpaca (ICLR 2024 Oral)

Python 2,693 291 Updated Aug 14, 2024

[EMNLP 2024] LongAlign: A Recipe for Long Context Alignment of LLMs

Python 256 21 Updated Dec 16, 2024

Implementation of the LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paper

Python 152 14 Updated Jul 20, 2024

tiktoken is a fast BPE tokeniser for use with OpenAI's models.

Python 16,624 1,308 Updated Oct 6, 2025

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 7,289 625 Updated Nov 21, 2025

ACL 2024 | LooGLE: Long Context Evaluation for Long-Context Language Models

Python 191 5 Updated Oct 8, 2024

Implementation of paper Data Engineering for Scaling Language Models to 128K Context

Python 478 30 Updated Mar 19, 2024

Codes for the paper "∞Bench: Extending Long Context Evaluation Beyond 100K Tokens": https://arxiv.org/abs/2402.13718

Python 356 29 Updated Sep 25, 2024

The RedPajama-Data repository contains code for preparing large datasets for training large language models.

Python 4,856 366 Updated Dec 7, 2024

A repository sharing the literatures about long-context large language models, including the methodologies and the evaluation benchmarks

Jupyter Notebook 269 11 Updated Jul 30, 2024

AllenAI's post-training codebase

Python 3,329 459 Updated Nov 23, 2025
Next