Skip to content
View Amber-Chaeeunk's full-sized avatar
🐶
🐶

Block or report Amber-Chaeeunk

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN

Python 57,659 5,841 Updated Dec 25, 2025

Embedding Atlas is a tool that provides interactive visualizations for large embeddings. It allows you to visualize, cross-filter, and search embeddings and metadata.

TypeScript 4,476 240 Updated Dec 23, 2025

RAG on Everything with LEANN. Enjoy 97% storage savings while running a fast, accurate, and 100% private RAG application on your personal device.

Python 6,098 603 Updated Dec 25, 2025

This repository contains expert evaluation interface and data evaluation script for the OpenScholar project.

HTML 29 2 Updated Nov 19, 2024

Everything you need to know to build your own RAG application

Jupyter Notebook 3,822 439 Updated Nov 22, 2025

Efficient Triton Kernels for LLM Training

Python 5,980 455 Updated Dec 25, 2025

A curated list of papers and resources based on "Large Language Models on Graphs: A Comprehensive Survey" (TKDE)

974 67 Updated Mar 2, 2025

Fast and memory-efficient exact attention

Python 21,292 2,248 Updated Dec 25, 2025

A framework for serving and evaluating LLM routers - save LLM costs without compromising quality

Python 4,486 355 Updated Aug 10, 2024

Implementation for MatMul-free LM.

Python 3,042 199 Updated Dec 2, 2025

Evaluate your LLM's response with Prometheus and GPT4 💯

Python 1,023 66 Updated Apr 25, 2025

[SIGIR 2024] The official repo for paper "Planning Ahead in Generative Retrieval: Guiding Autoregressive Generation through Simultaneous Decoding"

Python 31 5 Updated Apr 24, 2024

언어모델을 학습하기 위한 공개 한국어 instruction dataset들을 모아두었습니다.

Python 445 31 Updated Apr 13, 2025

Perplexica is an AI-powered answering engine. It is an Open source alternative to Perplexity AI

TypeScript 27,804 2,912 Updated Dec 25, 2025

A list of multi-vector retrieval resources

15 Updated May 29, 2024

20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.

Python 13,051 1,386 Updated Dec 18, 2025

Longformer: The Long-Document Transformer

Python 2,178 288 Updated Feb 8, 2023

[WWW 2024] The official repo for paper "Scalable and Effective Generative Information Retrieval".

Python 63 7 Updated May 7, 2024

An open science effort to benchmark legal reasoning in foundation models

Python 521 79 Updated Aug 25, 2024
5 Updated Aug 16, 2024

LexGLUE: A Benchmark Dataset for Legal Language Understanding in English

Python 230 42 Updated Jul 23, 2025

Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.

Python 1,997 475 Updated Dec 20, 2025

A series of large language models trained from scratch by developers @01-ai

Jupyter Notebook 7,849 490 Updated Nov 27, 2024

A curated list of awesome LLM agents frameworks.

Python 1,233 131 Updated Dec 21, 2025

LLM finetuned for medical question answering

Python 547 70 Updated Sep 7, 2023
Python 3,633 433 Updated May 17, 2024

Reading list of hallucination in LLMs. Check out our new survey paper: "Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models"

1,068 54 Updated Sep 27, 2025

Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.

Python 6,093 527 Updated Jul 1, 2025
Next