Skip to content
View cmengting's full-sized avatar

Block or report cmengting

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

AI Native Data App Development framework with AWEL(Agentic Workflow Expression Language) and Agents

Python 17,594 2,457 Updated Nov 6, 2025

A streamlined and customizable framework for efficient large model (LLM, VLM, AIGC) evaluation and performance benchmarking.

Python 1,909 218 Updated Nov 10, 2025

中文 NLP 预处理、解析工具包,准确、高效、易用 A Chinese NLP Preprocessing & Parsing Package www.jionlp.com

Python 3,753 441 Updated Oct 30, 2025

🤗 Evaluate: A library for easily evaluating machine learning models and datasets.

Python 2,360 299 Updated Sep 25, 2025

[NeurIPS2025] "AI-Researcher: Autonomous Scientific Innovation" -- A production-ready version: https://novix.science/chat

Python 3,536 407 Updated Oct 16, 2025

A Multi-Agent Trading System Based on Internal Contest Mechanism

Python 457 124 Updated Oct 13, 2025

Jina VDR is a multilingual, multi-domain benchmark for visual document retrieval

Python 32 1 Updated Aug 4, 2025

Industrial-first evaluation benchmark for LLMs in the DevOps/AIOps domain.

Python 644 46 Updated Jul 10, 2024

Label Studio is a multi-type data labeling and annotation tool with standardized output format

JavaScript 25,371 3,175 Updated Nov 10, 2025

收集整理开源的数据标注工具

928 178 Updated Oct 9, 2019

Microsoft.Recognizers.Text provides recognition and resolution of numbers, units, date/time, etc. in multiple languages (ZH, EN, FR, ES, PT, DE, IT, TR, HI, NL. Partial support for JA, KO, AR, SV).…

C# 1,755 434 Updated Feb 19, 2025

GraalPy – A high-performance embeddable Python 3 runtime for Java

Python 1,503 140 Updated Nov 7, 2025

Ongoing research training transformer models at scale

Python 14,146 3,258 Updated Nov 7, 2025

🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23

TypeScript 18,123 1,735 Updated Nov 9, 2025

A curated, but incomplete, list of data-centric AI resources.

1,135 80 Updated Jun 26, 2024

A powerful tool for creating fine-tuning datasets for LLM

JavaScript 11,724 1,133 Updated Nov 8, 2025
TypeScript 1,612 249 Updated Apr 22, 2025

Infisical is the open-source platform for secrets, certificates, and privileged access management.

TypeScript 23,584 1,571 Updated Nov 10, 2025

🔥 MaxKB is an open-source platform for building enterprise-grade agents. 强大易用的开源企业级智能体平台。

Python 19,245 2,494 Updated Nov 10, 2025

Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷

Python 5,489 287 Updated Nov 7, 2025

OCR & Document Extraction using vision models

TypeScript 11,927 814 Updated May 20, 2025

Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.

Python 63,211 9,286 Updated Nov 6, 2025

Model Context Protocol Servers

TypeScript 72,236 8,684 Updated Nov 9, 2025

[ACL 2024 Findings] Code implementation of Paper "Rethinking Negative Instances for Generative Named Entity Recognition"

Python 59 2 Updated Mar 20, 2024

Awesome papers about generative Information Extraction (IE) using Large Language Models (LLMs)

1,022 61 Updated Nov 18, 2024

This hands-on lab aims to alleviate some of that headache by demonstrating how to create/augment a QnA dataset from complex unstructured data, assuming a real-world scenario. The sample aims to be …

Jupyter Notebook 59 15 Updated Apr 29, 2025

The Open Source Feature Store for AI/ML

Python 6,464 1,174 Updated Nov 6, 2025

Vision Document Retrieval (ViDoRe): Benchmark. Evaluation code for the ColPali paper.

Python 247 33 Updated Aug 4, 2025
Next