Open Source Mac Large Language Models (LLM) - Page 4

Large Language Models (LLM) for Mac

View 119 business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    PandasAI

    PandasAI

    PandasAI is a Python library that integrates generative AI

    PandasAI is a Python library that adds Generative AI capabilities to pandas, the popular data analysis and manipulation tool. It is designed to be used in conjunction with pandas, and is not a replacement for it. PandasAI makes pandas (and all the most used data analyst libraries) conversational, allowing you to ask questions to your data in natural language. For example, you can ask PandasAI to find all the rows in a DataFrame where the value of a column is greater than 5, and it will return a DataFrame containing only those rows.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    Qwen3-Omni

    Qwen3-Omni

    Qwen3-omni is a natively end-to-end, omni-modal LLM

    Qwen3-Omni is a natively end-to-end multilingual omni-modal foundation model that processes text, images, audio, and video and delivers real-time streaming responses in text and natural speech. It uses a Thinker-Talker architecture with a Mixture-of-Experts (MoE) design, early text-first pretraining, and mixed multimodal training to support strong performance across all modalities without sacrificing text or image quality. The model supports 119 text languages, 19 speech input languages, and 10 speech output languages. It achieves state-of-the-art results: across 36 audio and audio-visual benchmarks, it hits open-source SOTA on 32 and overall SOTA on 22, outperforming or matching strong closed-source models such as Gemini-2.5 Pro and GPT-4o. To reduce latency, especially in audio/video streaming, Talker predicts discrete speech codecs via a multi-codebook scheme and replaces heavier diffusion approaches.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3
    Tongyi DeepResearch

    Tongyi DeepResearch

    Tongyi Deep Research, the Leading Open-source Deep Research Agent

    DeepResearch (Tongyi DeepResearch) is an open-source “deep research agent” developed by Alibaba’s Tongyi Lab designed for long-horizon, information-seeking tasks. It’s built to act like a research agent: synthesizing, reasoning, retrieving information via the web and documents, and backing its outputs with evidence. The model is about 30.5 billion parameters in size, though at any given token only ~3.3B parameters are active. It uses a mix of synthetic data generation, fine-tuning and reinforcement learning; supports benchmarks like web search, document understanding, question answering, “agentic” tasks; provides inference tools, evaluation scripts, and “web agent” style interfaces. The aim is to enable more autonomous, agentic models that can perform sustained knowledge gathering, reasoning, and synthesis across multiple modalities (web, files, etc.).
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    gptee

    gptee

    LLMs done the UNIX-y way

    Output from a language model using standard input as the prompt. Now supporting GPT3.5 chat completions! gptee was designed for use within shell scripts and other programs and also works in interactive shells. You can compose commands and execute them in a script. Proceed with caution before running arbitrary shell scripts. Using a chat completion model (like gpt-3.5-turbo), you can then inject a system message with -s or --system messages. For davinci and other non-chat models, the output is prefixed to the prompt. Compose shell commands like you would in a script. Try with a custom model. By default gptee uses gpt-3.5-turbo.
    Downloads: 3 This Week
    Last Update:
    See Project
  • Cloud-based help desk software with ServoDesk Icon
    Cloud-based help desk software with ServoDesk

    Full access to Enterprise features. No credit card required.

    What if You Could Automate 90% of Your Repetitive Tasks in Under 30 Days? At ServoDesk, we help businesses like yours automate operations with AI, allowing you to cut service times in half and increase productivity by 25% - without hiring more staff.
    Try ServoDesk for free
  • 5
    towhee

    towhee

    Framework that is dedicated to making neural data processing

    Towhee is an open-source machine-learning pipeline that helps you encode your unstructured data into embeddings. You can use our Python API to build a prototype of your pipeline and use Towhee to automatically optimize it for production-ready environments. From images to text to 3D molecular structures, Towhee supports data transformation for nearly 20 different unstructured data modalities. We provide end-to-end pipeline optimizations, covering everything from data decoding/encoding, to model inference, making your pipeline execution 10x faster. Towhee provides out-of-the-box integration with your favorite libraries, tools, and frameworks, making development quick and easy. Towhee includes a pythonic method-chaining API for describing custom data processing pipelines. We also support schemas, making processing unstructured data as easy as handling tabular data.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 6
    Autolabel

    Autolabel

    Label, clean and enrich text datasets with LLMs

    Autolabel is a Python library to label, clean and enrich datasets with Large Language Models (LLMs). Autolabel data for NLP tasks such as classification, question-answering and named entity recognition, entity matching and more. Seamlessly use commercial and open-source LLMs from providers such as OpenAI, Anthropic, HuggingFace, Google and more.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    BertViz

    BertViz

    BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)

    BertViz is an interactive tool for visualizing attention in Transformer language models such as BERT, GPT2, or T5. It can be run inside a Jupyter or Colab notebook through a simple Python API that supports most Huggingface models. BertViz extends the Tensor2Tensor visualization tool by Llion Jones, providing multiple views that each offer a unique lens into the attention mechanism. The head view visualizes attention for one or more attention heads in the same layer. It is based on the excellent Tensor2Tensor visualization tool. The model view shows a bird's-eye view of attention across all layers and heads. The neuron view visualizes individual neurons in the query and key vectors and shows how they are used to compute attention.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    CodeGeeX2

    CodeGeeX2

    CodeGeeX2: A More Powerful Multilingual Code Generation Model

    CodeGeeX2 is the second-generation multilingual code generation model from ZhipuAI, built upon the ChatGLM2-6B architecture and trained on 600B code tokens. Compared to the first generation, it delivers a significant boost in programming ability across multiple languages, outperforming even larger models like StarCoder-15B in some benchmarks despite having only 6B parameters. The model excels at code generation, translation, summarization, debugging, and comment generation, and it supports over 100 programming languages. With improved inference efficiency, quantization options, and multi-query/flash attention, CodeGeeX2 achieves faster generation speeds and lightweight deployment, requiring as little as 6GB GPU memory at INT4 precision. Its backend powers the CodeGeeX IDE plugins for VS Code, JetBrains, and other editors, offering developers interactive AI assistance with features like infilling and cross-file completion.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    Emscripten

    Emscripten

    Emscripten: An LLVM-to-WebAssembly Compiler

    Emscripten is a complete open-source compiler toolchain that transforms C, C++, and other LLVM-based source code into WebAssembly (and JavaScript), enabling native‑like applications to run in web browsers, Node.js, and other Wasm environments. While Emscripten mostly focuses on compiling C and C++ using Clang, it can be integrated with other LLVM-using compilers (for example, Rust has Emscripten integration, with the wasm32-unknown-emscripten and asmjs-unknown-emscripten targets). Emscripten provides Web support for popular portable APIs such as OpenGL and SDL2, allowing complex graphical native applications to be ported, such as the Unity game engine and Google Earth. It can probably port your codebase, too.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Smart Business Texting that Generates Pipeline Icon
    Smart Business Texting that Generates Pipeline

    Create and convert pipeline at scale through industry leading SMS campaigns, automation, and conversation management.

    TextUs is the leading text messaging service provider for businesses that want to engage in real-time conversations with customers, leads, employees and candidates. Text messaging is one of the most engaging ways to communicate with customers, candidates, employees and leads. 1:1, two-way messaging encourages response and engagement. Text messages help teams get 10x the response rate over phone and email. Business text messaging has become a more viable form of communication than traditional mediums. The TextUs user experience is intentionally designed to resemble the familiar SMS inbox, allowing users to easily manage contacts, conversations, and campaigns. Work right from your desktop with the TextUs web app or use the Chrome extension alongside your ATS or CRM. Leverage the mobile app for on-the-go sending and responding.
    Learn More
  • 10
    GLM-130B

    GLM-130B

    GLM-130B: An Open Bilingual Pre-Trained Model (ICLR 2023)

    GLM-130B is an open bilingual (English and Chinese) dense language model with 130 billion parameters, released by the Tsinghua KEG Lab and collaborators as part of the General Language Model (GLM) series. It is designed for large-scale inference and supports both left-to-right generation and blank filling, making it versatile across NLP tasks. Trained on over 400 billion tokens (200B English, 200B Chinese), it achieves performance surpassing GPT-3 175B, OPT-175B, and BLOOM-176B on multiple benchmarks, while also showing significant improvements on Chinese datasets compared to other large models. The model supports efficient inference via INT8 and INT4 quantization, reducing hardware requirements from 8× A100 GPUs to as little as a single server with 4× RTX 3090s. Built on the SwissArmyTransformer (SAT) framework and compatible with DeepSpeed and FasterTransformer, it supports high-speed inference (up to 2.5× faster) and reproducible evaluation across 30+ benchmark tasks.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    Gorilla

    Gorilla

    Gorilla: An API store for LLMs

    Gorilla is Apache 2.0 With Gorilla being fine-tuned on MPT, and Falcon, you can use Gorilla commercially with no obligations. Gorilla enables LLMs to use tools by invoking APIs. Given a natural language query, Gorilla comes up with the semantically- and syntactically- correct API to invoke. With Gorilla, we are the first to demonstrate how to use LLMs to invoke 1,600+ (and growing) API calls accurately while reducing hallucination. We also release APIBench, the largest collection of APIs, curated and easy to be trained on! Join us, as we try to expand the largest API store and teach LLMs how to write them! Hop on our Discord, or open a PR, or email us if you would like to have your API incorporated as well.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    Granite 3.0 Language Models

    Granite 3.0 Language Models

    New set of lightweight state-of-the-art, open foundation models

    This repository introduces Granite 3.0 language models as lightweight, state-of-the-art open foundation models built to natively support multilinguality, coding, reasoning, and tool usage. A central goal is efficient deployment, including the potential to run on constrained compute resources while remaining useful for a broad span of enterprise tasks. The repo positions the models for both research and commercial use under an Apache-2.0 license, signaling permissive adoption paths. Documentation highlights the capability mix (reasoning, tool use, code) and points to model artifacts and guidance for evaluation. Activity on the project shows an evolving codebase with open pull requests and standard GitHub project structure for issues and security visibility. In practice, this is a hub for acquiring Granite 3.0 variants and understanding how to integrate them into applications.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    Guardrails

    Guardrails

    Adding guardrails to large language models

    Guardrails is a Python package that lets a user add structure, type and quality guarantees to the outputs of large language models (LLMs). At the heart of Guardrails is the rail spec. rail is intended to be a language-agnostic, human-readable format for specifying structure and type information, validators and corrective actions over LLM outputs. We create a RAIL spec to describe the expected structure and types of the LLM output, the quality criteria for the output to be considered valid, and corrective actions to be taken if the output is invalid.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    Khoj

    Khoj

    An AI personal assistant for your digital brain

    Get more done with your open-source AI personal assistant. Khoj is a desktop application to search and chat with your notes, documents, and images. It is an offline-first, open-source AI personal assistant that is accessible from Emacs, Obsidian or your Web browser. Khoj is a thinking tool that is transparent, fun, and easy to engage with. You can build faster and better by using Khoj to search and reason across all your data sources. Khoj learns from your notes and documents to function as an extension of your brain. So that you can stay focused on doing what matters. Khoj started with the founding principle that a personal assistant be understandable, accessible and hackable. This means you can always customize and self-host your Khoj on your own machines.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    LangChain-Chatchat

    LangChain-Chatchat

    Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge

    LangChain-Chatchat (formerly Langchain-ChatGLM): A local knowledge base question answering application implementation based on large language models such as Langchain and ChatGLM. The knowledge base information of the current project is stored in the database, please initialize the database before running the project officially (we strongly recommend that you back up your knowledge files before performing operations). Relying on the open-source LLM and Embedding models supported by this project, this project can realize offline private deployment using all open-source models. At the same time, this project also supports the call of OpenAI GPT API, and will continue to expand the access to various models and model APIs in the future.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16
    MetaGPT

    MetaGPT

    The Multi-Agent Framework

    The Multi-Agent Framework: Given one line Requirement, return PRD, Design, Tasks, Repo. Assign different roles to GPTs to form a collaborative software entity for complex tasks. MetaGPT takes a one-line requirement as input and outputs user stories / competitive analysis/requirements/data structures / APIs / documents, etc. Internally, MetaGPT includes product managers/architects/project managers/engineers. It provides the entire process of a software company along with carefully orchestrated SOPs.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    Petals

    Petals

    Run 100B+ language models at home, BitTorrent-style

    Run 100B+ language models at home, BitTorrent‑style. Run large language models like BLOOM-176B collaboratively — you load a small part of the model, then team up with people serving the other parts to run inference or fine-tuning. Single-batch inference runs at ≈ 1 sec per step (token) — up to 10x faster than offloading, enough for chatbots and other interactive apps. Parallel inference reaches hundreds of tokens/sec. Beyond classic language model APIs — you can employ any fine-tuning and sampling methods, execute custom paths through the model, or see its hidden states. You get the comforts of an API with the flexibility of PyTorch. You can also host BLOOMZ, a version of BLOOM fine-tuned to follow human instructions in the zero-shot regime — just replace bloom-petals with bloomz-petals. Petals runs large language models like BLOOM-176B collaboratively — you load a small part of the model, then team up with people serving the other parts to run inference or fine-tuning.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18
    Qwen2.5-Math

    Qwen2.5-Math

    A series of math-specific large language models of our Qwen2 series

    Qwen2.5-Math is a series of mathematics-specialized large language models in the Qwen2 family, released by Alibaba’s QwenLM. It includes base models (1.5B / 7B / 72B parameters), instruction-tuned versions, and a reward model (RM) to improve alignment. Unlike its predecessor Qwen2-Math, Qwen2.5-Math supports both Chain-of-Thought (CoT) reasoning and Tool-Integrated Reasoning (TIR) for solving math problems, and works in both Chinese and English. It is optimized for solving mathematical benchmarks and exams; the 72B-Instruct model achieves state-of-the-art results among open source models on many English and Chinese math tasks.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    STORM

    STORM

    An LLM-powered knowledge curation system that researches topics

    STORM is an open-source virtual assistant framework developed by Stanford's OVAL lab. It is designed for creating natural language interfaces and assistants that can interact with APIs, databases, and services in a modular way.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    Scikit-LLM

    Scikit-LLM

    Seamlessly integrate LLMs into scikit-learn

    Seamlessly integrate powerful language models like ChatGPT into sci-kit-learn for enhanced text analysis tasks. At the moment the majority of the Scikit-LLM estimators are only compatible with some of the OpenAI models. Hence, a user-provided OpenAI API key is required. Additionally, Scikit-LLM will ensure that the obtained response contains a valid label. If this is not the case, a label will be selected randomly (label probabilities are proportional to label occurrences in the training set). Note: unlike in a typical supervised setting, the performance of a zero-shot classifier greatly depends on how the label itself is structured. It has to be expressed in natural language, descriptive, and self-explanatory.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 21
    Unstructured.IO

    Unstructured.IO

    Open source libraries and APIs to build custom preprocessing pipelines

    The unstructured library provides open-source components for ingesting and pre-processing images and text documents, such as PDFs, HTML, Word docs, and many more. The use cases of unstructured revolve around streamlining and optimizing the data processing workflow for LLMs. unstructured modular bricks and connectors form a cohesive system that simplifies data ingestion and pre-processing, making it adaptable to different platforms and is efficient in transforming unstructured data into structured outputs.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 22
    embedchain

    embedchain

    Framework to easily create LLM powered bots over any dataset

    Embedchain is a framework to easily create LLM-powered bots over any dataset. If you want a javascript version, check out embedchain-js. Embedchain empowers you to create chatbot models similar to ChatGPT, using your own evolving dataset. Start building LLM powered bots under 30 seconds.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 23
    llm

    llm

    An ecosystem of Rust libraries for working with large language models

    llm is an ecosystem of Rust libraries for working with large language models - it's built on top of the fast, efficient GGML library for machine learning. The primary entry point for developers is the llm crate, which wraps the llm-base and the supported model crates. Documentation for the released version is available on Docs.rs. For end-users, there is a CLI application, llm-cli, which provides a convenient interface for interacting with supported models. Text generation can be done as a one-off based on a prompt, or interactively, through REPL or chat modes. The CLI can also be used to serialize (print) decoded models, quantize GGML files, or compute the perplexity of a model. It can be downloaded from the latest GitHub release or by installing it from crates.io.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    Qwen2.5-Coder

    Qwen2.5-Coder

    Qwen2.5-Coder is the code version of Qwen2.5, the large language model

    Qwen2.5-Coder, developed by QwenLM, is an advanced open-source code generation model designed for developers seeking powerful and diverse coding capabilities. It includes multiple model sizes—ranging from 0.5B to 32B parameters—providing solutions for a wide array of coding needs. The model supports over 92 programming languages and offers exceptional performance in generating code, debugging, and mathematical problem-solving. Qwen2.5-Coder, with its long context length of 128K tokens, is ideal for a variety of use cases, from simple code assistants to complex programming scenarios, matching the capabilities of models like GPT-4o.
    Downloads: 22 This Week
    Last Update:
    See Project
  • 25
    Qwen2.5

    Qwen2.5

    Open source large language model by Alibaba

    Qwen2.5 is a series of large language models developed by the Qwen team at Alibaba Cloud, designed to enhance natural language understanding and generation across multiple languages. The models are available in various sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B parameters, catering to diverse computational requirements. Trained on a comprehensive dataset of up to 18 trillion tokens, Qwen2.5 models exhibit significant improvements in instruction following, long-text generation (exceeding 8,000 tokens), and structured data comprehension, such as tables and JSON formats. They support context lengths up to 128,000 tokens and offer multilingual capabilities in over 29 languages, including Chinese, English, French, Spanish, and more. The models are open-source under the Apache 2.0 license, with resources and documentation available on platforms like Hugging Face and ModelScope. This is a full ZIP snapshot of the Qwen2.5 code.
    Downloads: 19 This Week
    Last Update:
    See Project