Pulse · SciSharp/LLamaSharp · GitHub

April 12, 2025 – May 12, 2025

Overview

22 Active pull requests

40 Active issues

20 Pull requests merged by 8 people

Removed Tensor Overrides Native Memory Allocations
#1185 merged May 12, 2025
Feat/tensor override
#1180 merged May 11, 2025
May Binary Update
#1179 merged May 11, 2025
Update to M.E.AI 9.4.3-preview.1.25230.7
#1182 merged May 6, 2025
Bump Whisper.net from 1.7.4 to 1.8.1
#1143 merged May 3, 2025
Bump Spectre.Console from 0.49.1 to 0.50.0
#1175 merged May 3, 2025
Bump Whisper.net.Runtime from 1.7.4 to 1.8.1
#1145 merged May 3, 2025
Bump Microsoft.SemanticKernel.Abstractions from 1.44.0 to 1.48.0
#1176 merged May 3, 2025
Bump Microsoft.AspNetCore.Mvc.Razor.RuntimeCompilation from 8.0.12 to 8.0.15
#1177 merged May 3, 2025
Update LLamaEmbedder, Examples packages, and KernelMemory examples
#1170 merged May 3, 2025
docs: add deep-wiki link to readme.
#1172 merged Apr 29, 2025
Create stale_issues.yml
#1171 merged Apr 25, 2025
Removed Android x86
#1167 merged Apr 22, 2025
Removed Android x86
#1166 merged Apr 21, 2025
Update Apr 2025
#1165 merged Apr 21, 2025
Fixed CUDA Build Conditional
#1164 merged Apr 21, 2025
Using ubuntu-22.04 because ubuntu-24.04 fails
#1163 merged Apr 20, 2025
CI Ubuntu-24.04
#1162 merged Apr 20, 2025
DLLAMA_CURL=OFF
#1161 merged Apr 20, 2025
Fixed LLama.Web error starting session
#1158 merged Apr 17, 2025

2 Pull requests opened by 2 people

MTMD - Remove Llava
#1178 opened May 1, 2025
Memory efficient context handling
#1183 opened May 10, 2025

36 Issues closed by 8 people

[BUG]: GPU doesn't get into the process;
#1060 closed May 12, 2025
Refactoring for StatelessExecutor
#1084 closed May 12, 2025
[Feature]: RAG from PDF File
#1087 closed May 12, 2025
[BUG]: InferenceParams doesn't contain a definition for Temperature
#1096 closed May 12, 2025
[BUG]: CUDA errors with two GPUs (multiple parallel requests)
#1091 closed May 7, 2025
[BUG]: Tokenization in 0.14.0 adds spaces
#856 closed May 2, 2025
Setting up a non-development runtime environment as a published .net application.
#939 closed Apr 28, 2025
[BUG]: loglevel 0 and 1 from llama.cpp doesn't seem to be supported
#995 closed Apr 27, 2025
closing inactive issues
#1157 closed Apr 26, 2025
[BUG]: System.AccessViolationException on KernelMemory Example in Llama.Examples
#1058 closed Apr 21, 2025
[BUG]: There are memory errors in versions 0.22.0 and 0.23.0 (NuGet)
#1160 closed Apr 21, 2025
When using StatelessExecutor, llama_new_context logs to console every time InferAsync is called
#363 closed Apr 19, 2025
Issue running LLama.Web in linux container - OSX M1
#342 closed Apr 19, 2025
Prompt format produced by DefaultHistoryTransform is non-standard
#290 closed Apr 19, 2025
Feature Request: Switch backends dynamically at runtime?
#264 closed Apr 19, 2025
Create HTTP API server and provide API like OAI
#269 closed Apr 19, 2025
Any plans to bring llava support?
#340 closed Apr 19, 2025
Not able to read fields from appsetings.json
#598 closed Apr 19, 2025
Build CUDA with AVX
#605 closed Apr 19, 2025
CentOS x86_64 Failed Loading 'libllama.so'
#685 closed Apr 19, 2025
Segmentation fault on Docker
#616 closed Apr 19, 2025
about NVidia GPU use example
#611 closed Apr 19, 2025
[llava] How to clear imagepaths ?
#643 closed Apr 19, 2025
Unable to run example Project
#323 closed Apr 19, 2025
Is it possible, that a model ist also able to use the online search during a chat?
#291 closed Apr 19, 2025
Kernel Memory is broken with latest nugets
#305 closed Apr 19, 2025
Unexpected behavior in ChatSession.ChatAsync methods
#261 closed Apr 19, 2025
If llama.dll is built in Debug mode, gibberish is produced when called from LLamaSharp
#188 closed Apr 19, 2025
Unable to get a simple example to work with Llamasharp and Semantic Kernel
#186 closed Apr 19, 2025
[BUG]: deepseek-r1
#1097 closed Apr 19, 2025
[Feature]: DeepSeek-R1-Distill-Qwen or similar distilled DeepSeek gguf support
#1059 closed Apr 19, 2025
DeepSeek-R1 reasoning process
#1090 closed Apr 19, 2025
OpenCL
#1024 closed Apr 19, 2025
'The type initializer for 'LLama.Native.NativeApi' threw an exception' in MAUI App
#180 closed Apr 19, 2025
Is it possible to train or fine tune a model with LLamaSharp?
#20 closed Apr 19, 2025
[BUG]: LLamaSharp Web stopped working
#1080 closed Apr 17, 2025

4 Issues opened by 4 people

[Feature]: Request for NPU Support on New Surface Devices to Enhance AI Model Execution
#1184 opened May 10, 2025
[Feature]: Using xcframework instead of using dylib
#1181 opened May 6, 2025
[BUG]: problems that LLamaSharp can match llama.cpp
#1174 opened Apr 29, 2025
Unknown model architecture: qwen3
#1173 opened Apr 29, 2025

126 Unresolved conversations

Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.

[BUG]: Application crashes on LlamaContext.GetState()
#1152 commented on Apr 12, 2025 • 0 new comments
Add more unit tests
#237 commented on Apr 19, 2025 • 0 new comments
Token Healing
#446 commented on Apr 19, 2025 • 0 new comments
Android Backend
#695 commented on Apr 19, 2025 • 0 new comments
[BUG]: Exception Info: System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
#1094 commented on Apr 20, 2025 • 0 new comments
Always getting a "Ċ" before EOS when using Qwen
#865 commented on Apr 23, 2025 • 0 new comments
[BUG]: llamasharp 0.20.0 can not be used with visual studio 2019
#1076 commented on Apr 24, 2025 • 0 new comments
[BUG]: Get GGML_ASSERT when running KernelMemorySaveAndLoad.cs
#1151 commented on Apr 24, 2025 • 0 new comments
[BUG]: Outputting Chinese characters may result in incomplete UTF8 encoding, causing garbled text
#1048 commented on Apr 26, 2025 • 0 new comments
[BUG]: Failed to load ./runtimes/win-x64/native/cuda12/llama.dll
#1014 commented on Apr 26, 2025 • 0 new comments
Using LLMSharpEmbeddings to setup a rag from a local LLM File
#1013 commented on Apr 27, 2025 • 0 new comments
Can LlamaSharp run on Tensor GPUs?
#1012 commented on Apr 27, 2025 • 0 new comments
[Feature]: Request GuardRails support for LlamaSharp
#1010 commented on Apr 27, 2025 • 0 new comments
[BUG]: Unhandled exception. System.Text.Json.JsonException: '0x00' is an invalid start of a value
#1008 commented on Apr 27, 2025 • 0 new comments
[BUG]: RAG with KernelMemory -
#996 commented on Apr 27, 2025 • 0 new comments
[BUG]:RUN LLama.Examples =>KernelMemory.cs System.AccessViolationException:“Attempted to read or write protected memory. This is often an indication that other memory is corrupt.”
#980 commented on Apr 27, 2025 • 0 new comments
[BUG]: Not loading cuda backend on laptop
#990 commented on Apr 27, 2025 • 0 new comments
How to publish a self-contained, single runtime, with multiple backends?
#977 commented on Apr 28, 2025 • 0 new comments
Embeddings: batch size vs context length
#963 commented on Apr 28, 2025 • 0 new comments
Predicting memory usage - Memory Access Violation
#952 commented on Apr 28, 2025 • 0 new comments
[BUG]: llama_get_logits_ith(353) returned null
#929 commented on Apr 28, 2025 • 0 new comments
Question about promt templates
#927 commented on Apr 28, 2025 • 0 new comments
LLamaSharp v0.15.0 broke cuda backend
#909 commented on Apr 28, 2025 • 0 new comments
[Feature]: NativeLibraryConfig.WithLibrary availablility in Dotnet Standard 2.1 (For compatibility with Unity3D)
#960 commented on Apr 28, 2025 • 0 new comments
How do i use RAG by kernel memory and Semantic kernel Handlebar Planner with llama3
#899 commented on Apr 29, 2025 • 0 new comments
[Feature]: Add development support for Dev Containers
#898 commented on Apr 29, 2025 • 0 new comments
Application Not Using GPU Despite Installing LlamaSharp.Backend.Cuda12
#896 commented on Apr 29, 2025 • 0 new comments
[BUG]: "The type or namespace 'Common' does not exist in the namespace 'LLama'"
#895 commented on Apr 29, 2025 • 0 new comments
[BUG]: Different continuation after restoring state
#888 commented on Apr 29, 2025 • 0 new comments
[BUG]: RTX 4080 - crash while loading Vulkan
#886 commented on Apr 29, 2025 • 0 new comments
Error Loading LLAMA 3.1 llama_model_load: error loading model: done_getting_tensors: wrong number of tensors; expected 292, got 291[BUG]:
#875 commented on Apr 29, 2025 • 0 new comments
[BUG]: Null reference in AskAsync()
#870 commented on Apr 29, 2025 • 0 new comments
[BUG]: Object Reference Error in ApplyPenalty when setting nl_logit
#866 commented on Apr 30, 2025 • 0 new comments
[BUG]: App crashes with CUDA error in ggml-cuda.cu:1503
#860 commented on Apr 30, 2025 • 0 new comments
[BUG]: LLamaSharp.Backend not added as reference
#849 commented on Apr 30, 2025 • 0 new comments
[BUG]: When the number of GpuLayerCount is more than 5, no data is returned or the speed is very slow
#835 commented on Apr 30, 2025 • 0 new comments
Method not found: 'Double Microsoft.KernelMemory.AI.TextGenerationOptions.get_TopP()'.
#832 commented on Apr 30, 2025 • 0 new comments
How to handle `CUDA error: out of memory`?
#831 commented on Apr 30, 2025 • 0 new comments
[BUG]: ChatSession unnecessarily prevents arbitrary conversation interleaving
#857 commented on Apr 30, 2025 • 0 new comments
[Feature]: AuthorRole can custom role labels be supported ？
#790 commented on May 1, 2025 • 0 new comments
[BUG]: Cannot load the backend on MACOS
#785 commented on May 1, 2025 • 0 new comments
[Feature]: Expose implementation details of the KV Cache
#776 commented on May 1, 2025 • 0 new comments
[BUG]: Offset and length were out of bounds
#766 commented on May 1, 2025 • 0 new comments
Llava DLL issue in Unity
#760 commented on May 1, 2025 • 0 new comments
[BUG]: Error loading the LLava model
#1136 commented on May 1, 2025 • 0 new comments
[BUG]: When using large models with the GPU the code crashes with cannot allocate kvcache
#759 commented on May 2, 2025 • 0 new comments
[Feature]: SemanticKernel FuctionCall
#758 commented on May 2, 2025 • 0 new comments
Split the main package
#754 commented on May 2, 2025 • 0 new comments
Unable to load SYCL compiled backend
#746 commented on May 2, 2025 • 0 new comments
开局提问有几率会触发无限空行回复
#745 commented on May 2, 2025 • 0 new comments
[BUG]: Fail to Load Model with Chinese Model Path
#744 commented on May 2, 2025 • 0 new comments
[Feature]: 不同的LLM模型，代码要以怎样的方式融合到项目里
#739 commented on May 2, 2025 • 0 new comments
Add debug mode of LLamaSharp
#732 commented on May 3, 2025 • 0 new comments
Add unit test about long context
#731 commented on May 3, 2025 • 0 new comments
[BUG]: WSL2 has problem running LLamaSharp with cuda11
#727 commented on May 3, 2025 • 0 new comments
[BUG]: Answer stop abruptly after contextsize, even with limiting prompt size
#722 commented on May 3, 2025 • 0 new comments
[BUG]: Linux cuda version detection could be incorrect
#724 commented on May 3, 2025 • 0 new comments
Take multiple chat templates into account
#705 commented on May 3, 2025 • 0 new comments
[CI] Add more unit test to ensure the the outputs are reasonable
#704 commented on May 3, 2025 • 0 new comments
Namespace should be consistent
#693 commented on May 3, 2025 • 0 new comments
How do I continously print the answer word for word when using document ingestion with kernel memory?
#687 commented on May 4, 2025 • 0 new comments
System.TypeInitializationException: 'The type initializer for 'LLama.Native.NativeApi' threw an exception.'
#686 commented on May 4, 2025 • 0 new comments
[Proposal] Backend-free support
#670 commented on May 4, 2025 • 0 new comments
Debian 12 x LLamaSharp 0.11.2 Crashed Silently
#668 commented on May 4, 2025 • 0 new comments
IndexOutOfRangeException when calling IKernelMemory.AskAsync()
#661 commented on May 4, 2025 • 0 new comments
[Proposal] Refactor the mid-level and high-level implementations of LLamaSharp
#684 commented on May 4, 2025 • 0 new comments
Add a best practice example for RAG
#648 commented on May 5, 2025 • 0 new comments
AccessViolationException
#654 commented on May 5, 2025 • 0 new comments
[Native Lib] Support specifying LLaVA native library path
#644 commented on May 5, 2025 • 0 new comments
Unable to use lora in llamasharp but can use it in llama.cpp
#618 commented on May 5, 2025 • 0 new comments
SemanticKernel ChatCompletion is Stateless
#614 commented on May 5, 2025 • 0 new comments
Godot game engine example
#608 commented on May 5, 2025 • 0 new comments
[Feature] Support GritLM to get embeddings
#646 commented on May 5, 2025 • 0 new comments
Llama.web app published into iis windows 64bit server, after deployment model values not loaded from appsettings
#597 commented on May 6, 2025 • 0 new comments
Cannot add a user message after another user message (Parameter message
#585 commented on May 6, 2025 • 0 new comments
Separating and Streamlining llama/llava binaries Suggestion
#583 commented on May 6, 2025 • 0 new comments
[Kernel Memory] Integrate TextGenerationOptions to LLamaSharp.kernel-memory
#580 commented on May 6, 2025 • 0 new comments
Thread Safety in llama.cpp
#596 commented on May 6, 2025 • 0 new comments
Examples don't run with CUDA12
#599 commented on May 6, 2025 • 0 new comments
Stateless executor doesn't work with LlamaSharp 0.10 & NET 8.0
#578 commented on May 7, 2025 • 0 new comments
How to accelerate running speed in CPU environment?
#562 commented on May 7, 2025 • 0 new comments
How to use embedding correctly
#547 commented on May 7, 2025 • 0 new comments
Possibly useful for documentation: Article by us on Medium about building a Console App with .Net 8.0
#543 commented on May 7, 2025 • 0 new comments
ZLUDA Support
#537 commented on May 7, 2025 • 0 new comments
NativeApi: `TryLoadLibrary()` can fail for some systems
#524 commented on May 7, 2025 • 0 new comments
RuntimeError: for .NET framework 4.7.2.
#508 commented on May 7, 2025 • 0 new comments
[BUG]: Wrong behavior on InferenceParams.AntiPrompts
#1056 commented on May 7, 2025 • 0 new comments
llamasharp.backend.cpu is missing NuGet package README file
#506 commented on May 8, 2025 • 0 new comments
Avoid declaring constructors with parameters if the properties of the type can be obtained from configuration settings.
#498 commented on May 8, 2025 • 0 new comments
Use NerdBank.GitVersioning for versioning
#490 commented on May 8, 2025 • 0 new comments
System.Runtime.InteropServices.MarshalDirectiveException: 'Method's type signature is not PInvoke compatible.'
#484 commented on May 8, 2025 • 0 new comments
Using this repo in unity 3d.
#482 commented on May 8, 2025 • 0 new comments
Wrong result when change to other model.
#481 commented on May 8, 2025 • 0 new comments
Saving State after GetState
#480 commented on May 9, 2025 • 0 new comments
Enable OpenCL/ROCm
#464 commented on May 9, 2025 • 0 new comments
using CUDA when both CPU and Cuda12 back-ends are present.
#456 commented on May 9, 2025 • 0 new comments
Cannot add a user message after another user message
#435 commented on May 9, 2025 • 0 new comments
LoadState() not restoring context when using CUDA backend?
#426 commented on May 9, 2025 • 0 new comments
Failed to eval - potential bug but hard to diagnose
#422 commented on May 9, 2025 • 0 new comments
Crash on KernelMemory Gathering Information / References with LLamaSharp Embedder
#407 commented on May 9, 2025 • 0 new comments
Impossible Invalid GBNF Grammar Parsed
#394 commented on May 10, 2025 • 0 new comments
Handle different Histories in same session / Use Custom History
#392 commented on May 10, 2025 • 0 new comments
ChatSessionStripRoleName example not working reliably.
#375 commented on May 10, 2025 • 0 new comments
CUDA error 700 : an illegal memory access was encountered
#343 commented on May 10, 2025 • 0 new comments
Allow user to define the string to concatenate the role name and prompt in DefaultHistoryTransform
#322 commented on May 10, 2025 • 0 new comments
Feature Request: gbnfgen port
#309 commented on May 10, 2025 • 0 new comments
Support cublas computation without requiring CUDA installed
#350 commented on May 10, 2025 • 0 new comments
Write an API to allow setting stop sequence to use with Kernel Memoery
#289 commented on May 11, 2025 • 0 new comments
Roadmap to v1.0.0
#287 commented on May 11, 2025 • 0 new comments
WebAPI project isn't using GPU, even though CUDA backend gets loaded by LlamaSharp.dll
#278 commented on May 11, 2025 • 0 new comments
Improve support for text-completion and text-embedding APIs
#239 commented on May 11, 2025 • 0 new comments
Consider adding Windows on ARM build of llama.dll to LLamaSharp.Backend.Cpu
#600 commented on May 11, 2025 • 0 new comments
[Feature]: Support JSON Schema from llama.cpp
#798 commented on May 11, 2025 • 0 new comments
[BUG]: BoolQ Example throws error
#1120 commented on May 12, 2025 • 0 new comments
System.TypeLoadException: 'Could not load type 'LLama.Native.NativeApi' from assembly 'LLamaSharp,
#1119 commented on May 12, 2025 • 0 new comments
System.AccessViolationException on llama_backend_init()
#1062 commented on May 12, 2025 • 0 new comments
Argument out of range exception when running any prompt through DeepSeek-R1-Distill-Llama-8B-Q8_0
#1053 commented on May 12, 2025 • 0 new comments
[Feature]: Embeding LLamaSharp inside .NET MAUI
#1063 commented on May 12, 2025 • 0 new comments
Use NBGV for versioning
#491 commented on May 8, 2025 • 0 new comments
Introduce ChatHistory interface
#669 commented on May 4, 2025 • 0 new comments
Automatic Solution Generator - Work in progress
#676 commented on May 4, 2025 • 0 new comments
feat: support dynamic native library loading in .NET standard 2.0.
#738 commented on May 2, 2025 • 0 new comments
refactor: the directory structure.
#763 commented on May 1, 2025 • 0 new comments
Model File Manager
#789 commented on May 1, 2025 • 0 new comments
Unit tests that aim to verify the behavior and correctness of the sampling pipeline under various conditions
#1107 commented on May 10, 2025 • 0 new comments
add LLamaReranker and tests
#1150 commented on May 12, 2025 • 0 new comments