Skip to content
View giulio98's full-sized avatar

Highlights

  • Pro

Block or report giulio98

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Supercharge Your LLM with the Fastest KV Cache Layer

Python 2,392 284 Updated Jul 1, 2025

libcapture is a multiplatform c++ library that allows to capture the display and the microphone audio.

C++ 16 4 Updated Dec 27, 2022

A library for doing secure aggregation Hint-RLWE based

C++ 2 Updated Jan 16, 2025

LLM KV cache compression made easy

Python 523 44 Updated Jul 2, 2025

Structured Outputs

Python 11,990 608 Updated Jul 1, 2025

A library for mechanistic interpretability of GPT-style language models

Python 2,309 407 Updated Jun 19, 2025

SΩI: Score-based O-INFORMATION Estimation

Python 9 Updated Aug 1, 2024

Karras et al. (2022) diffusion models for PyTorch

Python 2,477 385 Updated Jan 7, 2025

Official Repository of the paper "Let Them Drop: Scalable and Efficient Secure Federated Learning Solutions Agnostic to Client Stragglers "

Python 5 Updated Jul 30, 2024

llm.c, but in SYCL/Intel oneAPI!

C++ 6 Updated Aug 5, 2024

Simple autodiff library built on NumPy, inspired by micrograd

Python 1 Updated Feb 22, 2025

Hackable and optimized Transformers building blocks, supporting a composable construction.

Python 9,665 688 Updated Jul 1, 2025

The official Meta Llama 3 GitHub site

Python 28,803 3,405 Updated Jan 26, 2025

Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.

Python 2,434 191 Updated Jun 25, 2025

Inference Llama 2 in one file of pure C

C 18,509 2,289 Updated Aug 6, 2024

Official implementation of MINDE: Mutual Information Neural Diffusion Estimation

Python 14 2 Updated Apr 17, 2025
Python 5 Updated Mar 6, 2024

Hydra is a framework for elegantly configuring complex applications

Python 9,447 689 Updated May 15, 2025

[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization

Python 693 46 Updated Aug 13, 2024

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Python 37,798 6,547 Updated Jul 2, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 51,230 8,450 Updated Jul 2, 2025

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Python 6,921 386 Updated Jul 11, 2024

One-Line-of-Code Data Mollification Improves Optimization of Likelihood-based Generative Models (NeurIPS 2023)

Jupyter Notebook 2 Updated Dec 21, 2023

BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)

Python 7,504 829 Updated Jun 1, 2025

Extension of Dipy with Privacy Enhancing Technologies

Python 2 Updated Jun 9, 2024
Python 518 29 Updated Dec 21, 2024

Large Context Attention

Python 716 53 Updated Jan 24, 2025

A blazing fast inference solution for text embeddings models

Rust 3,749 282 Updated Jul 2, 2025

Large Language Model Text Generation Inference

Python 10,275 1,206 Updated Jul 1, 2025

Python bindings for the Transformer models implemented in C/C++ using GGML library.

C 1,866 141 Updated Jan 28, 2024
Next