Skip to content

"Efficient and scalable solutions for PyTorch, enabling large language model quantization with k-bit precision for enhanced accessibility.

License

Notifications You must be signed in to change notification settings

ved1beta/Quanta

Repository files navigation

Quanta 🚀

A lightweight PyTorch library for efficient model quantization and memory optimization. Perfect for running large language models on consumer hardware.

Key Features

  • 🎯 8-bit & 4-bit quantization primitives
  • 💾 Memory-efficient optimizers
  • 🚀 LLM.int8() inference support
  • 🔄 QLoRA-style fine-tuning
  • 🖥️ Cross-platform hardware support

Quick Start

import torch
from bytesandbits.functional.quantization import quantize_8bit, dequantize_8bit

# Quantize your model
q_tensor, scale, zero_point = quantize_8bit(model_weights)

Status

🚧 Early Development - Currently implementing core quantization features.

License

MIT License

Inspired by bitsandbytes

About

"Efficient and scalable solutions for PyTorch, enabling large language model quantization with k-bit precision for enhanced accessibility.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages