- Zurich, Switzerland
- https://nestler.sh/
-
RamTorch Public
Forked from lodestone-rock/RamTorchRAM is all you need
Python Apache License 2.0 UpdatedNov 5, 2025 -
-
stable-baselines3 Public
Forked from DLR-RM/stable-baselines3PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
-
streaming-drl Public
Forked from mohmdelsayed/streaming-drlDeep reinforcement learning without experience replay, target networks, or batch updates.
-
ZClip Public
Forked from bluorion-com/ZClipOfficial implementation of the paper: "ZClip: Adaptive Spike Mitigation for LLM Pre-Training".
Python Apache License 2.0 UpdatedMay 15, 2025 -
-
Hyper-gradient-descent Public
Forked from opooladz/Hyper-gradient-descentPython UpdatedMar 26, 2025 -
psgd_torch Public
Forked from lixilinx/psgd_torchPytorch implementation of preconditioned stochastic gradient descent (Kron and affine preconditioner, low-rank approximation preconditioner and more)
Python UpdatedMar 25, 2025 -
claude-code-openai Public
Forked from 1rgs/claude-code-proxyRun Claude Code on OpenAI models
-
pokegym Public
Forked from PufferAI/pokegymGymnasium environment for Pokemon Red
Python MIT License UpdatedMar 4, 2025 -
jax-torch-comparison Public
Forked from evanatyourservice/jax-torch-comparisonPython UpdatedFeb 5, 2025 -
pytorch-image-models Public
Forked from huggingface/pytorch-image-modelsPyTorch image models, scripts, pretrained weights -- ResNet, ResNeXT, EfficientNet, EfficientNetV2, NFNet, Vision Transformer, MixNet, MobileNet-V3/V2, RegNet, DPN, CSPNet, and more
Python Apache License 2.0 UpdatedFeb 1, 2025 -
llmdifftracker Public
Forked from fal-ai-community/llmdifftrackerLightweight package that tracks and summarizes code changes using LLMs (Large Language Models)
-
kron_torch Public
Forked from evanatyourservice/kron_torchAn implementation of PSGD Kron second-order optimizer for PyTorch
Python Creative Commons Attribution 4.0 International UpdatedJan 19, 2025 -
adaptive-muon Public
Forked from leloykun/adaptive-muonA version of @KellerJordan's Muon that adapts to the scale of the gradients
Jupyter Notebook UpdatedJan 2, 2025 -
PufferLib Public
Forked from PufferAI/PufferLibSimplifying reinforcement learning for complex game environments
-
entropix Public
Forked from xjdr-alt/entropixEntropy Based Sampling and Parallel CoT Decoding
Python Apache License 2.0 UpdatedDec 14, 2024 -
fsdp_optimizers Public
Forked from ethansmith2000/fsdp_optimizerssupporting pytorch FSDP for optimizers
Python Apache License 2.0 UpdatedDec 8, 2024 -
stochastic_round_cuda Public
Forked from ethansmith2000/stochastic_round_cuda -
-
pytorch Public
Forked from pytorch/pytorchTensors and Dynamic neural networks in Python with strong GPU acceleration
Python Other UpdatedNov 7, 2024 -
schedule_free Public
Forked from facebookresearch/schedule_freeSchedule-Free Optimization in PyTorch
-
oracle-head-gpt Public
Forked from SonicCodes/oracle-head-gptprobe for predicting future hiddenstates on gpt-2 vibes....
Python UpdatedOct 28, 2024 -
modded-nanogpt Public
Forked from KellerJordan/modded-nanogptNanoGPT (124M) quality in 3.25B tokens
Python UpdatedOct 13, 2024 -
flux-fp8-api Public
Forked from aredden/flux-fp8-apiFlux diffusion model implementation using quantized fp8 matmul & remaining layers use faster half precision accumulate, which is ~2x faster on consumer devices.
Python UpdatedSep 4, 2024 -
guided-diffusion Public
Forked from kostarion/guided-diffusionPython MIT License UpdatedJul 23, 2024 -
-
FABRAG Public
FABRIC.. but RAG!
-
-
memory-transformer-pt4 Public
Forked from Avelina9X/memory-transformer-pt4Jupyter Notebook UpdatedFeb 24, 2024



