nshkrdotcom nshkrdotcom

Hey there!

I'm a software engineer and researcher focused on AI reliability, distributed systems, and functional programming. I build infrastructure for LLM research on the Elixir/BEAM platform.

Crucible Framework - LLM Reliability Research

I'm the creator of the Crucible Framework, a platform for conducting reproducible experiments on large language model reliability, built on Elixir/OTP.

Key Goal: Building towards 99%+ LLM reliability through ensemble voting and request hedging, with comprehensive statistical testing and transparent causal reasoning chains.

Core Libraries

All published under the @North-Shore-AI organization:

Library	Description
crucible_framework	Documentation hub & research framework
crucible_bench	Statistical testing & analysis (15+ tests, effect sizes, power analysis)
crucible_ensemble	Multi-model voting strategies for improved reliability
crucible_hedging	Request hedging for latency reduction
crucible_trace	Causal reasoning chain logging for LLM transparency
crucible_datasets	Unified interface to benchmark datasets (MMLU, HumanEval, GSM8K)
crucible_telemetry	Research-grade instrumentation & metrics collection
crucible_harness	Automated experiment orchestration & reporting
crucible_examples	Interactive Phoenix LiveView demos showcasing all framework components
crucible_adversary	Adversarial testing & robustness evaluation framework
crucible_xai	Explainable AI tools (LIME, SHAP, feature attribution)
ExDataCheck	Data validation & quality library for ML pipelines
ExFairness	Fairness & bias detection library for AI/ML systems
LLMGuard	AI firewall & guardrails for LLM-based applications

Tech Stack: Elixir, OTP, BEAM VM, Telemetry Research Areas: LLM reliability, ensemble methods, tail latency optimization, statistical testing Status: Active development, v0.1.0 released

Elixir Projects

AI Agent Orchestration & Multi-Agent Systems

synapse ⭐ 23 - Synapse: Elixir-powered AI agent orchestration, built on the battle-teste…
ds_ex ⭐ 14 - DSPEx - Declarative Self-improving Elixir | A BEAM-Native AI Program O…
DSPex ⭐ 8 - Declarative Self Improving Elixir - DSPy Orchestration in Elixir
mabeam ⭐ 4 - Multi Agent BEAM
AutoElixir ⭐ 3 - AI Multi Agent Swarms in Elixir
ALTAR ⭐ 4 - The Agent & Tool Arbitration Protocol

AI SDKs & API Clients

gemini_ex ⭐ 18 - Elixir Interface / Adapter for Google Gemini LLM, for both AI Studio a…
claude_agent_sdk ⭐ 7 - Elixir SDK for Claude AI Agent API - Renamed from claude_code_sdk_elix…
codex_sdk ⭐ 1 - OpenAI Codex SDK written in Elixir
jules_ex ⭐ 1 - Elixir client SDK for the Jules API - orchestrate AI coding sessions
pipeline_ex ⭐ 6 - Claude Code + Gemini AI collaboration orchestration tools

AI Infrastructure & Utilities

gepa_ex ⭐ 0 - GEPA (Genetic-Pareto) optimizer combining LLM-powered reflection with Pareto search to evolve text-based system components
snakebridge ⭐ 0 - Configuration-driven Python library integration for Elixir - auto-generate type-safe wrappers for any Python library
weaviate_ex ⭐ 0 - Modern Elixir client for Weaviate vector database with health checks…
json_remedy ⭐ 20 - A practical, multi-layered JSON repair library for Elixir that intelli…
snakepit ⭐ 8 - High-performance, generalized process pooler and session manager for e…
duckdb_ex ⭐ 0 - DuckDB driver client in Elixir

Schema & Data Validation

sinter ⭐ 9 - Unified schema definition, validation, and JSON generation for Elixir
exdantic ⭐ 8 - A powerful, flexible schema definition and validation library for Elix…
perimeter ⭐ 6 - Elixir Typing Mechanism

Developer Tools & Debugging

ex_dbg ⭐ 9 - State-of-the-Art Introspection and Debugging System for Elixir/Phoenix…
elixir_scope ⭐ 4 - Revolutionary AST-based debugging and code intelligence platform for E…
ElixirScope ⭐ 3 - AI-Powered Execution Cinema Debugger for Elixir/BEAM
elixir_dashboard ⭐ 0 - A Phoenix LiveView performance monitoring dashboard for tracking slow endpoints and database queries
elixir_tracer ⭐ 0 - Local-first observability for Elixir with New Relic API parity

OTP & Distributed Systems

superlearner ⭐ 6 - OTP Supervisor Educational Platform
apex ⭐ 3 - Core Apex framework for OTP supervision and monitoring
apex_ui ⭐ 3 - Web UI for Apex OTP supervision and monitoring tools
arsenal ⭐ 3 - Metaprogramming framework for automatic REST API generation from OTP o…
arsenal_plug ⭐ 2 - Phoenix/Plug adapter for Apex Arsenal framework

Testing & Quality Assurance

supertester ⭐ 3 - A battle-hardened testing toolkit for building robust and resilient El…
sandbox ⭐ 3 - Isolated OTP application management system for Elixir/Erlang
cluster_test ⭐ 3 - Distributed Erlang/Elixir test cluster management via Mix tasks

Infrastructure & Observability

foundation ⭐ 10 - Elixir infrastructure and Observability Library
AITrace ⭐ 0 - The unified observability layer for the AI Control Plane
Assessor ⭐ 0 - The definitive CI/CD platform for AI Quality.
Citadel ⭐ 0 - The command and control layer for the AI-powered enterprise

Cloud & Edge Computing

cf_ex ⭐ 3 - Elixir libraries for Cloudflare edge computing services. Battle-tested…
ex_cloudflare_phoenix ⭐ 0 - Cloudflare Durable Objects and Calls for Phoenix Framework

Browser & Platform Integration

playwriter ⭐ 6 - Elixir WSL-to-Windows browser integration

Utilities

youtube_audio_dl ⭐ 0 - Download high-quality audio from YouTube as MP3 files using Elixir. Fe…
tools ⭐ 0 - Elixir repository

GitHub Stats

Tech Stack

Languages: Elixir, Erlang, Python, JavaScript/TypeScript, Rust

Frameworks: Phoenix, OTP, FastAPI, React

Specialties:

Distributed systems & fault tolerance
AI/LLM infrastructure & reliability
Functional programming & metaprogramming
Statistical analysis & experimental design
Developer tools & productivity

Platforms: BEAM VM, AWS, GCP, Cloudflare Workers, Edge Computing

Current Focus

Research: LLM reliability through ensemble methods and statistical testing
Building: AI infrastructure on Elixir/OTP
Learning: Advanced OTP patterns, distributed systems optimization
Growing: The Crucible framework ecosystem

Connect

GitHub: @nshkrdotcom
Organization: @North-Shore-AI

Philosophy

"Build infrastructure that researchers and engineers actually want to use. Make reliability measurable. Make experiments reproducible. Make the BEAM shine for AI workloads."

Open to

Collaboration on Elixir AI tooling
Consulting for distributed systems & AI infrastructure
Speaking about LLM reliability, Elixir/OTP, or functional programming
Research partnerships in AI reliability & distributed systems
Open source contributions - PRs welcome on any project!

Development Metrics (Past 30 Days)

GitHub Activity Summary

Period: September 28 - October 27, 2025

Metric	Value
Repositories Analyzed	44 active repositories
Total Commits	978 commits
Lines Added	3,163,289
Lines Removed	2,476,036
Net Lines Changed	5,639,325
Average per Repo	22 commits, 128k lines
Average per Commit	5,766 lines
Daily Velocity	187,977 lines/day

Top 10 Most Active Projects

DSPex - 4.8M lines (18 commits) - Declarative AI orchestration framework
pump_fun_web - 200K lines (202 commits) - Solana memecoin platform
snakepit - 140K lines (127 commits) - Python-Elixir bridge & process pooling
claude_agent_sdk - 82K lines (65 commits) - Claude AI agent integration
Spectra - 70K lines (8 commits) - Web application project
code_agent - 38K lines (38 commits) - AI coding agent system
weaviate_ex - 29K lines (28 commits) - Vector database client
nshkrdotcom.github.io - 28K lines (219 commits) - Portfolio & documentation site
gemini_ex - 26K lines (23 commits) - Google Gemini LLM client
pipeline_ex - 24K lines (7 commits) - Multi-LLM orchestration tools

Work Distribution by Category

Category	Repositories	Commits	Lines Changed	% of Total
AI Agent Orchestration & SDKs	10	234	~5,030,237	89.2%
Web Applications	4	435	~307,209	5.4%
Infrastructure & Data	8	99	~208,412	3.7%
Developer Tools	6	43	~51,130	0.9%
Other Projects	16	167	~42,337	0.8%

Claude AI Usage Correlation

Period: September 28 - October 27, 2025

Metric	Value
Total Cost	$3,347.39
Total Tokens	6.1B tokens
Input Tokens	6.0M
Output Tokens	12.6M
Cache Creates	304.5M
Cache Reads	5.8B
Average Daily Cost	$111.58
Peak Usage Day	Oct 11 ($256.12)

Models Used: Sonnet 4.5, Sonnet 4, Haiku 4.5, Haiku 3.5, Opus 4.1

Efficiency Metrics

Metric	Value	Analysis
Cost per Line	$0.00106	Extremely cost-efficient at ~$1 per 1,000 lines
Cost per Commit	$3.42	High-value commits averaging 5,766 lines each
Cache Hit Rate	95%	Exceptional prompt reuse & context efficiency
Lines per Dollar	943	Strong ROI on AI-assisted development
Tokens per Commit	6.2M	Complex, context-rich development sessions

Development Patterns & Insights

High-Impact Work:

89% of changes focused on AI/LLM infrastructure and SDK development
Major refactoring/enhancement of DSPex declarative AI framework
Production web application (pump_fun_web) with 202 commits shows sustained delivery
Parallel development across 44 repositories demonstrates broad ecosystem growth

AI-Assisted Development ROI:

Total Investment: $3,347 over 30 days
Output: 5.6M lines of code across 978 commits
Efficiency: ~$1 per 1,000 lines of thoughtfully architected code
Time Multiplier: 95% cache hit rate indicates highly efficient iterative development
Context Leverage: Average 6.2M tokens/commit shows deep architectural work

Peak Activity Correlation:

Oct 7-11: Infrastructure automation & AI framework development ($391-$256/day)
Oct 14: Repository organization & documentation systems ($110/day)
Oct 17-20: Multi-repo AI SDK development ($253-$182/day)
Higher costs correlate with complex architectural decisions & framework design

Notable Achievements:

Built/enhanced 10 AI orchestration & SDK projects
Delivered production Solana platform with 200K+ LOC
Created comprehensive Python-Elixir bridge (snakepit)
Developed multiple vector DB & graph DB clients
Maintained documentation hub with 219 commits
Automated GitHub star tracking & README generation

Productivity Breakdown

Primary Activities:

89% AI/LLM infrastructure & agent orchestration
5% Production web applications
4% Data infrastructure & database clients
2% Developer tooling & automation

Development Velocity:

Daily Average: 187,977 lines/day (32.5 commits/day)
Per Session: 5,766 lines/commit
With AI Assistance: 943 lines per dollar spent

Quality Indicators:

High commit-to-LOC ratio (5,766 lines/commit) suggests substantial, well-planned changes
95% cache hit rate indicates consistent, iterative refinement
Distributed across 44 repos shows ecosystem thinking vs. siloed development
6.2M tokens/commit demonstrates thorough architectural consideration

Last updated: 2025-11-06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly