I'm a software engineer and researcher focused on AI reliability, distributed systems, and functional programming. I build infrastructure for LLM research on the Elixir/BEAM platform.
I'm the creator of the Crucible Framework, a platform for conducting reproducible experiments on large language model reliability, built on Elixir/OTP.
Key Goal: Building towards 99%+ LLM reliability through ensemble voting and request hedging, with comprehensive statistical testing and transparent causal reasoning chains.
All published under the @North-Shore-AI organization:
| Library | Description |
|---|---|
| crucible_framework | Documentation hub & research framework |
| crucible_bench | Statistical testing & analysis (15+ tests, effect sizes, power analysis) |
| crucible_ensemble | Multi-model voting strategies for improved reliability |
| crucible_hedging | Request hedging for latency reduction |
| crucible_trace | Causal reasoning chain logging for LLM transparency |
| crucible_datasets | Unified interface to benchmark datasets (MMLU, HumanEval, GSM8K) |
| crucible_telemetry | Research-grade instrumentation & metrics collection |
| crucible_harness | Automated experiment orchestration & reporting |
| crucible_examples | Interactive Phoenix LiveView demos showcasing all framework components |
| crucible_adversary | Adversarial testing & robustness evaluation framework |
| crucible_xai | Explainable AI tools (LIME, SHAP, feature attribution) |
| ExDataCheck | Data validation & quality library for ML pipelines |
| ExFairness | Fairness & bias detection library for AI/ML systems |
| LLMGuard | AI firewall & guardrails for LLM-based applications |
Tech Stack: Elixir, OTP, BEAM VM, Telemetry Research Areas: LLM reliability, ensemble methods, tail latency optimization, statistical testing Status: Active development, v0.1.0 released
- synapse ⭐ 23 - Synapse: Elixir-powered AI agent orchestration, built on the battle-teste…
- ds_ex ⭐ 14 - DSPEx - Declarative Self-improving Elixir | A BEAM-Native AI Program O…
- DSPex ⭐ 8 - Declarative Self Improving Elixir - DSPy Orchestration in Elixir
- mabeam ⭐ 4 - Multi Agent BEAM
- AutoElixir ⭐ 3 - AI Multi Agent Swarms in Elixir
- ALTAR ⭐ 4 - The Agent & Tool Arbitration Protocol
- gemini_ex ⭐ 18 - Elixir Interface / Adapter for Google Gemini LLM, for both AI Studio a…
- claude_agent_sdk ⭐ 7 - Elixir SDK for Claude AI Agent API - Renamed from claude_code_sdk_elix…
- codex_sdk ⭐ 1 - OpenAI Codex SDK written in Elixir
- jules_ex ⭐ 1 - Elixir client SDK for the Jules API - orchestrate AI coding sessions
- pipeline_ex ⭐ 6 - Claude Code + Gemini AI collaboration orchestration tools
- gepa_ex ⭐ 0 - GEPA (Genetic-Pareto) optimizer combining LLM-powered reflection with Pareto search to evolve text-based system components
- snakebridge ⭐ 0 - Configuration-driven Python library integration for Elixir - auto-generate type-safe wrappers for any Python library
- weaviate_ex ⭐ 0 - Modern Elixir client for Weaviate vector database with health checks…
- json_remedy ⭐ 20 - A practical, multi-layered JSON repair library for Elixir that intelli…
- snakepit ⭐ 8 - High-performance, generalized process pooler and session manager for e…
- duckdb_ex ⭐ 0 - DuckDB driver client in Elixir
- sinter ⭐ 9 - Unified schema definition, validation, and JSON generation for Elixir
- exdantic ⭐ 8 - A powerful, flexible schema definition and validation library for Elix…
- perimeter ⭐ 6 - Elixir Typing Mechanism
- ex_dbg ⭐ 9 - State-of-the-Art Introspection and Debugging System for Elixir/Phoenix…
- elixir_scope ⭐ 4 - Revolutionary AST-based debugging and code intelligence platform for E…
- ElixirScope ⭐ 3 - AI-Powered Execution Cinema Debugger for Elixir/BEAM
- elixir_dashboard ⭐ 0 - A Phoenix LiveView performance monitoring dashboard for tracking slow endpoints and database queries
- elixir_tracer ⭐ 0 - Local-first observability for Elixir with New Relic API parity
- superlearner ⭐ 6 - OTP Supervisor Educational Platform
- apex ⭐ 3 - Core Apex framework for OTP supervision and monitoring
- apex_ui ⭐ 3 - Web UI for Apex OTP supervision and monitoring tools
- arsenal ⭐ 3 - Metaprogramming framework for automatic REST API generation from OTP o…
- arsenal_plug ⭐ 2 - Phoenix/Plug adapter for Apex Arsenal framework
- supertester ⭐ 3 - A battle-hardened testing toolkit for building robust and resilient El…
- sandbox ⭐ 3 - Isolated OTP application management system for Elixir/Erlang
- cluster_test ⭐ 3 - Distributed Erlang/Elixir test cluster management via Mix tasks
- foundation ⭐ 10 - Elixir infrastructure and Observability Library
- AITrace ⭐ 0 - The unified observability layer for the AI Control Plane
- Assessor ⭐ 0 - The definitive CI/CD platform for AI Quality.
- Citadel ⭐ 0 - The command and control layer for the AI-powered enterprise
- cf_ex ⭐ 3 - Elixir libraries for Cloudflare edge computing services. Battle-tested…
- ex_cloudflare_phoenix ⭐ 0 - Cloudflare Durable Objects and Calls for Phoenix Framework
- playwriter ⭐ 6 - Elixir WSL-to-Windows browser integration
- youtube_audio_dl ⭐ 0 - Download high-quality audio from YouTube as MP3 files using Elixir. Fe…
- tools ⭐ 0 - Elixir repository
Languages: Elixir, Erlang, Python, JavaScript/TypeScript, Rust
Frameworks: Phoenix, OTP, FastAPI, React
Specialties:
- Distributed systems & fault tolerance
- AI/LLM infrastructure & reliability
- Functional programming & metaprogramming
- Statistical analysis & experimental design
- Developer tools & productivity
Platforms: BEAM VM, AWS, GCP, Cloudflare Workers, Edge Computing
Research: LLM reliability through ensemble methods and statistical testing
Building: AI infrastructure on Elixir/OTP
Learning: Advanced OTP patterns, distributed systems optimization
Growing: The Crucible framework ecosystem
- GitHub: @nshkrdotcom
- Organization: @North-Shore-AI
"Build infrastructure that researchers and engineers actually want to use. Make reliability measurable. Make experiments reproducible. Make the BEAM shine for AI workloads."
Collaboration on Elixir AI tooling
Consulting for distributed systems & AI infrastructure
Speaking about LLM reliability, Elixir/OTP, or functional programming
Research partnerships in AI reliability & distributed systems
Open source contributions - PRs welcome on any project!
Period: September 28 - October 27, 2025
| Metric | Value |
|---|---|
| Repositories Analyzed | 44 active repositories |
| Total Commits | 978 commits |
| Lines Added | 3,163,289 |
| Lines Removed | 2,476,036 |
| Net Lines Changed | 5,639,325 |
| Average per Repo | 22 commits, 128k lines |
| Average per Commit | 5,766 lines |
| Daily Velocity | 187,977 lines/day |
- DSPex - 4.8M lines (18 commits) - Declarative AI orchestration framework
- pump_fun_web - 200K lines (202 commits) - Solana memecoin platform
- snakepit - 140K lines (127 commits) - Python-Elixir bridge & process pooling
- claude_agent_sdk - 82K lines (65 commits) - Claude AI agent integration
- Spectra - 70K lines (8 commits) - Web application project
- code_agent - 38K lines (38 commits) - AI coding agent system
- weaviate_ex - 29K lines (28 commits) - Vector database client
- nshkrdotcom.github.io - 28K lines (219 commits) - Portfolio & documentation site
- gemini_ex - 26K lines (23 commits) - Google Gemini LLM client
- pipeline_ex - 24K lines (7 commits) - Multi-LLM orchestration tools
| Category | Repositories | Commits | Lines Changed | % of Total |
|---|---|---|---|---|
| AI Agent Orchestration & SDKs | 10 | 234 | ~5,030,237 | 89.2% |
| Web Applications | 4 | 435 | ~307,209 | 5.4% |
| Infrastructure & Data | 8 | 99 | ~208,412 | 3.7% |
| Developer Tools | 6 | 43 | ~51,130 | 0.9% |
| Other Projects | 16 | 167 | ~42,337 | 0.8% |
Period: September 28 - October 27, 2025
| Metric | Value |
|---|---|
| Total Cost | $3,347.39 |
| Total Tokens | 6.1B tokens |
| Input Tokens | 6.0M |
| Output Tokens | 12.6M |
| Cache Creates | 304.5M |
| Cache Reads | 5.8B |
| Average Daily Cost | $111.58 |
| Peak Usage Day | Oct 11 ($256.12) |
Models Used: Sonnet 4.5, Sonnet 4, Haiku 4.5, Haiku 3.5, Opus 4.1
| Metric | Value | Analysis |
|---|---|---|
| Cost per Line | $0.00106 | Extremely cost-efficient at ~$1 per 1,000 lines |
| Cost per Commit | $3.42 | High-value commits averaging 5,766 lines each |
| Cache Hit Rate | 95% | Exceptional prompt reuse & context efficiency |
| Lines per Dollar | 943 | Strong ROI on AI-assisted development |
| Tokens per Commit | 6.2M | Complex, context-rich development sessions |
High-Impact Work:
- 89% of changes focused on AI/LLM infrastructure and SDK development
- Major refactoring/enhancement of DSPex declarative AI framework
- Production web application (pump_fun_web) with 202 commits shows sustained delivery
- Parallel development across 44 repositories demonstrates broad ecosystem growth
AI-Assisted Development ROI:
- Total Investment: $3,347 over 30 days
- Output: 5.6M lines of code across 978 commits
- Efficiency: ~$1 per 1,000 lines of thoughtfully architected code
- Time Multiplier: 95% cache hit rate indicates highly efficient iterative development
- Context Leverage: Average 6.2M tokens/commit shows deep architectural work
Peak Activity Correlation:
- Oct 7-11: Infrastructure automation & AI framework development ($391-$256/day)
- Oct 14: Repository organization & documentation systems ($110/day)
- Oct 17-20: Multi-repo AI SDK development ($253-$182/day)
- Higher costs correlate with complex architectural decisions & framework design
Notable Achievements:
- Built/enhanced 10 AI orchestration & SDK projects
- Delivered production Solana platform with 200K+ LOC
- Created comprehensive Python-Elixir bridge (snakepit)
- Developed multiple vector DB & graph DB clients
- Maintained documentation hub with 219 commits
- Automated GitHub star tracking & README generation
Primary Activities:
- 89% AI/LLM infrastructure & agent orchestration
- 5% Production web applications
- 4% Data infrastructure & database clients
- 2% Developer tooling & automation
Development Velocity:
- Daily Average: 187,977 lines/day (32.5 commits/day)
- Per Session: 5,766 lines/commit
- With AI Assistance: 943 lines per dollar spent
Quality Indicators:
- High commit-to-LOC ratio (5,766 lines/commit) suggests substantial, well-planned changes
- 95% cache hit rate indicates consistent, iterative refinement
- Distributed across 44 repos shows ecosystem thinking vs. siloed development
- 6.2M tokens/commit demonstrates thorough architectural consideration
Last updated: 2025-11-06


