Skip to content
View mohdibrahimaiml's full-sized avatar

Block or report mohdibrahimaiml

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
mohdibrahimaiml/README.md

💻 Mohd Ibrahim Afridi (Afridi)

AI/ML Engineer • Independent Researcher • Entrepreneur • AI Safety & Trust 🌐 Portfolio · 📧 Email · 💼 LinkedIn · 🐙 GitHub


🚀 About Me

Hi, I’m Afridi — an AI/ML engineer and independent researcher obsessed with building verifiable, trustworthy, safe AI systems.

  • 🧠 Founder & CTO at XCL3NT, an AI-first commerce brand
  • 🤖 Researcher behind Dynamic Chain-of-Thought Reward Models (D-CoT)Read D-CoT
  • 🛡️ Focus: AI Safety & Multilingual Familarity — designing systems with evidence, evaluation, and safeguards by default
  • 🌍 Remote-ready and open to relocation (US/EU/NZ/SEA)

Mission: Make AI safer, more transparent, and actually helpful to humanity. 🌱


🧪 My Research Focus

  • AI Safety & Trust: principled safeguards, red-teaming, abstain/route policies, and post-hoc verification
  • Verifiable QA systems (answers backed by evidence)
  • Model behavior analysis & safety alignment
  • Evaluation frameworks (metrics, gates, nightly reports)
  • Human-data pipelines (collection → curation → evals)
  • Cost/freshness routing and retrieval-generation hybrids

🛠️ Selected Projects

Project Description Stack
Evidence‑Bound Answering System Evidence‑bound answer engine (retriever → answer → verifier) FastAPI, Next.js, Docker, Helm
Prompt Contracts + Fuzzing CI for Answer Engines Prompt contracts + stress packs + CI gates Python, YAML DSL
Proof‑Answers Proof‑carrying answers with minimal evidence graphs Python, Graph APIs
UIRE Universal Intent Resolution Engine (handles ambiguity) FastAPI, Docker, Helm
Human‑Guided Parametric‑vs‑Retrieval Gating Freshness/cost‑aware routing and gating Python, Policy Engine
TruthLens Claim → Evidence fact‑checking engine HF Spaces, Transformers

Plus: DataLoaderSpeedrun, BreezeMind‑Pro, Career Vision AI, Human‑Feedback‑Safety‑Simulator, and more on my GitHub.


📈 Impact Highlights

  • 📊 Reduced hallucinations by −38%, latency by −23%, and cost by −44% across 5+ pipelines
  • 🏆 Boosted factual F1 by +7–12pp and alignment quality on ArenaHard by +3.4pp with D‑CoT RMs
  • 📚 Published research like Grok‑3 and Grok‑3+
  • 🧩 Designed prompt contracts, nightly eval dashboards, and safety gates that scale

🧠 Tech Stack

Languages: Python, C++, TypeScript, JS Frameworks: PyTorch, TensorFlow, JAX, FastAPI, Next.js Infra: Docker, Kubernetes, Helm, Prometheus, Grafana Concepts: MoE, FP8, RLHF, KV caching, LoRA, DQN Other: Z3, Lean4, CI/CD, Retrieval, Eval pipelines


💌 Let’s Collaborate

If you’re building frontier models, eval frameworks, or safety tooling — I’d love to collaborate. Let’s make AI safer, smarter, and actually trustworthy. 🛡️

“AI safety isn’t a checkbox — it’s a responsibility.” – Me, probably during a caffeine high ☕😄


🔒 AI Safety & Trust

  • Build evidence‑bound systems (claims must cite sources)
  • Add prompt contracts + CI gates for regressions
  • Use nightly evals with safety and calibration metrics
  • Prefer abstain/route over confident nonsense
  • Ship receipts: versions, seeds, costs, and checks for replayability

📊 GitHub Stats

Afridi's GitHub stats


🧩 Fun Fact

I treat debugging like detective work… except the culprit is me from 3 AM last night. 🕵️‍♂️

Pinned Loading

  1. Evidence-Bound-Answering-System Evidence-Bound-Answering-System Public

    Evidence-bound answer engine: retrieve, cite, verify—sentence by sentence.

    Python 1

  2. Prompt-Contracts-Fuzzing-CI-for-Answer-Engines Prompt-Contracts-Fuzzing-CI-for-Answer-Engines Public

    Prompt-contracts + fuzzing CI for LLMs: YAML DSL, stress packs, diff gates.

    Python 1

  3. Human-Guided-Parametric-vs-Retrieval-Gating Human-Guided-Parametric-vs-Retrieval-Gating Public

    Freshness-aware router: decide when to retrieve, compute, clarify, or rely on the model.

    Python 1

  4. Model-behavior-writing-pack Model-behavior-writing-pack Public

    Curated writing pack + proof-pack for LLM Model Designer: before/after rewrites, multilingual transcripts, experiment briefs (CI), empathy & accessibility.

    Python 1

  5. Proof-Answers- Proof-Answers- Public

    Proof-carrying answers: evidence graphs + a deterministic verifier per claim.

    Python 1

  6. UIRE UIRE Public

    Universal Intent Resolution Engine: ambiguity → micro-clarify → policy → prompt.

    Python 1