hirzel

Martin Hirzel hirzel

7 followers · 0 following

http://hirzels.com/martin/

Achievements

Organizations

Stars

cuga-project / cuga-agent

CUGA is an open-source generalist agent for the enterprise, supporting complex task execution on web and APIs, OpenAPI/MCP integrations, composable architecture, reasoning modes, and policy-aware f…

Python 182 18 Updated Nov 6, 2025

gbdrt / mu-ppl

A micro Python based probabilistic programming language

Jupyter Notebook 5 Updated Nov 3, 2025

codetlingua / codetlingua

Python 18 5 Updated Apr 15, 2024

eth-sri / type-constrained-code-generation

Reproduction Package for the paper "Type-Constrained Code Generation with Language Models" [PLDI 2025]

Python 77 3 Updated Jun 11, 2025

Ingkarat / PoTo

PoTo: A Hybrid Andersen's Points-to Analysis for Python

Python 3 Updated Jun 29, 2025

IBM / Issue-Test-Localizer

This repository is for Issue-Test-Localizer. An approach for localizing tests from issue descriptions

Python 2 1 Updated Sep 18, 2025

Asaf-Yehudai / LLM-Agent-Evaluation-Survey

Top papers related to LLM-based agent evaluation

86 12 Updated Oct 21, 2025

nuprl / MultiPL-E

A multi-programming language benchmark for LLMs

Python 279 51 Updated Aug 9, 2025

ibm-granite / granite-io

Python framework which enables you to transform how a user calls or infers an IBM Granite model and how the output from the model is returned to the user.

Python 48 27 Updated Nov 6, 2025

SWE-bench / experiments

Open sourced predictions, execution logs, trajectories, and results from model inference + evaluation runs on the SWE-bench task.

Shell 219 266 Updated Oct 21, 2025

IBM / TDD-Bench-Verified

TDD-Bench-Verified is a new benchmark for generating test cases for test-driven development (TDD)

Python 25 3 Updated Sep 18, 2025

plasma-umass / ChatDBG

ChatDBG - AI-assisted debugging. Uses AI to answer 'why'

Python 1,044 79 Updated Nov 5, 2025

rjust / defects4j

A Database of Real Faults and an Experimental Infrastructure to Enable Controlled Experiments in Software Engineering Research

Perl 893 350 Updated Oct 11, 2025

BerriAI / litellm

Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]

Python 30,760 4,613 Updated Nov 7, 2025