Skip to content
View hirzel's full-sized avatar

Organizations

@IBM

Block or report hirzel

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

GraphGen4Code: a toolkit for creating code knowledge graphs based on WALA code analysis and extraction of documentation and forum content.

Jupyter Notebook 321 42 Updated Nov 19, 2024
Java 28 18 Updated Nov 4, 2025

CUGA is an open-source generalist agent for the enterprise, supporting complex task execution on web and APIs, OpenAPI/MCP integrations, composable architecture, reasoning modes, and policy-aware f…

Python 182 18 Updated Nov 7, 2025

A micro Python based probabilistic programming language

Jupyter Notebook 5 Updated Nov 3, 2025
Python 18 5 Updated Apr 15, 2024

Reproduction Package for the paper "Type-Constrained Code Generation with Language Models" [PLDI 2025]

Python 77 3 Updated Jun 11, 2025

PoTo: A Hybrid Andersen's Points-to Analysis for Python

Python 3 Updated Jun 29, 2025

This repository is for Issue-Test-Localizer. An approach for localizing tests from issue descriptions

Python 2 1 Updated Sep 18, 2025

Top papers related to LLM-based agent evaluation

86 12 Updated Oct 21, 2025

A multi-programming language benchmark for LLMs

Python 279 51 Updated Aug 9, 2025

Python framework which enables you to transform how a user calls or infers an IBM Granite model and how the output from the model is returned to the user.

Python 48 27 Updated Nov 7, 2025

Open sourced predictions, execution logs, trajectories, and results from model inference + evaluation runs on the SWE-bench task.

Shell 219 266 Updated Oct 21, 2025

TDD-Bench-Verified is a new benchmark for generating test cases for test-driven development (TDD)

Python 25 3 Updated Sep 18, 2025

ChatDBG - AI-assisted debugging. Uses AI to answer 'why'

Python 1,045 79 Updated Nov 5, 2025

A Database of Real Faults and an Experimental Infrastructure to Enable Controlled Experiments in Software Engineering Research

Perl 892 350 Updated Oct 11, 2025

Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]

Python 30,796 4,625 Updated Nov 8, 2025

Agentless🐱: an agentless approach to automatically solve software development problems

Python 1,949 213 Updated Dec 22, 2024

Prompt Declaration Language (PDL) is a declarative prompt programming language.

Python 245 43 Updated Nov 7, 2025

Static Python call graph generator

Python 358 72 Updated Nov 26, 2023

Build production-ready AI agents in both Python and Typescript.

Python 2,941 387 Updated Nov 7, 2025

The official Python SDK for Codellm-Devkit

Python 116 28 Updated Nov 7, 2025

SWE-bench: Can Language Models Resolve Real-world Github Issues?

Python 3,767 679 Updated Oct 11, 2025

KubeStellar - a flexible solution for multi-cluster configuration management for edge, multi-cloud, and hybrid cloud

Go 581 203 Updated Nov 7, 2025

A language for constraint-guided and efficient LLM programming.

Python 4,078 214 Updated May 22, 2025

tempeh is a framework to TEst Machine learning PErformance exHaustively which includes tracking memory usage and run time.

Python 18 5 Updated Jan 3, 2022

A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning

Python 7,056 1,321 Updated Aug 26, 2025

A library of sklearn compatible categorical variable encoders

Python 2,467 404 Updated Nov 2, 2025

Home of CodeT5: Open Code LLMs for Code Understanding and Generation

Python 3,080 485 Updated Jan 20, 2024

AutoML debugging and remediation tool called MARO: ML Automated Remediation Oracle

Python 2 Updated Jun 21, 2022
Next