- Singapore
-
13:46
(UTC +08:00) - https://26hzhang.github.io/
- https://orcid.org/0000-0002-2725-6458
- @hzhang26
- in/hzhang26
Highlights
- Pro
Lists (2)
Sort Name ascending (A-Z)
Stars
TxAgent: An AI Agent for Therapeutic Reasoning Across a Universe of Tools
MartinLwx / pdb-tutorial
Forked from spiside/pdb-tutorialA simple tutorial about effectively using pdb
Official Repo for SwS: A Weakness-driven Problem Synthesis Framework in RL for LLM Reasoning
Leverage WorldQuant API to generate alpha signals, and mine promising alpha expressions.
Get started with building Fullstack Agents using Gemini 2.5 and LangGraph
Statsmodels: statistical modeling and econometrics in Python
Python wrapper for TA-Lib (http://ta-lib.org/).
arXiv LaTeX Cleaner: Easily clean the LaTeX code of your paper to submit to arXiv
Qlib is an AI-oriented Quant investment platform that aims to use AI tech to empower Quant Research, from exploring ideas to implementing productions. Qlib supports diverse ML modeling paradigms, i…
📈 Get real-time stocks from TradingView
Trading Framework and Bot based on Moomoo/Futu
verl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-in-Group Policy Optimization for LLM Agent Training"
verl: Volcano Engine Reinforcement Learning for LLMs
Reasoning in LLMs: Papers and Resources, including Chain-of-Thought, OpenAI o1, and DeepSeek-R1 🍓
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations
Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill?
Awesome RL Reasoning Recipes ("Triple R")
What Makes a Reward Model a Good Teacher? An Optimization Perspective
Model merging is a highly efficient approach for long-to-short reasoning.
This is the first paper to explore how to effectively use RL for MLLMs and introduce Vision-R1, a reasoning MLLM that leverages cold-start initialization and RL training to incentivize reasoning ca…
Understanding R1-Zero-Like Training: A Critical Perspective