CS Knowledge Base

Hi, I'm Jiaxin Zhang 👋

I am an AI Researcher, working on reliable long-horizon AI agents, agentic reinforcement learning, and calibrated post-training.

My research asks a simple question:

How can AI agents know what they don’t know, act under uncertainty, and improve from their own prediction–reality gaps?

I build methods, environments, and evaluation frameworks that turn uncertainty, confidence, and consistency into first-class training signals for reliable and self-improving AI systems.

Homepage · Google Scholar · LinkedIn · X/Twitter · Email

Research Focus

Agentic RL & Post-training
Calibration-aware on-policy distillation, GRPO/RL training, self-evolving environments, synthetic feedback, and reward/evaluator design for long-horizon agents.
Alignment, Calibration & Honesty
Uncertainty-aware supervision, confidence calibration, hallucination detection, factuality, scalable oversight, and reliable model behavior.
Long-horizon Agents & Evaluation
Tool use, planning, trajectory-level evaluation, deep research agents, evidence grounding, failure attribution, and enterprise-scale agent benchmarks.

Selected Work

Prospective Hindsight
Self-calibrating reinforcement learning via prediction–reality gaps, aligning an agent’s action-time self-belief with verifier outcomes.
CaOPD: Calibration-aware On-policy Distillation
Decouples capability learning from honest confidence calibration in LLM post-training.
Agentic Uncertainty Quantification
Turns verbalized uncertainty into active control signals for memory, reflection, and long-horizon execution.
[ICML2026] Agentic Confidence Calibration
A trajectory-level calibration framework for diagnosing and improving the reliability of long-horizon agents.
[ACL2026] The Evolving Role of Uncertainty Quantification in Large Language Models
The evolution of uncertainty from a passive diagnostic metric to an active control signal guiding real-time model behavior

For the full list of publications, please see my Google Scholar or homepage.

Contact

I am interested in reliable AI agents, agentic RL, post-training, calibration, uncertainty, scalable evaluation, and self-improving AI systems. Feel free to reach out via email or visit my homepage.

CS Knowledge Base

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Jiaxin Zhang jxzhangjhu

Achievements

Achievements

Block or report jxzhangjhu

Hi, I'm Jiaxin Zhang 👋

Research Focus

Selected Work

Contact

Pinned Loading

Uh oh!