Publications

First-author and co-first-author works are listed first, followed by collaborations. Also on .

First-author

Evolving Agents in the Dark: Retrospective Harness Optimization via Self-Preference

Wenbo Pan, Shujie Liu, Chin-Yew Lin, Jingying Zeng, Xianfeng Tang, Xiangyang Zhou, Yan Lu, Xiaohua Jia
arXiv preprint, 2026 · · ·

RHO improves LLM agents without any ground-truth labels: the agent retrospectively compares its own past trajectories via self-preference, then rewrites its harness (prompts, tools, control flow) to prefer the behaviors it judges better. Joint work with Microsoft Research Asia.

M*: Every Task Deserves Its Own Memory Harness

Wenbo Pan, Shujie Liu, Xiangyang Zhou, Shiwei Zhang, Wanlu Shi, Mirror Xu, Xiaohua Jia
arXiv preprint, 2026 · ·

Instead of one fixed memory design for all tasks, M* searches for a task-specific memory architecture expressed as executable Python code, optimizing how an agent stores, retrieves, and consolidates information for each workload.

Towards Long-Horizon Interpretability: Efficient and Faithful Multi-Token Attribution for Reasoning LLMs

Wenbo Pan, Zhichao Liu, Xianlong Wang, Haining Yu, Xiaohua Jia
ICML 2026 (Oral) · ·

FlashTrace attributes multi-token spans in long reasoning chains to their input causes, faithfully and at a fraction of the cost of token-by-token methods. Selected for an oral presentation at ICML 2026 (168 of 23,918 submissions). Ships as a Python package with CLI and interactive HTML traces.

Can LLMs Refuse Questions They Do Not Know? Measuring Knowledge-Aware Refusal in Factual Tasks

Wenbo Pan, Jie Xu, Qiguang Chen, Junhao Dong, Libo Qin, Xinfeng Li, Haining Yu, Xiaohua Jia
ICLR 2026 ·

Proposes knowledge-aware refusal metrics that separate “refusing because the model does not know” from blanket refusal behavior, and measures this ability across model families on factual tasks.

The Hidden Dimensions of LLM Alignment: A Multi-Dimensional Analysis of Orthogonal Safety Directions

Wenbo Pan, Zhichao Liu, Qiguang Chen, Xiangyang Zhou, Haining Yu, Xiaohua Jia
ICML 2025 · ·

Shows that safety alignment writes not one but many orthogonal directions into activation space. We extract this safety residual space, identify directions predictive of refusal, and show how individual directions can be ablated or steered.

Breaking Language Barriers: Cross-Lingual Continual Pre-Training at Scale

Wenzhen Zheng*, Wenbo Pan*, Xu Xu*, Libo Qin, Li Yue, Ming Zhou (co-first author)
EMNLP 2024 ·

A scaling law for continual pre-training: given a compute budget, predicts the loss reachable when adapting an existing checkpoint to a new data distribution (e.g., a new language), validated up to large-scale cross-lingual CPT runs.

A Preliminary Evaluation of ChatGPT for Zero-shot Dialogue Understanding

Wenbo Pan, Qiguang Chen, Xiao Xu, Wanxiang Che, Libo Qin
arXiv preprint, 2023 ·

One of the earliest systematic evaluations of ChatGPT on dialogue understanding (slot filling, intent detection, DST), documenting failure modes that shaped later instruction-tuning work.

Collaborations

WebTrap: Stealthy Mid-Task Hijacking of Browser Agents During Navigation

Zhichao Liu, Wenbo Pan, Haining Yu, Ge Gao, Tianqing Zhu, Xiaohua Jia
arXiv preprint, 2026 ·

Image-to-Video Diffusion: From Foundations to Open Frontiers

Xianlong Wang, Wenbo Pan, Shijia Zhou, Ke Li, Yuqi Wang, Zeyu Ye, Hangtao Zhang, Leo Yu Zhang, Xiaohua Jia
arXiv preprint, 2026 ·

Dual-branch Robust Unlearnable Examples

Xianlong Wang, Hangtao Zhang, Wenbo Pan, Ziqi Zhou, Changsong Jiang, Li Zeng, Xiaohua Jia
arXiv preprint, 2026 ·

Improve Fluency of Neural Machine Translation Using Large Language Models

Jianfei He, Wenbo Pan, Jijia Yang, Sen Peng, Xiaohua Jia
MT Summit 2025

End-to-end Task-oriented Dialogue: A Survey of Tasks, Methods, and Future Directions

Libo Qin, Wenbo Pan, Qiguang Chen, Lizi Liao, Zhou Yu, Yue Zhang, Wanxiang Che, Min Li
arXiv preprint, 2023 ·

BibTeX

@article{pan2026rho,
  title={Evolving Agents in the Dark: Retrospective Harness Optimization via Self-Preference},
  author={Pan, Wenbo and Liu, Shujie and Lin, Chin-Yew and Zeng, Jingying and Tang, Xianfeng and Zhou, Xiangyang and Lu, Yan and Jia, Xiaohua},
  journal={arXiv preprint arXiv:2606.05922},
  year={2026}
}

@article{pan2026mstar,
  title={M$^\star$: Every Task Deserves Its Own Memory Harness},
  author={Pan, Wenbo and Liu, Shujie and Zhou, Xiangyang and Zhang, Shiwei and Shi, Wanlu and Xu, Mirror and Jia, Xiaohua},
  journal={arXiv preprint arXiv:2604.11811},
  year={2026}
}

@inproceedings{pan2026flashtrace,
  title={Towards Long-Horizon Interpretability: Efficient and Faithful Multi-Token Attribution for Reasoning LLMs},
  author={Pan, Wenbo and Liu, Zhichao and Wang, Xianlong and Yu, Haining and Jia, Xiaohua},
  booktitle={International Conference on Machine Learning (ICML)},
  note={Oral presentation},
  year={2026}
}

@inproceedings{pan2026refusal,
  title={Can LLMs Refuse Questions They Do Not Know? Measuring Knowledge-Aware Refusal in Factual Tasks},
  author={Pan, Wenbo and Xu, Jie and Chen, Qiguang and Dong, Junhao and Qin, Libo and Li, Xinfeng and Yu, Haining and Jia, Xiaohua},
  booktitle={International Conference on Learning Representations (ICLR)},
  year={2026}
}

@inproceedings{pan2025hidden,
  title={The Hidden Dimensions of LLM Alignment: A Multi-Dimensional Analysis of Orthogonal Safety Directions},
  author={Pan, Wenbo and Liu, Zhichao and Chen, Qiguang and Zhou, Xiangyang and Yu, Haining and Jia, Xiaohua},
  booktitle={International Conference on Machine Learning (ICML)},
  year={2025}
}

@inproceedings{zheng2024breaking,
  title={Breaking Language Barriers: Cross-Lingual Continual Pre-Training at Scale},
  author={Zheng, Wenzhen and Pan, Wenbo and Xu, Xu and Qin, Libo and Yue, Li and Zhou, Ming},
  booktitle={Proceedings of EMNLP 2024},
  year={2024}
}

@article{pan2023preliminary,
  title={A Preliminary Evaluation of ChatGPT for Zero-shot Dialogue Understanding},
  author={Pan, Wenbo and Chen, Qiguang and Xu, Xiao and Che, Wanxiang and Qin, Libo},
  journal={arXiv preprint arXiv:2304.04256},
  year={2023}
}