Hi! I am Wenbo Pan
I’m a Ph.D. student at City University of Hong Kong, expecting to graduate in 2028. With a background in Statistics, Computer Science, and Philosophy, I focus on understanding the capability boundaries and safety concerns of advanced AI systems from a mechanistic perspective. My research blends analytical theory with empirical practice, enabling me to develop well-founded insights. My long-term mission is to reduce p(doom) - even if a tiny bit.
Curious about my academic and professional journey? Explore my full bio to learn more.
What I’m Passionate About
- Research. I enjoy exploring the unknown in science. Currently, I focus on leveraging mathematical and analytical tools to gain deeper insights into Large Language Models (LLMs). I’m also interested in data-driven approaches to fine-tune LLMs for better alignment.
- Reading & Thinking. I spend a significant portion of my leisure time immersed in books, obsessing over both academic non-fiction and sci-fi.
- Productivity Tools. I have open-sourced several projects on GitHub, mainly focusing on productivity tools that aid my research and workflow.
- Writing for Learning. Publishing articles on this site and Rednote is a great way to express my thoughts and refine my ideas through feedback.
Works I’m Proud Of
Faro Chat Models (2024)
A series of fine-tuned LLMs designed for long-context modeling. I curated a large, diverse set of instruction-tuning samples and optimized existing base models using LoRA training and model merging. As a result, these models exhibit significantly improved stability and accuracy in long-text understanding.
Safety Residual Space (2025)
An analytical tool for understanding what LLMs learn from safety fine-tuning. It measures activation shifts during training and extracts the most meaningful feature directions—those predictive of LLM behaviors like refusal. Check out the paper for details.
CPT Scaling Law (2024)
We propose a scaling law that estimates the optimal compute (FLOPs) required to reach a certain loss in CPT. Continually Pre-Training (CPT) refers to pre-training a model from an existing checkpoint. Since CPT data distributions often differ from original pre-training data (e.g., different languages), it requires a dedicated study.
moffee (2024)
An open-source slide maker that transforms Markdown documents into clean, professional slide decks. moffee handles the styling and layout, allowing you to focus entirely on the content.