Welcome!

Hi! I am Wenbo Pan

I’m a Ph.D. student at City University of Hong Kong, expecting to graduate in 2028. With a background in Statistics, Computer Science, and Philosophy, I focus on understanding the capability boundaries and safety concerns of advanced AI systems from a mechanistic perspective. My research blends analytical theory with empirical practice, enabling me to develop well-founded insights. My long-term mission is to reduce p(doom) - even if a tiny bit.

Curious about my academic and professional journey? Explore my full bio to learn more.

What I’m Passionate About

Research. I enjoy exploring the unknown in science. Currently, I focus on leveraging mathematical and analytical tools to gain deeper insights into Large Language Models (LLMs). I’m also interested in data-driven approaches to fine-tune LLMs for better alignment.
Reading & Thinking. I spend a significant portion of my leisure time immersed in books, obsessing over both academic non-fiction and sci-fi.
Productivity Tools. I have open-sourced several projects on GitHub, mainly focusing on productivity tools that aid my research and workflow.
Writing for Learning. Publishing articles on this site and Rednote is a great way to express my thoughts and refine my ideas through feedback.

Works I’m Proud Of

Faro Chat Models (2024)

A series of fine-tuned LLMs designed for long-context modeling. I curated a large, diverse set of instruction-tuning samples and optimized existing base models using LoRA training and model merging. As a result, these models exhibit significantly improved stability and accuracy in long-text understanding.

Safety Residual Space (2025)

An analytical tool for understanding what LLMs learn from safety fine-tuning. It measures activation shifts during training and extracts the most meaningful feature directions—those predictive of LLM behaviors like refusal. Check out the paper for details.

CPT Scaling Law (2024)

We propose a scaling law that estimates the optimal compute (FLOPs) required to reach a certain loss in CPT. Continually Pre-Training (CPT) refers to pre-training a model from an existing checkpoint. Since CPT data distributions often differ from original pre-training data (e.g., different languages), it requires a dedicated study.

moffee (2024)

An open-source slide maker that transforms Markdown documents into clean, professional slide decks. moffee handles the styling and layout, allowing you to focus entirely on the content.

Wenbo Pan

Explorer

Recent Notes

Welcome!

Digital Productivity Setup in 2025

What I Think About When I Edit

Welcome!

What I’m Passionate About

Works I’m Proud Of

Faro Chat Models (2024)

Safety Residual Space (2025)

CPT Scaling Law (2024)

moffee (2024)

Graph View

Table of Contents