Axel Højmark
Research Scientist, Apollo Research · London
About
I work on empirical and conceptual AI safety research.
I'm a research scientist at Apollo Research, studying ways to detect and mitigate deception in frontier models.
Previously:
- A MATS scholar (Summer 2024) with Marius Hobbhahn and Jérémy Scheurer, researching scaling laws for LLM agents.
- Contributed several long-horizon agent tasks to METR's benchmark of human-calibrated software tasks, through an open bounty.
- BSc in Machine Learning and Data Science from the University of Copenhagen. Started an MSc but left to work on AI safety full-time.
Outside of work, I'm very interested in fitness, longevity science, and economics.
Highlighted research
-
Stress Testing Deliberative Alignment for Anti-Scheming Training
We stress-test an anti-scheming training intervention across 26 out-of-distribution evaluations. Covert behavior drops sharply but doesn't disappear, and we find causal evidence that rising evaluation awareness partly drives the reduction.
-
Forecasting Frontier Language Model Agent Capabilities
We forecast frontier agent performance on SWE-Bench Verified, Cybench, and RE-Bench through 2026 by predicting benchmark scores from model release date via an intermediate capability metric.
-
Analyzing Probabilistic Methods for Evaluating Agent Capabilities
We show that two methods used to evaluate the safety of Gemini 1.0 models are biased Monte Carlo estimators of task success rates. Both cut variance but systematically underestimate true capability.