open to LLM research engineer internships
Kristian Schwethelm

Kristian Schwethelm

PhD Student · TU Munich · Chair for AI in Medicine
large language models · looped transformers · trustworthy AI

I'm a PhD student at the Chair for AI in Medicine at TU Munich, supervised by Prof. Daniel Rückert and Prof. Georgios Kaissis. I work on pretraining and fine-tuning large language models, with a current focus on looped (depth-recurrent) transformers: how recurrence trades off against parameters, how to retrofit it into pretrained models, and how to induce latent reasoning via RL.

In parallel, I work on LLM agents in clinical decision workflows and LLM text extraction from clinical reports. I've previously published on differentially private learning and hyperbolic neural networks. Best Graduate Award (M.Sc. Applied Computer Science, 2024) and Top Reviewer at NeurIPS 2025.

Latest · 2026 Preprint: Iso-Depth Scaling Laws for Looped Language Models
Selected work
Preprint '26
Schwethelm, Rückert, Kaissis

We measure the parameter-sharing cost of looped LMs with a scaling law exponent φ and use Δφ as a diagnostic: commonly used truncated backpropagation lowers it (worse loop), hyperconnections raise it (real gain), even though both improve validation loss.

TMLR '25
Schwethelm, Kaiser, Knolle, Lockfisch, Rückert, Ziller

Diffusion models, used as image priors, can reconstruct training data from differentially private models, exposing a gap between DP's theoretical bounds and what actually leaks in practice.

SaTML '25
Schwethelm, Kaiser, Kuntzer, Yiğitsoy, Rückert, Kaissis

First framework for combining active learning with DP-SGD in standard supervised settings. We introduce step amplification to fix privacy budget waste and show which acquisition functions still work under DP, and which fall apart.

ICLR '24
Schwethelm*, Bdeir*, Landwehr

HCNN: a CNN that lives entirely in hyperbolic space. We generalize the convolutional layer, batchnorm, and the classifier head to the Lorentz model, so the network leverages hyperbolic geometry at every layer instead of just the head.