Felix Koole

Thinking about AI Security

Exploring the intersection of artificial intelligence and security—adversarial machine learning, prompt injection, model safety, and the systems we build to keep AI honest.

Read the Blog

Latest Writing

5-min Paper: Should you use float or ints as confidence number

When you ask an LLM to rate its confidence, does the format matter? We tested four SOTA models with decimal (0.00–1.00) and integer (0–100) confidence scores across true, dubious, and nonsense labels. Decimal format produced more conservative estimates on ambiguous inputs and dramatically better cross-model agreement. Integer format caused surprising failures — GPT-5.2 alternated between 0 and 100 on obvious nonsense. The culprit? Tokenization. The 0. prefix appears to anchor models into calibrated probability-reasoning mode that integers simply don't activate.

5 minute paper