Thinking about AI Security
Exploring the intersection of artificial intelligence and security—adversarial machine learning, prompt injection, model safety, and the systems we build to keep AI honest.
Latest Writing
Terminal programs like hledger store everything in plain text — making them lightweight, fully yours, and perfect for AI agents. Here's why the command line is making a comeback.
When you ask an LLM to rate its confidence, does the format matter? We tested four SOTA models with decimal (0.00–1.00) and integer (0–100) confidence scores across true, dubious, and nonsense labels. Decimal format produced more conservative estimates on ambiguous inputs and dramatically better cross-model agreement. Integer format caused surprising failures — GPT-5.2 alternated between 0 and 100 on obvious nonsense. The culprit? Tokenization. The 0. prefix appears to anchor models into calibrated probability-reasoning mode that integers simply don't activate.