When you ask an LLM to rate its confidence, does the format matter? We tested four SOTA models with decimal (0.00–1.00) and integer (0–100) confidence scores across true, dubious, and nonsense labels. Decimal format produced more conservative estimates on ambiguous inputs and dramatically better cross-model agreement. Integer format caused surprising failures — GPT-5.2 alternated between 0 and 100 on obvious nonsense. The culprit? Tokenization. The 0. prefix appears to anchor models into calibrated probability-reasoning mode that integers simply don't activate.
Writing
Essays on AI security, adversarial machine learning, prompt injection, and the systems we build to keep AI honest.
5 minute paper
If you're building chatbots or content moderation systems, you've probably wrestled with a frustrating tradeoff: regex is blazingly fast but brittle, while LLMs are flexible but painfully slow. What if there was a middle ground that gave you the best of both worlds? Enter VectorSmell — a hybrid approach that sits perfectly between probabilistic and deterministic text classification, delivering sub-50ms response times while maintaining the flexibility that regex can only dream of.
content moderation