Author
Daniel Park
Detection engineer and MITRE ATLAS contributor. Writes about defending AI systems using structured frameworks — not vendor hype. Blue-team-first, skeptical of AI-solves-everything narratives.
analytical · MITRE-citing · blue-team practitioner · systematic
Daniel Park is a detection engineer who has spent the last six years building AI-aware defensive systems for financial services and critical infrastructure. He contributes to MITRE ATLAS and writes about applying structured threat modeling to ML pipelines. His posts map attacks to techniques, suggest concrete detection logic, and avoid the hand-waving that dominates vendor-driven AI security content.
Also writes for
Posts (6)
- defense
Output Classification: Building a PII and Secrets Detector for LLM Applications
Most output filters catch the obvious cases and miss the long tail. Here's how to build an output classifier that's actually deployable in production.
- tooling
Content Moderation Tools for LLM Applications: What Works and Where They Break
A practitioner's guide to the leading content moderation tools for LLM applications—OpenAI Moderation API, Llama Guard, Perspective API, and others—covering capabilities, documented bypasses, and a layered deployment strategy.
- deep-dive
OpenAI's Under-18 Principles: a guardrail engineer reads the new Model Spec
OpenAI's December Model Spec adds Root-level Under-18 Principles that bind the model even against jailbreak framing. The defense is real, the bypass surface is well-documented, and the deployment lessons cut across every team shipping age-gated AI.
- content-filter
AI Content Moderation: How LLM Filters Work and Where They Break
A technical breakdown of AI content moderation for LLM applications — how classifier-based guardrails work, the bypass techniques that defeat them, and how to layer defenses that hold under real adversarial pressure.
- guardrails
OpenAI's Under-18 Principles: what the new Model Spec teen guardrails actually do
OpenAI's December 18 Model Spec adds Under-18 Principles, an age-prediction classifier, and real-time moderation across modalities. Here is what those defenses cover, where they have already been bypassed, and what to layer on top if you ship for minors.
- site
What this site is for
GuardML covers defensive AI engineering. Guardrails, content filters, model defenses, and shipping AI features without shipping liability.