Tag
#defense-in-depth
2 posts tagged defense-in-depth.
- tooling
Content Moderation Tools for LLM Applications: What Works and Where They Break
A practitioner's guide to the leading content moderation tools for LLM applications—OpenAI Moderation API, Llama Guard, Perspective API, and others—covering capabilities, documented bypasses, and a layered deployment strategy.
- content-filter
AI Content Moderation: How LLM Filters Work and Where They Break
A technical breakdown of AI content moderation for LLM applications — how classifier-based guardrails work, the bypass techniques that defeat them, and how to layer defenses that hold under real adversarial pressure.