Tag
#llm-security
4 posts tagged llm-security.
- tooling
AI Moderation Tools for LLMs: What Works and What Gets Bypassed
A practitioner's comparison of AI moderation tools — AWS Bedrock Guardrails, Azure AI Content Safety, Lakera Guard, NeMo Guardrails, and Llama Guard —
- tooling
AI Safety Tools: A Guide to Guardrails, Filters, and Defenses
A practitioner's breakdown of the leading AI safety tools — NeMo Guardrails, LLM Guard, Llama Guard, and managed platforms — with benchmark data, known
- alignment
Model Alignment: What It Is, How It Works, and Where It Fails
Model alignment trains AI systems to follow human intent rather than optimize for proxy metrics. Here's what the main techniques actually do, how they're
- defense
Output Classification: A PII and Secrets Detector for LLM Apps
Most output filters catch the obvious cases and miss the long tail. Here's how to build an output classifier that's actually deployable in production.