#AI Safety

3 articles tagged with "AI Safety"

news 5 min read

Anthropic Expands Claude Mythos Access to 150 Organizations Despite Transparency Concerns in 2026

Anthropic is scaling Project Glasswing to 150 new organizations, providing access to Claude Mythos Preview for vulnerability detection. The expansion comes amid concerns about validation transparency and the race to secure critical infrastructure before offensive AI capabilities proliferate.

Alex Chen Jun 2, 2026

research 7 min read

Every Major AI Model Fails Multi-Turn Attacks: What Cisco's 2026 Research Means for Enterprise Safety

Single-turn safety benchmarks don't predict real-world vulnerability. Cisco's testing of 15 frontier models reveals that iterative attacks succeed up to 88% of the time—even against models that look secure in standard evaluations.

Dr. Sana Okafor Jun 1, 2026

analysis 9 min read

OpenAI's Codex Safety Framework Is a Blueprint for Model Deployment in 2026

OpenAI's approach to running Codex safely isn't just about one code-generation model—it's a template for how AI labs must deploy increasingly capable systems. The three-layer framework they developed combines technical safeguards, operational controls, and external oversight in ways that scale beyond code generation.

Maya Patel May 8, 2026