← Back to Model Beat
10Other·Dec 22

Continuously hardening ChatGPT Atlas against prompt injection

OpenAI is strengthening ChatGPT Atlas against prompt injection attacks using automated red teaming trained with reinforcement learning. This proactive discover-and-patch loop helps identify novel exploits early and harden the browser agent’s defenses as AI becomes more agentic.

Covered by 1 source

Related stories

OtherAI literacy resources for teens and parentsDec 18OtherTokenization in Transformers v5: Simpler, Clearer, and More ModularDec 18OtherProject Vend: Phase twoDec 18OtherFalcon LLM - CybernewsDec 18