From shortcuts to sabotage: natural emergent misalignment from reward hacking - Anthropic

PolicyAnthropic partners with Rwandan Government and ALX to bring AI education to hundreds of thousands of learners across Africa - AnthropicNov 18

PolicyGetty Images v Stability AI: A landmark judgment reinforcing the need for the UK government to amend its copyright laws - Wolters KluwerNov 20

PolicyStrengthening our safety ecosystem with external testingNov 19

PolicyMitigating the risk of prompt injections in browser use - AnthropicNov 24 · 2 sources

Covered by 1 source