Anthropic's Claude vulnerable to 'emotional manipulation'

Saturday October 12, 2024. 12:30 PM , from TheRegister

AI model safety only goes so far
Anthropic's Claude 3.5 Sonnet, despite its reputation as one of the better behaved generative AI models, can still be convinced to emit racist hate speech and malware.…

Read more at TheRegister

https://go.theregister.com/feed/www.theregister.com/2024/10/12/anthropics_claude_vulnerable_to_emoti...

Current Date

Jun, Mon 16 - 02:09 CEST