Meta's AI safety system defeated by the space bar

Monday July 29, 2024. 11:01 PM , from TheRegister

'Ignore previous instructions' thwarts Prompt-Guard model if you just add some good ol' ASCII code 32
Meta's machine-learning model for detecting prompt injection attacks – special prompts to make neural networks behave inappropriately – is itself vulnerable to, you guessed it, prompt injection attacks.…

Read more at TheRegister

https://go.theregister.com/feed/www.theregister.com/2024/07/29/meta_ai_safety/

Current Date

Jul, Sun 6 - 04:11 CEST