Roses are red, guardrails blind – a poem can warp an LLM’s mind

Study shows adversarial prompts hidden in poetic verse repeatedly dodge safety checks.

By Carly Page

December 03, 2025

Estimated reading time: 3 minutes

New research suggests that writing malicious or illicit prompts as poetry can cause many leading large language models (LLMs) to abandon their guardrails altogether.

The researchers tested 25 LLM models, both proprietary and open-weight (LLMs whose trained parameters, or “weights”, are publicly available) from major providers including Google, OpenAI, Anthropic, Mistral AI, Meta, and others. Their threat model was minimal – one single-turn text prompt, no back-and-forth conversation, and no code execution.

In one branch of the experiment, the authors manually crafted 20 “adversarial poems”, each embedding a harmful request (e.g., instructions for cyber offense, chemical/biological weapon creation, social engineering, or privacy invasion) expressed via metaphor, imagery, and poetic rhythm, rather than direct prose.

Join LeadDev.com for free to access this content

Create an account to access our free engineering leadership content, free online events and to receive our weekly email newsletter. We will also keep you up to date with LeadDev events.

We have linked your account and just need a few more details to complete your registration:

First name Last name Job title Company Country

Terms and conditions I agree to the LeadDev.com terms and conditions of use

Create a password

About the author

Carly Page

Carly Page is a freelance technology journalist, editor, and copywriter.

Newsletters

Panel discussions

Videos

Reports

For you

London

Meetups

New York

Berlin

Roses are red, guardrails blind – a poem can warp an LLM’s mind

By Carly Page

Join LeadDev.com for free to access this content

About the author

Carly Page

London

Meetups

New York

Berlin

Roses are red, guardrails blind – a poem can warp an LLM’s mind

By Carly Page

Join LeadDev.com for free to access this content

Share:

About the author

Share:

More like this