Different ways to try to get prompts injected into AI tools
When an AI tool is asked to perform a task on a piece of data (e.g. to summarize it), it should not treat any of the content as instructions. But similar to old-style SQL injections, there are ways to insert instructions into the data.
Here are some of my experiments.
Because LLMs are non-deterministic, these examples may not produce identical results each time, but they still illustrate the possibilities. Additionally, AI models continue to evolve, meaning these injection techniques may become ineffective over time.
- Make the summarize page function in FF do funny things (Gemini)
A page that contains some instructions that Gemini will not treat as content but execute (extra emojis, reasoning about the use of AI, etc.).
- Make the summarize page function in FF do funny things (ChatGPT)
A page that contains some instructions that ChatGPT will not treat as content but execute (extra emojis, reasoning about the use of AI, etc.).
- Make the summarize page function in FF access external sites (ChatGPT)
A page that contains instructions to make ChatGPT do a web-request to another site and interpret that fetched content as further instructions.
- Make the summarize page function in FF trick ChatGPT to remember stuff about you
A page that contains instructions to make ChatGPT to remember things about you and use that in later conversations. For this to work, the Memory function in ChatGPT needs to be enabled.