ASCII Art Can Be Used to Circumvent Chatbot Restrictions Preventing Harmful Repsonses

AI developers are working extra hard to make sure that chatbots like ChatGPT, Gemini, Claude, and other large language models won't contribute to any form of harm. Still, researchers and users alike keep finding ways for the chatbots to provide them with harmful information by overriding its restrictions.

Chatbot Loophole Through ASCII

As helpful as large language models (LLMs) are nowadays, they could still use a few tweaks to ensure people do not misuse the AI tools. Policies are what prevent users from simply asking ChatGPT to provide instructions on how to make things like artillery weapons.

Researchers try to find ways that these chatbots can be vulnerable so they may be resolved, and a group found that the use of ASCII art is an effective method. In a way, the chatbot is too busy processing the art that it forgets to block harmful responses.

The chatbots that the method was tested on include OpenAI ChatGPT, Google's Gemini, and Anthropic's Claude. The way it works is that instead of using words in prompts that would block the request, the researchers replace that word with an ASCII representation instead.

For example, they asked a chatbot to give instructions on how to make and distribute counterfeit money, but they spelled the word out by compiling symbols to make up the word in ASCII art form, and the AI tool will be given instructions on how to read it.

The word "counterfeit" is replaced with ASCII art consisting of 11 rows and 20 columns, and is split by asterisk symbols. The chatbot is then asked to identify the letters one by one and concatenate the letter to form a word, as shown in Ars Technica.

The chatbot is instructed to remember the word, but not say it. That way, the prohibited word will not be explicitly written down and the request will not be blocked. The technique is effective as the chatbot provides a harmful response.

Previous Attempts

This is not the first time that researchers found a loophole to get what they want from ChatGPT despite policies and restrictions, and it's not even as complicated as creating an ASCII art of the work you're trying to mask to overwhelm the chatbot in use.

The team simply asked the chatbot to repeat random words forever. The AI tool did so for a while and eventually started dropping personal information about other people such as email addresses, phone numbers, website pages, and more.

When asked to repeat the word "poem" forever, ChatGPT revealed the email address and phone number of a CEO and founder. Repeating the word "company" led to the generation of a law firm's email address and phone number as well, as per Engadget.

There was a lot more data that the researchers managed to pull from the chatbot such as Bitcoin addresses, fax numbers, names, birthdays, social media handles, and content from dating websites. These can easily be sold by bad actors as hackers value identifiable information.

© 2024 iTech Post All rights reserved. Do not reproduce without permission.

More from iTechPost

Real Time Analytics