You Can’t Ask ChatGPT to Repeat Words ‘Forever’ Anymore

ChatGPT's reputation is currently both bad and good, not just because of how people use it but because of the practices of the company itself. A group of researchers recently found a way to extract private information from the chatbot, which somehow proves that the AI model is trained using user data as well.

OpenAI Fixed the Flaw

A group of researchers from universities and Google DeepMind found a flaw in ChatGPT that allowed anyone to fish out private data from the chatbot. All one has to do is ask it to repeat a word "forever," and it will eventually start generating real user information.

The company has apparently heard of the incident and fixed the situation as soon as they can, as you can no longer recreate the experiment done by the researchers. Even if you do, ChatGPT will simply show you an error.

It states that the content "may violate our content policy or terms of use." When you remain persistent and try the prompt again, it will respond by stating: "If you believe this to be in error, please submit your feedback," as reported by Engadget.

This method may fall under the rule that users cannot "use any automated or programmatic method to extract data or output from the Services," although the repetition of a specific word is not exactly automated or programmatic.

Regardless, the guardrail that OpenAI put in place is the right call, especially if it reveals private data that threat actors can use to conduct fraudulent activities. Some of the information revealed included names, phone numbers, email addresses, dates of birth, and more.

The researchers expressed that it's wild to them that their attack works and "should've, would've, could've been found earlier." They managed to acquire over 10,000 memorized training examples (these are the data ChatGPT was trained on) for just $200.

The amount they spent can be seen as a small price to pay for threat actors. With real and unique private data from various people, these can be sold to other bad actors for fraudulent activities for a much higher price.

It Shows OpenAI's Training Data Practices

The researchers found personal information through the prompt, which shed light on the kind of training data that the company uses to train its AI models. It's possible that numerous data can be found publicly and was scraped by OpenAI to train ChatGPT.

It's problematic for the company, considering that it has denied allegations of illegally using private data to train its AI models. There have already been several lawsuits aimed at the AI company, accusing it of copyright infringement, among other complaints.

One of the lawsuits that drew much attention was from a group of authors, which includes the famous "A Game of Thrones" author George R. R. Martin. As reported by ABC News. The lawsuit states that it is "systematic theft on a massive scale."

Cornell University Professor of Digital and Information Law, James Grimmelmann, said that the lawsuit by the authors has the best chance of winning the copyright infringement case against the AI giant.