Why It Doesn't Make Sense to Upload Business Data to AI Chatbots

(Photo : chandlervid85 on Freepik)

Artificial intelligence (AI) is eating the business tech world alive. According to recent research by consulting firm EY, a solid 95% of senior business leaders are currently investing in AI. What's more, AI is not just hype. Fully three-quarters of those leaders said they are seeing positive ROI in operational efficiency, employee productivity, and customer satisfaction.

AI solutions can process information faster than humans can and with far fewer errors. They can take enormous datasets and crunch them to produce meaningful, actionable insights that drive better decision-making. With the help of AI, business leaders can detect and mitigate emerging risks, spot nascent opportunities and capitalize on them before their competitors, and meet customer expectations more effectively. 

With these benefits, it's no wonder that people want to use AI to work on their business data. They hope that Claude, ChatGPT, Gemini, and similar AI platforms can derive strategic insights and deliver useful solutions to their business dilemmas.

But this approach isn't always viable and could even be harmful. Here's why it's a bad idea to share business data with AI chatbots.

Business Databases Are Large and Dynamic

It's one thing to ask ChatGPT to summarize that report you didn't read. Uploading an entire business database is quite another. AI platforms can handle a lot of data, but they also have limitations. 

Today's enterprises collect data from dozens of sources, easily running to millions of database rows, but public-use AI platforms have a strict cap on the amount of data you can upload at a time. 

For example, the latest paid version of ChatGPT limits each end user to 10GB and each organization to 100 GB. CSV spreadsheets are capped at approximately 50MB, text files are capped at 2 million tokens per file, and each file of any sort has a hard limit of 512MB. 

It also takes a long time to upload datasets of this size. You'd be waiting a long time for the large language model (LLM) to process your uploaded data. Because business data changes quickly, it would be out of date and irrelevant by the time you ask your first question. 

AI Platforms Shouldn't Be Trusted with Sensitive Information

If you share anything with ChatGPT or any other AI platform, you've shared it with the entire world. For example, ChatGPT's terms and conditions state clearly that OpenAI can use the data you share to train its models. Once your data is in their repositories, you're trusting their security to keep it safe, and that doesn't come with any guarantee—leaks do happen. 

Recently, cybersecurity experts uncovered a defect in Microsoft's AI-powered Copilot Studio that could enable attackers to access sensitive information on internal infrastructure. This server-side request forgery (SSRF) vulnerability might even allow malicious actors to move between Copilot tenants.

"The privacy risks are extreme, in my opinion. Because you're effectively sharing your top-secret corporate information that is completely private and frankly, let's say, offline, and you're sending it to a public service that hosts the chatbot and asking it to analyze it. And that opens up the business to all kinds of issues," warns Avi Perez, CTO of Pyramid Analytics. "In that framework, data privacy and the issues associated with it are tremendous. They're a showstopper."

A decision intelligence platform built to integrate safely with multiple LLMs, Pyramid allows users to ask questions about data, and the AI engine then interfaces with models without actually giving the models access to any information. Pyramid then runs the analysis, turning around answers and visualizations.

LLMs Can't Perform Complex Formulas

LLMs are fantastic at certain tasks, but only certain tasks. They're very good at discerning what people intend to find out with their natural language queries and repurposing information to generate a comprehensible answer. But don't let that lead you to put too much faith in their responses. 

To give one surprising example, AI is mediocre, at best, in mathematics. That's because it's not a calculator—it's based on language. Even the best LLMs can't get much beyond 75% in accuracy for math calculations, and sometimes they fail at even basic arithmetic. 

"At their core, LLMs are text-prediction engines with a degree of randomness. They rely on probability. Even if they seem to be reasoning logically, they're actually using pattern recognition instead of showing true mathematical understanding," observes Briana Brownell, a data scientist. "Meanwhile, mathematics has specific structures: formulas, equations, and relationships between concepts. LLMs don't (yet) capture these features in the way a purpose-built math system would."

LLMs hit a similar wall when it comes to processing business metrics. They simply don't have the expertise or familiarity with the particulars of your company data to produce reliable answers to business-specific questions. Unfortunately, when an LLM doesn't know the answer, it rarely says so. Instead, it hallucinates an answer. Unless you're checking its responses, you could end up basing serious business decisions on erroneous results. 

OpenAI has been working on changing this dynamic, and its new "o1" model has been designed to unlock new "reasoning" capabilities.

To Use AI for Data Analytics, Proceed with Caution

AI chatbots open up a massive amount of capabilities for business users, delivering significant value to many business use cases. But they also bring significant risks and limitations—and can potentially cause serious harm if they're used without caution. It's vital to find ways to take advantage of the benefits of AI without exposing sensitive data.

© 2024 iTech Post All rights reserved. Do not reproduce without permission.
* This is a contributed article and this content does not necessarily represent the views of itechpost.com

Tags