With more and more AI models and tools being developed, questions about the data that's used to train them have been emerging. Even major AI companies have been accused of illegally scraping data. Zoom has made a point of clarifying that its platform will not be doing so.
Zoom Calls Will Not Be Used to Train AI
The video conferencing platform became widely used during the pandemic when people were restricted from meeting face-to-face. Since then, it's still being used by companies and organizations for virtual meetings in hybrid setups.
With that said, Zoom has a lot of user data and calls that can be used to train AI, and that fact became concerning when the platform updated its Terms of Service which implied that video calls could be used to train AI models.
The update stated that service-generated data and customer content could be used "for the purpose of product and service development" like "machine learning or artificial intelligence including for the purposes of training or tuning of algorithms and models."
Users have become wary of this given that it Zoom might start using private and sensitive data without their consent. In response to the matter, Zoom Chief Product Officer Smita Hashim said that Zoom audio, video, or chat will not be used to train their models without customer consent.
As reported by Ars Technica, "service generated data" were the telemetry and diagnostic data and not the actual content of its users' calls or chats. Since the post by Hashim might not have been enough, Zoom clarified it through its latest Terms of Service Update.
It states that "Zoom does not use any of your audio, video, chat, screen sharing, attachments or other communications-like Customer Content (such as poll results, whiteboard, and reactions) to train Zoom or third-party artificial intelligence models."
Other Platforms Have Taken Measures
Zoom's data cannot be accessed by other AI companies due to its calls being private. Other platforms, however, are not so lucky. Social networking sites like Reddit and X (formerly Twitter) have had to go through extreme measures to avoid user data from being used.
X, for instance, faced a lot of criticism over its countermeasure, wherein certain users were given a limit on how many posts they can view in a day. Unverified users were allowed fewer views per day than verified users.
This is what also started Reddit's API price change, which eventually spun out of control. The API pricing was meant for companies and other users who were using Reddit data to train AI models, but it also affected a lot of Reddit third-party apps and was met with protests from mods.
Illegal scraping of data has been an ongoing issue. The main concern is that the AI models might learn from data that are meant to be private, which might otherwise be revealed when the AI models divulge what it learns when responding to certain text prompts.
Even if that's not the case, users are still entitled to keep their private data private of they choose to.