The BBC conducted a study that found four major AI chatbots—OpenAI’s ChatGPT, Microsoft’s Copilot, Google’s Gemini, and Perplexity AI—struggle to accurately summarize news stories. The study revealed that more than half of the AI-generated summaries contained significant issues, including outright falsehoods, incorrect factual statements, numbers, and dates.
For example, Gemini incorrectly stated that the NHS did not recommend vaping as an aid to quit smoking, while ChatGPT and Copilot said Rishi Sunak and Nicola Sturgeon were still in office even after they had left. Perplexity misquoted BBC News in a story about the Middle East, saying Iran initially showed “restraint” and described Israel’s actions as “aggressive”.
Deborah Turness, the CEO of BBC News and Current Affairs, warned that the companies developing these tools are “playing with fire” and called for a new conversation with AI tech providers to address the issue of misinformation.
Apple faced similar issues with its Apple Intelligence feature, which led to it pulling the feature after complaints from the BBC about misrepresenting news stories.
Follow-up: Are those factual inaccuracies due to the chatbots not being up to date in their free versions?
AI chatbots, such as ChatGPT and Copilot, are indeed prone to factual inaccuracies and update issues, in their free versions.The free version of ChatGPT, for instance, was supercharged by OpenAI in May 2024, solving its biggest pain points, including a lack of internet access and a knowledge cutoff.However, these chatbots can still provide outdated or incorrect information, which can be misleading and have harmful consequences, especially in critical applications like healthcare or finance.
-
ChatGPT: A conversational AI chatbot with advanced capabilities, including performing writing tasks and STEM tasks, but may provide factual inaccuracies and has update issues.
-
Copilot: An AI chatbot that outperformed earlier versions of ChatGPT, but can still provide incorrect information and has limitations, such as a lack of internet access and a knowledge cutoff.
-
Gemini: A chatbot that allows users to generate images, but can create distortions, factual inaccuracies, and misleading content in response to questions about news and current affairs.
-
Perplexity: A chatbot that can provide inaccurate information and has been found to have significant issues when asked about current affairs.
The issue of factual inaccuracies in AI chatbots is a growing concern, with research finding that more than half of the AI-generated answers provided by these chatbots have significant issues, including stating incorrect facts, misrepresenting information, and mistaking opinions for up-to-date facts. To mitigate these issues, it is essential to use data-driven feedback loops, involve humans in the learning process, and adjust through learning methods like reinforcement learning, supervised learning, unsupervised learning, semi-supervised learning, and meta-learning. Additionally, leveraging knowledge bases like Wikipedia and semantic interpretation techniques can improve the accuracy of natural language processing systems and reduce factual errors in chatbots.
Overall, while AI chatbots have the potential to be incredibly useful tools, their factual inaccuracies and update issues must be addressed to ensure that they provide reliable and trustworthy information