AI mistakes VS human mistakes

0
20
AI vs human mistakes

Understanding the nature of AI errors and how they differ from human mistakes

Making mistakes is part of human nature. We all commit errors daily, whether in familiar or unfamiliar tasks. These errors range from insignificant to devastating. When we make mistakes, we can destroy relationships, damage our professional reputation, and sometimes create life-threatening situations.

Throughout history, we’ve developed protective measures against typical human errors. Today’s security practices reflect this: casinos switch dealers periodically to prevent fatigue-related mistakes for example. Before surgeries, medical staff mark the correct body parts and track all instruments to prevent leaving them inside patients. We’ve established various systems—from copyediting to double-entry bookkeeping to appellate courts—that effectively catch and correct human errors.

Society is now incorporating a fundamentally different type of error-maker: AI. While technologies such as large language models (LLMs) can handle many cognitive tasks traditionally done by humans, they aren’t error-free. When chatbots recommend “eating rocks” or “adding glue to pizza,” it might seem ridiculous. However, what sets AI mistakes apart from human ones isn’t how often they occur or how serious they are—it’s their unusual nature. AI systems make errors in ways that differ fundamentally from human error patterns.

This fundamental difference creates challenges and dangers in how we use AI. We must create new protective measures that address these unique characteristics and prevent AI errors from causing harm.

Human mistakes vs AI mistakes

As explained here, we can generally predict when and where humans will make mistakes based on our life experiences. Human errors typically occur at the edge of one’s knowledge. For example, most people would struggle with calculus problems. Therefore, human mistakes usually follow patterns: making one calculus error likely indicates more will follow. These errors are predictable, increasing, or decreasing based on factors like tiredness and lack of focus. Additionally, when people make mistakes, they are often accompanied by ignorance about the topic.

Our traditional error-correction methods work well when AI systems make similar mistakes to humans. However, modern AI systems—especially large language models (LLMs)—show different error patterns.

>>>  A.I. scams on phone

AI errors appear unpredictably and don’t cluster around specific subjects. LLMs tend to distribute mistakes evenly across their knowledge base. They’re just as likely to fail at calculus as they are to make absurd claims like “cabbages eat goats.”

Nonetheless, AI mistakes aren’t affected by ignorance. An LLM will be just as confident when saying something completely wrong—and so to a human—as it will be when saying something true. This random inconsistency makes it difficult to rely on LLMs for complex problems requiring multiple steps. When using AI for business analysis, it’s not enough that it understands profit factors; you need assurance it won’t suddenly forget basic concepts like money.

AI mistakes

Two research directions emerge from this challenge. One involves developing LLMs that produce errors more similar to human ones. The other focuses on creating new error-detection systems specifically designed for typical LLM mistakes.

We’ve already developed tools to make LLMs behave more like humans. Many come from “alignment” research, which strives to make models operate according to their human creators’ intentions and goals. ChatGPT’s breakthrough success largely came from one such technique: reinforcement learning with human feedback. This approach rewards AI models when humans approve of their responses. Similar methods could teach AI systems to make mistakes that humans find more understandable by specifically penalizing errors that seem incomprehensible to people.

Some of our existing systems for catching human errors can help identify AI mistakes. Having LLMs verify their own work can reduce errors to some extent. However, LLMs might also provide explanations that sound reasonable but are actually nonsensical.

AI requires some error prevention methods that differ completely from those we use for humans. Since machines don’t experience fatigue or frustration like people do, one effective approach involves asking an LLM the same question multiple times with slight variations, then combining these responses. While humans would find such repetition irritating, machines can handle it without complaint. By comparing multiple responses to similar questions, you can identify potential errors or inconsistencies in the AI’s outputs.

>>>  Producer uses AI to make his vocals sound like Kanye West

Similarities and differences

Researchers haven’t fully understood how LLM errors differ from human ones. Some AI peculiarities appear more human-like upon closer examination. Take prompt sensitivity. LLMs can give vastly different answers to slightly altered questions. Survey researchers observe similar behavior in humans, where small changes in question-wording can dramatically affect poll responses.

LLMs also seem to have a bias towards repeating the words that were most common in their training data. This might mirror the human “availability heuristic,” where we spit out the first thing we remember instead of thinking carefully. Similar to humans, some LLMs seem to lose focus in lengthy texts, recalling information better from the beginning and end. However, research shows improvement in this area: LLMs trained extensively on information retrieval from long texts show more consistent performance throughout documents.

Sometimes, LLMs behave more human-like than expected, which seems strange. Interestingly, some effective methods for “jailbreaking” LLMs (making them ignore their programmed restrictions) resemble human social manipulation tactics, like impersonation or claiming something is just a joke. However, other successful jailbreaking techniques would never fool humans. Noteworthy is the fact that one research team discovered that using ASCII art (text-based pictures) to ask dangerous questions, like bomb-making instructions, would bypass the LLM’s safeguards.

While humans occasionally make inexplicable, inconsistent errors, these instances are uncommon and often signal underlying issues. We typically don’t allow people showing such behavior to make important decisions. Similarly, we should limit AI systems to tasks that match their actual capabilities, always considering the potential consequences of their mistakes.

While we can often spot human errors through context, inconsistency, or lack of confidence, AI systems can present incorrect information with complete assurance and in ways that seem perfectly logical at first glance.

This challenge becomes especially concerning in our current digital age, where information spreads rapidly across social media and other platforms. When AI systems generate content that contains subtle but significant errors, these mistakes can quickly propagate through shares, reposts, and citations before anyone realizes they’re incorrect. Unlike human-generated misinformation, which often shows clear signs of bias or logical flaws, AI-generated errors can be remarkably sophisticated and harder to identify without careful verification.

>>>  Photos and portraits can move and talk

However, the solution isn’t to hide or censor AI mistakes when they occur. Instead, we need transparency and open discussion about these errors to better understand them and improve our systems. Censorship would not only be ineffective but could also create a dangerous illusion of infallibility. By acknowledging and studying AI errors openly, we can develop better detection methods and help users become more discerning consumers of AI-generated content.

Crucially, we must ensure that AI systems remain tools to assist human decision-making rather than becoming autonomous arbiters of human fate. This is particularly vital in contexts where AI decisions can significantly impact people’s lives and livelihoods, such as content moderation on social media platforms. When AI systems flag potential violations that could result in account bans or revenue loss, there must be robust human oversight and clear appeal processes. We cannot allow automated systems to make unilateral decisions that could devastate individuals’ careers and businesses without meaningful human review and recourse.

Moving forward, success will likely come from a hybrid approach: adapting traditional error-checking methods where appropriate while developing novel safeguards specifically designed for AI systems. This might include implementing multiple verification layers, creating better alignment techniques, and establishing clear boundaries for AI system deployment based on their reliability in specific contexts. Most importantly, we need to cultivate a healthy skepticism and implement robust fact-checking processes when working with AI-generated content.

The key is not to view AI systems as inherently more or less error-prone than humans, but rather to recognize them as fundamentally different types of error-makers. By understanding these differences, we can better harness AI’s potential while protecting against its unique vulnerabilities.