AI pollution and the future of content

Published:

How AI-generated content is flooding the Internet and threatening the quality of digital information

Quality vs. Speed

We have now learned to integrate artificial intelligence into our lives, just as we have long become accustomed to using the internet for everything. While AI has become a valuable source of information, especially when it helps us better understand a concept or learn something new, many people have quickly seized the opportunity to exploit the medium, not to get intelligent assistance—that is, filling their knowledge gaps or speeding up repetitive processes—but rather to be completely replaced in creating content. Consequently, the internet is gradually becoming populated with junk content that pollutes the existing pool of information, making it increasingly difficult to distinguish which information and content is created by people, albeit of low quality, from that generated by AI.

This problem manifests itself in various areas of content creation: textual, photographic, audio, and video.

Text

In textual content created exclusively based on AI, errors in concepts, grammar, and translation are likely to occur, which the opportunistic user typically doesn’t bother to correct but tends to transform as quickly as possible into a product to present or sell.

Images

Regarding AI-created images, they are often eye-catching and very well-made but require a completely different analytical approach compared to traditional photos. Previously, with Photoshop, we would start from a real photo, correcting specific defects to improve it, but now with artificially generated images, we must analyze them completely to identify inconsistencies and imperfections that might be hidden by the apparent quality of the image. However, people are often captivated by an attractive image, even if full of inconsistencies and errors. And those who publish them don’t care at all about correcting them.

>>>  The AGI revolution in healthcare

Audio

Artificial audio content, although more difficult to identify due to the increasing naturalness of voices and their nuances, often presents problems with pronunciation and prosody in addition to structural problems related to musical content created in this way. Frequently, those who try to create and commercialize songs with AI don’t have a real understanding of musical structure and compositional rules, thus producing technically inadequate results.

Video

Video content, however, remains the most complex to create artificially and is therefore still easily identifiable and not very exploitable, although there are those who try to use it anyway.

Ethics

All of this is comparable to a food industry that, favoring quantity and speed over quality, tries to sell substandard products at all costs by disguising them with attractive packaging. As with food, in digital content too, quality is sacrificed in favor of production speed and immediate profit.

Ethics is increasingly taking a back seat to profit. Nevertheless, while on one hand we must fight against those who knowingly spread approximate and misleading information to manipulate the public, on the other hand, there are those who exclusively want to make money by looking for shortcuts that result in producing poor-quality content, if not full of errors.

The problem is that the public also accepts things of low quality and full of defects. People lack the critical judgment to reject carelessly made content. This doesn’t mean privileging aesthetics, but being able to properly evaluate the creator’s intent—understanding whether they’re creating something for a deeper purpose than just selling a product.

>>>  Inside the AI's Mind

The future of AI training

Another critical aspect of this trend concerns the training of future AIs. Even without considering the commercial exploitation of artificial content, we face a significant challenge: future AIs will inevitably be trained on content generated by AIs themselves (so-called bootstrapping). This creates a potentially problematic cycle, where each generation of AI could amplify the errors of the previous one. Previously, most existing content was produced by experts who dedicated time and study to its creation, but now we risk entering a spiral of qualitative degradation.

Added to this is the problem of AI ‘hallucinations,’ that is, the tendency to generate information that seems plausible and coherent but is actually inaccurate or completely invented.

Hallucinations are particularly insidious because AI presents this information with the same confidence and style with which it presents correct information, making it difficult for the user to distinguish between real facts and erroneously generated content. To mitigate these problems, it would be necessary to develop systems to track and verify the origin of content, creating separate datasets that clearly distinguish between verified human content and AI-generated content. This would allow maintaining a ‘clean’ database for training future AIs, avoiding the risk of amplifying errors and inaccuracies through successive generations of models.

A new research approach

ai search vs classic web search

The challenge that AIs have posed to Google has created an additional problem: the change in the search process. Previously, we would search Google for anything, and it would return a series of pages to choose from, but using AI search, we are no longer the ones choosing sources—instead, we are automatically given answers relevant to our questions. This change, while it may lead to more targeted results, risks reducing the plurality of content and could facilitate forms of indirect censorship.

>>>  How AI is democratizing song creation

Furthermore, this new paradigm raises important questions about the sustainability of content creation. We had become accustomed to a system where creators were primarily compensated through Google advertising, which encouraged them to produce quality content. But if content is directly extracted from an AI, who will visit creators’ websites? How will the production of original content remain economically sustainable? What will happen to in-depth research, specialized discussion forums, niche communities, and file searching on the network? The loss of these possibilities would represent a significant impoverishment of the digital ecosystem.

Related articles

Recent articles