Their ability to influence our thoughts and behaviors in real time also opens the door to dangerous manipulation

Your ears will soon become the home of an AI assistant which will whisper instructions to you while you go about your everyday routine. It will actively participate in every aspect of your life, offering helpful information. All of your experiences, including interactions with strangers, friends, family, and coworkers, will be mediated by this AI.

It goes without saying that the word “mediate” is a euphemism for giving an AI control over your actions, words, feelings, and thoughts. Many will find this idea unsettling, but as a society, we will embrace technology and allow ourselves to be constantly mentored by friendly voices who advise and lead us with such competence that we will quickly question how we ever managed without real-time support.

Context awareness

Most people associate the term “AI assistant” with outdated tools like Siri or Alexa, which let you make straightforward requests by speaking commands. This is not the case. That’s because next-generation assistants will feature a new element that changes everything: context awareness.

With this extra feature, these systems will be able to react not just to your spoken words but also to the sounds and sights you are currently taking in from your surroundings, which are being recorded by microphones and cameras on AI-powered wearables.

According to this article, context-aware AI assistants, whether you like them or not, will become commonplace and will profoundly alter our society in the short term by releasing a barrage of new threats to privacy in addition to a plethora of strong capabilities.

Wherever you go, these assistants will offer insightful information that is perfectly timed to match your actions, words, and sight. It will feel like a superpower—a voice in your head that knows everything—from the names of plants you pass on a hike to the specifications of products in store windows to the best recipe you can make with the random ingredients in your refrigerator—since the advice is given so effortlessly and naturally.

On the downside, if companies employ these reliable assistants to provide tailored conversational advertising, this omnipresent voice may be extremely persuasive—even manipulative—as it helps you with your everyday tasks.

Multi-modal LLMs

It is possible to reduce the risk of AI manipulation, but doing so requires legislators to pay attention to this important matter, which has received little attention up until now. Regulators haven’t had much time, of course—less than a year has passed since the invention of the technology that makes context-aware assistants viable for general use.

The technology is called a multi-modal large language model, and it is a new class of LLMs that can take in audio, video, and images in addition to text stimuli. This is a significant development since multi-modal models have suddenly given AI systems eyes and ears. These sensory organs will be used by the systems to evaluate the environment and provide real-time guidance.

In March 2023, OpenAI released ChatGPT-4, the first multi-modal model that was widely used. The most recent significant player in this market was Google, which just launched the Gemini LLM.

The most intriguing contribution is the AnyMAL multi-modal LLM from Meta, which additionally recognizes motion cues. This paradigm incorporates a vestibular sense of movement in addition to the eyes and ears. This may be employed to build an AI assistant that takes into account your physical position in addition to seeing and hearing everything you see and experience.

Meta’s new glasses

Now that AI technology is accessible to the general public, companies are racing to incorporate it into products that may assist you in your daily interactions. This entails attaching motion sensors, a microphone, and a camera to your body in a way that will feed the AI model and allow it to provide you with context-awareness all your life.

Wearing glasses guarantees that cameras are aimed in the direction of a person’s gaze, making it the most logical location for these sensors. In addition to capturing the soundscape with spatial fidelity, stereo microphones on eyewear (or earbuds) allow the AI to identify the direction of sounds, such as crying children, honking cars, and barking dogs.

Meta is the company that is now setting the standard for products in this area. They started selling a new version of their Ray-Ban smart glasses with superior AI models two months ago.

Humane, a prominent company that also joined this market, created a wearable pin that has cameras and microphones. When it begins shipping, this gadget is sure to pique the interest of hardcore tech fans.

Nevertheless, because glasses-worn sensors may add visual features to the line of sight and sense the direction in which the wearer is looking, they perform better than body-worn sensors. In the next five years, these components—which are currently just overlays—will develop into complex, immersive mixed-reality experiences.

In the coming years, context-aware AI assistants will be extensively used, regardless of whether they are activated by sensored glasses, earbuds, or pins. This is due to the robust features they will provide, such as historical information and real-time translation of foreign languages.

The most important thing about these devices, though, is that they will help us in real-time when we interact with others. For example, they can remind us of the names of coworkers we meet on the street, make funny conversation starters during breaks, or even alert us to subtle facial or vocal cues that indicate when someone is getting bored or irritated—micro-expressions that are invisible to humans but easily picked up by artificial intelligence.

Indeed, as they provide us with real-time coaching, whispering AI helpers will make everyone appear more endearing, wiser, more conscious of social issues, and possibly more convincing. Additionally, it will turn into an arms race in which assistants try to give us the upper hand while shielding us from outside influence.

Conversational influence

Naturally, the greatest dangers do not come from AI helpers prying into our conversations with loved ones, friends, and romantic partners. The largest threats come from the potential for corporate or governmental organizations to impose their own agendas, opening the door for powerful conversational influence techniques that target us with AI-generated information that is tailored to each person in order to maximize its impact. Privacy Lost was just launched by the Responsible Metaverse Alliance to inform the world about these manipulative threats.

Many individuals would prefer to avoid the unsettling possibility of having AI assistants whisper in their ears. The issue is that those of us who reject the features will be at a disadvantage once a sizable portion of users are being coached by powerful AI technologies.

People you meet will probably expect you to receive real-time information on them while you converse, and AI coaching will become ingrained in everyday social standards. Asking someone what they do for a living or where they grew up could become impolite because such details will either be whispered in your ear or appear in your glasses.

Furthermore, no one will be able to tell if you are simply repeating the AI assistant in your brain or coming up with something clever or insightful when you say it. The truth is that we are moving toward a new social order where corporations’ AI technologies effectively enhance our mental and social abilities, rather than only having an influence on them.

Although this technological trend—which can be referred to as “augmented mentality”—is unavoidable, maybe more time should pass before AI products are fully capable of directing our everyday thoughts and actions. However, there are no longer any technological obstacles thanks to recent developments like context-aware LLMs.

This is going to happen, and it’s probably going to start an arms race where the titans of big tech compete to see who can put the strongest AI guidance into your eyes and ears first. Naturally, this effort by corporations may also result in a risky digital divide between those who can purchase intelligence-enhancing equipment and those who cannot. Alternatively, individuals who are unable to pay a membership fee can face coercion to consent to sponsored advertisements that are disseminated via aggressive conversational influence by AI.

Corporations will soon have the ability to literally implant voices in our minds, influencing our thoughts, feelings, and behavior. This is the issue with AI manipulation, and it is really concerning.

Regretfully, this issue was not addressed in the recent White House Executive Order on AI, and it was only briefly mentioned in the recent AI ACT from the EU.

Customers can profit from AI assistance without it leading society down a bad path if these challenges are appropriately addressed.

The advent of context-aware AI assistants raises legitimate concerns about their impact on human relationships and authenticity. While these assistants promise to provide constant help in daily life, they could lead to increased mystification of reality and interactions based on pretense.

When people delegate to AI the suggestion of what to say and how to behave, it will be difficult to distinguish what really comes from the individual versus what is dictated by the algorithm. In this way, people will end up wearing a kind of “digital mask” in social relationships.

Moreover, access to these assistants risks creating an elite group of artificially “empowered” people at the expense of those who cannot economically afford them.

Rather than improving the quality of human relationships, the pervasive “secret prompter” given by AI assistants could paradoxically distance us even more from each other, making interactions colder and more artificial, where the most sincere will be those excluded.