280 billion parameters of Gopher versus 175 billion parameters of GPT-3

DeepMind, the London-based A.I. research company that is owned by Google-parent Alphabet, has created an artificial intelligence algorithm for language tasks which include reading comprehension and answering questions on a broad range of subjects, better than any existing similar software like GPT-3 for example. It’s called Gopher and in a few areas, such as a high school reading comprehension test, the software approaches human-level performance but not when it comes to common sense reasoning and mathematical reasoning.

However, DeepMind has made it clear that it wants to play a bigger role in the advancement of natural language processing. The company is most known for developing an artificial intelligence system that can beat the world’s best human player in the strategic game Go, a key milestone in computer science, and has recently made a breakthrough in applying A.I. to predict protein structure. Nonetheless, compared to competing laboratories such as OpenAI (GPT-3), and the A.I. research arms of Facebook, Microsoft, Alibaba, Baidu, and even its sister company Google, DeepMind has done significantly less work on natural language processing (NLP).

These companies have developed huge linguistic A.I. systems which are based on neural networks that can consume and manipulate hundreds of millions to hundreds of billions of variables. They are trained on massive archives of books and material gathered from the Internet, and are known among A.I. experts as “ultra-large language models”. The benefit is that they can execute a wide range of language abilities, such as translation and question-answering, as well as text authoring, with little or no specific training in those areas.

>>>  Synthesized faces are more trustworthy than real faces

According to the data published by DeepMind, their language model was significantly more accurate than existing ultra-large language models on many tasks, particularly answering questions about specialized subjects like science and the humanities, and equal or nearly equal to them on others, such as logical reasoning and mathematics.

There are around 280 billion distinct parameters in Gopher. This puts it ahead of OpenAI’s GPT-3, which has 175 billion. However, it is smaller than Megatron, a system that Microsoft and Nivida worked on earlier this year that has 535 billion parameters, as well as Google’s 1.6 trillion and Alibaba’s 10 trillion.

Larger language models have already resulted in more fluent chatbots and digital assistants, more accurate translation software, better search engines, and systems that can summarize complex documents. Anyway, DeepMind claimed doesn’t intend to commercialize Gopher.

Because most human knowledge is contained in language, some academics, including some at OpenAI, believe that through creating larger and larger language models, scientists will eventually get “artificial general intelligence”. This is the word computer scientists use to describe artificial intelligence (AI) that is as flexible as a human’s.

That’s why A.I. researchers and social scientists have expressed ethical doubts about ultra-large language models since they frequently learn racial, ethnic, and gender preconceptions from the texts on which they are trained, and the models are so complicated that it is impossible to detect and trace these biases before deploying these systems.

Another problem with such algorithms is that they use a lot of electricity to train and run, which could exacerbate global warming. In addition, some linguistics and A.I. researchers have also urged tech corporations to stop constructing A.I. systems because, despite their scale, they still don’t achieve human-level language understanding.

>>>  Where is conversational A.I. going?

However, the DeepMind ethics team stated that there is no one-size-fits-all solution to many of the problems that ultra-large language models cause.

DeepMind also published separate research on a technique that could make the creation of large language models more energy-efficient though and potentially make it easier for researchers to detect bias and toxic language as well as verify information sources. The technology, dubbed a Retrieval-Enhanced Transformer (Retro), has access to a 2 trillion-word database that the software uses as memory.

When given a human-written prompt, the system looks for the passage from its training set that is closest to the first prompt, then looks for the next closest text block, and then employs those two passages to guide its response.

This type of memory minimizes the amount of data the language model must process at any given time. As a result, the model can be smaller and use less energy. DeepMind also claims that their Retro model, which has 7 billion parameters, can match the performance of OpenAI’s GPT-3, despite GPT-3 being 25 times larger. And since researchers can see exactly which part of the training text the Retro software is using to generate its output, the DeepMind researchers believe it will be easier to uncover bias or inaccuracy.

It’s good that companies like DeepMind are trying to fix A.I.’s possible ethical issues but more still to be done, for example, to allow people to verify an A.I. is really neutral in any situation.

Source fortune.com