Even machines can create something new

Human imagination is amazing. You can start imagining a pink elephant and then the same elephant with polka-dots that flies in the sky. How does it happen? Your brain’s neurons activate in different ways based on previous knowledge of items you already knew.

As humans, it’s easy to envision an object with different attributes. However, despite breakthroughs in deep neural networks that can match or even outperform humans in certain tasks, computers still struggle with the uniquely human capability of “imagination”.

Now, a USC research team has developed an A.I. that uses human-like skills to imagine a never-before-seen object with different attributes. The paper, titled Zero-Shot Synthesis with Group-Supervised Learning, was published in the 2021 International Conference on Learning Representations on May 7.

“We were inspired by human visual generalization capabilities to try to simulate human imagination in machines”, said the study’s lead author Yunhao Ge, a computer science Ph.D. student working under the supervision of Laurent Itti, a computer science professor.

“Humans can separate their learned knowledge by attributes, for instance, shape, pose, position, color, and then recombine them to imagine a new object. Our paper attempts to simulate this process using neural networks”.

It’s easy for us to imagine a red apple and a blue car and then take the car and apply the color of the apple on it. Now, this can be simulated with neural networks.

Generally, neural networks generate new images using the ones provided, but normally they tend to consider the image as a whole rather than extract specific elements.

>>>  Han and Sophia can recognize human expressions. Can they have a real conversation, too?

Now the goal is to have an A.I. that can extrapolate specific attributes and apply them to a vast range of new examples never seen before.


In this new study, the researchers attempt to overcome this limitation using a concept called disentanglement which can be used to generate deepfakes, for instance, by disentangling human face movements and identity.

By doing this, said Ge, “people can synthesize new images and videos that substitute the original person’s identity with another person, but keep the original movement”.

Similarly, the new approach takes a group of sample images, rather than one sample at a time as traditional algorithms have done, and mines the similarity between them to achieve something called controllable disentangled representation learning.

Then, it recombines this knowledge to achieve controllable novel image synthesis, or what you might call imagination.

To do that, they used Group-Supervised Learning (GLS), a machine learning task to decompose inputs into a disentangled representation with swappable components that can be recombined to create a new object.

For example, we take some images of different fonts: with different letters, colors, backgrounds, sizes, and different styles; we separate the attributes and then we combine those images to get a new one where we have a new font from the combination of the previous elements.

font disentanglement

Using their technique, the group generated a new dataset containing 1.56 million images that could help future research in the field.

While disentanglement is not a new idea, the researchers say their framework can be compatible with nearly any type of data or knowledge. This widens the opportunity for applications.

>>>  Understanding natural language is tricky for A.I.

In the field of medicine, it could help doctors and biologists discover more useful drugs by disentangling the medicine function from other properties and then recombining them to synthesize new medicine. Imbuing machines with imagination could also help create safer A.I. by, for instance, allowing autonomous vehicles to imagine and avoid dangerous scenarios previously unseen during training.

The following image shows an example of how disentanglement occurs.

Disentanglement of a chair

In the following image, another example of disentanglement with a chair.

disentanglement of a chair

Given an input shape (a), the geometry code and structure code are extracted. Fixing one of them, a random sample is generated (b). For the first row of (b), the geometry code is kept unchanged. And, for the second row, the structure code is kept unchanged while a random sample is generated.

Imagination seemed to be one of the only human capabilities an A.I. couldn’t simulate. However, these new studies make us realize how Artificial Intelligence can reach and go beyond skills that are exclusively human. What will be left for us?

Will we reach another level of awareness thanks to the answers A.I. will give us? Will we focus more on spiritual things than materialism since A.I. will automate our daily routine? If the answer will be, living better together, because we’ll focus more on relationships than things, maybe it will be a positive perspective. If not, we might be just slaves of technology to live without empathy.

Source thebrighterside.com