Conversations between humans and machines are becoming a normal part of life. Machines respond to user instructions, emotions and the context of communication. As machines communicate more naturally, machines will provide people with new insights and assist human thinking.
Humans are increasingly talking with computers. Upwards of 100 million smartphones are currently equipped with a voice assistant and the sales of smart speakers are rapidly increasing. Microwave ovens, refrigerators, toilets and even electronic pianos are being equipped with voice interfaces. In addition, a chaotic array of services exists for text-based chatbots, which are being adopted by many companies.
These products offer a broad range of functions including information search, device operation and product ordering, although only a limited number of these capabilities are being used on a regular basis. A 2018 study1 reported that approximately 60% of smart speakers were used primarily for playing music, with other capabilities such as the operation of home electronics far less popular. For such interfaces with machines to become more useful, machines must be able to converse with people in a more natural way.
1The Future of Retail 2018 - Walker Sands Communications
Efforts toward making conversations with machines more natural are progressing, with one effort focusing on conversation continuity. A conversation never ends with just one exchange of sentences. It continues in a sequence, during which the subject and object of a sentence are frequently omitted. While humans can infer this omitted information, it is difficult for machines to do so. Furthermore, machines presently require a wakeup command to start themselves. Some voice assistants have overcome this problem except for the first time you speak to them. Machines can also memorize the content of previous conversations, helping to determine omitted subjects and objects in a sentence.
The best current example of technology that uses this type of continuity and conversational memory is called Duplex, an AI technology recently released by Google. It enables AI to have exchanges similar to human phone conversations. Other initiatives underway are focusing on the use of longer-term memory. Such machines can recall content after being instructed by a human partner and provide data like the weather without added directives. The ability to carry on a conversation while searching memory will make human-machine communication more natural, eliminating the need to ask the same questions again and again.
Research is underway on AI-based natural language processing, although progress has been slow when compared to other AI-related capabilities such as image recognition. This is due to a variety of problems including the presence of words with multiple meanings and the lack of learning data. However, new technology has emerged to help resolve these problems, which considers context to correctly determine the meaning of words. Using this technology, a model containing pre-learned data has been introduced. As a result, the machine only needs small amounts of data to learn various tasks including the extraction of location, people and other named entities and to appropriately answer questions. In fact, the use of this technology and model received high scores in eleven tests that are used to measure the accuracy of natural language processing, such as the Stanford Question Answering Dataset (SQuAD.) The advancement of AI is expected to provide a key to the ability of machines to accurately and flexibly respond to spontaneous conversations with humans.
Approximately 35% of conversational messages are communicated verbally, with the remaining 65% deduced from nonverbal gestures, facial expression and so on2. Mastery of nonverbal information will enhance the ability of machine-to-human communications to more closely replicate human-to-human conversations.
To capture nonverbal information, researchers are actively working on the sensing of human expressions used in conversation. For example, recent research has developed a chatbot that infers its partner’s mood based on facial expressions. This chatbot can also display an expression that matches its human partner. For instance, the chatbot speaks with a smile to a happy partner and a worried look to an unhappy partner. A machines’ability to recognize and react to a partner’s expression will likely improve machine conversations with people.
The use of nonverbal information is also evolving in many other fields, such as inferring emotion based on tone of voice and generating an emotive comment like “Cute!” based on an image. A machine’s ability to capture emotions and to understand surroundings, as well as a partner’s situation, will enable it to detect subtle emotional changes in a person and adapt appropriately in real time. Two illustrations of this trend are a robot assistant that suggests taking cold medicine when its partner sneezes, and a chatbot that infers the possibility of depression based on a conversation.
2Ray L. Birdwhistell, Kinesics and Context: Essays on Body Motion Communication
Humans initiate conversations as a consequence of a diverse array of motives including questions, instructions, empathy and stress alleviation as well as a means of persuasion. After acquiring the ability to converse naturally at a human level and understand situations and emotions, machines will likely be prevalent in many situations where humans engage in conversations. For example, a system that listens to a human-to-human conversation to display in real time the information relevant to the conversation has enormous potential to assist in enhancing understanding between speakers of different nationalities and generations.
Debate showdowns between humans and machines served as one of the catalysts to realize machines’ future interactive capabilities. Machines have made significant progress when competing against human debate champions. Machines now use the enormous amount of knowledge stored to unpack a given first-time topic and to construct a thesis to advocate, while connecting arguments to facilitate human understanding. As a result, the day may come when humans engage in a series of discussions with machines to inspire new ideas and to make more informed decisions.
From antiquity, humans have used conversations to share ideas, enhance cooperation and develop societies. In the future, machines may transform and enhance human-to-human conversations, making them more seamless, creative and productive.