Natural language processing: intelligent agents

Interview with Intel’s Pilar Manchon

Natural language processing in action

Where are we with NLP? Simple speech-based systems that understand natural language are already widely used.

The AI ​​can answer questions about things like flight times, give directions, tell you where restaurants are, and perform simple financial transactions. Such systems do not need to understand a sentence but they do need to recognize keywords that indicate user intent (e.g. “Make a reservation…”) and task parameters (“… for 6 p.m. at the French bistro. ”).

More advanced systems can summarize news articles and recognize complex linguistic structures. Such systems must have a rough understanding to compress articles without losing key meaning.

How does NLP work? NLP uses two main techniques: symbolic and statistical. The symbolism is based on a series of pre-programmed rules that cover grammar, syntax, etc. Statistics uses machine learning algorithms.

Main challenges: context and ambiguity

The context

Example: “clear” can be a verb or an adjective. In this case, a machine can understand the form a word takes in a sentence through part-of-speech (PoS) markup.
Sentence: James has cleared the way
Rule: A noun is a verb if the preceding tag is a pronoun (in this case “James”)

Therefore, the machine knows that “clear” is a verb in the example sentence and can determine that “path” is a noun.

Ambiguity

Word meaning disambiguation (WSD) is used in cases of polysemy (a word has multiple meanings) and synonemia (different words have similar meanings).

Example of a polyseme: “Fix”
He cooked dinner yesterday (done)
He repaired the car yesterday (repair)

In this case, the PoS markup and syntax will give the same result. Thus, a deeper approach is needed to identify the exact meaning based on real-world understanding. In the previous example, it’s about understanding that you can’t “fix” dinner. For WSD, WordNet is the go-to resource as the most comprehensive lexical database for the English language.

Listening is not the same as hearing

Voice interaction will be increasingly necessary as we create more and more keyboard-less devices such as wearables, robots, AR/VR displays, self-driving cars, and Internet of Things (IoT) devices. ). This will require something more robust than the scripted pseudo-intelligence that digital assistants offer today. We will need digital attendants who speak, listen, explain, adapt and understand context – intelligent agents.

Not so long ago, voice recognition was so bad that we were surprised when it worked, but now it’s so good that we’re surprised when it doesn’t. Over the past five years, speech recognition has improved at an annual rate of 15-20% and is approaching the accuracy with which humans recognize speech. There are three main engines at work here.

First, teaching a computer to understand speech requires data samples, and the amount of data samples has increased 100 times, as extracted search engine data is increasingly the source.

Second, new algorithms have been developed, called deep neural networks, which are particularly well suited to recognize patterns in ways that mimic the human brain.

Finally, recognition technologies have moved from a single device to the cloud, where large sets of data can be kept, and where computing cores and memory are nearly infinite. And although sending speech over a network can delay the response, latencies in mobile networks are decreasing.

The results? My kids are increasingly talking to their smartphones, using digital assistants to ask for directions, ask for information, find a TV show to watch, and message friends.

Talking doesn’t make you smart

But to make interaction truly natural, machines must also make sense of speech. Today’s digital assistants seem incredibly smart, but they actually use a superficial form of understanding called intents and mentions, which detects the task the user is trying to accomplish (intent) and the properties of the task (mentions ).

Basically, the system recognizes a command phrase (usually a verb) that identifies a task area such as “call”, “set alarm for”, or “find”. If it can’t find all the necessary information in the user’s declaration, it can ask for more details in some sort of scripted dialog.

These assistants take orders well, but they’re far from a personal concierge who intuitively understands your desires and can even suggest things you wouldn’t think to ask. Today’s helpers cannot break out of the script when recognition fails. Often they cannot explain their own suggestions. They cannot anticipate problems and offer alternatives. They rarely take the initiative.

You have to explain everything to a digital assistant, and even then you might not get what you want. Soon we will stop being amazed by their imitation of intelligence and start demanding real intelligence.

What would that look like? What would a truly intelligent, conversational and functioning agent be like in the next IoT revolution as we move towards the stage of augmented innovation, where wearable devices, autonomous vehicles, robots and on-board devices will abound?

Intelligent agents can talk, listen and hear

In enthusiasts, an intelligent agent is an artificial intelligence (AI) capable of making decisions based on past experiences. Among consumers, an intelligent agent would need a few additional qualities.

From the conversation: The understanding of the language should be less superficial than what we have today. Computers can easily lack intention or become confused and fall back on simple web searches. It’s because the system doesn’t really understand what you’re saying. If it doesn’t recognize the type of task it’s being asked to do, it doesn’t have a predefined script with which to ask for more details. A human being would be able to remedy the specific misunderstanding by saying, “I’m sorry. What kind of restaurant were you looking for?

Explanatory: With a deeper language model, a conversational system can explain why it is recommending a particular action or why it thinks something is true, just like a human can, unlike today’s “black box” recommender systems. For example, if I ask my TV for a legal drama and the system recommends the Marvel show Daredevil, I may need an explanation at the beginning because I may not know that the main character is a lawyer by day when he’s not cleaning the streets with his fists at night.

Ingenious: Human assistants are resourceful. When we detect a problem, we can work around it and offer alternatives. A deeply intelligent agent must proactively notice, for example, that the restaurant I had planned to have lunch with a colleague is closed that day for a religious holiday (this just happened to me).

Attentive: Intelligent agents must be constantly attentive. If one of my kids tells me they just ran out of milk, the agent should notice and add it to my online cart without me having to tell them.

Sociable: Intelligent agents need to be aware of my engagement with other people in my environment and know when and where not to interrupt.

Aware of the context: Social intelligence is actually a subset (but important enough to be called separately) of the broader category of contextual intelligence, which requires understanding the situation a person is in and proactively selecting services she has used in similar situations. Towards the end of dinner at the restaurant, the intelligent agent should offer to call a taxi.

Engaging: Perhaps most importantly, I want my intelligent agent to engage me and express understanding of the importance of my requests. In human conversation, a tone of urgency clashes with reactivity. Humor meets fun. Concern is greeted with suggestions. I’m not looking for a mechanical personality to replace human companionship, just a genuine conversationalist who offers a level of engagement that indicates he understands and will act on my desire for urgency, cheerfulness, or resolve.

We will soon need intelligent agents

Deep intelligence will be more important for tomorrow’s environments than today’s smartphones, as robots, self-driving cars and smart homes will need to converse, explain, reschedule and engage appropriately to the user and to the situation. The same deep learning technologies that made speech recognition amazingly accurate can achieve this.

With so much potential at your fingertips, what else would you want to see from intelligent agents?

James G. Williams