Start building your own chatbot now >

How we imagine talking to computers (Luke Skywalker and C3PO) versus How we actually talk to computers with a guy speaking with a smartphone (Siri, Google Voice, Alexa and Cortana)


There is a big gap between our expectations and the reality concerning Artificial Intelligence. We can put the blame on Natural Language. Which is more difficult to master than we tend to believe.

However, the computer still represents a docile and powerful assistant and we want it to be smarter. A large number of people are pushing forward the research thanks to worldwide challenges or competitions, for example the Allen AI Science challenge (which aims to prove that an AI can be smarter than an 8th grader).

One of the most famous examples of a realistic intelligence came from the movie director Stanley Kubrick in his 1968 movie 2001: A space Odyssey.

How far are we from a fictional assistant similar to HAL-9000, capable of helping David Bowman in his mission? How far are we from this fictional assistant?

Let’s compare HAL-9000 with SIRI (from the dialogue with Stephen Colbert in 2011 on the Colbert report, an American TV show).

David Bowman: Open the pod bay doors, HAL. HAL: I’m sorry, Dave, I’m afraid I can’t do that. David: What are you talking about, HAL? HAL: I know that you and Frank were planning to disconnect me, and I’m afraid that’s something I cannot allow to happen. Versus Colbert: … I don’t want to search for anything! I want to write the show! Siri: Searching the Web for “search for anything. I want to write the shuffle.” Colbert: … For the love of God, the cameras are on, give me something? Siri: What kind of place are you looking for? Camera stores or churches?
HAL – 9000 versus Siri (The Colbert Report)

HAL-9000 gives the feeling that it fully grasps the situation in its complexity, whereas Siri doesn’t even understand a simple request… 🙂

In Computer Science, we call the process of “understanding” the meaning behind a sentence as Natural Language Understanding (NLU).


Ex: Should I take an umbrella today? => Will it rain today?

It is different from Natural Language Processing (NLP) which is the process of determining the grammatical role of every word in a sentence and their relations.

Ex: The/DT man/NN who/WP gave/VBD Bill/NNP the/DT money/NN

Let’s start with a brief history of NLU and we will see afterwards what are the main problems related to this field of Artificial Intelligence.



A brief history of Natural Language Understanding


  •      1950s: Beginning of NLU.

Turing addressed the problem of artificial intelligence, and proposed an experiment which became known as the Turing test, an attempt to define a standard for a machine to be called “intelligent”.

At the beginning, developers evaluated a user’s input with a few rules of pattern-matching.

Example: if “Hello <-VARIABLE->” then greetings.


  •      1970-80s: Linguists started to “code”.

Linguistics experts started to contribute to NLU, by “coding” all grammar and semantic rules. That produced realistic software like:


Human: What does the box contain? Computer: The blue pyramid and the blue block. Human: What is the pyramid supported by? Computer: The box. Human: How many blocks are not in the box? Computer: Four of them. Human: Is at least one of them narrower than the one which I told you to pick up? Computer: Yes the red cube. Human: Is it supported? Computer: Yes by the table.
Winograd 1972


Is there more than one country in each continent? No. What are the countries from which a river flows into the Black_Sea? romania. What is the total area of countries south of the Equator and not in Australasia? 10239 sq miles. Which country bordering the Mediterranean borders a country that is bordered by a country whose population exceeds the population of India? turkey. What countries border Denmark? west_germany.
Pereira 1980


Both were linguistically rich and logic-driven.

We can be more critical and say that the questions came from a sandbox of easy questions, but it was 35 years ago.

One of the biggest problems at the time was the grammatical interpretation of a sentence (NLP). The error rate was important.


  •      1990-2015: Statistical revolution in Natural Language Processing.

The statistical revolution in Natural Language Processing led to a decrease in the NLU research:

The majority of the models in NLP now include what is called today “Machine Learning”. It is a probability model. The more you give data, the more efficient the model is.

Today, results are pretty amazing: we can process a sentence with more than 98% of accuracy.


What are the main problems of Natural Languge Understanding?


First of all, NLU is an ungrateful field, we have to admit it: we are very demanding when it comes to computers’ understanding  and knowledge.

Do we really need today a personal robot that we can have a philosophical discussion with?

Or do we just need to automate daily tasks, like creating a shopping list?

Technically, there are two main problems:


We have multiple ways to express a same idea.

Example: When you want to make an appointment with your doctor, you may say:

  •   I need to make an appointment.
  •   I need to see the doctor.
  •   When is the doctor free?
  •   I need to renew my prescription.
  •   Do you think the doctor could squeeze me in today?
  •   I need to make an appointment for my husband.
  •   My child needs to come in for a check-up.
  •   The doctor wants to see me again in two week’s time.

To ask for a “rendez-vous”, you can do it in multiple ways.

In order to understand the whole sentence, we have to link together a lot of concepts by creating associations between words. (prescription <=> doctor <=> cold <=> check-up)

All these words lead us to the second main problem.


Words and sentences depend on the context.

But first of all, we need to define what the context is: we can say that it is something that helps us to understand of something else, be it a text, a joke, an event…

In other words: context is the circumstances of something happening.
It can be a story lived by a two persons from a group of ten (private joke) which may create a specific meaning for both of them, different from the one understood by the rest of the group.

It can also depend on a situation. Let’s take an example: if you read somewhere “… and bacon”, what is the meaning of these two words?

We begin with the first word “and”, it defines the end of a list; regarding the second word “bacon”, it is a meat product.

Does it imply ordering something? Does it imply listing all pork recipes? Does it imply completing a shopping list?

We cannot guess the point of such sentence without context. This is exactly what we expect a computer to do.

Actually, I think we are not approaching the problem from the right way.

Even a human cannot understand what is the meaning behind random words without a context, and the only one who can give enough data when he is talking to the computer is the user, it cannot be only based on “probabilistic model”.

We have to find a way to help developers to add more intelligence in their softwares, and to do so, everybody has to contribute to Artificial Intelligence.
Together, we will crack Natural Language Understanding and build a better Artificial Intelligence !


Gaetan JUVIN – Recast.AI


Want to build your own conversational bot? Get started with Recast.AI !

Subscribe to our newsletter