Start building your own chatbot now >

When you work on conversational bots, you get to explore how robots best interact with people. Bots are meant to be a new way to communicate with machines. We can think of human-machine interaction as an input-output system. The keyboard and webcam are like the human senses, the screen is like the human face. When two entities send each other messages, they build a conversation, they communicate. For effective communication we need to introduce the importance of Dialog Management.

Human-machine dialog

Imagine a world where machines are everywhere and interactions with them are systematic. What would be the most efficient way to communicate with them? Machines talk to each other in their own way: they use binary protocols at the speed of light, they use their own language. They can share hundreds of Gigabits of information per second. In this kind of environment, wouldn’t it to be easier to learn to speak the machine’s language? Humans, while polyvalent and adaptative living beings, are used to talking to each other in their own language. Human languages are adapted to our societies, our environment and they are most probably the best languages for our brains. Could we use our own languages to interact with machines? In a world of two factions, humans and machines, one side has to learn other side’s language in order to grow. Learning a new language is time consuming and the human’s time is precious, let the machines learn our languages.

Let’s say machines could understand human sentences and convey information in our languages. Would language processing be enough to meaningfully communicate with us?

Example 1 :

John : Hello, I’d like a pizza.
Robot : I ordered you a pizza, it will be delivered in one week in Shanghai Shi, Xuhui Qu, China.

Example 2 :

John : Hello.
Robot : Hi.
John : I am with four friends at Grand Boulevard, how do we get to La Defense?
Robot : Go west and walk for two hours.

The Robot in the examples achieved what it was made for, but not in a useful way. We would like to have a pizza at home, maybe in 15 minutes. We would like to go to La Defense as soon as possible. Here the Robot is lacking information about John to fulfil his request, and John is lacking information about the robot to phrase his request better.

For the Robot to be useful to John, it has to discover what John is looking for. A better scenario would be :

Example 1 :

John : Hello, I’d like a pizza.
Robot : Hello, are you in Paris? May I ask you where you want the pizza to be delivered?
John : At 45 bis rue du Pont.
Robot : At what time would you like to eat please?
John : At 8 o’clock.
Robot : What pizza would you like? Today’s specials are Four Cheese Premium, Veggie Avocado and Norwegian in your area.
John : I’d like a pizza with beef.
Robot : Here are three pizzas with beef : BestBeef Pizza from Pizza Minute 9€, BeefForever from Sergio 12€ and Carnivore from Pizze Chef.
John : I would like the pizza from Pizze Chef please.
Robot : Let me summarize your order : one Carnivore from Pizze Chef, at 45 bis rue du Pont for 8 o’clock. Do you want to add any delivery instructions? If not, we will call you at this number.
John : Ok call me on this number.
Robot : Your order has been confirmed by Pizze Chef.

Example 2 :

John : How do I get to La Defense?
Robot : Could you tell me your location?
John : We are at Grand Boulevard.
Robot : If you want to go now, you can use the subway : Ligne A for about 2€, it will take around 30 minutes.
John : Is there a faster way?
Robot : I advise you to look at the Uber service, I can help you to book one.

The robot is trying to understand what the user wants. Ideally, the robot would read our mind which would save a lot of time. Time is a rare resource, if there are two paths to achieve the same result, the quickest path is preferable. How does this principle apply to conversation ? Let’s play the Pizza order example again for the same result, John ordered a pizza he likes, but following a different path :

John : Hello, I’d like a pizza.
Robot : Hello, are you in Paris?
John : Yes.
Robot : May I ask you where you want the pizza to be delivered?
John : At 45 bis rue du Pont.
Robot : At what time would you like to eat please?
John : At 8 o’clock.
Robot : What pizza would you like?
John : What are today’s pizza?
Robot : The today chef’s pizzas are Four Cheese Premium, Veggie Avocado and Norwegian in your area.
John : I’d like a pizza with beef.
Robot : Here is one pizza with beef in your area : BestBeef Pizza from Pizza Minute.
John : How much is it?
Robot : 9€.
John : Are there other beef pizzas?
Robot : BeefForever from Sergio.
John : Is there a beef pizza from Pizze Chef?
Robot : Yes, the Carnivore from Pizze Chef.
John : I would like the pizza from Pizze Chef please.
Robot : Let me summarize your order : one Carnivore from Pizze Chef, at 45 bis rue du Pont for 8 o’clock. Do you want to add any delivery instructions?
John : Should I give you my phone number or the door code?
Robot : You can give me the door code, if not we will contact you on this phone number.
John : Ok call me on this number.
Robot : Your order has been confirmed by Pizze Chef.

In this example, there are many more exchanges for the same result. Some users may prefer it, some users might find the discussion too long. External factors influence the type of conversation you have with an interlocutor : Is it the first time you’ve spoken to that interlocutor? Is your environment noisy? Are there any distractions? Is the conversation by text or over the phone? Are you in a crowded place? Do you trust your interlocutor? Did you just move to a new city?

Homer Simpson and Bender have common interests

Homer Simpson and Bender have common interests

A new field of AI is born (again)

The instance described in the previous section is formalized in dialog management theory and our bots are dialog systems. Dialog Management is responsible for describing the flow inside a conversation and it is a growing research field in  computer studies. One can think about Dialog Management as finite state machines, where the state transitions are based on the acquired information and the messages from the interlocutors.

The term Dialog Management isn’t new though, the paper “Combining Expression and Content in Domains for Dialog Managers” from 1998 already tried to introduce a generalization of dialog managers in order to implement dialog manager development tools. More recently, a paper from A. Hunter discusses the process of debating, arguing and persuasion using propositional executable logic. Other works propose Machine Learning algorithms in order to build more complex state machines.

In dialog systems, bots are described with different roles, such as Information Providers, Advisors, Tutors, Conversational Partners and can be used for decision-making, multi-party interaction, persuasion and conflict resolution. Likewise, parts of dialog can be abstracted :

Question then Answer
Propose then Accept/Reject/Challenge
Offer then Accept/Decline
Compliment then Refusal/Thanks
Greeting then Greeting

Furthermore, global structures can emerge from the dialogs, for example Opening with greetings, Body with topics then Closing with farewells. Finally, understanding Topic Transitions, the ability to switch contexts and subjects, is a crucial part in developing more friendly conversations.

An old real world example : Games

When you’ve played a video game, you might have chosen to  engage Non-Player Characters or NPC. They are so-called because they look like human players, but they behave in a scripted manner. In essence, they are simple bots created for one specific task. Narrative video games, which are text-based video games, offer many decision paths to the player. A good narrative game should feel coherent and unpredictable, the illusion of a virtual world actually unfolding in front of the player should be total. Often, the path taken by the player is guided by the Game Designer. They use tricks to influence players’ choices to control the path they take while maintaining the illusion of freedom of choice.

To represent the path a user can take, the dialog is compiled into a Dialog Tree. Dialog Trees are compact representations of the message sequence inside a dialog. Image from Roblox.

To represent the path a user can take, the dialog is compiled into a Dialog Tree. Dialog Trees are compact representations of the message sequence inside a dialog. Image from Roblox.

When you build a Bot for the actual world, you face new challenges : a) The world and people are complex, it may be difficult to predict all the paths they may take and b) You cannot save checkpoints to go back in time.

 

Want to build your own conversational bot? Get started with Recast.AI !

Subscribe to our newsletter


There are currently no comments.