Tell, don’t show: how to teach AI

Should we teach good behaviour to Artificial Intelligence (AI) through our feedback, or should we try and tell it a set of rules explaining what good behaviour is? Both approaches have advantages and limitations, but when we tested them in a complex scenario, one of them emerged the winner.

If AI is the future, how will we tell it what we want it to do?

Artificial intelligence is capable of crunching through enormous datasets and providing us assistance in many facets of our lives. Indeed, it seems this is our future. An AI assistant may help you decide what gifts to buy for a friend, or what books to read, or who to meet, or what to do on the weekend. In the worst case, of course, this could be dystopian – AI controls us, and not the other way around, we’ve all heard that story – but in the best case, it could be incredibly stimulating, deeply satisfying, and profoundly liberating.

But an important and unsolved problem is that of specifying our intent, our goals, and our desires, for the AI system. Assuming we know what we want from the AI system (this is not always the case, as we’ll see later), how do we teach the system? How do we help the system learn what gifts might be good for a friend, what books we might like to read, the people we might like to meet, and the weekend activities we care about?

There are many parts to this problem, and many solutions. The solution ultimately depends on the context in which we’re teaching the AI, and the task we’re recruiting it to do for us. So in order to study this, we need a concrete problem. Luckily for me, Ruixue Liu decided to join us at Microsoft for an internship in which she explored a unique and interesting problem indeed. The problem we studied was how to teach an AI system to give us information about a meeting, where for some reason, we can’t see the meeting room.

Our problem: eyes-free meeting participation

When people enter a meeting room, they can typically pick up several cues: who is in the meeting? Where in the room are they? Are they seated or standing? Who is speaking? What are they doing? Research shows that not having this information can be very detrimental to meeting participation.

Unfortunately, in many modern meeting scenarios, this is exactly the situation we find ourselves in. People often join online meetings remotely without access to video, due to device limitations, poor Internet connections, or because they are engaged in parallel “eyes-busy” tasks such as driving, cooking, or going to the gym. People who are blind or low vision also describe this lack of information as a major hurdle in meetings, whether in-person or online.

We think an AI system could use cameras in meeting rooms to present this information to people who, for whatever reason, cannot see the meeting room. This information could be relayed via computer-generated speech, or special sound signals, or even through haptics. Given that the participant only has a few moments to understand this information as they join a meeting, it’s important that only the most useful information is given to the user. Does the user want to know about people’s locations? Their pose? Their clothes? What information would be useful and helpful for meeting participation?

However, what counts as ‘most useful’ varies from user to user, and context to context. One goal of the AI system is to learn this, but it can’t do so without help from the user. Here is the problem: should the user tell the system what information is most useful, by specifying a set of rules about what information they want in each scenario, or should the user give feedback to the system, saying whether or not it did a good job over the course of many meetings, with the aim of teaching it correct behaviour in the long term?

Our study, in which we made people attend over 100 meetings

Don’t worry – luckily for the sanity of our participants, these weren’t real meetings. We created a meeting simulator which could randomly generate meeting scenarios. Each simulated meeting had a set of people – we generated names, locations (within the room), poses, whether they were speaking or not, and several other pieces of information. Because we were testing eyes-free meeting participation, we didn’t visualise this information – the objective was for the user to train the system to present a useful summary of this information in audio form.

We conducted a study in which 15 participants used two approaches to ‘train’ the system to relay the information they wanted. One approach was a rule-based programming system, where the participant could specify “if this, then that”-style rules. For example, “if the number of people in the meeting is less than 5, then tell me the names of the people in the meeting”.

The other approach was a feedback-based training system (our technical approach was to use a kind of machine learning called deep reinforcement learning). In the feedback-based training system, the user couldn’t say what they wanted directly, but instead, as they went to various (simulated) meetings, the system would do its best to summarise the information. After each summary, the user provided simple positive/negative feedback, answering “yes” or “no” to the question of whether they were satisfied with the summary.

Each participant tried both systems, one after the other in randomised order. We let participants play around, test and tweak and teach the AI as much as they liked, and try out the system’s behaviour on as many simulated meetings as they liked. Many participants “attended” well over 100 meetings, with two participants choosing to attend nearly 160 meetings over the course of the experiment! Who knew meetings could be such fun!

We asked participants to fill out a few questionnaires about their experience of interacting with both systems, and we conducted follow-up interviews to talk about their experience, too.

Results

Participants reported a significantly lower cognitive load and higher satisfaction when giving the system rules, than giving feedback. Thus, it was easier and more satisfactory to tell the AI how to behave, than to show it how to behave through feedback.

Rule-based programming gave participants a greater feeling of control and flexibility, but some participants found it hard at the beginning of the experiment to formulate rules from scratch. Participants also found it hard to understand how different rules worked together, and whether conflicting rules had an order of precedence (they did not).

Feedback-based teaching was seen by participants as easier, but much more imprecise. There were instances where the system did something almost correct, but because the user could only say whether the behaviour was good or bad, they did not have the tools to give more nuanced feedback to the system. Moreover, people don’t just know their preferences, they figure them out over time. With feedback-based teaching, participants worried that they were ‘misleading’ the system with poor feedback at the early stages of training, while they were still figuring out what their preferences were.

Conclusion

Based on our results, we would recommend a rule-based programming interface. But as explained, we found several advantages and limitations to both approaches. In both cases, we found that the first step was for the human to figure out what they wanted from the system! This is hard if the user doesn’t have a clear idea of what the system can and can’t do; our first recommendation is for system designers to make this clear.

Our participants also had a hard time in both cases expressing their preferences exactly: with rules, it was because the rule-based programming language was complex, and with feedback-based teaching, it was because yes/no feedback isn’t precise enough. Our second recommendation is to make clear to users what actions they need to take to specify certain preferences.

Finally, it was difficult for participants to understand the system they finally trained; it was difficult to know what rules would apply in certain scenarios, and they also found the feedback-trained system to be unpredictable. Our third recommendation is to provide more information as to why the system does what it does in certain scenarios.

In the future, we should consider blending the two approaches, to get the best of both worlds. For example, the feedback-based system could be used to generate candidate rules, to help users form a better idea of their preferences, or detect hard-to-specify contexts. Rule-based systems could help define context, explain behaviour learnt by the system, and provide a way for specifying and editing information not captured by the feedback-trained system. We aren’t sure what this might look like, but we’re working on it. Until then, let’s aim to tell, and not show, what we want our AI to do.

Here’s a summary poem:

Yes, no, a little more
What do you want?
I can do this, this, and this
But that I can’t

Tell me and I’ll show you
What you can’t see
I’ll do my best to learn from
What you tell me

Want to learn more? Read our study here (click to download PDF), and see the publication details below:

Liu, Ruixue, Advait Sarkar, Erin Solovey, and Sebastian Tschiatschek. “Evaluating Rule-based Programming and Reinforcement Learning for Personalising an Intelligent System.” In IUI Workshops. 2019. http://ceur-ws.org/Vol-2327/#ExSS

People reluctant to use self-driving cars, survey shows

Autonomous vehicles are going to save us from traffic, emissions, and inefficient models of car ownership. But while songs of praise for self-driving cars are regularly sung in Silicon Valley, does the public really want them?

That’s what my student Charlie Hewitt, and collaborators Ioannis Politis and Theocharis Amanatidis set out to study. We decided to conduct a public opinion survey to find out.

However, we first had to solve two problems.

  1. When Charlie started his work, there were no existing surveys designed specifically around autonomous vehicles. We had some surveys for technology acceptance in general, and some for cars, which are a good start. So we combined those and introduced some additional information. This resulted in the creation of a new survey designed specifically for autonomous vehicles. We called it the Autonomous Vehicle Acceptance Model, or AVAM for short.
  2. When people think of self-driving cars, they generally picture a futuristic pod with no steering wheel or controls, that they just step into and get magically transported to their destination. However, the auto industry differentiates between six levels of autonomy. Previous studies had attempted to get people’s attitudes to each of these levels, but it turns out people can’t picture these different levels of autonomy very well, and don’t understand how they differ. So, Charlie created short descriptions to explain the differences between them. These vignettes are a key part of the AVAM, because they help the general public understand the implications of different levels of autonomy.

Here are the six levels of autonomous vehicles as described in our survey:

  • Level 0: No Driving Automation. Your car requires you to fully control steering, acceleration/deceleration and gear changes at all times while driving. No autonomous functionality is present.
  • Level 1: Driver Assistance. Your car requires you to control steering and acceleration/deceleration on most roads. On large, multi-lane highways the vehicle is equipped with cruise-control which can maintain your desired speed, or match the speed of the vehicle to that of the vehicle in front, autonomously. You are required to maintain control of the steering at all times.
  • Level 2: Partial Driving Automation. Your car requires you to control steering and  acceleration/deceleration on most roads. On large, multi-lane highways the vehicle is equipped with cruise-control which can maintain your desired speed, or match the speed of the vehicle to that of the vehicle in front, autonomously. The car can also follow the highway’s lane markings and change between lanes autonomously, but may require you to retake control with little or no warning in emergency situations.
  • Level 3: Conditional Driving Automation. Your car can drive partially autonomously on large, multi-lane highways. You must manually steer and accelerate/decelerate when on minor roads, but upon entering a highway the car can take control and steer, accelerate/decelerate and switch lanes as appropriate. The car is aware of potential emergency situations, but if it encounters a confusing situation which it cannot handle autonomously then you will be alerted and must retake control within a few seconds. Upon reaching the exit of the highway the car indicates that you must retake control of the steering and speed control.
  • Level 4: High Driving Automation. Your car can drive fully autonomously only on large, multi-lane highways. You must manually steer and accelerate/decelerate when on minor roads, but upon entering a highway the car can take full control and can steer, accelerate/decelerate and switch lanes as appropriate. The car does not rely on your input at all while on the highway. Upon reaching the exit of the highway the car indicates that you must retake control of the steering and speed control.
  • Level 5: Full Driving Automation. Your car is fully autonomous. You are able to get into the car and instruct it where you would like to travel to, the car then carries out your desired route with no further interaction required from you. There are no steering or speed controls as driving occurs without any interaction from you.

Before you read on, think about each of those levels. What do you think are the advantages and disadvantages of each? Which would you be comfortable with and why?

We sent our survey to 187 drivers recruited from across the USA, and here’s what we found:

Result 1: our respondents were not ready to accept autonomous vehicles.

We found that on many measures, people report a lower acceptance of higher automation levels. People perceive higher autonomy levels as being less safe, they report lower intent to use them, and higher anxiety with higher autonomy levels.

We compared some of the results with those from an earlier study, conducted in 2014. We had to make some simplifying assumptions, as the 2014 study wasn’t conducted with the AVAM. However, we still found that our results were mostly similar: both studies found that people (unsurprisingly) expected to have to do less as the level of autonomy increased. Both studies also found that people showed lower intent to use higher autonomy vehicles, and poorer general attitude towards higher autonomy. Self-driving cars seem to be suffering in public opinion!

Result 2: the biggest leap in user perception comes with full autonomy.

We asked people how much they would expect to have to use their hands, feet and eyes while using a vehicle at each level of autonomy. Even though vehicles at the intermediate levels of autonomy (3 and 4) can do significantly more than levels 1 and 2, people did not perceive the higher levels as requiring significantly less engagement. However, at level 5 (full autonomy), there was a dramatic drop in expected engagement. This was an interesting and new finding (albeit not entirely surprising). One explanation for this is that people only really perceive two levels of autonomy: partial and full, and don’t really care about the minor differences in experience with different levels of partial autonomy.

All in all, we were fascinated to learn about people’s attitudes to self-driving cars. Despite the enthusiasm displayed by the tech media, there seems to be a consistent concern around their safety and reluctance to adopt amongst the general public. Even if self-driving cars really do end up being safer and better in many other ways than regular cars, automakers will still face this challenge of public perception.

And now, a summary poem:

The iron beast has come alive,
We do not want it, do not want it
Its promises we do not prize
It does not do as we see fit

Only when we can rely
On iron beast with its own eye
Only then will we concede
And disaffection yield to need

If you’re interested in using our questionnaire or our data, please reach out! I’d love to help you build on our research.

Want to learn more about our study? Read it here (click to download PDF) or see the publication details below:

Charlie Hewitt, Ioannis Politis, Theocharis Amanatidis, and Advait Sarkar. 2019. Assessing public perception of self-driving cars: the autonomous vehicle acceptance model. In Proceedings of the 24th International Conference on Intelligent User Interfaces (IUI ’19). ACM, New York, NY, USA, 518-527. DOI: https://doi.org/10.1145/3301275.3302268