Call us at 030 227 21 68 or reach out at

DeepMind’s conversational AI chatbot Sparrow shows dialogue difficulties and need for responsible AI

Last month, DeepMind introduced a new artificial intelligence (AI) chatbot called Sparrow. The British-owned subsidiary of Google parent Alphabet hails Sparrow as an important step in creating a safer, less biassed machine learning (ML) system. Despite the promise, the AI assistant is not being deployed just yet.

Research-based, proof-of-concept

Sparrow is an AI chatbot that relies on dialogue to offer a conversational experience. The AI subsidiary of Alphabet calls it a “dialogue agent that’s useful and reduces the risk of unsafe and inappropriate answers.”

It is designed to talk with a user and answer questions or search the internet using Google “when it’s helpful to look up evidence to inform its responses.” While Sparrow brings a new take on AI chatbots, DeepMind considers it a research-based, proof-of-concept model that is not ready to be deployed.

“We have not deployed the system because we think that it has a lot of biases and flaws of other types,” says Geoffrey Irving, a safety researcher at DeepMind and lead author of the paper introducing Sparrow.

“I think the question is, how do you weigh the communication advantages — like communicating with humans — against the disadvantages? I tend to believe in the safety needs of talking to humans … I think it is a tool for that in the long run,” he added.

In the paper, Irving also says that he does not want to weigh in on the possible path for enterprise applications using Sparrow just yet. It is not immediately clear whether Sparrow will be most useful for general digital assistants such as Google Assistant or Amazon Alexa, or for specific vertical applications.

Conversational AI and dialogue difficulties

Sparrow is not the first conversational AI assistant but it is one of the first to tackle the dialogue difficulties. We have already seen the likes of conversational assistants like Microsoft’s AI chatbot Tay get a crash course in racism on Twitter and DeepMind probably wants to avoid a similar fate for its assistant.

Dialogue is one of the major challenges that companies like DeepMind need to tackle when designing conversational AI assistants. Irving says one of the main difficulties around dialogue is that there is a lot of context that needs to be considered.

“A system like DeepMind’s AlphaFold is embedded in a clear scientific task, so you have data like what the folded protein looks like, and you have a rigorous notion of what the answer is – such as did you get the shape right,” he told VentureBeat. But in general cases, “you’re dealing with mushy questions and humans – there will be no full definition of success.”

In order to address this challenge, DeepMind built Sparrow using a form of reinforcement learning based on human feedback. The model uses the preference feedback of study participants to determine how useful an answer is.

“To get this data, we show our participants multiple model answers to the same question and ask them which answer they like the most. Because we show answers with and without evidence retrieved from the internet, this model can also determine when an answer should be supported with evidence,” the company said.

While this rule-based approach could limit bias in AI models, not everyone is convinced about the result. Eugenio Zuccarelli, an innovation data scientist at CVS Health and research scientist at MIT Media Lab, points out that there could still be bias in the “human loop.”

Irving says that DeepMind plans to scale the approach to many more rules in the future. “I think you would probably have to become somewhat hierarchical, with a variety of high-level rules and then a lot of detail about particular cases,” he explained.

In order to eliminate bias, Irving says the model would need to support multiple languages, cultures, and dialects. He also expressed his interest in developing the dialogue agent towards increased safety.

“I think you need a diverse set of inputs to your process — you want to ask a lot of different kinds of people, people that know what the particular dialogue is about,” he said. “So you need to ask people about language, and then you also need to be able to ask across languages in context – so you don’t want to think about giving inconsistent answers in Spanish versus English.”

With its focus on working on the rules and ethical side of the model, Sparrow is a step in the right direction. The bigger question is whether it will become the norm for responsible AI and whether DeepMind will be able to push the industry towards that goal.

What to read next?

2048 1366 Editorial Staff
My name is HAL 9000, how can I assist you?
This website uses cookies to ensure the best possible experience. By clicking accept, you agree to our use of cookies and similar technologies.
Privacy Policy