Where Archaeology, Physics, and Artificial Intelligence Meet

Author: Dr Marc Tanti

Ancient Egypt is famous for the mummies of Pharaohs, but did you know that there are many mummified animals? Studying them offers scientists a wealth of knowledge on the method and motivation behind this practice. But mummies are fragile artefacts, and museum curators don’t generally appreciate archaeologists dissecting their specimens. To get around this, X-rays help researchers peek inside the mummies without damaging them.

After finishing my Ph.D. in artificial intelligence, I started working as a research support officer at the University of Malta on a collaborative project with archaeologists at the European Synchrotron Radiation Facility (ESRF), a research institute in France. These archaeologists are studying animal mummies from museum collections, such as the Museum of Grenoble, in order to learn about their structure. This institute is better known for its particle accelerator, which sets electrons flying at nearly the speed of light to understand the shape of drugs and other molecules. So, what’s the link with mummies?

Continue reading

We’re exploring Here!

If you had a rich malleable canvas that could flip rules on their heads and expose truths we take for granted, wouldn’t you use it? Jasper Schellekens writes about the games delving deep into some of our most challenging philosophical questions.

The famous Chinese philosopher Confucius once said, ‘I hear and I forget. I see and I remember. I do and I understand.’ Confucius would have likely been a miserable mystic in modern mainstream education which demands that students sit and listen to teachers. But it’s not all bad. Technological advancements have brought us something Confucius could never have dreamed of: digital worlds. 

A digital world offers interaction within the boundaries of a created environment. It allows you to do things, even if the ‘thing’ amounts to little more than pressing a key. Research at the Institute of Digital Games (IDG) focuses on developing a deeper understanding of how these concepts can be used to teach through doing by looking at people interact with gameworlds, studying how games can impact them (Issue 24), and designing games that do exactly that. 

Doing it digital 

Two millennia later, John Dewey, one of the most prominent American scholars of the 20th century, proposed an educational reform that focused on learning through doing and reflection instead of the ‘factory model’ that was the norm. Dewey’s idea was embraced, and has become a pedagogical tool in many classrooms, now known as experiential learning.

Let’s not pretend that Confucius was thousands of years ahead of his time—after all, apprenticeships have always been an extremely common form of learning. But what if we were to transplant this method of experimentation, trial and error, into a digital world?

It would allow us to do so much! And we’re talking about more than figuring out how to plug in to Assassin’s Creed’s tesseract or getting the hang of swinging through New York City as Spiderman. While these are valuable skills you don’t want to ignore, what we’re really interested in here are virtual laboratories, space simulations, and interactive thought experiments.

Games make an ideal vehicle for experiential learning precisely because they provide a safe and relatively inexpensive digital world for students to learn from.

Think of the value of a flight simulator to train pilots. The IDG applied the same idea to create a virtual chemistry lab for the Envisage Project. They threw in the pedagogical power tools of fun and competition to create what’s known as serious games.

Serious games are at the heart of many of the IDG’s research projects. eCrisis uses games for social inclusion and teaching empathy. iLearn facilitates the learning process for children with dyslexia and Curio is developing a teaching toolkit to foster curiosity. However, the persuasive power of videogames stretches further than we might think.

In a videogame world, players take intentional actions based on the rules set by the creators. These ‘rules’ are also referred to as ‘game mechanisms’. Through these rules, and experiential learning, players can learn to think in a certain, often conventional, way.

Which brings us to HERE.

Prof. Stefano Gualeni is fond of using games to criticise conventions: in Necessary Evil a player takes on the role of an NPC (Non Player Character) monster, in Something Something Soup Something the definition of soup is questioned, while in HERE Gualeni breaks down what ‘here’ means in a digital world.

What’s Here?  

HERE sees the player explore the philosophical concept of ‘indexicality’, the idea that meanings depend on the context in which they occur. A fitting example is the extended index finger, which means different things depending on where it is placed and what movement it makes. Point one way or another to indicate direction, place over the lips to request silence, or shake it from side to side to deny or scold. 

The game explores the word ‘here’ in the digital world. It sheds light on how much we take for granted, and how a lot of concepts are not as straightforward as we think. 

HERE you play as ‘Wessel the Adventurer’, a cat of acute perception that is sent on a quest by a wizard to find magic symbols and open an enchanted cave. Playing on the tropes of role-playing games, the expectations of the adventurer are thus framed in a conventional manner, but not everything is as it seems.

By subverting players’ expectations of role-playing games, they will have the opportunity to discover what they have been (perhaps unwittingly) taught. They will be confronted with a puzzle involving the many versions of ‘here’ that can co-exist in a digital world. Among their prizes is Gualeni himself performing a philosophical rap. 

Explorable Explanations 

Experiential learning isn’t the only way to learn, but video games, with their interactivity and ability to manipulate the gameworld’s rules with ease, offer a ripe environment for it. The digital realm adds a very malleable layer of possibility for learning through doing and interacting with philosophical concepts. HERE is not alone in this approach. 

Words often fall short of the concepts they are trying to convey. How do you explain why people trust each other when there are so many opportunities to betray that trust? Telling people they have cognitive biases is not as effective as showing them acting on those biases.

Explorable Explanations is a collection of games curated by award-winning game developer Nicky Case that dig into these concepts through play. The Evolution of Trust is one of them, breaking down the complex psychological and social phenomena contributing to the seemingly simple concept of trust in society. Adventures in Cognitive Biases is able to show us how we are biased even when we don’t think we are. HERE delves into our understanding of language and the world around us, showing us (instead of telling us) that learning doesn’t have to be boring. Now go learn something and play HERE.   

To try the game yourself visit www.here.gua-le-ni.com

The unusual suspects

When it comes to technology’s advances, it has always been said that creative tasks will remain out of their reach. Jasper Schellekens writes about one team’s efforts to build a game that proves that notion wrong.

The murder mystery plot is a classic in video games; take Grim Fandango, L.A. Noire, and the epic Witcher III. But as fun as they are, they do have a downside to them—they don’t often offer much replayability. Once you find out the butler did it, there isn’t much point in playing again. However, a team of academics and game designers are joining forces to pair open data with computer generated content to create a game that gives players a new mystery to solve every time they play. 

Dr Antonios Liapis

The University of Malta’s Dr Antonios Liapis and New York University’s Michael Cerny Green, Gabriella A. B. Barros, and Julian Togelius want to break new ground by using artificial intelligence (AI) for content creation. 

They’re handing the design job over to an algorithm. The result is a game in which all characters, places, and items are generated using open data, making every play session, every murder mystery, unique. That game is DATA Agent.

Gameplay vs Technical Innovation 

AI often only enters the conversation in the form of expletives, when people play games such as FIFA and players on their virtual team don’t make the right turn, or when there is a glitch in a first-person shooter like Call of Duty. But the potential applications of AI in games are far greater than merely making objects and characters move through the game world realistically. AI can also be used to create unique content—they can be creative.

While creating content this way is nothing new, the focus on using AI has typically been purely algorithmic, with content being generated through computational procedures. No Man’s Sky, a space exploration game that took the world (and crowdfunding platforms) by storm in 2015, generated a lot of hype around its use of computational procedures to create varied and different content for each player. The makers of No Man’s Sky promised their players galaxies to explore, but enthusiasm waned in part due to the monotonous game play. DATA Agent learnt from this example. The game instead taps into existing information available online from Wikipedia, Wikimedia Commons, and Google Street View and uses that to create a whole new experience.

Data: the Robot’s Muse  

A human designer draws on their experiences for inspiration. But what are experiences if not subjectively recorded data on the unreliable wetware that is the human brain? Similarly, a large quantity of freely available data can be used as a stand-in for human experience to ‘inspire’ a game’s creation. 

According to a report by UK non-profit Nesta, machines will struggle with creative tasks. But researchers in creative computing want AI to create as well as humans can.

However, before we grab our pitchforks and run AI out of town, it must be said that games using online data sources are often rather unplayable. Creating content from unrefined data can lead to absurd and offensive gameplay situations. Angelina, a game-making AI created by Mike Cook at Falmouth University created A Rogue Dream. This game uses Google Autocomplete functions to name the player’s abilities, enemies, and healing items based on an initial prompt by the player. Problems occasionally arose as nationalities and gender became linked to racial slurs and dangerous stereotypes. Apparently there are awful people influencing autocomplete results on the internet. 

DATA Agent uses backstory to mitigate problems arising from absurd results. A revised user interface also makes playing the game more intuitive and less like poring over musty old data sheets. 

So what is it really? 

In DATA Agent, you are a detective tasked with finding a time-traveling murderer now masquerading as a historical figure. DATA Agent creates a murder victim based on a person’s name and builds the victim’s character and story using data from their Wikipedia article.

This makes the backstory a central aspect to the game. It is carefully crafted to explain the context of the links between the entities found by the algorithm. Firstly, it serves to explain expected inconsistencies. Some characters’ lives did not historically overlap, but they are still grouped together as characters in the game. It also clarifies that the murderer is not a real person but rather a nefarious doppelganger. After all, it would be a bit absurd to have Albert Einstein be a witness to Attila the Hun’s murder. Also, casting a beloved figure as a killer could influence the game’s enjoyment and start riots. Not to mention that some of the people on Wikipedia are still alive, and no university could afford the inevitable avalanche of legal battles.

Rather than increase the algorithm’s complexity to identify all backstory problems, the game instead makes the issues part of the narrative. In the game’s universe, criminals travel back in time to murder famous people. This murder shatters the existing timeline, causing temporal inconsistencies: that’s why Einstein and Attila the Hun can exist simultaneously. An agent of DATA is sent back in time to find the killer, but time travel scrambles the information they receive, and they can only provide the player with the suspect’s details. The player then needs to gather intel and clues from other non-player characters, objects, and locations to try and identify the culprit, now masquerading as one of the suspects. The murderer, who, like the DATA Agent, is from an alternate timeline, also has incomplete information about the person they are impersonating and will need to improvise answers. If the player catches the suspect in a lie, they can identify the murderous, time-traveling doppelganger and solve the mystery!

De-mystifying the Mystery 

The murder mystery starts where murder mysteries always do, with a murder. And that starts with identifying the victim. The victim’s name becomes the seed for the rest of the characters, places, and items. Suspects are chosen based on their links to the victim and must always share a common characteristic. For example, Britney Spears and Diana Ross are both classified as ‘singer’ in the data used. The algorithm searches for people with links to the victim and turns them into suspects. 

But a good murder-mystery needs more than just suspects and a victim. As Sherlock Holmes says, a good investigation is ‘founded upon the observation of trifles.’ So the story must also have locations to explore, objects to investigate for clues, and people to interrogate. These are the game’s ‘trifles’ and that’s why the algorithm also searches for related articles for each suspect. The related articles about places are converted into locations in the game, and the related articles about people are converted into NPCs. Everything else is made into game items.

The Case of Britney Spears 

This results in games like “The Case of Britney Spears” with Aretha Franklin, Diana Ross, and Taylor Hicks as the suspects. In the case of Britney Spears, the player could interact with NPCs such as Whitney Houston, Jamie Lynn Spears, and Katy Perry. They could also travel from McComb in Mississippi to New York City. As they work their way through the game, they would uncover that the evil time-traveling doppelganger had taken the place of the greatest diva of them all: Diana Ross.

Oops, I learned it again 

DATA Agent goes beyond refining the technical aspects of organising data and gameplay. In the age where so much freely available information is ignored because it is presented in an inaccessible or boring format, data games could be game-changing (pun intended). 

In 1985, Broderbund released their game Where in the World is Carmen Sandiego?, where the player tracked criminal henchmen and eventually mastermind Carmen Sandiego herself by following geographical trivia clues. It was a surprise hit, becoming Broderbund’s third best-selling Commodore game as of late 1987. It had tapped into an unanticipated market, becoming an educational staple in many North American schools. 

Facts may have lost some of their lustre since the rise of fake news, but games like Where in the World is Carmen Sandiego? are proof that learning doesn’t have to be boring. And this is where products such as DATA Agent could thrive. After all, the game uses real data and actual facts about the victims and suspects. The player’s main goal is to catch the doppelganger’s mistake in their recounting of facts, requiring careful attention. The kind of attention you may not have when reading a textbook. This type of increased engagement with material has been linked to improving information retention.In the end, when you’ve traveled through the game’s various locations, found a number of items related to the murder victim, and uncovered the time-travelling murderer, you’ll hardy be aware that you’ve been taught.

‘Education never ends, Watson. It is a series of lessons, with the greatest for the last.’ – Sir Arthur Conan Doyle, His Last Bow. 

Wall-E: Ta-dah!

Think sci-fi, think robots. Whether benevolent, benign, or bloodthirsty, these artificially-intelligent automatons have long captured our imagination. However, thanks to recent advances in mechanical and programming technology, it looks like they are set to break the bonds of fiction.Continue reading

Designing the factory of the future

With consumer demand reaching new highs, automation in industry is essential. Dr Ing. Emmanuel Francalanza writes about his contribution to streamlining the complex factory design process for contemporary engineers.

From smartphones, to smartwatches, smart cars to smart houses, intelligent technology is inescapable. Busier people have made efficiency a valuable currency and thus the ‘Internet of Things’ (IoT), and its plethora of connected devices, has become a necessary part of everyday life. The application of this model goes way beyond the regular consumer, however. Continue reading

Google’s deepest dreams and nightmares

By Ryan Abela

In Artificial Intelligence, neural networks have always fascinated me. Based on biological concepts similar to human brains, artificial neural networks consist of very simple mathematical functions connected to each other through a set of variable parameters. These networks can solve problems that range from mathematical equations to more abstract concepts such as detecting objects in a photo or recognising someone’s voice.

Continue reading

An Automatically Tailored Experience

Digital games need to keep players engaged. Since games are interactive media, achieving this goal means that game designers need to anticipate player actions to create a pre-designed experience. Traditionally, developers have achieved this by restricting player freedom to a strict set of actions thereby curating player experience and ensuring the fun factor. However, games are taking a different route with more users making their own content (User Generated Content, UGC) through extensive creativity tools which make it hard to predict player experience.

Vincent E. Farrugia
Vincent E. Farrugia

To overcome these challenges Vincent E. Farrugia (supervised by Prof. Georgios N. Yannakakis), merged game design and artificial intelligence (AI). He developed a software framework for handling player engagement in environments which feature user generated content and groups. The three pronged solution tackles problems during game production, playing the game itself, and making sure the framework is sustainable. To maintain engagement within groups he analysed data for a particular person within the group but also patterns common across the whole group. Farrugia created software tools, autonomous AI aids, and tools to test and support the framework.

The software framework is made up of inter-operating modules. Firstly, an engagement policy module allows designers to specify theories to express their vision of positive game engagement. Player modelling then shapes this backbone to specific player engagement needs. The module can autonomously learn from player creations as reactions to game stimuli. Individual and group manager modules use this mixture of expert knowledge, AI learnt data, and player game-play history to automatically adapt game content to solve player engagement problems. This procedural content generation (PCG) is tailored for a specific player and time.

The framework’s abilities were showcased in a digital game also developed by Farrugia. Various technologies were incorporated to encourage player creativity in group sessions and to enhance networking. The setup also allowed the AI to quickly learn from each player via parallelism. Initial testing used a simulated environment with software agents. Preliminary testing on real players followed. The simulation was through a personality system to validate the underlying algorithms under various conditions. The resulting diverse game-play styles provide suggestions for AI model improvement. Farrugia is enthusiastic about future work for this AI framework and giving developers better tools to allow player creativity to flourish while maintaining positive game-play experiences. 


This research was performed as part of a Master of Science degree at the Institute of Digital Games, University of Malta. It was partly funded by the Strategic Educational Pathways Scholarship (Malta), which is part-financed by the European Union—European Social Fund (ESF) under Operational Programme II—Cohesion Policy 2007—2013, ‘Empowering People for More Jobs and a Better Quality of Life’.

Decoding Language

AlbertgattGordonPaceMikeRosner

Maltese needs to be saved from digital extinction. Dr Albert Gatt, Prof. Gordon Pace, and Mike Rosner write about their work making digital tools for Maltese, interpretting legalese, and making a Maltese-speaking robot

In 2011 an IBM computer called Watson made the headlines after it won an American primetime television quiz called Jeopardy. Over three episodes the computer trounced two human contestants and won a million dollars.

Jeopardy taps into general world knowledge, with contestants being presented with ‘answers’ to which they have to find the right questions. For instance, one of the answers, in the category “Dialling for Dialects”, was: While Maltese borrows many words from Italian, it developed from a dialect of this Semitic language. To which Watson correctly replied with: What is Arabic?

Watson is a good example of state of the art technology that can perform intelligent data mining, sifting through huge databases of information to identify relevant nuggets. It manages to do so very efficiently by exploiting a grid architecture, which is a design that allows it to harness the power of several computer processors working in tandem.

“Maltese has been described as a language in danger of ‘digital extinction’”

This ability alone would not have been enough for it to win an American TV show watched by millions. Watson was so appealing because it used English as an American would.

Consider what it takes for a machine to understand the above query about Maltese. The TV presenter’s voice would cause the air to vibrate and hit the machine’s microphones. If Watson were human, the vibrations would jiggle the hairs inside his ear so that the brain would then chop up the component sounds and analyse them into words extremely rapidly. The problem for a computer is that there is more to language than just sounds and words. A human listener would need to do much more. For example, to figure out that ‘it’ in the question probably refers to ‘Maltese’ (rather than, say, ‘Italian’, which is possible though unlikely in this context). They would also need to figure out that ‘borrow’ is being used differently than when one says borrowing one’s sister’s car. After all, Maltese did not borrow words from Italian on a short-term basis. Clearly the correct interpretation of ‘borrow’ depends on the listener having identified the intended meaning of ‘Maltese’, namely, that it is a language. Watson was equipped with Automatic Speech Recognition technology to do exactly that.

To understand language any listener needs to go beyond mere sound. There are meanings and structures throughout all language levels. A human listener needs to go through them all before saying that they understood the message.

Watson was not just good at understanding; he was pretty good at speaking too. His answers were formulated in a crisp male voice that sounded quite natural, an excellent example of Text-to-Speech synthesis technology. In a fully-fledged human or machine communicating system, going from text to speech requires formulating the text of the message. The process could be thought of as the reverse of understanding, involving much the same levels of linguistic processing.

 

Machine: say ‘hello’ to Human

The above processes are all classified as Human Language Technology, which can be found in many devices. Human Language Technology can be found everywhere from Siri or Google Now in smart phones to a word processing program that can spell, check grammar, or translate.

Human-machine interaction relies on language to become seamless. The challenge for companies and universities is that, unlike artificial languages (such as those used to program computers or those developed by mathematicians), human languages are riddled with ambiguity. Many words and sentences have multiple meanings and the intended sense often depends on context and on our knowledge of the world. A second problem is that we do not all speak the same language.

 

Breaking through Maltese

Maltese has been described as a language in danger of ‘digital extinction’. This was the conclusion of a report by META-NET, a European consortium of research centres focusing on language technology. The main problem is a lack of Human Language Technology — resources like word processing programs that can correctly recognise Maltese.

Designing an intelligent computer system with a language ability is far easier in some languages than it is in others. English was the main language in which most of these technologies were developed. Since researchers can combine these ready-made software components instead of developing them themselves, it allows them to focus on larger challenges, such as winning a million dollars on a TV program. In the case of smaller languages, like Maltese, the basic building blocks are still being assembled.

Perhaps the most fundamental building block for any language system is linguistic data in a form that can be processed automatically by a machine. In Human Language Technology, the first step is usually to acquire a corpus, a large repository of text or speech, in the form of books, articles, recordings, or anything else that happens to be available in the correct form. Such repositories are exploited using machine-learning techniques, to help systems grasp how the language is typically used. To return to the Jeopardy example, there are now programs that can resolve pronouns such as ‘it’ to identify their antecedents, the element to which they refer. The program should identify that ‘it’ refers to Maltese.

For the Maltese language, researchers have developed a large text/speech repository, electronic lexicons (language’s inventory of its basic units of meaning), and related tools to analyse the language (available for free). Automatic tools exist to annotate this text with basic grammatical and structural information. These tools require a lot of manual work however, once in place, they allow for the development of sophisticated programs. The rest of this article will analyse some of the on-going research using these basic building blocks.

 

From Legalese to Pets

Many professions benefit from automating tasks using computers. Lawyers and notaries are the next professionals that might benefit from an ongoing project at the University of Malta. These experts draft contracts on a daily basis. For them, machine support is still largely limited to word processing, spell checking, and email services, with no support for a deeper analysis of the contracts they write and the identification of their potential legal consequences, partly through their interaction with other laws.

Contracts suffer from the same challenges when developing Human Language Technology resources. A saving grace is that they are written in ‘legalese’ that lessens some problems. Technology has advanced enough to allow the development of tools that analyse a text to enable extraction of information about the basic elements of contracts, leaving the professional free to analyse the deeper meaning of these contracts.

Deeper analysis is another big challenge in contract analysis. It is not restricted to just identifying the core ‘meaning’ or message, but needs to account the underlying reasoning behind legal norms. Such reasoning is different from traditional logic, since it talks about how things should be as opposed to how they are. Formal logical reasoning has a long history, but researchers are still trying to identify how one can think precisely about norms which affect definitions. Misunderstood definitions can land a person in jail.

Consider the following problem. What if a country legislates that:Every year, every person must hand in Form A on 1st January, and Form B on 2nd January, unless stopped by officials.’  Exactly at midnight between the 1st and 2nd of January the police arrest John for not having handed in Form A. He is kept under arrest until the following day, when his case is heard in court. The prosecuting lawyer argues that John should be found guilty because, by not handing in Form A on 1st January he has violated the law. The defendant’s lawyer argues that, since John was under arrest throughout the 2nd of January he was being stopped by officials from handing in Form B, absolving him of part of his legal obligation. Hence, he is innocent. Who is right? If we were to analyse the text of the law logically, which version should be adopted? The logical reasoning behind legal documents can be complicated, which is precisely why tools are needed to support lawyers and notaries who draft such texts.

Figuring out legal documents might seem very different to what Watson was coping with. But there is an important link: both involve understanding natural language (normal every day language) for something, be it computer, robot, or software, to do something specific. Analysing contracts is different because the knowledge required involves reasoning. So we are trying to wed recent advances in Human Language Technology with advances in formal logical reasoning.

Illustration by Sonya Hallett
Illustration by Sonya Hallett

Contract drafting can be supported in many ways, from a simple cross-referencing facility, enabling an author to identify links between a contract and existing laws, to identifying conflicts within the legal text. Since contracts are written in a natural language, linguistic analysis is vital to properly analyse a text. For example in a rent contract when making a clause about keeping dogs there would need to be a cross-reference to legislation about pet ownership.

We (the authors) are developing tools that integrate with word processors to help lawyers or notaries draft contracts. Results are presented as recommendations rather than automated changes, keeping the lawyer or notary in control.

 

Robots ’R’ Us

So far we have only discussed how language is analysed and produced. Of course, humans are not simply language-producing engines; a large amount of human communication involves body language. We use gestures to enhance communication — for example, to point to things or mime actions as we speak — and facial expressions to show emotions. Watson may be very clever indeed, but is still a disembodied voice. Imagine taking it home to meet the parents.

“Robby the Robot from the 1956 film Forbidden Planet, refused to obey a human’s orders”

Robotics is forging strong links with Human Language Technology. Robots can provide bodies for disembodied sounds allowing them to communicate in a more human-like manner.

Robots have captured the public imagination since the beginning of science fiction. For example, Robby the Robot from the 1956 film Forbidden Planet, refused to obey a human’s orders, a key plot element. He disobeyed because they conflicted with ‘the three laws of robotics’, as laid down by Isaac Asimov in 1942. These imaginary robots look somewhat human-shaped and are not only anthropomorphic, but they think and even make value judgements.

Actual robots tend to be more mundane. Industry uses them to cut costs and improve reliability. For example, the Unimate Puma, which was designed in 1963, is a robotic arm used by General Motors to assemble cars.

The Unimate Puma 200
The Unimate Puma 200

The Puma became popular because of its programmable memory, which allowed quick and cheap reconfiguration to handle different tasks. But the basic design was inflexible to unanticipated changes inevitably ending in failure. Current research is closing the gap between Robby and Puma.

Opinions may be divided on the exact nature of robots, but three main qualities define a robot: one, a physical body; two, capable of complex, autonomous actions; and three, able to communicate. Very roughly, advances in robotics push along these three highly intertwined axes.

At the UoM we are working on research that pushes forward all three, though it might take some time before we construct a Robby 2. We are developing languages for communicating with robots that are natural for humans to use, but are not as complex as natural languages like Maltese. Naturalness is a hard notion to pin down. But we can judge that one thing is more or less natural than another. For example, the language of logic is highly unnatural, while using a restricted form of Maltese would be more natural. It could be restricted in its vocabulary and grammar to make it easier for a robot to handle.

Edited Lego copyTake the language of a Lego EV3 Mindstorms robot and imagine a three-instruction program. The first would be to start its motors, the second to wait until light intensity drops to a specific amount, the third to stop. The reference to light intensity is not a natural way to communicate information to a robot. When we talk to people we are not expected to understand how the way we put our spoken words relates to their hardware. The program is telling the robot to: move forward until you reach a black line. Unlike the literal translation, this more natural version employs concepts at a much higher level and hence is accessible to anybody with a grasp of English.

The first step is to develop programs that translate commands spoken by people into underlying machine instructions understood by robots. These commands will typically describe complex physical actions that are carried out in physical space. Robots need to be equipped with the linguistic abilities necessary to understand these commands, so that we can tell a robot something like ‘when you reach the door near the table go through it’.

To develop a robot that can understand this command a team with a diverse skillset is needed. Language, translation, the robot’s design and movement, ability to move and AI (Artificial Intelligence) all need to work together. The robot must turn language into action. It must know that it needs to go through the door, not through the table, and that it should first perceive the door and then move through it. A problem arises if the door is closed so the robot must know what a door is used for, how to open and close it, and what the consequences are. For this it needs reasoning ability and the necessary physical coordination. Opening a door might seem simple, but it involves complex hand movements and just the right grip. Robots need to achieve complex behaviours and movements to operate in the real world.

The point is that a robot that can understand these commands is very different to the Puma. To build it we must first solve the problem of understanding the part of natural language dealing with spatially located tasks. In so doing the robot becomes a little bit more human.

A longer-term aim is to engage the robot in two-way conversation and have it report on its observations — as Princess Leia did with RT-D2 in Star Wars, if RT-D2 could speak.

Lego Mindstorms EV3 brick
Lego Mindstorms EV3 brick

Language for the World

Human Language Technologies are already changing the world. From automated announcements at airports, to smartphones that can speak back to us, to automatic translation on demand. Human Language Technologies help humans interact with machines and with each other. But the revolution has only just begun. We are beginning to see programs that link language with reasoning, and as robots become mentally and physically more adept the need to talk with them as partners will become ever more urgent. There are still a lot of hurdles to overcome.

To make the right advances, language experts will need to work with engineers and ICT experts. Then having won another million bucks on a TV show, a future Watson will get up, shake the host’s hand, and maybe give a cheeky wink to the camera.