Decoding Language

AlbertgattGordonPaceMikeRosner

Maltese needs to be saved from digital extinction. Dr Albert Gatt, Prof. Gordon Pace, and Mike Rosner write about their work making digital tools for Maltese, interpretting legalese, and making a Maltese-speaking robot

In 2011 an IBM computer called Watson made the headlines after it won an American primetime television quiz called Jeopardy. Over three episodes the computer trounced two human contestants and won a million dollars.

Jeopardy taps into general world knowledge, with contestants being presented with ‘answers’ to which they have to find the right questions. For instance, one of the answers, in the category “Dialling for Dialects”, was: While Maltese borrows many words from Italian, it developed from a dialect of this Semitic language. To which Watson correctly replied with: What is Arabic?

Watson is a good example of state of the art technology that can perform intelligent data mining, sifting through huge databases of information to identify relevant nuggets. It manages to do so very efficiently by exploiting a grid architecture, which is a design that allows it to harness the power of several computer processors working in tandem.

“Maltese has been described as a language in danger of ‘digital extinction’”

This ability alone would not have been enough for it to win an American TV show watched by millions. Watson was so appealing because it used English as an American would.

Consider what it takes for a machine to understand the above query about Maltese. The TV presenter’s voice would cause the air to vibrate and hit the machine’s microphones. If Watson were human, the vibrations would jiggle the hairs inside his ear so that the brain would then chop up the component sounds and analyse them into words extremely rapidly. The problem for a computer is that there is more to language than just sounds and words. A human listener would need to do much more. For example, to figure out that ‘it’ in the question probably refers to ‘Maltese’ (rather than, say, ‘Italian’, which is possible though unlikely in this context). They would also need to figure out that ‘borrow’ is being used differently than when one says borrowing one’s sister’s car. After all, Maltese did not borrow words from Italian on a short-term basis. Clearly the correct interpretation of ‘borrow’ depends on the listener having identified the intended meaning of ‘Maltese’, namely, that it is a language. Watson was equipped with Automatic Speech Recognition technology to do exactly that.

To understand language any listener needs to go beyond mere sound. There are meanings and structures throughout all language levels. A human listener needs to go through them all before saying that they understood the message.

Watson was not just good at understanding; he was pretty good at speaking too. His answers were formulated in a crisp male voice that sounded quite natural, an excellent example of Text-to-Speech synthesis technology. In a fully-fledged human or machine communicating system, going from text to speech requires formulating the text of the message. The process could be thought of as the reverse of understanding, involving much the same levels of linguistic processing.

 

Machine: say ‘hello’ to Human

The above processes are all classified as Human Language Technology, which can be found in many devices. Human Language Technology can be found everywhere from Siri or Google Now in smart phones to a word processing program that can spell, check grammar, or translate.

Human-machine interaction relies on language to become seamless. The challenge for companies and universities is that, unlike artificial languages (such as those used to program computers or those developed by mathematicians), human languages are riddled with ambiguity. Many words and sentences have multiple meanings and the intended sense often depends on context and on our knowledge of the world. A second problem is that we do not all speak the same language.

 

Breaking through Maltese

Maltese has been described as a language in danger of ‘digital extinction’. This was the conclusion of a report by META-NET, a European consortium of research centres focusing on language technology. The main problem is a lack of Human Language Technology — resources like word processing programs that can correctly recognise Maltese.

Designing an intelligent computer system with a language ability is far easier in some languages than it is in others. English was the main language in which most of these technologies were developed. Since researchers can combine these ready-made software components instead of developing them themselves, it allows them to focus on larger challenges, such as winning a million dollars on a TV program. In the case of smaller languages, like Maltese, the basic building blocks are still being assembled.

Perhaps the most fundamental building block for any language system is linguistic data in a form that can be processed automatically by a machine. In Human Language Technology, the first step is usually to acquire a corpus, a large repository of text or speech, in the form of books, articles, recordings, or anything else that happens to be available in the correct form. Such repositories are exploited using machine-learning techniques, to help systems grasp how the language is typically used. To return to the Jeopardy example, there are now programs that can resolve pronouns such as ‘it’ to identify their antecedents, the element to which they refer. The program should identify that ‘it’ refers to Maltese.

For the Maltese language, researchers have developed a large text/speech repository, electronic lexicons (language’s inventory of its basic units of meaning), and related tools to analyse the language (available for free). Automatic tools exist to annotate this text with basic grammatical and structural information. These tools require a lot of manual work however, once in place, they allow for the development of sophisticated programs. The rest of this article will analyse some of the on-going research using these basic building blocks.

 

From Legalese to Pets

Many professions benefit from automating tasks using computers. Lawyers and notaries are the next professionals that might benefit from an ongoing project at the University of Malta. These experts draft contracts on a daily basis. For them, machine support is still largely limited to word processing, spell checking, and email services, with no support for a deeper analysis of the contracts they write and the identification of their potential legal consequences, partly through their interaction with other laws.

Contracts suffer from the same challenges when developing Human Language Technology resources. A saving grace is that they are written in ‘legalese’ that lessens some problems. Technology has advanced enough to allow the development of tools that analyse a text to enable extraction of information about the basic elements of contracts, leaving the professional free to analyse the deeper meaning of these contracts.

Deeper analysis is another big challenge in contract analysis. It is not restricted to just identifying the core ‘meaning’ or message, but needs to account the underlying reasoning behind legal norms. Such reasoning is different from traditional logic, since it talks about how things should be as opposed to how they are. Formal logical reasoning has a long history, but researchers are still trying to identify how one can think precisely about norms which affect definitions. Misunderstood definitions can land a person in jail.

Consider the following problem. What if a country legislates that:Every year, every person must hand in Form A on 1st January, and Form B on 2nd January, unless stopped by officials.’  Exactly at midnight between the 1st and 2nd of January the police arrest John for not having handed in Form A. He is kept under arrest until the following day, when his case is heard in court. The prosecuting lawyer argues that John should be found guilty because, by not handing in Form A on 1st January he has violated the law. The defendant’s lawyer argues that, since John was under arrest throughout the 2nd of January he was being stopped by officials from handing in Form B, absolving him of part of his legal obligation. Hence, he is innocent. Who is right? If we were to analyse the text of the law logically, which version should be adopted? The logical reasoning behind legal documents can be complicated, which is precisely why tools are needed to support lawyers and notaries who draft such texts.

Figuring out legal documents might seem very different to what Watson was coping with. But there is an important link: both involve understanding natural language (normal every day language) for something, be it computer, robot, or software, to do something specific. Analysing contracts is different because the knowledge required involves reasoning. So we are trying to wed recent advances in Human Language Technology with advances in formal logical reasoning.

Illustration by Sonya Hallett
Illustration by Sonya Hallett

Contract drafting can be supported in many ways, from a simple cross-referencing facility, enabling an author to identify links between a contract and existing laws, to identifying conflicts within the legal text. Since contracts are written in a natural language, linguistic analysis is vital to properly analyse a text. For example in a rent contract when making a clause about keeping dogs there would need to be a cross-reference to legislation about pet ownership.

We (the authors) are developing tools that integrate with word processors to help lawyers or notaries draft contracts. Results are presented as recommendations rather than automated changes, keeping the lawyer or notary in control.

 

Robots ’R’ Us

So far we have only discussed how language is analysed and produced. Of course, humans are not simply language-producing engines; a large amount of human communication involves body language. We use gestures to enhance communication — for example, to point to things or mime actions as we speak — and facial expressions to show emotions. Watson may be very clever indeed, but is still a disembodied voice. Imagine taking it home to meet the parents.

“Robby the Robot from the 1956 film Forbidden Planet, refused to obey a human’s orders”

Robotics is forging strong links with Human Language Technology. Robots can provide bodies for disembodied sounds allowing them to communicate in a more human-like manner.

Robots have captured the public imagination since the beginning of science fiction. For example, Robby the Robot from the 1956 film Forbidden Planet, refused to obey a human’s orders, a key plot element. He disobeyed because they conflicted with ‘the three laws of robotics’, as laid down by Isaac Asimov in 1942. These imaginary robots look somewhat human-shaped and are not only anthropomorphic, but they think and even make value judgements.

Actual robots tend to be more mundane. Industry uses them to cut costs and improve reliability. For example, the Unimate Puma, which was designed in 1963, is a robotic arm used by General Motors to assemble cars.

The Unimate Puma 200
The Unimate Puma 200

The Puma became popular because of its programmable memory, which allowed quick and cheap reconfiguration to handle different tasks. But the basic design was inflexible to unanticipated changes inevitably ending in failure. Current research is closing the gap between Robby and Puma.

Opinions may be divided on the exact nature of robots, but three main qualities define a robot: one, a physical body; two, capable of complex, autonomous actions; and three, able to communicate. Very roughly, advances in robotics push along these three highly intertwined axes.

At the UoM we are working on research that pushes forward all three, though it might take some time before we construct a Robby 2. We are developing languages for communicating with robots that are natural for humans to use, but are not as complex as natural languages like Maltese. Naturalness is a hard notion to pin down. But we can judge that one thing is more or less natural than another. For example, the language of logic is highly unnatural, while using a restricted form of Maltese would be more natural. It could be restricted in its vocabulary and grammar to make it easier for a robot to handle.

Edited Lego copyTake the language of a Lego EV3 Mindstorms robot and imagine a three-instruction program. The first would be to start its motors, the second to wait until light intensity drops to a specific amount, the third to stop. The reference to light intensity is not a natural way to communicate information to a robot. When we talk to people we are not expected to understand how the way we put our spoken words relates to their hardware. The program is telling the robot to: move forward until you reach a black line. Unlike the literal translation, this more natural version employs concepts at a much higher level and hence is accessible to anybody with a grasp of English.

The first step is to develop programs that translate commands spoken by people into underlying machine instructions understood by robots. These commands will typically describe complex physical actions that are carried out in physical space. Robots need to be equipped with the linguistic abilities necessary to understand these commands, so that we can tell a robot something like ‘when you reach the door near the table go through it’.

To develop a robot that can understand this command a team with a diverse skillset is needed. Language, translation, the robot’s design and movement, ability to move and AI (Artificial Intelligence) all need to work together. The robot must turn language into action. It must know that it needs to go through the door, not through the table, and that it should first perceive the door and then move through it. A problem arises if the door is closed so the robot must know what a door is used for, how to open and close it, and what the consequences are. For this it needs reasoning ability and the necessary physical coordination. Opening a door might seem simple, but it involves complex hand movements and just the right grip. Robots need to achieve complex behaviours and movements to operate in the real world.

The point is that a robot that can understand these commands is very different to the Puma. To build it we must first solve the problem of understanding the part of natural language dealing with spatially located tasks. In so doing the robot becomes a little bit more human.

A longer-term aim is to engage the robot in two-way conversation and have it report on its observations — as Princess Leia did with RT-D2 in Star Wars, if RT-D2 could speak.

Lego Mindstorms EV3 brick
Lego Mindstorms EV3 brick

Language for the World

Human Language Technologies are already changing the world. From automated announcements at airports, to smartphones that can speak back to us, to automatic translation on demand. Human Language Technologies help humans interact with machines and with each other. But the revolution has only just begun. We are beginning to see programs that link language with reasoning, and as robots become mentally and physically more adept the need to talk with them as partners will become ever more urgent. There are still a lot of hurdles to overcome.

To make the right advances, language experts will need to work with engineers and ICT experts. Then having won another million bucks on a TV show, a future Watson will get up, shake the host’s hand, and maybe give a cheeky wink to the camera.

Will robots take over the world?

Unlikely, for the next 100 years. Academics and sci-fi writers take three rough approaches. We will become one with the bots by integrating computers into our body achieving the next stage of evolution. Or, robots will become so powerful so quickly that we’ll become their slaves, helpless to stop them — think the Matrix. Or, robots have certain technological hurdles that will take ages to overcome.

Let’s analyse those hurdles. Computing power: no problem. Manufacturing expense: no problem. Artificial intelligence: could take decades, but we are already mapping and replicating the human brain through computers. Energy: very difficult to power such energy-hungry devices in a mobile way; battery or portable energy generation has a long way to go. The desire to enslave humanity: would require Asmiov’s trick or a mad computer scientist to programme it into the bot’s code. Conclusion: unlikely, sleep easy tonight.

Cybersexuality

Relationships have changed hand in hand with society. More couples are living far apart from each other. Marc Buhagiar speaks to Mary Ann Borg Cunen to explore how technology can lend a hand. Illustrations by Sonya Hallett.

Continue reading

Maltish or Engtese

Stick to one language! Was the old maxim. Otherwise, you’ll risk confusing your kids and they will never learn to speak properly. Research by Prof. Helen Grech and her team shows that this is not true: bilinguals usually do better. Teaching your child two languages at a go might delay them initially but helps them in the long run.  Words by Dr Edward Duca.

Continue reading

Cockneys vs Zombies — Film Review

Film Review_NT

At a site in East London, two construction workers inadvertently unearth the tomb belonging to the late King Charles II. Upon entering the crypt, they are assaulted, bitten and unkilled by former plague victims. Meanwhile, brothers Terry (Rasmus Hardiker) and Andy (Harry Tread- away), with their cousin Katy (Mi- chelle Ryan), are planning a bank heist. The trio concoct this heinousness with a noble intent: saving their grandad’s (Alan Ford) retirement home from be- ing demolished by heartless property developers. But of course, everything goes pear-shaped when the entire neighbourhood is invaded by hordes of the undead.

Cockneys and zombies: that’s what the title promises and that’s exactly what it delivers. Given the self-conscious- ly schlocky title, you would expect a crudely-made, amateurish production,

the likes of which litter the internet. The truth is, thankfully, very different. Cockneys has quite a high production value. It’s not World War Z but footage of London enfolded in chaos and may- hem is rendered in good quality CG, as are the close-up shots of carnage.

Still, one problem with comedy zombie flicks is that they will forever be in the shadow of Edgar Wright’s masterful Shaun of the Dead (2004). Shaun was a perfect storm of comedy, horror, excellent production, inspired casting, and fortuitous timing. Just as everybody was trying to get his/her head around the seemingly dubious merits and immense popularity of tor- ture porn horror films (Saw and The Passion of the Christ were both released in 2004), in waltzed Messrs. Wright, (Simon) Pegg and (Nick) Frost who made everybody’s sides split with laughter.

Luckily, even though Cockneys vs Zombies is nowhere near as brilliant as Shaun, it still can hold its head high. Director Matthias Hoene and writers James Moran (Severance, 2005) and Lucas Roche touch upon, but don’t expand much, on the zombie-as-meta- phor angle. They just want to play it for laughs and get more hits than misses. The scene in which poor old Hamish (Richard Briers) is being chased by the notoriously slow-moving zombies is pure gold and West Ham United sup- porters can put their mind at rest that, even after death, the feud with Millwall still rages on. In an inspired scene, we are at last shown that even infants are not immune to a zombie infestation.

Cockneys is no (early) George A. Romero and does not aspire to be. It just wants you to relax, pop some corn, sip on soda, and enjoy a zombie-tour around the streets of East London.

Future-Safe Malta

Words by Prof. Saviour Formosa
“Extreme weather leaves Mediterranean countries picking up the pieces. Egypt and Lebanon were the hardest hit with over 1.2 million people displaced overnight. Malta didn’t fare much better. The authorities have reported over 2,300 dead or missing, thousand injured and 74,000 persons displaced. Power cuts have been reported all over the island after Turbine Two tripped at the Delimara Power Station. Enemalta have not replied. The islands have taken a major blow to their infrastructure. Debris has been reported 1 km away from the coasts. The AFM and emergency responses were immediately dispatched and are starting to clear arterial roads. Insurance companies are still counting the costs. Valletta, Floriana and parts of Isla were protected from the storm surge by centuries-old Knight’s fortifications. The following localities have been affected: Birgu, Bormia, Kalkara, Marsa, Gzira, Msida, Pietà, San Giljan, Sliema, Ta’Xbiex, Xghajra, Birzebbuga, Marsascala, Marsaxlokk, Xlendi and Marsalforn. “

The above cutout could become reality if a Category 3 storm lashes Malta with 178 to 208 km per hour winds. The chances are minimal but too probable to ignore, since in 1995 a similar storm formed close to the Maltese Islands followed by others in 1996, 2006, and 2011.  Below are two scenarios that compare Malta as it currently stands against an island with a solid disaster management plan.

 [ SCENARIO 1 – AN UNPREPARED ISLAND]

The emergency forces have been inundated with calls for help and have few plans to operate a workable rescue effort. Key personnel were lost at home or while rushing to the scene, since the infrastructure has been knocked out, paralysing the island.  Power surges or power cuts have caused fires all over the Islands creating an apocalyptic scenario. With the storm still raging, the lack of a back-end ICT network has rendered communication near impossible.

 [ SCENARIO 2 – THE IDEAL SCENARIO]

A fleet of small aerial drones is monitoring the disaster. The authorities are using them to identify the hardest hit areas and map out corridors that allow access on the ground. Emergency vehicles are being deployed safely. Services will be redeployed after safety assessments and clearing of the main infrastructure. Paramedics, NGO rescue teams, and armed forces help move people to safer grounds and carry out rescue operations. Community buildings on higher ground are converted into temporary shelters. In turn, decision-makers are kept informed using an Emergency Room for effective relief.Continue reading

Etna

The ancients saw volcanoes as the wrath of their mighty gods. Volcanoes have been blamed for clearing whole towns, even planet-wide extinctions. A local team based in Gozo has just found out if Etna affects the Maltese Islands. Words by Dr Edward Duca.

Continue reading

Maniac: Two films. Two reviewers.

Film ReviewNoelKrista

Noel: I recently saw William Lustig’s Maniac (1980) and Franck Khalfoun’s 2012 remake back-to-back. The latter is rather faithful to the original’s spirit. Frank Zito (played by Joe Spinell [1980] and Elijah Wood [2012]) is more of a textbook psychopath, and more brutal in Khalfoun’s film; but still remains faithful to its source.

Krista: I thought the first’s ‘rawness’ was more brutal. The second had a polished style despite the first person perspective. The 1980 film was grittier.

Maniac1980

N: True. The remake looks slicker. For instance, the murder scenes are meticulously choreographed, operatic even. Lustig’s film is truer to life, scarier too, because in his lucid moments the killer acts normal.

K: The first person perspective didn’t convince me. Eventually I even forgot about it till it suddenly jumped to the fore again. It was inconsistent and uneasy without being very unsettling. It reminded me of Peeping Tom (1960), which made better use of the first person perspective.

N: Agree, but it didn’t distract me.

K: I hoped it would be more ‘distracting’. It would have been preferable if the first person perspective had been more defamiliarising, puncturing the viewer’s comfort zone — rather than just being ‘naturalised’.

N: The subjective point of view didn’t help me to get closer to the killer. I only saw this technique being used effectively in Enter the Void (2009). I find it a bit distracting because it can turn into a weird game (Spot the reflection in the mirror!). That said, in Maniac they were well aware of this and tried to have fun with it. The moments when the film veers away from the first person perspective, it sort of clicks into another gear.

K: Good point about the first person perspective being the default here, and the veering away from it becoming a ‘moment’ in itself. It calls to mind Bret Easton Ellis’ book American Psycho (1991).

N: I liked the fact that the remake created a deeper relationship between Frank and the mannequins. They are more than just a manifestation of his childhood trauma — a dysfunctional, promiscuous mother. The restoration of the mannequins is a genuine labour of love which underscores the affection that he nurtures towards the photographer (Anna, played by Nora Arnezeder). She is a mediocre artist unable to hold her camera properly. Frank is the real deal, getting his hands dirty.

maniac-poster1
K:
That’s a well-noted criticism of the photographer. In the first movie, I couldn’t really ‘judge’ whether she was a good artist or not — there wasn’t a focus on her art, instead they showed the world she moves around in, which made me think she was a budding artist. In the second one she’s portrayed as an underwhelming artist. She tries to use the mannequins to underpin her art and to somehow appropriate his by projecting an image of her face onto their blank heads.

N: Besides Anna, two other victims in Khalfoun’s film are a dancer and an agent. In both murders the director abandons the first person perspective, suggesting that either Frank is seeing his actions as a form of art, or that we, the audience, should see Frank himself as a work of art.

K: Yes, perhaps even perverting the sublime into the brutally grotesque. Yet ‘getting his hands dirty’ is counterpoised by the film’s stylishness.

N: So which is better?

K: Both films ultimately do different things. This is down to stylistic differences, enjoyably the remake doesn’t try to ‘replace’ Lustig’s film.

N: Totally agree. They’re like brothers sharing one (hell of a disturbed)
mother, similar yet so different.

 

https://www.youtube.com/watch?v=4umIfrP_vMk

Does Alcohol kill brain cells?

This myth is HUGE! Urban legend says that drinking kills cells, some even say: ‘three beers kill 10,000 brain cells.’ Thankfully, they are wrong.

In microbiology labs, a 70% alcohol 30% water mix is used to clean surfaces pretty efficiently. It seems our neurons are made of sturdier stuff.

Alcohol does affect brain cells. Everyone knows that and it isn’t pretty. Alcohol can damage dendrites, which are delicate neural extensions that usually convey signals to other neurons. Damaging them prevents information travelling from one neuron to another — a problem. Luckily, the damage isn’t permanent.