An A.I. researcher watches innovation and ethics collide (Columbia)

Written for the M.A. Science reporting program at Columbia Journalism School. This is a short profile on a computer scientist who is working at the intersection of natural language processing and robotics, two of the most challenging specialties in artificial intelligence.


 

Imagine that a delivery worker returns to the deli with a fresh ding on its side from an impatient teenager on a skateboard. It had taken more time than allotted to get the job done, resulting in the hot meatball sandwich going cold on the way, and an angry call from the customer. As soon as the worker pulls in from the sidewalk, the deli owner shouts across a wall of snacks, demanding to know what happened. The worker scans its memory for every visual, wheel turn, and joint swivel. It reports back to the owner, “I was knocked onto my side for 38 minutes. Right back wheel needs repair.” As it turns out, the delivery worker is a robot.  

That a robot should be able to summarize what happened to it is a new idea in artificial intelligence. Right now, if you reviewed a robot’s computational history, or the algorithms that decide its actions, the data would be an uninterpretable string of numbers. But one computer scientist is trying to build the technical foundation for robots to recall the day’s events using words, or what is known in artificial intelligence research as natural language processing. It may prove essential in a world in which humans and robots share the workforce. “Humans often tell each other what we’ve done,” writes Chad DeChant in a preprint paper. “We talk about the distant past as well as things that happened just a few minutes ago. Robots should be learning to do the same.” 

In his New York City apartment, DeChant, who is a literary person with a thin nose and big, almond-shaped eyes, had just put aside the code that he was working on. He is in his sixth year of his PhD in the Computer Science Department at Columbia University, a short window of time in which the scope of artificial intelligence has exponentially accelerated. “What worries me is that we are developing A.I. capabilities so fast,” DeChant said. “And we are totally unprepared for dealing with it on so many levels.”

DeChant is not only a technologist but an ethicist. He studied both philosophy and international policy before his pivot to numbers. He made the change ten years ago, in a moment of reflection over how he could make a difference in this world. “I thought the biggest issues of the 21st century would be climate change, the rise of China, and A.I.,” DeChant said. “And A.I. would dwarf the other two.” With his sights set, he spent the next few years teaching himself how to code, so that he could speak intelligently about computation. And to his surprise, he excelled at it. Following the good feedback, DeChant bootstrapped his way into a computer science doctorate program, without too much of a plan. “The whole time I thought I wasn’t going to be good enough,” he said. “Because at heart, I’m a wannabe novelist, or something.” (In moments of frustration with his PhD, DeChant still daydreams of abandoning code for fiction.) 

Being a computer scientist who loves words, however, has proven to be an asset in the field. DeChant studies the intersection of natural language and robotics, one of artificial intelligence’s most challenging new directions. Computer science is a splintered discipline, and until very recently, the two specialties sat on opposite sides of the field. Robotics is mechanical tinkering; natural language processing is abstract encoding. On top of this, the two areas of research are considered to be some of the hardest puzzles to solve, dissuading computer scientists from combining them. “Problem times problem equals problem squared,” DeChant said. 

Yet the emergence of simulation programs has helped researchers overcome some hurdles. Now computer scientists can run tests on digital robots in virtual environments, enabling investigators like DeChant to contribute to the field without a costly physical robot. Researchers like him, who tend to love the humanities as much as the hard sciences, have also contributed a novel idea: by integrating language into robotics, progress can be made in robotics itself. 

Early in his research career, DeChant had to work on a robotic hand. He thought it would be boring. Figuring out how to make a robot pick up a baseball seemed like a mundane task compared to his interest in global techno-ethics. On starting the project, however, he found it to be a deeply complex problem that bordered on the philosophical. Humans learn how to play chess or drive a car. In turn, we can instruct machines to do the same, using similar logic to how we were taught. But there are no instructions for how to grab a novel, turn its pages, and underline the most beautiful quotes. That behavior is innate to humankind. “To get a machine to do things that are natural to us is really hard,” DeChant says, “because we can’t think about all the steps that lead to it. We have no idea how to tell a child to learn how to pick something up. We just expect them to follow us.”

The same challenge applies to robot action summarization, the term DeChant has coined to describe a computer program’s ability to recount what it has done. What are the steps that lead a human to log their steps in their memory, and then recall the sequence later and explain it in words? In March, DeChant and his thesis advisor, Daniel Bauer of Columbia University, submitted a paper on the topic, titled Summarizing a virtual robot's past actions in natural language. To their knowledge, no one else has done this. Much work exists on how robots can follow natural language instructions. But the complement of that—robots recalling what they carried out—has been missing. The research gap may exist because the number of robots that work independently of human supervision is small. Today’s robots comprise automated factory arms, or discoid vacuum cleaners contained to the living room floor. But in the near future that will change. So will the nature of human-robot interaction. 

“I think that in the future,” DeChant told me, “all sophisticated robots and machine learning algorithms are going to have to be able to answer to humans and give accurate descriptions of what they've done.” The current barrier is, robots do not think or record using natural language. They need to be trained to recast their numerical data into prose. In DeChant’s research, he found that this is possible, although the robot’s action summaries were not 100 percent accurate. Still, it is a stepping stone to achieving explainability in artificial intelligence, which will be essential in future scenarios where a robot’s words are used in the arbitration of truth.

In northern California, on the opposite coast from DeChant, researchers at Google have also started to integrate natural language and robotics. They have a sizable team, likely a colossal budget, and more than a dozen physical robots that roll through real world rooms. (DeChant, on the other hand, works alone from home.) In April, the Google researchers submitted a paper that made a breakthrough in artificial intelligence. A big question in the field is how to bridge the gap between predictive models and reality. Language models like GPT-3, a text generator that produces human-like prose, chooses words based on what is statistically most likely to appear next, not because it understands the word’s meaning. Robots face a similar issue. When placed in real world environments, their actions are limited by predetermined options that are coded into the dataset. For example, if a robot is only given the word “refrigerator,” no other object exists in the world. By contrast, a human may come up with 50 different ways to act based on what they see around them.

For the first time, the researchers at Google trained a machine to be more like a human when it comes to choosing how to act on the fly. In their study, a natural language model, seeming somewhat like a brain, was given a multitude of possible tasks: “find a sponge,” “drive to the store,” “pick up the coffee,” and so on. A white, tubular robot had a camera and swiveling arm to serve as the language model’s “eyes and hands,” gathering contextual data on its surroundings. Combining information from both natural language and robotics, the machine was able to improvise actions in a real world setting. In one demonstration, it figured out how to bring one of its programmers a bottle of water. The machine received a pat on its head.

From a technical standpoint, DeChant considers this research to be incredible. “It’s just the coolest paper,” he said. Like his work, the Google researchers were training a robot to use natural language in its reasoning process. Now that it understood human language, it could see our world like we do, and act somewhat like us. Yet from an ethical perspective, DeChant’s excitement turned grim. There is almost no oversight or regulation in artificial intelligence. In theory, nothing stops a person from adding unruly commands to a dataset. As DeChant put it, a bad actor could simply add, “I want to kill someone. How can you help me?”

DeChant decided to go into research because it was the best way, in a rapidly evolving field, to know what was happening on the cutting edge. Yet at times, the rate of technical innovation can seem equal parts dazzling and dangerous. “The field of A.I. is so big now. I don't know if any one person, or even one group of people, could realistically control it,” DeChant said. “But there is still a community that wants to steer it in a safe direction, and I hope I can be part of that.”