AI hype in one picture.

Billion dollar investments. Top-tier scientists. Flo Rida. NeurIPS 2017 was a confusing, absurd, and inspirational roller coaster ride. Let’s try to understand what happened.

Understanding the hype

Hype. Everyone agrees that the hype for NeurIPS 2017 was incredible. Just look at the plot above: the entire conference sold out during early registration. The conference was packed. Big companies were desperately recruiting dewy-eyed and geeky-dispositioned PhD students. There were countless afterparties hosted at luxury venues by Intel, DeepMind, OpenAI, Uber, Facebook, Borealis, CapitalOne, Apple, etc. Intel even got Flo Rida to help them unveil a new chip. You mean the rapper Flo Rida?! Yes. Him. Although I can’t confirm because the bouncer wouldn’t let me in :/

He likes Apple Bottom jeans, boots with the fur...and apparently GPUs.

Money. All this sudden hype stems from the fact that large companies are placing big bets on AI. Many of them (Apple, Google, Microsoft, Intel, Uber, Facebook, Amazon) have their own research labs. Scientists at these labs publish papers, attend conferences, and sometimes advise younger researchers, just as they would in academia. The main difference is that their research is generally focused on projects that these companies find useful. I will not want to spend too much time talking about money and AI because this NYT article does a better job. That said, I’ll repost a few of its most interesting statistics:

  1. Fewer than 10,000 people in the world have the skills necessary to tackle serious artificial intelligence research, according to Element AI, an independent lab in Montreal.
  2. Last year, Google DeepMind’s “staff costs” were $138 million for 400 employees. That’s $345,000 per employee. These salaries are not uncommon in industrial research labs, even for students fresh out of their PhDs.
  3. Top academic talent has moved into the private sector. Examples: Uber hired 40 people from Carnegie Mellon’s groundbreaking AI program in 2015 to work on its self-driving-car project. Four of the best-known academic AI researchers have left or taken leave from their professorships at Stanford.
Taken from the Economist.

Impact on research. In years past, one of my advisors explained, NeurIPS was a mellow conference. It was generally aimed at professors and their graduate students. Doing a quick scan over the NeurIPS 2017 accepted papers, I found that the organization with the most affiliated authors was Google/DeepMind/Brain (210), followed by Carnegie Mellon (108), MIT (93), Stanford (81), Berkeley (81), and Microsoft (70)1. So the majority of NeurIPS attendees are still academic, but industry participation (read: Google) is growing.

Taken from @MLpuppy.

But which researchers are setting trends in the field, and which ones are making relatively small contributions? A quick review of the conference schedule shows that five of the seven invited speakers and 10 of the 19 symposium organizers had industry affiliations. This means that industry-funded researchers “set the curve” at NeurIPS.

Why can’t more research be done in academia, where the interests of the community are better served, rather than the interests of a few CEOs? “But they had free fidget spinners…“.

After day one, I did my best to look beyond the hype and find examples of people doing good science. My initial pessimism faded and I discovered some interesting themes.

Are we alchemists? Researcher Ali Rahimi received a Test of Time award for his contributions to the field back in 2007-08. He used his acceptance speech as an opportunity to make a strong, controversial claim about the state of machine learning: “it is the new alchemy.” Ali’s point is that we spend too much of our time trying to improve the performance of AI on various datasets and too little time trying to understand why things go right or wrong.

“We’re building systems that govern healthcare and mediate our civil dialogue. We influence elections. I would like to live in a society whose systems are built on top of verifiable, rigorous, thorough knowledge, and not on alchemy.”

An example of "alchemy" in AI, taken from Ali's NeurIPS keynote.

Several hours later, Yann LeCun posted a strong criticism of Ali’s speech. This debate soon diffused into countless lunchtime and hallway conversations. Whether people sided with Yann or Ali on this, they seemed grateful for a chance to discuss the issue. The machine learning community is results-driven and there have been few forums for these debates until now.

Metalearning. Pieter Abbeel and friends are pushing metalearning. Since he is one of the world’s most respected researchers, this was a huge theme at NeurIPS. The idea of metalearning is to teach a computer how to learn. Instead of teaching a computer how to solve a maze, you would teach a computer to teach itself how to solve a maze. Yes, this is more complicated. The idea is that by “learning to learn,” you get AIs that generalize to new situations effectively.

Metalearning for efficient maze navigation.

I think everyone agrees that metalearning is desirable. The real question is how to make it work. Even Pieter was unclear on this point, although he presented a wealth of recent ideas. I especially liked his paper presenting an agent that could explore a maze until it found a target. When dropped back into the maze, the agent used its past experience to navigate quickly to the target.

Deep reinforcement learning (Deep RL). The young and ambitious field of deep reinforcement learning continues to deliver great results. Earlier this year, Google DeepMind published a fourth Nature paper. The team described how to teach an algorithm, which they call AlphaGo Zero, to play Go at superhuman level, starting from zero human knowledge. During NeurIPS they released an updated version which plays Go, Shogui, and Chess at a dominant level.

AlphaGo Zero rediscovering 3000 years of Go strategy.

The problem with deep RL is that it still learns far too slowly. For example, it can outperform humans at most Atari games…but whereas a human needs a few minutes to learn the game, the computer needs to play for hundreds of hours (see slide 15). Talks, posters, and presentations tended to focus on how to make deep RL learn tasks of greater complexity, more quickly. Popular ideas included hierarchical RL, metalearning, and various unsupervised auxilliary tasks.

Interpretability. There was a big symposium (3000+ people) and two workshops about this. The interpretability issue relates to the fact that we often want to get machine learning systems to explain themselves. Consider applications where human well-being is involved: self-driving cars, medical applications, and financial decisions. In these situations, we want humans to trust the algorithms. The best way to do this is to make the computer explain its decision-making process in the way that humans understand.

Between Ali’s keynote, several new government grants aimed at interpretability, and a push among companies to use AI to solve real-world problems, interpretability felt like a central issue this year. I am happy about this because my research – the reason I attended NeurIPS – is centered around interpretability. Here I am giving a talk about it:

Disentangled representations. The power of deep learning is that it can transform features at the pixel level, such as color and shape, into more complex ones such as “ears”, “wheels”, or “leaves”. Clearly, it’s easier to explain what is going on in a picture using the latter. The problem is that these concepts get mixed together like a plate of spagetthi. As Yoshua Bengio said (hungrily), “If we can take that spaghetti and disentangle it, that would be very nice.” So we’d like algorithms that discover high-level features like “ears,” “wheels,” or “leaves” that are separable. We’d also like to do this in an unsupervised manner.

I attended a workshop centered around this idea. People whose work and ideas I found interesting included Yoshua Bengio (Montreal), Stefano Soatto (UCLA), Josh Tenenbaum (MIT), and DeepMind’s Irina Higgins, Peter Battaglia, David Pfau, and Tejas Kulkarni. This theme was not big at NeurIPS, but I think it is promising.

AI and society

I was surprised to find that some of my conversations were not about science at all. They were about the relationship between AI and society. Many of these occurred at the fascinating but sparsely-attended Kinds of Intelligence symposium. This symposium brought together influential thinkers from psychology (Alison Gopnik), neuroscience (Gary Marcus, Lucia Jacobs), deep learning (Demis Hassabis, Zoubin Ghahramani), privacy (Cynthia Dwork), and public policy (David Runciman).

People and perspectives. The Kinds of Intelligence symposium made me think critically about the ways AI will affect society. I ended up having some fascinating conversations on the topic. Here is a brief list of the most striking people and perspectives:

  1. Taras (grad student at KTH) is worried about AI making the poor poorer and the rich richer. Based on how corporate NeurIPS 2017 was, I think this is valid. Far too much of current AI research is aimed at finding better ways to sell things.

  2. Kyle Cranmer (NYU) is leading the effort to bring AI to the natural sciences. Applications include particle track reconstruction (particle physics), tracking supermassive black hole emissions (astronomy), analysis of LIGO data (gravity waves), and solving the many-body problem (quantum mechanics). These are examples of basic research which can help society as a whole rather than a single company.

  3. Rich Caruana (Microsoft Research) is trying to prevent bias in new AI systems. An example of this bias is the COMPAS system, which was more likely to recommend white inmates for parole than black ones. We can’t let this happen in the future.

  4. Sam Greydanus (Me! Working for the DARPA Explainable AI Project) has decided that if we are going to introduce AI to society, we need to be able to explain its decisions. He introduced a new way of doing this and showed how it can catch AIs that are “cheating” at certain tasks.

  5. Dhruv Batra (Facebook AI Research) is concerned about misreporting of AI in the media. A series of fake news articles about his work recently caused massive – and totally unfounded – hysteria.
  6. Alonso (my Uber driver) was mostly concerned that, “Robots are gonna take over the world!” He should talk to Dhruv.

  7. Nenad (DeepMind Health) pointed to ways AI will improve health care. Examples include personalized medicine, better diagnostic tools, and accelerated drug discovery.

  8. Ishmael (a gorilla in a book I'm reading) would probably say, “Humans have GOT to stop worrying about themselves and start thinking about how their actions affect the rest of the planet. How will AI help or hurt the environment?”

  9. Peter Battaglia (DeepMind) was concerned about how AI will reduce privacy. Corporations and governments already own a massive amount of our personal information but they don’t have the means to piece it together into a comprehensive story. AI will change that.

  10. David Runciman (Cambridge) is interested in the relationship between AI and Artificial Agents (AAs). These AAs are institutions such as states, corporations, or markets. They wield a great deal of influence over our world but have motives and priorities that are different from those of humans. How will AI and AA interact?!

  11. Jonnie Penn (AI historian at Cambridge) reminded me that AI will solve some problems and create new ones. What’s cool is that we get to determine how the story unfolds. This is a big responsibility for us researchers. It means taking the time to communicate our work in a way the public can understand. It means thinking carefully about how our work is changing society…and whether we are proud of these changes.

What can we do? Jonnie and I are organizing an informal group, AI for Good, aimed at addressing these issues. If you want to join the conversation, email me and I will send you an application. If you are a US citizen, you should also email your congress(wo)men. I did it and it only took five minutes.

  1. Note that some authors are featured on more than one paper and thus are counted more than once