With over 500 highly influential citations, Professor Mausam of IIT-Delhi is one of India’s foremost AI researchers. After studying computer science at IIT-Delhi in 2001, Mausam went on to do his Masters and Doctorate at the University of Washington and was one of the youngest research faculty there. In many ways, his work in the space of Open Information Extraction has influenced how the web works today.
Now, he’s focussed on a new solution that could help India get over the data scarcity problem that’s troubling research on artificial intelligence in India. FactorDaily caught up with the 39-year-old professor to give you a peek into his work, especially at the intersection of machines, knowledge and language, and how it impacts systems around us. In this interview, he also talks about his approach to solving India’s data problem. Edited excerpts:
Can you trace your career trajectory?
Initially, I had a straightforward career trajectory. I went to IIT-Delhi to do computer science as a student and then applied to grad school. When I was finishing my PhD at the University of Washington, I was thinking about going to industrial research centres.
But in the long run, I wanted to come back to India and teach. It was very clear to me that I wanted to come back. So my mentors advised me that if you want to come back to teach you should try to take on a faculty position.
As it happened, just by the stroke of luck, in 2007 a new centre called the Turing Center was starting up at the University of Washington, which worked at the intersection of machine learning and language. I didn’t have a language background and my area of research had been decision-making. But the person there said that we’d rather have somebody who’s good but not in the area than somebody who is in the area and not very good. I was unusually lucky to get a research faculty position at the University of Washington.
I didn’t apply elsewhere for a faculty position. Second, nobody stays on in their own university as a faculty member after PhD. So it was an incredible honour even though it was not a tenured position.
Of course, it is very difficult to become a faculty member in your own university because obviously, you don’t think of yourself as a faculty member because you have been a student all your life. Neither your fellow students think of you as a faculty member. So it was a little bit of a transition for a couple of years. But I started advising students there and I was mentored by the seniors in advising students, so over time I learned the ropes and then it was time to come back to India. I came back to IIT-Delhi as a faculty member in 2013.
How has your research interest evolved over time?
Most people are masters of something. Or specialists. I think of myself as broad and also shallow. I started working in my area of research as a PhD student, in the field of Markov Decision Process (a theoretical model used to model complex decision-making problems).
And then I completely jumped a few steps to work in language and text. So initially, that was challenging. But over time, I started learning about that particular area. Since I was at UDub (University of Washington) and always collaborating with my adviser (Prof Dan Weld). I was also working with Professor Oren Etzioni.
Dan and I had a couple of students together and we got interested in crowdsourcing. Now, most of my work in my PhD had been theoretical. I had always hoped to do more practical work. I wanted to see the value of MDP and Reinforcement Learning in real-world applications. By some stroke of luck, we hit on the area of crowdsourcing, which was just becoming popular in 2006-2008. And 2009 was when we wrote our first paper. We realised there was a lot of value of MDP in that field and AI can really help.
When I was coming back to India, I had a fairly broad profile, which is unusual in a young faculty.
At the time you realise that it’s not the area that particularly matters. It is what is the problem we want to solve. If the problem excites us more or less we will be able to move into that area. I was lucky to get that confidence early on.
After coming to India, I have slowly started focusing more of my efforts in what I call intelligent information systems. This is the intersection of machine learning, knowledge, and language. For instance, question answers, chatbots, dialogue systems, filling knowledge bases from the text, infer new facts that have not been said, and so forth are typically areas that I look at. I have some other project going on in MDP and machine-learning as well but more of my efforts have been in machine-learning IIS.
Can you break down how you apply MDP in AI and crowdsourcing?
Crowdsourcing is this idea that I potentially have a lot of people on demand that I can make use of to achieve things that I couldn’t have done earlier. For example, a very famous application on this is, suppose a blind person wants to know if there’s a bench in this park, how would they do it? Typically they ask someone. What else could you do? You could use your phone, take a picture and ask someone or ask AI. AI is not yet there in many of these questions. It can probably tell a park bench but it will find it difficult to tell how many calories are there in a dish. So you can just ask a person online. That’s one kind of crowdsourcing. Machine-learning practitioners use the crowd to train data. The data is there but to train it requires manual effort and researchers don’t have that time. But there are a lot of workers on platforms like Amazon’s Mechanical Turk or oDesk.
The challenge is, however, that these are random people and they may or may not be skilled at the task that you’re interested in. And they may also be spammers. So how do we make sure that the final outcome that we get from crowdsourcing is good? That became the single-most fundamental challenge that a lot of AI researchers in early 2010 were trying to solve. We had our set of ideas inspired by MDP literature.
My last paper on crowdsourcing was last year. There are many people in the human-computer interaction field who are doing work in the space now. I feel like it’s time to do different things.
We said I can always ask many people to do the same task. For example, if I ask you a ‘Yes’ or ‘No’ question, you can say ‘yes’ but I may not trust you. So I can ask 10 people and if they all say ‘yes,’ there’s a good chance that the answer is ‘yes.’ If on the other hand, you say ‘yes,’ somebody else says ‘no,’ yet another person says ‘no,’ and so on, you can be pretty sure that we are confused.
How many people should I ask depends on the hardness of the task, how much do I trust the next worker that comes along and so on. So, can we create a computational model that automatically figures out do I need to ask another person to get a more trustworthy answer. We modelled that as a computational Markov decision process framework. We performed experiments on Mechanical Turk and we consistently found that the policies and decisions made by AI were significantly superior to any human-devised policy that we may have originally had. The machine was able to do much by recognizing how well individual workers perform, the hardness of a task, confidence of answers. We were able to take further leaps in terms of the right tasks for a worker and figuring out how do I help the people who are defining tasks in creating high-quality tasks. That’s a space I have played in for a long time now. My last paper on crowdsourcing was last year. There are many people in the human-computer interaction field who are doing work in the space now. I feel like it’s time to do different things.
What could be the impact of your paper from an application point of view?
We were one of the early players in the field. So impact happens at multiple levels. It could be about taking things into deployment or developing a field and having lots of researchers follow your work. So I think we’ve definitely had the impact of the latter type. We haven’t done explicit tech transfer but overall we’ve been able to influence the quality over time.
What are some of the industry collaborations you’ve worked on?
With the AI boom, there’s a lot of demand for startups and other companies to interact with professors. If I have a startup, I may have one data scientist and I may start analyzing my data. But how do I get out if I get stuck? Typically at that time, you need somebody who understands the subject somewhat more deeply, and that is where a professor comes into play.
I have worked with a few of them. For instance, there was a dating platform where you were trying to figure out who is the best person of the opposite gender for you. I advised a startup that was into analysing tweets by converting that into a structured representation. I’ve worked with a platform that sells medicines to understand disease patterns, another one trying to read resumes automatically. I’ve advised several such structures.
There was a dating platform where you were trying to figure out who is the best person of the opposite gender for you. I advised a startup that was into analysing tweets by converting that into a structured representation.
You mentioned IIS, the intersection of machine learning, knowledge, and language. In today’s situation, which are the key areas that use it at scale?
Pretty much everything. Whenever you interact with the machine which is giving you some kind of knowledge to answer for you, there is an intelligent information system. Google has been the prime example, where you give a simple query and it points you to documents. Even Google’s paradigms have eventually changed, some because of our efforts. Now if you ask what is the height of Mt Everest, it will give you the answer and not just the document that has the answer. That’s because it is able to understand the question and has its own knowledge graph. So instead of satisfying your information needs by pointing you to documents, it is satisfying information needs directly.
Even Google’s paradigms have eventually changed, some because of our efforts. Now if you ask what is the height of Mt Everest, it will give you the answer and not just the document that has the answer. That’s because it is able to understand the question and has its own knowledge graph.
There are many things Google won’t be able to answer. For example, what are all the sports teams based in Arizona, it won’t be able to answer. It is probably a conscious choice. The research is a little ahead of real-world use because we are not yet at 98-99% accuracy. But maybe in 5-10 years from now, we’ll be at a much better place.
Now you have chatbots as well. For instance, if I said to a bot send Chrysanthemum flowers to this address, the agent has to first check whether they have it in stock or it can deliver at the location in time. So it has to work with specific knowledge in their database and understand what the user said. So how do I create these chatbots automatically in a new domain without the need for too much annotation? That is something I have been thinking about. Basically, everything that starts from the text, creates knowledge, questions, answers, dialogues, summarises it.
How do you mean Google’s paradigm has changed because of your efforts?
Traditionally, Google has done information retrieval. That says you issued a query and Google gave you documents. Information extraction means Google read the text and extracted bits of information that can be stored as knowledge. Question answer means the system answers your query. Original paradigm at Google was that they will point you to places that have the answers. Modern paradigm is changing towards you ask a question and I will try to answer it correctly and sometimes I will also tell you how I found the answer. This paradigm shift where the machine is doing more of the reading of the text and understanding what it says is new.
With Professor Oren Etzioni and others, I was involved in something called the open information extraction project (pdf). It refers to the paradigm where you read the text and convert it to facts. For example, if I have a sentence like ‘After defeating Lakers, Sonics are now the top dogs of NBA’. So from this, a good open information system will be able to extract that Sonics defeated the Lakers and things like they are now the top dogs. When someone asks that question, who defeated the Lakers, I can use this information and give you answers right away. This method is called information extraction. When you are doing it in a vocabulary agnostic way, without fixed vocabulary, it’s called open information extraction. Oren Etzioni was the first person to push this research problem and very soon I joined him in that project. Now the best open IE system comes out of our lab in IIT-Delhi. When I came back to India, Oren was also doing something else so I brought this here. The software has been downloaded significantly and getting a lot of traction. This is in 2010-11. At the time, Google wasn’t doing a lot of information extraction. Several of us went to Google and told them about the value of a system like Open IE.
What are some of the challenges that you would pick out from an Indian context when it comes to AI research?
There are no challenges from an Indian point of view. But there are challenges from a point of view for any place that’s similar to India.
In research we have vehicles of research and, of course, students are the vehicles often. There we have to have an ecosystem where interesting problems can come out. Broadly, your problems can be characterised as human resource problems, compute resource problems and an ecosystem issue. We have a huge number of students. But how many of them do research in India is still questionable. If there are top students from the IITs, they’ll probably go to the top 10 institutions in the world. Only a small fraction of AI researchers are in India despite it being one of the most populous countries in the world. That’s mostly because we haven’t had that culture. That’s changing. Access to high-quality students barring a few is a challenge. It’s not just an India challenge but it’s a challenge for equivalently ranked universities in the world. The best students will go to the top 10-20 universities and then it’s a matter of who can find more diamonds in the rough.
Now if you look at funding, you’ll see that the West is a lot more structured. In India, you have to explore many ways of raising research funding. Everybody has a policy, which gives you a small bit as appropriate but, you have to try many sources. It doesn’t come easy to us but it does come. You can definitely do things in India. But if you have five things to do in India, you won’t have the time. That affects research output. In the West, researchers are shielded from issues like budgeting and procurement and many of those administrative issues.
We have an advantage that in India we have problems that the West does not have. So if you decide to work on problems that affect our own country, there are a lot of interesting things we can do and many of my colleagues are doing it.
Lastly, we have an advantage that in India we have problems that the West does not have. So if you decide to work on problems that affect our own country, there are a lot of interesting things we can do and many of my colleagues are doing it. It isn’t always easy to get good data as well. But there are opportunities that arise from such challenges.
Do you see challenges with data in NLP research in India?
NLP is one of the better communities in India when it comes to data-sharing. Except for sensitive data, most researchers share data. We also have unique problems like NLP of Indian languages. There are researchers who specialise in these areas. India doesn’t pose a particular problem with data. I’d say a place like Stanford could put in $100,000 in creating a dataset that the world uses. If that happens, they may get thousands of citations. For that, not only did they think of what dataset to collect, they have enough to spend on creating that dataset. The last bit is hard for us. It is also an opportunity. Now we have to think about low data machine-learning.
Is there work happening in low data machine learning?
We have been thinking about how to use deep-learning methods when data is less. The challenge is deep-learning is compute-hungry and data-hungry. We have compute. But we don’t 100,000 or million labels for every task. That becomes hard.
We have been thinking about how to use deep-learning methods when data is less. The challenge is deep-learning is compute-hungry and data-hungry. We have compute. But we don’t 100,000 or million labels for every task. That becomes hard.
Absolutely. Professor Parag Singla was a collaborator. We have been thinking about how to use deep-learning methods when data is less. The challenge is deep-learning is compute-hungry and data-hungry. We have compute. But we don’t 100,000 or million labels for every task. That becomes hard. A lot of success with vision happened because of the ImageNet dataset, which was extremely large. A lot of NLP success is because of Stanford Question Answering Dataset (SQuAD), which had 100,000 Q&As. What if I had a new task where my data was nothing. I’d pay some people to get data for me but I may not be able to pay $100,000 for every machine-learning problem. Suppose I paid $1,000 or less and got about 10,000 labels. A deep-learning system will do something on it. We can do more if we think about the architecture or the formulation process and tweak various knobs and that’s something that Professor Parag Singla and I have been investigating.
Have you made any breakthroughs there?
Researchers are really good at research. To productise something, it’s a substantial initiative in itself. Some researchers like to do that. And I know some of my colleagues who are very interested in doing that. Eventually, someone has to really work hard in taking from prototype to a well-oiled system.
There are no free lunches. If you don’t have enough data, you need something else. Here we have human experts. So we are asking, how can a human expert augment the knowledge that a deep-learning system has so that it gets better performance. A human designer could think more in terms of the right features or constraints in the output space. Using this additional human inputs, how will the machine-learning system use it? That’s the question we are trying to answer. We have shown that if you have some basic constraints that a human designer provides, then our performance in low data setting is much higher than a vanilla system that doesn’t have this additional information.
Any specific datasets that you are working with on this project?
We are working with a lot of datasets. Basically, you can take any data set and produce the data. And ask the question, well, what is the best we could have done if you have only this much data?
So, we also looking at named entity recognition and part of speech tagging. These are basic NLP. We are also looking at image-related problems like depth assessment. We have been looking at probabilistic planning problems, where a domain designer not only gives you a simulator but also tells you something about how the domain is structured. There are many we have started looking at. We’re very excited at the possibilities.
Do you see us lagging in commercialising research?
I don’t know a good answer to this. Researchers are really good at research. To productise something, it’s a substantial initiative in itself. Some researchers like to do that. And I know some of my colleagues who are very interested in doing that. Eventually, someone has to really work hard in taking from prototype to a well-oiled system. That’s a different level of initiative which a researcher is only a small part of.
Subscribe to FactorDaily
Our daily brief keeps thousands of readers ahead of the curve. More signals, less noise.