India never lacks world-class AI researchers

Soumyabrata Pal is a postdoctoral researcher at Google Research, India. He received his Ph.D. from the Department of Computer Science at the University of Massachusetts at Amherst.

He was a research intern at the Ernst & Young AI Lab in Palo Alto and in the spring of 2020 as an applied scientist intern at Amazon Search, Berkeley.

INDIAai interviewed Soumyabrata to get his perspective on AI.

How did you get into AI as an electronics and electrical communications engineering student?

At IIT Kharagpur (during my undergraduate), students had the luxury of choosing additional or elective courses from other departments. I took an introductory machine learning course (offered in the CS department) on a whim and liked the math/statistics teaching brand. I followed through by taking advanced courses on NLP and ML that were offered, and finally, I was also able to organize my final year B.Tech project in the CS department. This led me down the path of ML theory during my undergraduate studies.

However, a course offered in EE is not very well known but is very important for the theoretical parts of ML. The course is called “Information Theory” and is a sibling of “Statistics”. I loved the course (helped by an outstanding teacher) and wrote that I wanted to be an information theorist in my SOP. I was fortunate that my thesis supervisor was an esteemed information theorist working on ML.

What is a typical day in the life of a graduate research assistant?

A typical day in my life as a research assistant usually involved reading an exciting article/book or trying to find evidence for a result I had in mind. However, some days I also needed to code to do small-scale experiments to support the theoretical guarantees of my work. It’s a bit different from people working in ML/NLP, who spend most of their day coding different algorithms.

What were your first challenges as a graduate research assistant? How did you manage them?

I had no research experience in ML theory when I joined the team as a graduate research assistant. So it was like I was thrown into an ocean with the hope that I would learn to swim. At first it was difficult, but my counselor helped me get stronger. Reading relevant classic books that start from scratch has helped me a lot. I always try to read books first (instead of recent articles) when I need to understand a niche area in ML theory.

Tell us about your area of ​​doctoral research

My main research interests are theoretical machine learning and applied statistics. I’m interested in designing algorithms with theoretical/provable guarantees that can guide practical solutions to relevant problems.

For example, consider a raw image captured by a high quality DSLR camera that can easily exceed 50MB in size. However, many image compression schemes such as PNG and JPEG are available that can significantly reduce the size of image without making any difference to the human eye. Therefore, we can conclude that many pixels of the raw image are irrelevant, and it would be useful to design algorithms that can compress and recover these images in a proven way. As another example, consider streaming engines like Netflix, which makes personalized recommendations to millions of users. Many users have similar tastes, so their ratings can be aggregated to make better recommendations – collaborative filtering. However, the similarities and preferences are unknown and should be learned quickly online.

My research focuses on the design of learning algorithms. Such structures in the data occur naturally in many applications (eg image and speech signals are rare) or are often embedded in the data.

How is India doing in AI and ML? In this situation, what do you think are our strengths and weaknesses?

I believe that India is doing exceptionally well these days and is catching up with western nations. The strength of India is that there are world-class researchers in many different institutions working on ML. However, from what I understand, the main weakness is a huge gap between theory and practice in India. To close this gap, the business world and universities need to work together more, which is not happening now. In contrast, Silicon Valley in the United States, renowned for its innovation capabilities, has two of the best universities in the world that regularly collaborate with industry professionals.

In your opinion, what aspects should Indian universities improve?

Two aspects need to be improved in Indian universities in my opinion, 1) Increase collaboration with industry and understand practical and relevant issues 2) Improve postdoc culture in India, which in my opinion is not as mature than in Western countries.

How did you handle difficult situations like the rejection of an article as a researcher? How do you compose yourself?

Rejection is hard, and I always feel like I’m not good at dealing with it. Still, it’s a chance to make changes to the article and try again. Since we want to improve the science, a better version of the article will have more effect.

What advice do you give to people who want to work in artificial intelligence research? What do they need to focus on to progress?

There are many opportunities in AI research these days, whether in industry or academia. I recommend taking courses on ML/AI and understanding the basics first. Nowadays, with the advent of modern computing, it has become easy to train complex models and apply them. But first, it is essential to develop intuition and understanding. The gigantic number of online courses available can help anyone who wants to enter the field.

What research papers and books have had a significant impact on your life?

There are three books that have had an impact for me in understanding ML. I have listed them below:

  1. An Introduction to Statistical Learning by Gareth James, Daniela Witten, Trevor Hastie, and Ryan Tibshirani
  2. Deep Learning by Aaron Courville, Ian Goodfellow and Yoshua Bengio
  3. The probabilistic method of Joel Spencer and Noga Alon.

Apart from these, there are websites.

(Mastery of machine learningfor example) and sometimes super helpful blogs to quickly gain a high-level understanding of some complicated topics.

James G. Williams