Bahman Moraffah - How to Do Research?

How to begin research in machine learning and statistics

To be able to do state-of-the-art research in machine learning and statistics, solid technical skills in technical skills in linear algebra, probability, calculus, and computing are assumed. To get started, you may want to read some of the books in the sub-field in which you are interested. A common way to figure out what the current developments are is to talk to people, and look into papers from main conferences like NeurIPS, ICML, AISTATS, COLT, ICCASP, and etc. There is a misconception that papers are too technical, but I can guarantee they are not. With the surge in the number of papers, it is very difficult to keep up with all the papers. However, you will develop skills to distinguish which papers are more related to research. It is very important to read strategically and NOT linearly. Reading for graduate school is different than reading a book for fun. When we read for pleasure we often start at the beginning of the book, reading carefully in a linear fashion. If you do this with your academic material, it will take five times as long and it is likely you won’t retain the right kind of information from the reading. Instead of reading linearly, read strategically. As an academic reader your job is to mine the text you are reading for information. Instead of cruising along the narrative, you need to dive in, find the information you need, and move along. Here are a few points to help you read a paper:

  1. Read the title carefully and think about what it tries to do.

  2. Read the abstract and ask yourself, what would you do if you wanted to solve this problem? A good abstract often clearly explains the main argument of the article, the kind of evidence the author uses, and the authors’ contributions.

  3. Go and take a look at the figures; What data is being shown, and how is it being shown? For each of the figures in the paper, come up with what the figure is trying to communicate, and think about whether it does it effectively.

  4. After figures, focus a little more on what the paper’s goals are. If you have not figured it out by now, skim through the paper, and read the introduction, and conclusion. A paper will have a “big picture” then a narrow field, and finally then specific question/topic for the paper. For each of the papers, come up with what you think fits into each of those categories based on the Abstract, Introduction and Conclusion.

  5. Check out references for other related papers.

  6. If you find the paper interesting and closely related to your research, skim through the entire paper without checking out the math equations.

  7. At this point, you should know what this paper is all about. If you find it very intriguing, it is time to work out the math and fully understand it. Note that only a few papers are really fascinating.

Throughout this process keep asking yourself, Why do we care about this problem?, How would I solve this problem, if I were working on it?, What are the pitfalls of this model and how can it be resolved?, etc.

Never forget to take notes. Take good notes on the papers, and know where you can get the information you need when you need it. Have a little notebook so that you can have a summary of the papers you have read along with the main idea of the paper. We stand on the shoulders of giants to write a good papar. Knowing millions of ideas would help us come up with our own solution for a problem and hope that our solution will work out.

Initially reading a paper might seem tedious. You may even not understand half the paper you are reading. But don't give up! It sometimes takes weeks to read your first paper and fully understand it. But with time your pace will increase. Just keep reading and understanding. You can Google technical terms used in the paper or refer to wikipedia page, or even textbooks. Again, Do NOT forget to take notes.

Here are some useful links and tutorials that would help you conduct research in machine learning and statistics: