86 days to go: How to identify a trustworthy paper?
by Franziska Boenisch and Adam Dziedzic
There are many papers on the web but: how to find the trustworthy ones? There are many ways how to filter papers that are worth reading: (1) the easiest approach is to go through the list of papers published at top conferences, (2) search for highly influential (and cited) papers that have not been accepted to any conference, (3) or use the heuristics like list of authors with high reputation or very recent papers from your area that describe novel ideas.
The first step is to check the latest proceedings of renown conferences from your field. The advantage of this approach is that the published papers were peer reviewed by other researchers and identified as worthy publications. Thus, already (in most cases) 3 or more researchers agreed on the fact that these are trustworthy papers. In principle, this is the best approach since you can easily cite these reviewed and accepted papers when you want to publish your own work. Also if your new paper are based on previously published papers, they already passed the first test - it is pushing forward the envelope of science. It is also good to cite such papers with the information where they were published, instead of only giving their citations from, e.g., arXiv, where papers are submitted as preprints - meaning without any formal scientific reviews.
Another way to find trustworthy papers is to check who cited them and how often they were cited. There are papers which have not been accepted to any conference but still have many citations, even by other very influential and officially published papers. In some cases, authors of such highly cited but not published papers submitted them to a conference, however, reviewers did not agreed that the paper should be published. This can happen, for instance, when the idea is too novel and reviewers from a given conference might still judge such a paper as out of scope of a given conference. Such papers might emerge as the pioneering and highly influential work. Thus, you might find very interesting papers that are worth reading but have not been published.
Finally, you have to develop your own judgment. We work on machine learning, which is a very fast moving field. If you want to stay on top, you might not wait until a paper is officially reviewed and published. Then, it might be too late as someone else already built on top of the published work. You might subscribe to receiving recommended articles from Google Scholar, paper feed from Semantic Scholar, or daily digest from your field from arXiv. You might be already an expert in your field, and then judging the new papers might be easier for you. The general strategy is to check by whom a paper was published. Many times you can identify authors from the previous papers that you read and trust them that they do publish solid articles and you might want to read them. Many researchers build their reputation for many years and you might have more confidence in relaying on their work. If these first signs of how trustworthy a given paper is are not sufficient, you have to simply delve deeper into it. You might scan through the paper, check if it is well organized, how much care was put into plotting the figures, spelling, etc. Then, if the paper is still interesting, you might delve deeper. Many experienced paper readers recommend three passes. First, you gloss over the paper by reading abstract, introduction, maybe related work, titles of sections and subsections, and conclusions. If you find the paper interesting, you might want to check the method in more detail, check if the experiments support the method, and if expected proofs are provided. If the paper lacks some of these aspects, you might quit, otherwise it might be the trustworthy paper that deserves your careful reading.