Scientific results are communicated visually in the literature through diagrams, visualizations, and photographs. These information-dense objects have been largely ignored in bibliometrics and scientometrics studies when compared to citations and text. In this project, we use techniques from computer vision and machine learning to classify more than 8 million figures from PubMed into 5 figure types and study the resulting patterns of visual information as they relate to impact. We find that the distribution of figures and figure types in the literature has remained relatively constant over time, but can vary widely across field and topic. We find a significant correlation between scientific impact and the use of visual information, where higher impact papers tend to include more diagrams, and to a lesser extent more plots and photographs. To explore these results and other ways of extracting this visual information, we have built a visual browser to illustrate the concept and explore design alternatives for supporting viziometric analysis and organizing visual information. We use these results to articulate a new research agenda – viziometrics – to study the organization and presentation of visual information in the scientific literature.
June 2016: The Economist has written a nice print piece on our arXiv paper.
June 2016: Featured on LabWorm, a discovery platform that exposes top research tools with the goal of promoting a more open, collaborative and cutting edge science.
June 2016: MIT Technology Review wrote a nice piece on our project: The First Visual Search Engine for Scientific Diagrams
We originally used patch-based machine vision techniques to classify figures by visualization type, achieving 91% accuracy on a test set with 5 categories – equations (394), photos (782), tables (436), visualizations (890), and diagrams (769). More recently, we have begun using deep learning to achieve higher quality results at the expense of training time. For the task of classifying millions of images that we extracted from source papers, we found approximate 35% of them contains multiple sub-figures. A dismantling algorithm we proposed in ICPRAM 2015 resolves this issue by parsing each composite figure into multiple sub-figures. The algorithm splits each composite figure into visual “tokens” recursively, classifies each token as either auxiliary (e.g., text fragments) or standalone figures, then merges the tokens recursively to reconstruct whole figures. The algorithm terminates when the reonstructed figure achieve a certain “completeness” score based on their types and positions. Using the results of the dismantler, we can more precisely classify the sub-figures.
Our data for this research project comes from several sources. Currently, the prototype includes more than 8 million images from PubMed Central. We plan to add other data sets as they become available.
@article{lee2016viziometrics, author = {Lee, Poshen and West, Jevin and Howe, Bill}, title = {Viziometrics: Analyzing Visual Information in the Scientific Literature}, journal = {IEEE Big Data}, year = {2016} }
@inproceedings{lee2016viziometrix, author = {Lee, Poshen and West, Jevin and Howe, Bill}, title = {VizioMetrix: A Platform for Analyzing the Visual Information in Big Scholarly Data}, booktitle = {BigScholar Workshop (co-located at WWW)}, year = {2016} }
@inproceedings{lee2015dismantling, author = {Lee, Poshen and Howe, Bill}, title = {Dismantling Composite Visualizations in the Scientific Literature}, booktitle = {4th International Conference on Pattern Recognition Applications and Methods (ICPRAM)}, year = {2015} }