Several years ago, I was working on modeling the hypothalamic-pituitary axis with my colleague Joe Gonzalez-Heydrich. Unsurprisingly, we could not find any primary data in articles ostensibly describing the relationship between various hormones of this axis. So, I found a very nice shareware program called DataThief. DataThief is "a program to extract (reverse engineer) data points from a graph. Typically, you scan a graph from a publication, load it into DataThief, and save the resulting coordinates, so you can use them in calculations or graphs that include your own data." It worked as billed and recently when I was working with my colleague Asher Schacter on predicting outcomes of drug development from pre-clinical data, I remembered how useful DataThief had been and recommended that he use it to extract the primary data from publications for each of the pharmaceuticals he wanted to study. Lo and behold, it worked again!
If only we had a policy in place that required that all primary data be deposited in a public electronic repository or repositories, then this additional, laborious, and time-consuming step would be unnecessary. Bioinformaticians have been very effective in demonstrating the value of sharing primary experimental data (e.g. high throughput data such as gene expression data or gene variant data) but clinical researchers have yet to achieve the same enlightenment. Until then, please make sure your graphs are very accurate in your publications so that others may benefit from your hard work and the taxpayers' investments in your research.