Lots of data is held in graphs within pdf files.
Some of these graphs are represented using an image format, e.g., jpeg, while others are created using pdf operations (e.g., draw a cross at 10, 20).
If the pdf operations that create a graph are known, it is possible to extract the coordinates of the points in a graph; proof of concept
This project aims to add an option to Mozilla's pdf renderer to extracts the x/y coordinates of all the points appearing in a graph highlighted by the user.
qpdf does an excellent job of mapping the contents of a pdf to text.
pdffigures extracts figures from pdfs.
Manual conversion to svg and then automatic conversion from svg.
chemdataextractor, as the name suggests, is oriented towards extracting chemical information from pdfs, e.g., chemical names and formula.
utopia attempts to extract structural features of an article, including citations.
xpdf is used as a library by many tools.
poppler is a popular pdf rendering library.