Dexter

Note that the sources on sourceforge are legacy and aren't maintained any more. The current source is in GAVO's version control.

The following is an excerpt from a poster presented at the American Astronomical Society's 2000 Summer meeting in Rochester, NY

ADS' roughly 1,000,000 scanned pages contain numerous diagrams and figures for which the original data sets are lost or inaccessible. Having scans for the figures invites digitizing the data points to recover at least a part of these data. Performing this digitization automatically is still beyond the capabilities of current OCR systems, but the computer can ease this process for a human.

This was the starting point for Dexter, a Java applet that runs in the users' browsers and provides an interface for selecting the part of the page that is of interest. On that selection, coordinate axes, points and error bars can be marked and, of course, corrected. [...]

In the future, we plan to implement some recognition algorithms that would, e.g., trace a line for the user or automatically search for markers.

Some recognition capabilities are present in the sourceforge release, though you'd better not look at the implementation :-)

In the release on sourceforge, there is a rudimentary standalone version of Dexter called Debuxter (the name should indicates that it was written to ease development) and a more refined one called goucho. Read HOWTO.standalone in the documentation or say make test.

If all you want is just run Dexter on some image or PDF provided by you, consider using Standalone Dexter at the GAVO data center.

In terms of documentation, there's the HTML help file used at ADS included with the distribution. Also, we've written a paper on it for the ADASS 2000 conference. It is available online, and we also offer the original poster without background.

If you like this program, please consider contributing code.


Last change: 2021-03-25

Markus Demleitner