Versions Compared


  • This line was added.
  • This line was removed.
  • Formatting was changed.



OCR not working well, even with very text-based images.


171k images. 86k have tiff file "largest." 84k have tiff file called "larger." Difference in resolution.

Jeff: need to develop ranges to process


Would be nice to know how long it would take to run full corpus to make case for using supercomputers


Action items