Research portal

OCR Post-Correction Evaluation of Early Dutch Books Online - Revisited

Research output: Chapter in Book/Report/Conference proceedingConference contribution

We present further work on evaluation of the fully automatic post-correction of Early Dutch Books Online, a collection of 10,333 18th century books. In prior work we evaluated the new implementation of Text-Induced Corpus Clean-up (TICCL) on the basis of a single book Gold Standard derived from this collection. In the current paper we revisit the same collection on the basis of a sizeable 1020 item
random sample of OCR post-corrected strings from the full collection. Both evaluations have their own stories to tell and lessons to teach.
Original languageEnglish
Title of host publicationProceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)
Editors Calzolari
Number of pages8
StatePublished - 2016
EventLanguage Resources and Evaluation Conference - Grand Hotel Bernardin Conference Center, Portorož, Slovenia
Duration: 23 May 201628 May 2016


ConferenceLanguage Resources and Evaluation Conference
Abbreviated titleLREC-2016
Internet address

    Research areas

  • TICCL, OCR post-correction, evaluation, EDBO, Nederlab, CLARIAH

Research outputs

Login to Pure (for TiU staff only)