Research portal

OCR Post-Correction Evaluation of Early Dutch Books Online - Revisited

Research output: Scientific - peer-reviewConference contribution

We present further work on evaluation of the fully automatic post-correction of Early Dutch Books Online, a collection of 10,333 18th century books. In prior work we evaluated the new implementation of Text-Induced Corpus Clean-up (TICCL) on the basis of a single book Gold Standard derived from this collection. In the current paper we revisit the same collection on the basis of a sizeable 1020 item
random sample of OCR post-corrected strings from the full collection. Both evaluations have their own stories to tell and lessons to teach.
Original languageEnglish
Title of host publicationProceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)
Editors Calzolari
PublisherELRA
Pages967-974
Number of pages8
StatePublished - 2016
EventLanguage Resources and Evaluation Conference - Portorož, Slovenia

Conference

ConferenceLanguage Resources and Evaluation Conference
Abbreviated titleLREC-2016
CountrySlovenia
CityPortorož
Period23/05/1628/05/16
Internet address

Research areas

  • TICCL, OCR post-correction, evaluation, EDBO, Nederlab, CLARIAH

Publications

Log in to Pure