Test data for testing specs and software in @OCR-D
-
SBB0000F29300010000: Pages 1-5 of http://resolver.staatsbibliothek-berlin.de/SBB0000F29300010000
-
kant_aufklaerung_1784: http://ocr-d.de/sites/all/GTDaten/kant_aufklaerung_1784.zip, with TIFF compressed with JPEG + METS for second page
-
kant_aufklaerung_1784-binarized: http://ocr-d.de/sites/all/GTDaten/kant_aufklaerung_1784.zip, with binarized/gray produced by ocropus-nlbin + METS for all
-
test.ocrd.zip: OCRD-ZIP of
kant_aufklaerung_1784
. -
param-binarize.json: Sample parameter JSON file
-
page-with-glyphs.xml: https://github.com/impactcentre/iif-testfiles/blob/master/testfiles/res.xml
-
bagit_fetch_PPN595930174: OCRD-ZIP of
PPN595930174
(simplified to file group GDZOCR and PRESENTATION). -
column-samples: Samples for column detection
-
DIBCO11-machine_printed: Test set for the DIBCO11 challenge
-
page_dewarp: Dewarping samples by @mzucker
-
leptonica_samples: Sample facsimile from the leptonica computer vision library
for more information and the latest schema you can find here: https://github.com/PRImA-Research-Lab/PAGE-XML/wiki
- schema/2009-03-16.xsd: PAGE XSD, version 2009-03-16
- schema/2010-01-12.xsd: PAGE XSD, version 2010-01-12
- schema/2010-03-19.xsd: PAGE XSD, version 2010-03-19
- schema/2013-07-15.xsd: PAGE XSD, version 2013-07-15
- schema/2016-07-15.xsd: PAGE XSD, version 2016-07-15
- schema/2017-07-15.xsd: PAGE XSD, version 2017-07-15