Comments (5)
Failed to load list of training filenames from data/eng/list.train
Did you check the file?
from tessdata.
Yes, it's empty and I don't know why, i'm training tesseract on line images
You are useing make version: 4.3
combine_tessdata -u /usr/share/tesseract-ocr/4.00/tessdata/eng.traineddata data/eng/eng
Extracting tessdata components from /usr/share/tesseract-ocr/4.00/tessdata/eng.traineddata
Wrote data/eng/eng.lstm
Wrote data/eng/eng.lstm-punc-dawg
Wrote data/eng/eng.lstm-word-dawg
Wrote data/eng/eng.lstm-number-dawg
Wrote data/eng/eng.lstm-unicharset
Wrote data/eng/eng.lstm-recoder
Wrote data/eng/eng.version
unicharset_extractor --output_unicharset "data/eng/my.unicharset" --norm_mode 2 "data/eng/all-gt"
merge_unicharsets data/eng/eng.lstm-unicharset data/eng/my.unicharset "data/eng/unicharset"
Loaded unicharset of size 112 from file data/eng/eng.lstm-unicharset
Loaded unicharset of size 81 from file data/eng/my.unicharset
Wrote unicharset file data/eng/unicharset.
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-000x-04.tif" -t "data/eng-ground-truth/a01-000x-04.gt.txt" > "data/eng-ground-truth/a01-000x-04.box"
tesseract "data/eng-ground-truth/a01-000x-04.tif" data/eng-ground-truth/a01-000x-04 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-000x-05.tif" -t "data/eng-ground-truth/a01-000x-05.gt.txt" > "data/eng-ground-truth/a01-000x-05.box"
tesseract "data/eng-ground-truth/a01-000x-05.tif" data/eng-ground-truth/a01-000x-05 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-003-09.tif" -t "data/eng-ground-truth/a01-003-09.gt.txt" > "data/eng-ground-truth/a01-003-09.box"
tesseract "data/eng-ground-truth/a01-003-09.tif" data/eng-ground-truth/a01-003-09 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-003x-02.tif" -t "data/eng-ground-truth/a01-003x-02.gt.txt" > "data/eng-ground-truth/a01-003x-02.box"
tesseract "data/eng-ground-truth/a01-003x-02.tif" data/eng-ground-truth/a01-003x-02 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-007u-00.tif" -t "data/eng-ground-truth/a01-007u-00.gt.txt" > "data/eng-ground-truth/a01-007u-00.box"
tesseract "data/eng-ground-truth/a01-007u-00.tif" data/eng-ground-truth/a01-007u-00 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-007u-09.tif" -t "data/eng-ground-truth/a01-007u-09.gt.txt" > "data/eng-ground-truth/a01-007u-09.box"
tesseract "data/eng-ground-truth/a01-007u-09.tif" data/eng-ground-truth/a01-007u-09 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-014-01.tif" -t "data/eng-ground-truth/a01-014-01.gt.txt" > "data/eng-ground-truth/a01-014-01.box"
tesseract "data/eng-ground-truth/a01-014-01.tif" data/eng-ground-truth/a01-014-01 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-014-04.tif" -t "data/eng-ground-truth/a01-014-04.gt.txt" > "data/eng-ground-truth/a01-014-04.box"
tesseract "data/eng-ground-truth/a01-014-04.tif" data/eng-ground-truth/a01-014-04 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-014-07.tif" -t "data/eng-ground-truth/a01-014-07.gt.txt" > "data/eng-ground-truth/a01-014-07.box"
tesseract "data/eng-ground-truth/a01-014-07.tif" data/eng-ground-truth/a01-014-07 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-014x-08.tif" -t "data/eng-ground-truth/a01-014x-08.gt.txt" > "data/eng-ground-truth/a01-014x-08.box"
tesseract "data/eng-ground-truth/a01-014x-08.tif" data/eng-ground-truth/a01-014x-08 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-020u-05.tif" -t "data/eng-ground-truth/a01-020u-05.gt.txt" > "data/eng-ground-truth/a01-020u-05.box"
tesseract "data/eng-ground-truth/a01-020u-05.tif" data/eng-ground-truth/a01-020u-05 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-020x-01.tif" -t "data/eng-ground-truth/a01-020x-01.gt.txt" > "data/eng-ground-truth/a01-020x-01.box"
tesseract "data/eng-ground-truth/a01-020x-01.tif" data/eng-ground-truth/a01-020x-01 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-020x-02.tif" -t "data/eng-ground-truth/a01-020x-02.gt.txt" > "data/eng-ground-truth/a01-020x-02.box"
tesseract "data/eng-ground-truth/a01-020x-02.tif" data/eng-ground-truth/a01-020x-02 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-026-07.tif" -t "data/eng-ground-truth/a01-026-07.gt.txt" > "data/eng-ground-truth/a01-026-07.box"
tesseract "data/eng-ground-truth/a01-026-07.tif" data/eng-ground-truth/a01-026-07 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-026u-04.tif" -t "data/eng-ground-truth/a01-026u-04.gt.txt" > "data/eng-ground-truth/a01-026u-04.box"
tesseract "data/eng-ground-truth/a01-026u-04.tif" data/eng-ground-truth/a01-026u-04 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-030-00.tif" -t "data/eng-ground-truth/a01-030-00.gt.txt" > "data/eng-ground-truth/a01-030-00.box"
tesseract "data/eng-ground-truth/a01-030-00.tif" data/eng-ground-truth/a01-030-00 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-030-02.tif" -t "data/eng-ground-truth/a01-030-02.gt.txt" > "data/eng-ground-truth/a01-030-02.box"
tesseract "data/eng-ground-truth/a01-030-02.tif" data/eng-ground-truth/a01-030-02 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-030u-08.tif" -t "data/eng-ground-truth/a01-030u-08.gt.txt" > "data/eng-ground-truth/a01-030u-08.box"
tesseract "data/eng-ground-truth/a01-030u-08.tif" data/eng-ground-truth/a01-030u-08 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-038-02.tif" -t "data/eng-ground-truth/a01-038-02.gt.txt" > "data/eng-ground-truth/a01-038-02.box"
tesseract "data/eng-ground-truth/a01-038-02.tif" data/eng-ground-truth/a01-038-02 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-038-03.tif" -t "data/eng-ground-truth/a01-038-03.gt.txt" > "data/eng-ground-truth/a01-038-03.box"
tesseract "data/eng-ground-truth/a01-038-03.tif" data/eng-ground-truth/a01-038-03 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-038-08.tif" -t "data/eng-ground-truth/a01-038-08.gt.txt" > "data/eng-ground-truth/a01-038-08.box"
tesseract "data/eng-ground-truth/a01-038-08.tif" data/eng-ground-truth/a01-038-08 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-038x-01.tif" -t "data/eng-ground-truth/a01-038x-01.gt.txt" > "data/eng-ground-truth/a01-038x-01.box"
tesseract "data/eng-ground-truth/a01-038x-01.tif" data/eng-ground-truth/a01-038x-01 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-038x-05.tif" -t "data/eng-ground-truth/a01-038x-05.gt.txt" > "data/eng-ground-truth/a01-038x-05.box"
tesseract "data/eng-ground-truth/a01-038x-05.tif" data/eng-ground-truth/a01-038x-05 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-038x-07.tif" -t "data/eng-ground-truth/a01-038x-07.gt.txt" > "data/eng-ground-truth/a01-038x-07.box"
tesseract "data/eng-ground-truth/a01-038x-07.tif" data/eng-ground-truth/a01-038x-07 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-038x-08.tif" -t "data/eng-ground-truth/a01-038x-08.gt.txt" > "data/eng-ground-truth/a01-038x-08.box"
tesseract "data/eng-ground-truth/a01-038x-08.tif" data/eng-ground-truth/a01-038x-08 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-043-02.tif" -t "data/eng-ground-truth/a01-043-02.gt.txt" > "data/eng-ground-truth/a01-043-02.box"
tesseract "data/eng-ground-truth/a01-043-02.tif" data/eng-ground-truth/a01-043-02 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-049u-07.tif" -t "data/eng-ground-truth/a01-049u-07.gt.txt" > "data/eng-ground-truth/a01-049u-07.box"
tesseract "data/eng-ground-truth/a01-049u-07.tif" data/eng-ground-truth/a01-049u-07 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-049x-00.tif" -t "data/eng-ground-truth/a01-049x-00.gt.txt" > "data/eng-ground-truth/a01-049x-00.box"
tesseract "data/eng-ground-truth/a01-049x-00.tif" data/eng-ground-truth/a01-049x-00 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-053-00.tif" -t "data/eng-ground-truth/a01-053-00.gt.txt" > "data/eng-ground-truth/a01-053-00.box"
tesseract "data/eng-ground-truth/a01-053-00.tif" data/eng-ground-truth/a01-053-00 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-053u-08.tif" -t "data/eng-ground-truth/a01-053u-08.gt.txt" > "data/eng-ground-truth/a01-053u-08.box"
tesseract "data/eng-ground-truth/a01-053u-08.tif" data/eng-ground-truth/a01-053u-08 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-053x-00.tif" -t "data/eng-ground-truth/a01-053x-00.gt.txt" > "data/eng-ground-truth/a01-053x-00.box"
tesseract "data/eng-ground-truth/a01-053x-00.tif" data/eng-ground-truth/a01-053x-00 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-058-05.tif" -t "data/eng-ground-truth/a01-058-05.gt.txt" > "data/eng-ground-truth/a01-058-05.box"
tesseract "data/eng-ground-truth/a01-058-05.tif" data/eng-ground-truth/a01-058-05 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-058u-06.tif" -t "data/eng-ground-truth/a01-058u-06.gt.txt" > "data/eng-ground-truth/a01-058u-06.box"
tesseract "data/eng-ground-truth/a01-058u-06.tif" data/eng-ground-truth/a01-058u-06 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-063-01.tif" -t "data/eng-ground-truth/a01-063-01.gt.txt" > "data/eng-ground-truth/a01-063-01.box"
tesseract "data/eng-ground-truth/a01-063-01.tif" data/eng-ground-truth/a01-063-01 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-063x-00.tif" -t "data/eng-ground-truth/a01-063x-00.gt.txt" > "data/eng-ground-truth/a01-063x-00.box"
tesseract "data/eng-ground-truth/a01-063x-00.tif" data/eng-ground-truth/a01-063x-00 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-072u-00.tif" -t "data/eng-ground-truth/a01-072u-00.gt.txt" > "data/eng-ground-truth/a01-072u-00.box"
tesseract "data/eng-ground-truth/a01-072u-00.tif" data/eng-ground-truth/a01-072u-00 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-077u-01.tif" -t "data/eng-ground-truth/a01-077u-01.gt.txt" > "data/eng-ground-truth/a01-077u-01.box"
tesseract "data/eng-ground-truth/a01-077u-01.tif" data/eng-ground-truth/a01-077u-01 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-077u-05.tif" -t "data/eng-ground-truth/a01-077u-05.gt.txt" > "data/eng-ground-truth/a01-077u-05.box"
tesseract "data/eng-ground-truth/a01-077u-05.tif" data/eng-ground-truth/a01-077u-05 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-082u-00.tif" -t "data/eng-ground-truth/a01-082u-00.gt.txt" > "data/eng-ground-truth/a01-082u-00.box"
tesseract "data/eng-ground-truth/a01-082u-00.tif" data/eng-ground-truth/a01-082u-00 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-082u-02.tif" -t "data/eng-ground-truth/a01-082u-02.gt.txt" > "data/eng-ground-truth/a01-082u-02.box"
tesseract "data/eng-ground-truth/a01-082u-02.tif" data/eng-ground-truth/a01-082u-02 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-082u-05.tif" -t "data/eng-ground-truth/a01-082u-05.gt.txt" > "data/eng-ground-truth/a01-082u-05.box"
tesseract "data/eng-ground-truth/a01-082u-05.tif" data/eng-ground-truth/a01-082u-05 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-091-02.tif" -t "data/eng-ground-truth/a01-091-02.gt.txt" > "data/eng-ground-truth/a01-091-02.box"
tesseract "data/eng-ground-truth/a01-091-02.tif" data/eng-ground-truth/a01-091-02 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-096u-01.tif" -t "data/eng-ground-truth/a01-096u-01.gt.txt" > "data/eng-ground-truth/a01-096u-01.box"
tesseract "data/eng-ground-truth/a01-096u-01.tif" data/eng-ground-truth/a01-096u-01 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-113u-10.tif" -t "data/eng-ground-truth/a01-113u-10.gt.txt" > "data/eng-ground-truth/a01-113u-10.box"
tesseract "data/eng-ground-truth/a01-113u-10.tif" data/eng-ground-truth/a01-113u-10 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-117-07.tif" -t "data/eng-ground-truth/a01-117-07.gt.txt" > "data/eng-ground-truth/a01-117-07.box"
tesseract "data/eng-ground-truth/a01-117-07.tif" data/eng-ground-truth/a01-117-07 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-117u-05.tif" -t "data/eng-ground-truth/a01-117u-05.gt.txt" > "data/eng-ground-truth/a01-117u-05.box"
tesseract "data/eng-ground-truth/a01-117u-05.tif" data/eng-ground-truth/a01-117u-05 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-128-02.tif" -t "data/eng-ground-truth/a01-128-02.gt.txt" > "data/eng-ground-truth/a01-128-02.box"
tesseract "data/eng-ground-truth/a01-128-02.tif" data/eng-ground-truth/a01-128-02 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-132-01.tif" -t "data/eng-ground-truth/a01-132-01.gt.txt" > "data/eng-ground-truth/a01-132-01.box"
tesseract "data/eng-ground-truth/a01-132-01.tif" data/eng-ground-truth/a01-132-01 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-132-03.tif" -t "data/eng-ground-truth/a01-132-03.gt.txt" > "data/eng-ground-truth/a01-132-03.box"
tesseract "data/eng-ground-truth/a01-132-03.tif" data/eng-ground-truth/a01-132-03 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-132u-01.tif" -t "data/eng-ground-truth/a01-132u-01.gt.txt" > "data/eng-ground-truth/a01-132u-01.box"
tesseract "data/eng-ground-truth/a01-132u-01.tif" data/eng-ground-truth/a01-132u-01 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-132u-05.tif" -t "data/eng-ground-truth/a01-132u-05.gt.txt" > "data/eng-ground-truth/a01-132u-05.box"
tesseract "data/eng-ground-truth/a01-132u-05.tif" data/eng-ground-truth/a01-132u-05 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-132x-05.tif" -t "data/eng-ground-truth/a01-132x-05.gt.txt" > "data/eng-ground-truth/a01-132x-05.box"
tesseract "data/eng-ground-truth/a01-132x-05.tif" data/eng-ground-truth/a01-132x-05 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-132x-07.tif" -t "data/eng-ground-truth/a01-132x-07.gt.txt" > "data/eng-ground-truth/a01-132x-07.box"
tesseract "data/eng-ground-truth/a01-132x-07.tif" data/eng-ground-truth/a01-132x-07 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a02-000-07.tif" -t "data/eng-ground-truth/a02-000-07.gt.txt" > "data/eng-ground-truth/a02-000-07.box"
tesseract "data/eng-ground-truth/a02-000-07.tif" data/eng-ground-truth/a02-000-07 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a02-004-00.tif" -t "data/eng-ground-truth/a02-004-00.gt.txt" > "data/eng-ground-truth/a02-004-00.box"
tesseract "data/eng-ground-truth/a02-004-00.tif" data/eng-ground-truth/a02-004-00 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a02-004-01.tif" -t "data/eng-ground-truth/a02-004-01.gt.txt" > "data/eng-ground-truth/a02-004-01.box"
tesseract "data/eng-ground-truth/a02-004-01.tif" data/eng-ground-truth/a02-004-01 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a02-008-05.tif" -t "data/eng-ground-truth/a02-008-05.gt.txt" > "data/eng-ground-truth/a02-008-05.box"
tesseract "data/eng-ground-truth/a02-008-05.tif" data/eng-ground-truth/a02-008-05 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a02-017-03.tif" -t "data/eng-ground-truth/a02-017-03.gt.txt" > "data/eng-ground-truth/a02-017-03.box"
tesseract "data/eng-ground-truth/a02-017-03.tif" data/eng-ground-truth/a02-017-03 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a02-017-05.tif" -t "data/eng-ground-truth/a02-017-05.gt.txt" > "data/eng-ground-truth/a02-017-05.box"
tesseract "data/eng-ground-truth/a02-017-05.tif" data/eng-ground-truth/a02-017-05 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a02-037-04.tif" -t "data/eng-ground-truth/a02-037-04.gt.txt" > "data/eng-ground-truth/a02-037-04.box"
tesseract "data/eng-ground-truth/a02-037-04.tif" data/eng-ground-truth/a02-037-04 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a02-053-07.tif" -t "data/eng-ground-truth/a02-053-07.gt.txt" > "data/eng-ground-truth/a02-053-07.box"
tesseract "data/eng-ground-truth/a02-053-07.tif" data/eng-ground-truth/a02-053-07 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a02-057-00.tif" -t "data/eng-ground-truth/a02-057-00.gt.txt" > "data/eng-ground-truth/a02-057-00.box"
tesseract "data/eng-ground-truth/a02-057-00.tif" data/eng-ground-truth/a02-057-00 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a02-072-04.tif" -t "data/eng-ground-truth/a02-072-04.gt.txt" > "data/eng-ground-truth/a02-072-04.box"
tesseract "data/eng-ground-truth/a02-072-04.tif" data/eng-ground-truth/a02-072-04 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a02-078-06.tif" -t "data/eng-ground-truth/a02-078-06.gt.txt" > "data/eng-ground-truth/a02-078-06.box"
tesseract "data/eng-ground-truth/a02-078-06.tif" data/eng-ground-truth/a02-078-06 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a02-086-02.tif" -t "data/eng-ground-truth/a02-086-02.gt.txt" > "data/eng-ground-truth/a02-086-02.box"
tesseract "data/eng-ground-truth/a02-086-02.tif" data/eng-ground-truth/a02-086-02 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a02-098-01.tif" -t "data/eng-ground-truth/a02-098-01.gt.txt" > "data/eng-ground-truth/a02-098-01.box"
tesseract "data/eng-ground-truth/a02-098-01.tif" data/eng-ground-truth/a02-098-01 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a02-116-05.tif" -t "data/eng-ground-truth/a02-116-05.gt.txt" > "data/eng-ground-truth/a02-116-05.box"
tesseract "data/eng-ground-truth/a02-116-05.tif" data/eng-ground-truth/a02-116-05 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a02-120-04.tif" -t "data/eng-ground-truth/a02-120-04.gt.txt" > "data/eng-ground-truth/a02-120-04.box"
tesseract "data/eng-ground-truth/a02-120-04.tif" data/eng-ground-truth/a02-120-04 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a03-017-03.tif" -t "data/eng-ground-truth/a03-017-03.gt.txt" > "data/eng-ground-truth/a03-017-03.box"
tesseract "data/eng-ground-truth/a03-017-03.tif" data/eng-ground-truth/a03-017-03 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a03-017-04.tif" -t "data/eng-ground-truth/a03-017-04.gt.txt" > "data/eng-ground-truth/a03-017-04.box"
tesseract "data/eng-ground-truth/a03-017-04.tif" data/eng-ground-truth/a03-017-04 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a03-017-07.tif" -t "data/eng-ground-truth/a03-017-07.gt.txt" > "data/eng-ground-truth/a03-017-07.box"
tesseract "data/eng-ground-truth/a03-017-07.tif" data/eng-ground-truth/a03-017-07 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a03-023-01.tif" -t "data/eng-ground-truth/a03-023-01.gt.txt" > "data/eng-ground-truth/a03-023-01.box"
tesseract "data/eng-ground-truth/a03-023-01.tif" data/eng-ground-truth/a03-023-01 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a03-023-06.tif" -t "data/eng-ground-truth/a03-023-06.gt.txt" > "data/eng-ground-truth/a03-023-06.box"
tesseract "data/eng-ground-truth/a03-023-06.tif" data/eng-ground-truth/a03-023-06 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a03-034-04.tif" -t "data/eng-ground-truth/a03-034-04.gt.txt" > "data/eng-ground-truth/a03-034-04.box"
tesseract "data/eng-ground-truth/a03-034-04.tif" data/eng-ground-truth/a03-034-04 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a03-050-03.tif" -t "data/eng-ground-truth/a03-050-03.gt.txt" > "data/eng-ground-truth/a03-050-03.box"
tesseract "data/eng-ground-truth/a03-050-03.tif" data/eng-ground-truth/a03-050-03 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a03-054-04.tif" -t "data/eng-ground-truth/a03-054-04.gt.txt" > "data/eng-ground-truth/a03-054-04.box"
tesseract "data/eng-ground-truth/a03-054-04.tif" data/eng-ground-truth/a03-054-04 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a03-054-05.tif" -t "data/eng-ground-truth/a03-054-05.gt.txt" > "data/eng-ground-truth/a03-054-05.box"
tesseract "data/eng-ground-truth/a03-054-05.tif" data/eng-ground-truth/a03-054-05 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a03-059-06.tif" -t "data/eng-ground-truth/a03-059-06.gt.txt" > "data/eng-ground-truth/a03-059-06.box"
tesseract "data/eng-ground-truth/a03-059-06.tif" data/eng-ground-truth/a03-059-06 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a03-059-08.tif" -t "data/eng-ground-truth/a03-059-08.gt.txt" > "data/eng-ground-truth/a03-059-08.box"
tesseract "data/eng-ground-truth/a03-059-08.tif" data/eng-ground-truth/a03-059-08 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a03-059-11.tif" -t "data/eng-ground-truth/a03-059-11.gt.txt" > "data/eng-ground-truth/a03-059-11.box"
tesseract "data/eng-ground-truth/a03-059-11.tif" data/eng-ground-truth/a03-059-11 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a03-063-02.tif" -t "data/eng-ground-truth/a03-063-02.gt.txt" > "data/eng-ground-truth/a03-063-02.box"
tesseract "data/eng-ground-truth/a03-063-02.tif" data/eng-ground-truth/a03-063-02 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a03-066-00.tif" -t "data/eng-ground-truth/a03-066-00.gt.txt" > "data/eng-ground-truth/a03-066-00.box"
tesseract "data/eng-ground-truth/a03-066-00.tif" data/eng-ground-truth/a03-066-00 --psm 7 lstm.train
python3 shuffle.py 0 "data/eng/all-lstmf"
if [ "" = "Windows_NT" ]; then
dos2unix "data/eng/eng.numbers";
dos2unix "data/eng/eng.punc";
dos2unix "data/eng/eng.wordlist";
dos2unix "data/langdata/eng/eng.config";
fi
combine_lang_model
--input_unicharset data/eng/unicharset
--script_dir data/langdata
--numbers data/eng/eng.numbers
--puncs data/eng/eng.punc
--words data/eng/eng.wordlist
--output_dir data
--lang eng
lstmtraining
--debug_interval 0
--traineddata data/eng/eng.traineddata
--old_traineddata /usr/share/tesseract-ocr/4.00/tessdata/eng.traineddata
--continue_from data/eng/eng.lstm
--learning_rate 0.0001
--model_output data/eng/checkpoints/eng
--train_listfile data/eng/list.train
--eval_listfile data/eng/list.eval
--max_iterations 10000
--target_error_rate 0.01
/bin/bash: line 2: bc: command not found
/bin/bash: line 5: bc: command not found
- head -n '' data/eng/all-lstmf
head: invalid number of lines: '' - tail -n '' data/eng/all-lstmf
tail: invalid number of lines: '' - '[' '' = Windows_NT ']'
Failed to read data from: data/eng/eng.wordlist
Failed to read data from: data/eng/eng.punc
Failed to read data from: data/eng/eng.numbers
Loaded unicharset of size 112 from file data/eng/unicharset
Setting unichar properties
Other case É of é is not in unicharset
Setting script properties
Warning: properties incomplete for index 47 = ~
Config file is optional, continuing...
Failed to read data from: data/langdata/eng/eng.config
Null char=2
Failed to load list of training filenames from data/eng/list.train
make: *** [Makefile:327: data/eng/checkpoints/eng_checkpoint] Error 1
from tessdata.
I managed to solve the problem that I was having, but know i get this error make: *** [Makefile:327: data/hw/checkpoints/hw_checkpoint] Segmentation fault (core dumped)
from tessdata.
how did you solve this problem, could you elaborate more
from tessdata.
I don't remember anymore what I did, I just used make training MODEL_NAME=<MODEL_NAME>
. I think I added wrong parameters to the command
from tessdata.
Related Issues (20)
- Cannot extract tessdata HOT 2
- Arabic issue
- Which library recognizes operators and numbers? HOT 1
- VietOCR - how to manually config language file if I don't have write access to C:\Program Files\Tesseract-OCR\tessdata folder? HOT 1
- orc Portugues Brazil not found HOT 1
- Which font is used for Bengali tessdata? HOT 1
- Error: LSTM requested, but not present!! Loading tesseract HOT 5
- size of eng.traineddata best/fast/... HOT 1
- Tessdata on Homebrew HOT 1
- Select screen area bug HOT 1
- OCR by chi_tra_vert or chi_sim_vert returns garbled results HOT 1
- Python: pytesseract does not recognize language Romanian characters on converting PDF files (that contains photocopied images) HOT 1
- hin.traindata,devnagri.traindata
- About the identification of national currency symbol icons
- Need new trained-data for Myanmar.
- Word list in eng.traineddata HOT 4
- Failed loading language 'eng' HOT 2
- Cannot build Tesseract's training tool from source code on M1 Macbook HOT 3
- Required GUI Based App To Train Data On fonts And Different Languages
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tessdata.