Giter Club home page Giter Club logo

Comments (5)

zdenop avatar zdenop commented on June 19, 2024

Failed to load list of training filenames from data/eng/list.train
Did you check the file?

from tessdata.

xReniar avatar xReniar commented on June 19, 2024

Yes, it's empty and I don't know why, i'm training tesseract on line images

You are useing make version: 4.3
combine_tessdata -u /usr/share/tesseract-ocr/4.00/tessdata/eng.traineddata data/eng/eng
Extracting tessdata components from /usr/share/tesseract-ocr/4.00/tessdata/eng.traineddata
Wrote data/eng/eng.lstm
Wrote data/eng/eng.lstm-punc-dawg
Wrote data/eng/eng.lstm-word-dawg
Wrote data/eng/eng.lstm-number-dawg
Wrote data/eng/eng.lstm-unicharset
Wrote data/eng/eng.lstm-recoder
Wrote data/eng/eng.version
unicharset_extractor --output_unicharset "data/eng/my.unicharset" --norm_mode 2 "data/eng/all-gt"
merge_unicharsets data/eng/eng.lstm-unicharset data/eng/my.unicharset "data/eng/unicharset"
Loaded unicharset of size 112 from file data/eng/eng.lstm-unicharset
Loaded unicharset of size 81 from file data/eng/my.unicharset
Wrote unicharset file data/eng/unicharset.
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-000x-04.tif" -t "data/eng-ground-truth/a01-000x-04.gt.txt" > "data/eng-ground-truth/a01-000x-04.box"
tesseract "data/eng-ground-truth/a01-000x-04.tif" data/eng-ground-truth/a01-000x-04 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-000x-05.tif" -t "data/eng-ground-truth/a01-000x-05.gt.txt" > "data/eng-ground-truth/a01-000x-05.box"
tesseract "data/eng-ground-truth/a01-000x-05.tif" data/eng-ground-truth/a01-000x-05 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-003-09.tif" -t "data/eng-ground-truth/a01-003-09.gt.txt" > "data/eng-ground-truth/a01-003-09.box"
tesseract "data/eng-ground-truth/a01-003-09.tif" data/eng-ground-truth/a01-003-09 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-003x-02.tif" -t "data/eng-ground-truth/a01-003x-02.gt.txt" > "data/eng-ground-truth/a01-003x-02.box"
tesseract "data/eng-ground-truth/a01-003x-02.tif" data/eng-ground-truth/a01-003x-02 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-007u-00.tif" -t "data/eng-ground-truth/a01-007u-00.gt.txt" > "data/eng-ground-truth/a01-007u-00.box"
tesseract "data/eng-ground-truth/a01-007u-00.tif" data/eng-ground-truth/a01-007u-00 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-007u-09.tif" -t "data/eng-ground-truth/a01-007u-09.gt.txt" > "data/eng-ground-truth/a01-007u-09.box"
tesseract "data/eng-ground-truth/a01-007u-09.tif" data/eng-ground-truth/a01-007u-09 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-014-01.tif" -t "data/eng-ground-truth/a01-014-01.gt.txt" > "data/eng-ground-truth/a01-014-01.box"
tesseract "data/eng-ground-truth/a01-014-01.tif" data/eng-ground-truth/a01-014-01 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-014-04.tif" -t "data/eng-ground-truth/a01-014-04.gt.txt" > "data/eng-ground-truth/a01-014-04.box"
tesseract "data/eng-ground-truth/a01-014-04.tif" data/eng-ground-truth/a01-014-04 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-014-07.tif" -t "data/eng-ground-truth/a01-014-07.gt.txt" > "data/eng-ground-truth/a01-014-07.box"
tesseract "data/eng-ground-truth/a01-014-07.tif" data/eng-ground-truth/a01-014-07 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-014x-08.tif" -t "data/eng-ground-truth/a01-014x-08.gt.txt" > "data/eng-ground-truth/a01-014x-08.box"
tesseract "data/eng-ground-truth/a01-014x-08.tif" data/eng-ground-truth/a01-014x-08 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-020u-05.tif" -t "data/eng-ground-truth/a01-020u-05.gt.txt" > "data/eng-ground-truth/a01-020u-05.box"
tesseract "data/eng-ground-truth/a01-020u-05.tif" data/eng-ground-truth/a01-020u-05 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-020x-01.tif" -t "data/eng-ground-truth/a01-020x-01.gt.txt" > "data/eng-ground-truth/a01-020x-01.box"
tesseract "data/eng-ground-truth/a01-020x-01.tif" data/eng-ground-truth/a01-020x-01 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-020x-02.tif" -t "data/eng-ground-truth/a01-020x-02.gt.txt" > "data/eng-ground-truth/a01-020x-02.box"
tesseract "data/eng-ground-truth/a01-020x-02.tif" data/eng-ground-truth/a01-020x-02 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-026-07.tif" -t "data/eng-ground-truth/a01-026-07.gt.txt" > "data/eng-ground-truth/a01-026-07.box"
tesseract "data/eng-ground-truth/a01-026-07.tif" data/eng-ground-truth/a01-026-07 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-026u-04.tif" -t "data/eng-ground-truth/a01-026u-04.gt.txt" > "data/eng-ground-truth/a01-026u-04.box"
tesseract "data/eng-ground-truth/a01-026u-04.tif" data/eng-ground-truth/a01-026u-04 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-030-00.tif" -t "data/eng-ground-truth/a01-030-00.gt.txt" > "data/eng-ground-truth/a01-030-00.box"
tesseract "data/eng-ground-truth/a01-030-00.tif" data/eng-ground-truth/a01-030-00 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-030-02.tif" -t "data/eng-ground-truth/a01-030-02.gt.txt" > "data/eng-ground-truth/a01-030-02.box"
tesseract "data/eng-ground-truth/a01-030-02.tif" data/eng-ground-truth/a01-030-02 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-030u-08.tif" -t "data/eng-ground-truth/a01-030u-08.gt.txt" > "data/eng-ground-truth/a01-030u-08.box"
tesseract "data/eng-ground-truth/a01-030u-08.tif" data/eng-ground-truth/a01-030u-08 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-038-02.tif" -t "data/eng-ground-truth/a01-038-02.gt.txt" > "data/eng-ground-truth/a01-038-02.box"
tesseract "data/eng-ground-truth/a01-038-02.tif" data/eng-ground-truth/a01-038-02 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-038-03.tif" -t "data/eng-ground-truth/a01-038-03.gt.txt" > "data/eng-ground-truth/a01-038-03.box"
tesseract "data/eng-ground-truth/a01-038-03.tif" data/eng-ground-truth/a01-038-03 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-038-08.tif" -t "data/eng-ground-truth/a01-038-08.gt.txt" > "data/eng-ground-truth/a01-038-08.box"
tesseract "data/eng-ground-truth/a01-038-08.tif" data/eng-ground-truth/a01-038-08 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-038x-01.tif" -t "data/eng-ground-truth/a01-038x-01.gt.txt" > "data/eng-ground-truth/a01-038x-01.box"
tesseract "data/eng-ground-truth/a01-038x-01.tif" data/eng-ground-truth/a01-038x-01 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-038x-05.tif" -t "data/eng-ground-truth/a01-038x-05.gt.txt" > "data/eng-ground-truth/a01-038x-05.box"
tesseract "data/eng-ground-truth/a01-038x-05.tif" data/eng-ground-truth/a01-038x-05 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-038x-07.tif" -t "data/eng-ground-truth/a01-038x-07.gt.txt" > "data/eng-ground-truth/a01-038x-07.box"
tesseract "data/eng-ground-truth/a01-038x-07.tif" data/eng-ground-truth/a01-038x-07 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-038x-08.tif" -t "data/eng-ground-truth/a01-038x-08.gt.txt" > "data/eng-ground-truth/a01-038x-08.box"
tesseract "data/eng-ground-truth/a01-038x-08.tif" data/eng-ground-truth/a01-038x-08 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-043-02.tif" -t "data/eng-ground-truth/a01-043-02.gt.txt" > "data/eng-ground-truth/a01-043-02.box"
tesseract "data/eng-ground-truth/a01-043-02.tif" data/eng-ground-truth/a01-043-02 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-049u-07.tif" -t "data/eng-ground-truth/a01-049u-07.gt.txt" > "data/eng-ground-truth/a01-049u-07.box"
tesseract "data/eng-ground-truth/a01-049u-07.tif" data/eng-ground-truth/a01-049u-07 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-049x-00.tif" -t "data/eng-ground-truth/a01-049x-00.gt.txt" > "data/eng-ground-truth/a01-049x-00.box"
tesseract "data/eng-ground-truth/a01-049x-00.tif" data/eng-ground-truth/a01-049x-00 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-053-00.tif" -t "data/eng-ground-truth/a01-053-00.gt.txt" > "data/eng-ground-truth/a01-053-00.box"
tesseract "data/eng-ground-truth/a01-053-00.tif" data/eng-ground-truth/a01-053-00 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-053u-08.tif" -t "data/eng-ground-truth/a01-053u-08.gt.txt" > "data/eng-ground-truth/a01-053u-08.box"
tesseract "data/eng-ground-truth/a01-053u-08.tif" data/eng-ground-truth/a01-053u-08 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-053x-00.tif" -t "data/eng-ground-truth/a01-053x-00.gt.txt" > "data/eng-ground-truth/a01-053x-00.box"
tesseract "data/eng-ground-truth/a01-053x-00.tif" data/eng-ground-truth/a01-053x-00 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-058-05.tif" -t "data/eng-ground-truth/a01-058-05.gt.txt" > "data/eng-ground-truth/a01-058-05.box"
tesseract "data/eng-ground-truth/a01-058-05.tif" data/eng-ground-truth/a01-058-05 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-058u-06.tif" -t "data/eng-ground-truth/a01-058u-06.gt.txt" > "data/eng-ground-truth/a01-058u-06.box"
tesseract "data/eng-ground-truth/a01-058u-06.tif" data/eng-ground-truth/a01-058u-06 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-063-01.tif" -t "data/eng-ground-truth/a01-063-01.gt.txt" > "data/eng-ground-truth/a01-063-01.box"
tesseract "data/eng-ground-truth/a01-063-01.tif" data/eng-ground-truth/a01-063-01 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-063x-00.tif" -t "data/eng-ground-truth/a01-063x-00.gt.txt" > "data/eng-ground-truth/a01-063x-00.box"
tesseract "data/eng-ground-truth/a01-063x-00.tif" data/eng-ground-truth/a01-063x-00 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-072u-00.tif" -t "data/eng-ground-truth/a01-072u-00.gt.txt" > "data/eng-ground-truth/a01-072u-00.box"
tesseract "data/eng-ground-truth/a01-072u-00.tif" data/eng-ground-truth/a01-072u-00 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-077u-01.tif" -t "data/eng-ground-truth/a01-077u-01.gt.txt" > "data/eng-ground-truth/a01-077u-01.box"
tesseract "data/eng-ground-truth/a01-077u-01.tif" data/eng-ground-truth/a01-077u-01 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-077u-05.tif" -t "data/eng-ground-truth/a01-077u-05.gt.txt" > "data/eng-ground-truth/a01-077u-05.box"
tesseract "data/eng-ground-truth/a01-077u-05.tif" data/eng-ground-truth/a01-077u-05 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-082u-00.tif" -t "data/eng-ground-truth/a01-082u-00.gt.txt" > "data/eng-ground-truth/a01-082u-00.box"
tesseract "data/eng-ground-truth/a01-082u-00.tif" data/eng-ground-truth/a01-082u-00 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-082u-02.tif" -t "data/eng-ground-truth/a01-082u-02.gt.txt" > "data/eng-ground-truth/a01-082u-02.box"
tesseract "data/eng-ground-truth/a01-082u-02.tif" data/eng-ground-truth/a01-082u-02 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-082u-05.tif" -t "data/eng-ground-truth/a01-082u-05.gt.txt" > "data/eng-ground-truth/a01-082u-05.box"
tesseract "data/eng-ground-truth/a01-082u-05.tif" data/eng-ground-truth/a01-082u-05 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-091-02.tif" -t "data/eng-ground-truth/a01-091-02.gt.txt" > "data/eng-ground-truth/a01-091-02.box"
tesseract "data/eng-ground-truth/a01-091-02.tif" data/eng-ground-truth/a01-091-02 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-096u-01.tif" -t "data/eng-ground-truth/a01-096u-01.gt.txt" > "data/eng-ground-truth/a01-096u-01.box"
tesseract "data/eng-ground-truth/a01-096u-01.tif" data/eng-ground-truth/a01-096u-01 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-113u-10.tif" -t "data/eng-ground-truth/a01-113u-10.gt.txt" > "data/eng-ground-truth/a01-113u-10.box"
tesseract "data/eng-ground-truth/a01-113u-10.tif" data/eng-ground-truth/a01-113u-10 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-117-07.tif" -t "data/eng-ground-truth/a01-117-07.gt.txt" > "data/eng-ground-truth/a01-117-07.box"
tesseract "data/eng-ground-truth/a01-117-07.tif" data/eng-ground-truth/a01-117-07 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-117u-05.tif" -t "data/eng-ground-truth/a01-117u-05.gt.txt" > "data/eng-ground-truth/a01-117u-05.box"
tesseract "data/eng-ground-truth/a01-117u-05.tif" data/eng-ground-truth/a01-117u-05 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-128-02.tif" -t "data/eng-ground-truth/a01-128-02.gt.txt" > "data/eng-ground-truth/a01-128-02.box"
tesseract "data/eng-ground-truth/a01-128-02.tif" data/eng-ground-truth/a01-128-02 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-132-01.tif" -t "data/eng-ground-truth/a01-132-01.gt.txt" > "data/eng-ground-truth/a01-132-01.box"
tesseract "data/eng-ground-truth/a01-132-01.tif" data/eng-ground-truth/a01-132-01 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-132-03.tif" -t "data/eng-ground-truth/a01-132-03.gt.txt" > "data/eng-ground-truth/a01-132-03.box"
tesseract "data/eng-ground-truth/a01-132-03.tif" data/eng-ground-truth/a01-132-03 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-132u-01.tif" -t "data/eng-ground-truth/a01-132u-01.gt.txt" > "data/eng-ground-truth/a01-132u-01.box"
tesseract "data/eng-ground-truth/a01-132u-01.tif" data/eng-ground-truth/a01-132u-01 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-132u-05.tif" -t "data/eng-ground-truth/a01-132u-05.gt.txt" > "data/eng-ground-truth/a01-132u-05.box"
tesseract "data/eng-ground-truth/a01-132u-05.tif" data/eng-ground-truth/a01-132u-05 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-132x-05.tif" -t "data/eng-ground-truth/a01-132x-05.gt.txt" > "data/eng-ground-truth/a01-132x-05.box"
tesseract "data/eng-ground-truth/a01-132x-05.tif" data/eng-ground-truth/a01-132x-05 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a01-132x-07.tif" -t "data/eng-ground-truth/a01-132x-07.gt.txt" > "data/eng-ground-truth/a01-132x-07.box"
tesseract "data/eng-ground-truth/a01-132x-07.tif" data/eng-ground-truth/a01-132x-07 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a02-000-07.tif" -t "data/eng-ground-truth/a02-000-07.gt.txt" > "data/eng-ground-truth/a02-000-07.box"
tesseract "data/eng-ground-truth/a02-000-07.tif" data/eng-ground-truth/a02-000-07 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a02-004-00.tif" -t "data/eng-ground-truth/a02-004-00.gt.txt" > "data/eng-ground-truth/a02-004-00.box"
tesseract "data/eng-ground-truth/a02-004-00.tif" data/eng-ground-truth/a02-004-00 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a02-004-01.tif" -t "data/eng-ground-truth/a02-004-01.gt.txt" > "data/eng-ground-truth/a02-004-01.box"
tesseract "data/eng-ground-truth/a02-004-01.tif" data/eng-ground-truth/a02-004-01 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a02-008-05.tif" -t "data/eng-ground-truth/a02-008-05.gt.txt" > "data/eng-ground-truth/a02-008-05.box"
tesseract "data/eng-ground-truth/a02-008-05.tif" data/eng-ground-truth/a02-008-05 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a02-017-03.tif" -t "data/eng-ground-truth/a02-017-03.gt.txt" > "data/eng-ground-truth/a02-017-03.box"
tesseract "data/eng-ground-truth/a02-017-03.tif" data/eng-ground-truth/a02-017-03 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a02-017-05.tif" -t "data/eng-ground-truth/a02-017-05.gt.txt" > "data/eng-ground-truth/a02-017-05.box"
tesseract "data/eng-ground-truth/a02-017-05.tif" data/eng-ground-truth/a02-017-05 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a02-037-04.tif" -t "data/eng-ground-truth/a02-037-04.gt.txt" > "data/eng-ground-truth/a02-037-04.box"
tesseract "data/eng-ground-truth/a02-037-04.tif" data/eng-ground-truth/a02-037-04 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a02-053-07.tif" -t "data/eng-ground-truth/a02-053-07.gt.txt" > "data/eng-ground-truth/a02-053-07.box"
tesseract "data/eng-ground-truth/a02-053-07.tif" data/eng-ground-truth/a02-053-07 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a02-057-00.tif" -t "data/eng-ground-truth/a02-057-00.gt.txt" > "data/eng-ground-truth/a02-057-00.box"
tesseract "data/eng-ground-truth/a02-057-00.tif" data/eng-ground-truth/a02-057-00 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a02-072-04.tif" -t "data/eng-ground-truth/a02-072-04.gt.txt" > "data/eng-ground-truth/a02-072-04.box"
tesseract "data/eng-ground-truth/a02-072-04.tif" data/eng-ground-truth/a02-072-04 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a02-078-06.tif" -t "data/eng-ground-truth/a02-078-06.gt.txt" > "data/eng-ground-truth/a02-078-06.box"
tesseract "data/eng-ground-truth/a02-078-06.tif" data/eng-ground-truth/a02-078-06 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a02-086-02.tif" -t "data/eng-ground-truth/a02-086-02.gt.txt" > "data/eng-ground-truth/a02-086-02.box"
tesseract "data/eng-ground-truth/a02-086-02.tif" data/eng-ground-truth/a02-086-02 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a02-098-01.tif" -t "data/eng-ground-truth/a02-098-01.gt.txt" > "data/eng-ground-truth/a02-098-01.box"
tesseract "data/eng-ground-truth/a02-098-01.tif" data/eng-ground-truth/a02-098-01 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a02-116-05.tif" -t "data/eng-ground-truth/a02-116-05.gt.txt" > "data/eng-ground-truth/a02-116-05.box"
tesseract "data/eng-ground-truth/a02-116-05.tif" data/eng-ground-truth/a02-116-05 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a02-120-04.tif" -t "data/eng-ground-truth/a02-120-04.gt.txt" > "data/eng-ground-truth/a02-120-04.box"
tesseract "data/eng-ground-truth/a02-120-04.tif" data/eng-ground-truth/a02-120-04 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a03-017-03.tif" -t "data/eng-ground-truth/a03-017-03.gt.txt" > "data/eng-ground-truth/a03-017-03.box"
tesseract "data/eng-ground-truth/a03-017-03.tif" data/eng-ground-truth/a03-017-03 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a03-017-04.tif" -t "data/eng-ground-truth/a03-017-04.gt.txt" > "data/eng-ground-truth/a03-017-04.box"
tesseract "data/eng-ground-truth/a03-017-04.tif" data/eng-ground-truth/a03-017-04 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a03-017-07.tif" -t "data/eng-ground-truth/a03-017-07.gt.txt" > "data/eng-ground-truth/a03-017-07.box"
tesseract "data/eng-ground-truth/a03-017-07.tif" data/eng-ground-truth/a03-017-07 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a03-023-01.tif" -t "data/eng-ground-truth/a03-023-01.gt.txt" > "data/eng-ground-truth/a03-023-01.box"
tesseract "data/eng-ground-truth/a03-023-01.tif" data/eng-ground-truth/a03-023-01 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a03-023-06.tif" -t "data/eng-ground-truth/a03-023-06.gt.txt" > "data/eng-ground-truth/a03-023-06.box"
tesseract "data/eng-ground-truth/a03-023-06.tif" data/eng-ground-truth/a03-023-06 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a03-034-04.tif" -t "data/eng-ground-truth/a03-034-04.gt.txt" > "data/eng-ground-truth/a03-034-04.box"
tesseract "data/eng-ground-truth/a03-034-04.tif" data/eng-ground-truth/a03-034-04 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a03-050-03.tif" -t "data/eng-ground-truth/a03-050-03.gt.txt" > "data/eng-ground-truth/a03-050-03.box"
tesseract "data/eng-ground-truth/a03-050-03.tif" data/eng-ground-truth/a03-050-03 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a03-054-04.tif" -t "data/eng-ground-truth/a03-054-04.gt.txt" > "data/eng-ground-truth/a03-054-04.box"
tesseract "data/eng-ground-truth/a03-054-04.tif" data/eng-ground-truth/a03-054-04 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a03-054-05.tif" -t "data/eng-ground-truth/a03-054-05.gt.txt" > "data/eng-ground-truth/a03-054-05.box"
tesseract "data/eng-ground-truth/a03-054-05.tif" data/eng-ground-truth/a03-054-05 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a03-059-06.tif" -t "data/eng-ground-truth/a03-059-06.gt.txt" > "data/eng-ground-truth/a03-059-06.box"
tesseract "data/eng-ground-truth/a03-059-06.tif" data/eng-ground-truth/a03-059-06 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a03-059-08.tif" -t "data/eng-ground-truth/a03-059-08.gt.txt" > "data/eng-ground-truth/a03-059-08.box"
tesseract "data/eng-ground-truth/a03-059-08.tif" data/eng-ground-truth/a03-059-08 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a03-059-11.tif" -t "data/eng-ground-truth/a03-059-11.gt.txt" > "data/eng-ground-truth/a03-059-11.box"
tesseract "data/eng-ground-truth/a03-059-11.tif" data/eng-ground-truth/a03-059-11 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a03-063-02.tif" -t "data/eng-ground-truth/a03-063-02.gt.txt" > "data/eng-ground-truth/a03-063-02.box"
tesseract "data/eng-ground-truth/a03-063-02.tif" data/eng-ground-truth/a03-063-02 --psm 7 lstm.train
PYTHONIOENCODING=utf-8 python3 generate_line_box.py -i "data/eng-ground-truth/a03-066-00.tif" -t "data/eng-ground-truth/a03-066-00.gt.txt" > "data/eng-ground-truth/a03-066-00.box"
tesseract "data/eng-ground-truth/a03-066-00.tif" data/eng-ground-truth/a03-066-00 --psm 7 lstm.train
python3 shuffle.py 0 "data/eng/all-lstmf"
if [ "" = "Windows_NT" ]; then
dos2unix "data/eng/eng.numbers";
dos2unix "data/eng/eng.punc";
dos2unix "data/eng/eng.wordlist";
dos2unix "data/langdata/eng/eng.config";
fi
combine_lang_model
--input_unicharset data/eng/unicharset
--script_dir data/langdata
--numbers data/eng/eng.numbers
--puncs data/eng/eng.punc
--words data/eng/eng.wordlist
--output_dir data

--lang eng
lstmtraining
--debug_interval 0
--traineddata data/eng/eng.traineddata
--old_traineddata /usr/share/tesseract-ocr/4.00/tessdata/eng.traineddata
--continue_from data/eng/eng.lstm
--learning_rate 0.0001
--model_output data/eng/checkpoints/eng
--train_listfile data/eng/list.train
--eval_listfile data/eng/list.eval
--max_iterations 10000
--target_error_rate 0.01
/bin/bash: line 2: bc: command not found
/bin/bash: line 5: bc: command not found

  • head -n '' data/eng/all-lstmf
    head: invalid number of lines: ''
  • tail -n '' data/eng/all-lstmf
    tail: invalid number of lines: ''
  • '[' '' = Windows_NT ']'
    Failed to read data from: data/eng/eng.wordlist
    Failed to read data from: data/eng/eng.punc
    Failed to read data from: data/eng/eng.numbers
    Loaded unicharset of size 112 from file data/eng/unicharset
    Setting unichar properties
    Other case É of é is not in unicharset
    Setting script properties
    Warning: properties incomplete for index 47 = ~
    Config file is optional, continuing...
    Failed to read data from: data/langdata/eng/eng.config
    Null char=2
    Failed to load list of training filenames from data/eng/list.train
    make: *** [Makefile:327: data/eng/checkpoints/eng_checkpoint] Error 1

from tessdata.

xReniar avatar xReniar commented on June 19, 2024

I managed to solve the problem that I was having, but know i get this error make: *** [Makefile:327: data/hw/checkpoints/hw_checkpoint] Segmentation fault (core dumped)

from tessdata.

Sairahul07-25 avatar Sairahul07-25 commented on June 19, 2024

how did you solve this problem, could you elaborate more

from tessdata.

xReniar avatar xReniar commented on June 19, 2024

I don't remember anymore what I did, I just used make training MODEL_NAME=<MODEL_NAME>. I think I added wrong parameters to the command

from tessdata.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.