doc-analysis / tablebank Goto Github PK
View Code? Open in Web Editor NEWTableBank: A Benchmark Dataset for Table Detection and Recognition
License: Apache License 2.0
TableBank: A Benchmark Dataset for Table Detection and Recognition
License: Apache License 2.0
Thanks for your great work! If the method to generate the TableBank can be released?
As topic. This can help to reproduce your experiments.
I am running the code on google colab, and running:
!python detectron/tools/infer_simple.py --cfg /content/All_X101.yaml --output-dir /tmp/detectron-tablebank --image-ext jpg \
--wts /content/model_final.pth /content/drive/MyDrive/TableBank/Image
The error is:
Found Detectron ops lib: /usr/local/lib/python3.7/dist-packages/torch/lib/libcaffe2_detectron_ops_gpu.so
[E init_intrinsics_check.cc:44] CPU feature avx is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
[E init_intrinsics_check.cc:44] CPU feature avx2 is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
[E init_intrinsics_check.cc:44] CPU feature fma is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
Traceback (most recent call last):
File "detectron/tools/infer_simple.py", line 185, in <module>
main(args)
File "detectron/tools/infer_simple.py", line 125, in main
merge_cfg_from_file(args.cfg)
File "/content/detectron/detectron/core/config.py", line 1152, in merge_cfg_from_file
_merge_a_into_b(yaml_cfg, __C)
File "/content/detectron/detectron/core/config.py", line 1202, in _merge_a_into_b
raise KeyError('Non-existent config key: {}'.format(full_key))
KeyError: 'Non-existent config key: _BASE_'
My commands :
python train.py -model_type img -data data/demo -save_model demo-model -gpu_ranks 0 -batch_size 4 -learning_rate 0.1 -word_vec_size 80 -encoder_type brnn -image_channel_size 3
My GPU breaks while running to the 10000 steps of 100000
My batch_size sets 4 ,as 20 defined can not run
Hi,
I'd like to thank you for releasing this nice dataset. However, I found the quality of the annotation is actually not quite high, mainly two issues:
Issue 1 has been mentioned by #9 , where the author answer by
some error may cause a little table unlabeled
However, I plotted the first 100 image ids and their annotations in /Detection_data/Word, and found 21 images out of 100 with missing annotations ( 1 or up to 3 tables were missing). Unless I'm extremely lucky to catch these problematic annotations from the first 100 plot, this issue does not only exist in 'a little table'.
To be specific, I post the imgIds for those 21 images:
3, 9, 10, 27, 32, 33, 39, 47, 51, 56, 57, 58, 59, 60, 61, 62, 73, 76, 77, 87, 95
As for issue 2, I found 3 images (out of 100 tested images) with incorrect annotations:
18, 62, 83
I understand from the paper that these annotations are generated by parsing the PDF/Word documents, and those document parsing code could not catch all the tables. I post this here only for providing researchers some info that they might care about.
Issue 1 is actually not hard to fix. I have trained a model for table detection (trained on other datasets) with descent performance, I'd like to use this model to run one pass through all the data provided here and hopefully spot a large amount of missing annotations, then fix those manually. I'd be happy to share and discuss more.
I load the data with pycocotools, get annotations for each images using:
img_ann = coco.loadAnns(coco.getAnnIds(imgIds = image_id))
and plotted the annotations on a matplotlib figure using
coco.showAnns(img_ann)
The missing/incorrect annotations were then spotted by eye.
I'd be happy to discuss more and provide the testing jpynb if wanted.
Best,
Julian
how do I get the coordinates of the boxes?
I think I saw it somewhere, I can't find it now.
I downloaded the dataset parts but I cannot manage to extract the files correctly.
I tried different commands cited here: https://unix.stackexchange.com/questions/40480/how-to-unzip-a-multipart-spanned-zip-on-linux
But the only successful method was this one :
cat test.zip.* >test.zip
zip -FF test.zip --out test-full.zip
unzip test-full.zip
However, after the extraction one of the annotation json file is broken and has not been extracted correctly.
Can someone share their way to extract the dataset please ?
First, thank you very much for your code. But I didn't find a way to use tablebank to recognition table structure. Can you provide it, thank you.
Hi, is there any upcoming update to the models to work with Detectron2 ?
Thanks in advance,
While running this command after having recognition model in my local model.pt
python translate.py -model model.pt --src_dir recognition.jpg -output pred.txt
usage: translate.py [-h] [-config CONFIG] [-save_config SAVE_CONFIG] --model
MODEL [MODEL ...] [--fp32] [--avg_raw_probs]
[--data_type DATA_TYPE] --src SRC [--src_dir SRC_DIR]
[--tgt TGT] [--shard_size SHARD_SIZE] [--output OUTPUT]
[--report_bleu] [--report_rouge] [--report_time]
[--dynamic_dict] [--share_vocab]
[--random_sampling_topk RANDOM_SAMPLING_TOPK]
[--random_sampling_temp RANDOM_SAMPLING_TEMP]
[--seed SEED] [--beam_size BEAM_SIZE]
[--min_length MIN_LENGTH] [--max_length MAX_LENGTH]
[--max_sent_length] [--stepwise_penalty]
[--length_penalty {none,wu,avg}] [--ratio RATIO]
[--coverage_penalty {none,wu,summary}] [--alpha ALPHA]
[--beta BETA] [--block_ngram_repeat BLOCK_NGRAM_REPEAT]
[--ignore_when_blocking IGNORE_WHEN_BLOCKING [IGNORE_WHEN_BLOCKING ...]]
[--replace_unk] [--phrase_table PHRASE_TABLE] [--verbose]
[--log_file LOG_FILE]
[--log_file_level {CRITICAL,ERROR,WARNING,INFO,DEBUG,NOTSET,50,40,30,20,10,0}]
[--attn_debug] [--dump_beam DUMP_BEAM] [--n_best N_BEST]
[--batch_size BATCH_SIZE] [--gpu GPU]
[--sample_rate SAMPLE_RATE] [--window_size WINDOW_SIZE]
[--window_stride WINDOW_STRIDE] [--window WINDOW]
[--image_channel_size {3,1}]
translate.py: error: the following arguments are required: --src/-src
pytorch 0.4.1
py36_cuda9.2.148_cudnn7.1.4_1
Can someone tell what could be error ?
I don't know what to pass as an argument to -src
I see here that -src
means Source sequence to decode (one line per sequence)
which is what I don't understand !!
this doesn't help either, mentioned in this issue
how to annotation cell points in table recognition datasets
i wanna run this model in detectron on CPU (not GPU)
im using table detection task model and detectron code
can anybody help me how do i do that ?
I tried to download the TableBank dataset from the official website https://doc-analysis.github.io/tablebank-page/index.html, but it seems that I do not have the permission to download it. How can I obtain this dataset?
Wanted to know steps regarding fine tuning of model on custom classes.
I have downloaded and checked the TableBank dataset from your dataset homepage
I have found some issues in the annotations, the README denotes the number of tables in the Table Detection task as follows:
Task | Word | Latex | Word+Latex |
---|---|---|---|
Table detection | 163,417 | 253,817 | 417,234 |
But I ran my script to check the data annotations, it showed that there were only 101889 tables
in the Word subset.
Hi,
What is the breakdown of the topics or document types (news article, scientific papers, etc) in the Tablebank dataset?
I used the follow command to predict structure of the table :
python translate.py -model model.pt --src_dir './tables/' --src './src_txt.txt' -output pred.txt
and I get the following error:
AssertionError: Cannot use _dir with TextDataReader.
From your previous replies to issues https://github.com/doc-analysis/TableBank/issues/12 and https://github.com/doc-analysis/TableBank/issues/10, its looks that I can test the model by using -tgt (providing a ground truth file)
Can I not only predict on a sample?
i was bit confused. please once explain me, help me anyone :)
i was tried this way.
python drive/My\ Drive/OpenNMT-py/translate.py -data_type img -model drive/My\ Drive/Pretrained_Word_Embeddings/detectron_table_detection/model.pt -src_dir drive/My\ Drive/datasets/table_dataset_sample/8.jpg \
-output pred.txt -max_length 150 -beam_size 5 -gpu 0 -verbose
i am getting same issue. i don't know what is -src_dir & -src
usage: translate.py [-h] [-config CONFIG] [-save_config SAVE_CONFIG] --model
MODEL [MODEL ...] [--fp32] [--avg_raw_probs]
[--data_type DATA_TYPE] --src SRC [--src_dir SRC_DIR]
[--tgt TGT] [--shard_size SHARD_SIZE] [--output OUTPUT]
[--report_bleu] [--report_rouge] [--report_time]
[--dynamic_dict] [--share_vocab]
[--random_sampling_topk RANDOM_SAMPLING_TOPK]
[--random_sampling_temp RANDOM_SAMPLING_TEMP]
[--seed SEED] [--beam_size BEAM_SIZE]
[--min_length MIN_LENGTH] [--max_length MAX_LENGTH]
[--max_sent_length] [--stepwise_penalty]
[--length_penalty {none,wu,avg}] [--ratio RATIO]
[--coverage_penalty {none,wu,summary}] [--alpha ALPHA]
[--beta BETA] [--block_ngram_repeat BLOCK_NGRAM_REPEAT]
[--ignore_when_blocking IGNORE_WHEN_BLOCKING [IGNORE_WHEN_BLOCKING ...]]
[--replace_unk] [--phrase_table PHRASE_TABLE] [--verbose]
[--log_file LOG_FILE]
[--log_file_level {CRITICAL,ERROR,WARNING,INFO,DEBUG,NOTSET,50,40,30,20,10,0}]
[--attn_debug] [--dump_beam DUMP_BEAM] [--n_best N_BEST]
[--batch_size BATCH_SIZE] [--gpu GPU]
[--sample_rate SAMPLE_RATE] [--window_size WINDOW_SIZE]
[--window_stride WINDOW_STRIDE] [--window WINDOW]
[--image_channel_size {3,1}]
translate.py: error: the following arguments are required: --src/-src
here docuentation (Image to text ) they said,
python translate.py -data_type img -model demo-model_acc_x_ppl_x_e13.pt -src_dir data/im2text/images \
-src data/im2text/src-test.txt -output pred.txt -max_length 150 -beam_size 5 -gpu 0 -verbose
-src_dir: The directory containing the images.
then why i need -src data/im2text/src-test.txt ?
we want image to text. but src why txt. what is that can any one clarify me.
Thank you all
when i decompress TableBank.zip.001, which need password, where can i get it?
I found there is some problem in the data , table not labeled . two example from Word.json
{'category_id': 1, 'area': 46280, 'iscrowd': 0, 'segmentation': [[71, 176, 71, 280, 516, 280, 516, 176]], 'id': 69303, 'image_id': 53565, 'bbox': [71, 176, 445, 104]}
{'category_id': 1, 'area': 143613, 'iscrowd': 0, 'segmentation': [[66, 72, 66, 269, 795, 269, 795, 72]], 'id': 67935, 'image_id': 52492, 'bbox': [66, 72, 729, 197]}
I have submitted the form, but haven't received the reply email. Could you please send me the download link , my gmail address is [email protected].. Thanks a loooooooooot....
i tried here table image text extraction using tablebank pretrained model (Table structure recognition).
Results here,
python drive/My\ Drive/OpenNMT-py/translate.py -data_type img -model drive/My\ Drive/Pretrained_Word_Embeddings/detectron_table_detection/model.pt -src_dir drive/My\ Drive/datasets/table_dataset_sample/ \
-src src-test.txt -output pred.txt -max_length 150 -beam_size 5 -gpu 0 -verbose
[2019-08-21 05:40:53,223 INFO] Translating shard 0.
/usr/local/lib/python3.6/dist-packages/torchtext/data/field.py:359: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
var = torch.tensor(arr, dtype=self.dtype, device=device)
SENT 1: None
PRED 1: <tabular> <tbody> <tr> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> </tr> <tr> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> </tr> <tr> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdn> </tr> <tr> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdn> </tr> <tr> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdn> </tr> <tr> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdn> </tr> <tr> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdn> </tr> <tr> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdn> </tr> </tbody> </tabular>
PRED SCORE: -1.2848
PRED AVG SCORE: -0.0111, PRED PPL: 1.0111
Please if am i wrong, correct me.:)
Unable to download TableBank.zip[1]~[5] from
https://doc-analysis.github.io/tablebank-page/index.html
This XML file does not appear to have any style information associated with it. The document tree is shown below.
AuthenticationFailed
Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature. ...
Traceback (most recent call last):
File "tools/infer_simple.py", line 185, in
main(args)
File "tools/infer_simple.py", line 125, in main
merge_cfg_from_file(args.cfg)
File "/home/anshuman/detectron/detectron/core/config.py", line 1148, in merge_cfg_from_file
yaml_cfg = AttrDict(load_cfg(f))
File "/home/anshuman/detectron/detectron/core/config.py", line 1142, in load_cfg
return envu.yaml_load(cfg_to_load)
File "/home/anshuman/Downloads/envs/myenv/lib/python3.7/site-packages/yaml/init.py", line 70, in load
loader = Loader(stream)
File "/home/anshuman/Downloads/envs/myenv/lib/python3.7/site-packages/yaml/loader.py", line 34, in init
Reader.init(self, stream)
File "/home/anshuman/Downloads/envs/myenv/lib/python3.7/site-packages/yaml/reader.py", line 74, in init
self.check_printable(stream)
File "/home/anshuman/Downloads/envs/myenv/lib/python3.7/site-packages/yaml/reader.py", line 144, in check_printable
'unicode', "special characters are not allowed")
yaml.reader.ReaderError: unacceptable character #x0002: special characters are not allowed
in "", position 0
Is it possible to have the code used for calculate the Precision, Recall and F1 score reported in the table here on GitHub? I'm talking about the results obtained with the last released checkpoints obtained with Detectron2.
The instruction on the paper are not so clear...
https://layoutlm.blob.core.windows.net/tablebank/model_zoo/detection/Latex_X152/model_final.pth
<Error>
<Code>PublicAccessNotPermitted</Code>
<Message>
Public access is not permitted on this storage account. RequestId:f2d499af-401e-0029-4a9c-a9d626000000 Time:2023-06-28T08:40:37.7148634Z
</Message>
</Error>
how can i download the
X152(Latex) | Model/Config | 1.03GB
Getting this error while trying to infer
INFO net.py: 96: rpn_bbox_pred_fpn2_w [+ momentum] loaded from weights file into gpu_0/rpn_bbox_pred_fpn2_w: (12, 256, 1, 1)
INFO net.py: 96: rpn_bbox_pred_fpn2_b [+ momentum] loaded from weights file into gpu_0/rpn_bbox_pred_fpn2_b: (12,)
INFO net.py: 96: fc6_w [+ momentum] loaded from weights file into gpu_0/fc6_w: (1024, 12544)
INFO net.py: 96: fc6_b [+ momentum] loaded from weights file into gpu_0/fc6_b: (1024,)
INFO net.py: 96: fc7_w [+ momentum] loaded from weights file into gpu_0/fc7_w: (1024, 1024)
INFO net.py: 96: fc7_b [+ momentum] loaded from weights file into gpu_0/fc7_b: (1024,)
INFO net.py: 96: cls_score_w [+ momentum] loaded from weights file into gpu_0/cls_score_w: (2, 1024)
INFO net.py: 96: cls_score_b [+ momentum] loaded from weights file into gpu_0/cls_score_b: (2,)
INFO net.py: 96: bbox_pred_w [+ momentum] loaded from weights file into gpu_0/bbox_pred_w: (8, 1024)
INFO net.py: 96: bbox_pred_b [+ momentum] loaded from weights file into gpu_0/bbox_pred_b: (8,)
INFO net.py: 133: pred_b preserved in workspace (unused)
INFO net.py: 133: pred_w preserved in workspace (unused)
[I net_dag_utils.cc:102] Operator graph pruning prior to chain compute took: 7.9674e-05 secs
[I net_dag_utils.cc:102] Operator graph pruning prior to chain compute took: 5.9175e-05 secs
INFO infer_simple.py: 147: Processing /images/ -> /tmp/detectron-tablebank/.pdf
Traceback (most recent call last):
File "detectron/tools/infer_simple.py", line 185, in
main(args)
File "detectron/tools/infer_simple.py", line 153, in main
model, im, None, timers=timers
File "/content/detectron/detectron/core/test.py", line 66, in im_detect_all
model, im, cfg.TEST.SCALE, cfg.TEST.MAX_SIZE, boxes=box_proposals
File "/content/detectron/detectron/core/test.py", line 137, in im_detect_bbox
inputs, im_scale = _get_blobs(im, boxes, target_scale, target_max_size)
File "/content/detectron/detectron/core/test.py", line 946, in _get_blobs
blob_utils.get_image_blob(im, target_scale, target_max_size)
File "/content/detectron/detectron/utils/blob.py", line 52, in get_image_blob
im, cfg.PIXEL_MEANS, target_scale, target_max_size
File "/content/detectron/detectron/utils/blob.py", line 108, in prep_im_for_blob
im = im.astype(np.float32, copy=False)
AttributeError: 'NoneType' object has no attribute 'astype'
Hello Team,
I have build a model using retinanet and I would like to use your weights in my model for training. Could you please help me how to use it.
If not in retinanet, how can i use the pretrained model and weights for inference on new images.?
Weights downloaded from: https://dl.fbaipublicfiles.com/detectron/ImageNetPretrained/25093814/X-152-32x8d-IN5k.pkl
I am using CPU system.
Thank you for your nice work, I want to know whether I can get the docx files and where to get them.
TabelBank can express whether the cell has text and the number of cells, but it cannot express the cell whether has border and whether the cells are merged, right? But the table border and cell merge are important information, how can we get them?
how to convert the table detection models into onnx and how to use such onnx models?
i want to train table-line segmentation tag, but i do not understand the html format label, why not point format like [[x1,y1],[x2,y2]]?
hi~ For the table structure recognition, sometimes a cell may span multiple rows or columns. So what do the annotations look like for this situation?
I tried to load up your model in model_zoo for detectron but seems like your fine tuned model is not in their repo. I downloaded your model but it only has a yaml file and a pth file, it does not have a frozen graph to make inferences with.
How can I deploy your model to make predictions on my own data?
please make TestPretrainedModel.md as per detectron2 and i think you have upload the weights of detectron2 with detectron1 config. correct me if i am worng.
With the same validation/testing set and evaluation tools, we can make a fair comparison between other methods and the baselines. Thank you!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.