Giter Club home page Giter Club logo

tablebank's People

Contributors

haoransh avatar liminghao1630 avatar wolfshow avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tablebank's Issues

KeyError: 'Non-existent config key: _BASE_'

I am running the code on google colab, and running:

!python detectron/tools/infer_simple.py --cfg /content/All_X101.yaml --output-dir /tmp/detectron-tablebank --image-ext jpg \
    --wts /content/model_final.pth /content/drive/MyDrive/TableBank/Image

The error is:

Found Detectron ops lib: /usr/local/lib/python3.7/dist-packages/torch/lib/libcaffe2_detectron_ops_gpu.so
[E init_intrinsics_check.cc:44] CPU feature avx is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
[E init_intrinsics_check.cc:44] CPU feature avx2 is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
[E init_intrinsics_check.cc:44] CPU feature fma is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
Traceback (most recent call last):
  File "detectron/tools/infer_simple.py", line 185, in <module>
    main(args)
  File "detectron/tools/infer_simple.py", line 125, in main
    merge_cfg_from_file(args.cfg)
  File "/content/detectron/detectron/core/config.py", line 1152, in merge_cfg_from_file
    _merge_a_into_b(yaml_cfg, __C)
  File "/content/detectron/detectron/core/config.py", line 1202, in _merge_a_into_b
    raise KeyError('Non-existent config key: {}'.format(full_key))
KeyError: 'Non-existent config key: _BASE_'

Question on the quality of table annotations

Hi,

The Problem

I'd like to thank you for releasing this nice dataset. However, I found the quality of the annotation is actually not quite high, mainly two issues:

  1. Missing labels: no annotation found for an existing table
  2. Inaccurate annotations: some bbox does not cover the whole table region

Issue 1 has been mentioned by #9 , where the author answer by

some error may cause a little table unlabeled

However, I plotted the first 100 image ids and their annotations in /Detection_data/Word, and found 21 images out of 100 with missing annotations ( 1 or up to 3 tables were missing). Unless I'm extremely lucky to catch these problematic annotations from the first 100 plot, this issue does not only exist in 'a little table'.

To be specific, I post the imgIds for those 21 images:

3, 9, 10, 27, 32, 33, 39, 47, 51, 56, 57, 58, 59, 60, 61, 62, 73, 76, 77, 87, 95

As for issue 2, I found 3 images (out of 100 tested images) with incorrect annotations:

18, 62, 83

I understand from the paper that these annotations are generated by parsing the PDF/Word documents, and those document parsing code could not catch all the tables. I post this here only for providing researchers some info that they might care about.

Possible Fix

Issue 1 is actually not hard to fix. I have trained a model for table detection (trained on other datasets) with descent performance, I'd like to use this model to run one pass through all the data provided here and hopefully spot a large amount of missing annotations, then fix those manually. I'd be happy to share and discuss more.

FYI

I load the data with pycocotools, get annotations for each images using:

img_ann = coco.loadAnns(coco.getAnnIds(imgIds = image_id))

and plotted the annotations on a matplotlib figure using

coco.showAnns(img_ann)

The missing/incorrect annotations were then spotted by eye.

I'd be happy to discuss more and provide the testing jpynb if wanted.

Best,
Julian

how to extract the dataset ?

I downloaded the dataset parts but I cannot manage to extract the files correctly.

I tried different commands cited here: https://unix.stackexchange.com/questions/40480/how-to-unzip-a-multipart-spanned-zip-on-linux

But the only successful method was this one :

cat test.zip.* >test.zip
zip -FF test.zip --out test-full.zip
unzip test-full.zip

However, after the extraction one of the annotation json file is broken and has not been extracted correctly.

Can someone share their way to extract the dataset please ?

error in running recognition model ?

While running this command after having recognition model in my local model.pt

python translate.py -model model.pt --src_dir recognition.jpg -output pred.txt

usage: translate.py [-h] [-config CONFIG] [-save_config SAVE_CONFIG] --model
                    MODEL [MODEL ...] [--fp32] [--avg_raw_probs]
                    [--data_type DATA_TYPE] --src SRC [--src_dir SRC_DIR]
                    [--tgt TGT] [--shard_size SHARD_SIZE] [--output OUTPUT]
                    [--report_bleu] [--report_rouge] [--report_time]
                    [--dynamic_dict] [--share_vocab]
                    [--random_sampling_topk RANDOM_SAMPLING_TOPK]
                    [--random_sampling_temp RANDOM_SAMPLING_TEMP]
                    [--seed SEED] [--beam_size BEAM_SIZE]
                    [--min_length MIN_LENGTH] [--max_length MAX_LENGTH]
                    [--max_sent_length] [--stepwise_penalty]
                    [--length_penalty {none,wu,avg}] [--ratio RATIO]
                    [--coverage_penalty {none,wu,summary}] [--alpha ALPHA]
                    [--beta BETA] [--block_ngram_repeat BLOCK_NGRAM_REPEAT]
                    [--ignore_when_blocking IGNORE_WHEN_BLOCKING [IGNORE_WHEN_BLOCKING ...]]
                    [--replace_unk] [--phrase_table PHRASE_TABLE] [--verbose]
                    [--log_file LOG_FILE]
                    [--log_file_level {CRITICAL,ERROR,WARNING,INFO,DEBUG,NOTSET,50,40,30,20,10,0}]
                    [--attn_debug] [--dump_beam DUMP_BEAM] [--n_best N_BEST]
                    [--batch_size BATCH_SIZE] [--gpu GPU]
                    [--sample_rate SAMPLE_RATE] [--window_size WINDOW_SIZE]
                    [--window_stride WINDOW_STRIDE] [--window WINDOW]
                    [--image_channel_size {3,1}]

translate.py: error: the following arguments are required: --src/-src

pytorch 0.4.1
py36_cuda9.2.148_cudnn7.1.4_1

Can someone tell what could be error ?
I don't know what to pass as an argument to -src
I see here that -src means Source sequence to decode (one line per sequence) which is what I don't understand !!

this doesn't help either, mentioned in this issue

run in CPU

i wanna run this model in detectron on CPU (not GPU)
im using table detection task model and detectron code
can anybody help me how do i do that ?

Table Detection data mismatch in Word subset

I have downloaded and checked the TableBank dataset from your dataset homepage

I have found some issues in the annotations, the README denotes the number of tables in the Table Detection task as follows:

Task Word Latex Word+Latex
Table detection 163,417 253,817 417,234

But I ran my script to check the data annotations, it showed that there were only 101889 tables in the Word subset.

Prediction Using Table Recognition

I used the follow command to predict structure of the table :

python translate.py -model model.pt --src_dir './tables/' --src './src_txt.txt' -output pred.txt

and I get the following error:
AssertionError: Cannot use _dir with TextDataReader.

From your previous replies to issues https://github.com/doc-analysis/TableBank/issues/12 and https://github.com/doc-analysis/TableBank/issues/10, its looks that I can test the model by using -tgt (providing a ground truth file)

Can I not only predict on a sample?

why need -src src-test.txt for image to text opennmt?

i was bit confused. please once explain me, help me anyone :)

i was tried this way.

python drive/My\ Drive/OpenNMT-py/translate.py -data_type img -model drive/My\ Drive/Pretrained_Word_Embeddings/detectron_table_detection/model.pt -src_dir drive/My\ Drive/datasets/table_dataset_sample/8.jpg \
  -output pred.txt -max_length 150 -beam_size 5 -gpu 0 -verbose



i am getting same issue. i don't know what is -src_dir & -src

usage: translate.py [-h] [-config CONFIG] [-save_config SAVE_CONFIG] --model
                    MODEL [MODEL ...] [--fp32] [--avg_raw_probs]
                    [--data_type DATA_TYPE] --src SRC [--src_dir SRC_DIR]
                    [--tgt TGT] [--shard_size SHARD_SIZE] [--output OUTPUT]
                    [--report_bleu] [--report_rouge] [--report_time]
                    [--dynamic_dict] [--share_vocab]
                    [--random_sampling_topk RANDOM_SAMPLING_TOPK]
                    [--random_sampling_temp RANDOM_SAMPLING_TEMP]
                    [--seed SEED] [--beam_size BEAM_SIZE]
                    [--min_length MIN_LENGTH] [--max_length MAX_LENGTH]
                    [--max_sent_length] [--stepwise_penalty]
                    [--length_penalty {none,wu,avg}] [--ratio RATIO]
                    [--coverage_penalty {none,wu,summary}] [--alpha ALPHA]
                    [--beta BETA] [--block_ngram_repeat BLOCK_NGRAM_REPEAT]
                    [--ignore_when_blocking IGNORE_WHEN_BLOCKING [IGNORE_WHEN_BLOCKING ...]]
                    [--replace_unk] [--phrase_table PHRASE_TABLE] [--verbose]
                    [--log_file LOG_FILE]
                    [--log_file_level {CRITICAL,ERROR,WARNING,INFO,DEBUG,NOTSET,50,40,30,20,10,0}]
                    [--attn_debug] [--dump_beam DUMP_BEAM] [--n_best N_BEST]
                    [--batch_size BATCH_SIZE] [--gpu GPU]
                    [--sample_rate SAMPLE_RATE] [--window_size WINDOW_SIZE]
                    [--window_stride WINDOW_STRIDE] [--window WINDOW]
                    [--image_channel_size {3,1}]
translate.py: error: the following arguments are required: --src/-src

here docuentation (Image to text ) they said,

python translate.py -data_type img -model demo-model_acc_x_ppl_x_e13.pt -src_dir data/im2text/images \
					-src data/im2text/src-test.txt -output pred.txt -max_length 150 -beam_size 5 -gpu 0 -verbose

-src_dir: The directory containing the images.

then why i need -src data/im2text/src-test.txt ?

we want image to text. but src why txt. what is that can any one clarify me.

Thank you all

some table not labeled

I found there is some problem in the data , table not labeled . two example from Word.json

{'category_id': 1, 'area': 46280, 'iscrowd': 0, 'segmentation': [[71, 176, 71, 280, 516, 280, 516, 176]], 'id': 69303, 'image_id': 53565, 'bbox': [71, 176, 445, 104]}
{'category_id': 1, 'area': 143613, 'iscrowd': 0, 'segmentation': [[66, 72, 66, 269, 795, 269, 795, 72]], 'id': 67935, 'image_id': 52492, 'bbox': [66, 72, 729, 197]}

53565

52492

No email reply

I have submitted the form, but haven't received the reply email. Could you please send me the download link , my gmail address is [email protected].. Thanks a loooooooooot....

Why results bad for table structure recognition?

i tried here table image text extraction using tablebank pretrained model (Table structure recognition).

Results here,

python drive/My\ Drive/OpenNMT-py/translate.py -data_type img -model drive/My\ Drive/Pretrained_Word_Embeddings/detectron_table_detection/model.pt -src_dir drive/My\ Drive/datasets/table_dataset_sample/ \
 -src src-test.txt -output pred.txt -max_length 150 -beam_size 5 -gpu 0 -verbose
[2019-08-21 05:40:53,223 INFO] Translating shard 0.
/usr/local/lib/python3.6/dist-packages/torchtext/data/field.py:359: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  var = torch.tensor(arr, dtype=self.dtype, device=device)

SENT 1: None
PRED 1: <tabular> <tbody> <tr> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> </tr> <tr> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> </tr> <tr> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdn> </tr> <tr> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdn> </tr> <tr> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdn> </tr> <tr> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdn> </tr> <tr> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdn> </tr> <tr> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdy> <tdn> </tr> </tbody> </tabular>
PRED SCORE: -1.2848
PRED AVG SCORE: -0.0111, PRED PPL: 1.0111

Please if am i wrong, correct me.:)

Getting this error : yaml.reader.ReaderError

Traceback (most recent call last):
File "tools/infer_simple.py", line 185, in
main(args)
File "tools/infer_simple.py", line 125, in main
merge_cfg_from_file(args.cfg)
File "/home/anshuman/detectron/detectron/core/config.py", line 1148, in merge_cfg_from_file
yaml_cfg = AttrDict(load_cfg(f))
File "/home/anshuman/detectron/detectron/core/config.py", line 1142, in load_cfg
return envu.yaml_load(cfg_to_load)
File "/home/anshuman/Downloads/envs/myenv/lib/python3.7/site-packages/yaml/init.py", line 70, in load
loader = Loader(stream)
File "/home/anshuman/Downloads/envs/myenv/lib/python3.7/site-packages/yaml/loader.py", line 34, in init
Reader.init(self, stream)
File "/home/anshuman/Downloads/envs/myenv/lib/python3.7/site-packages/yaml/reader.py", line 74, in init
self.check_printable(stream)
File "/home/anshuman/Downloads/envs/myenv/lib/python3.7/site-packages/yaml/reader.py", line 144, in check_printable
'unicode', "special characters are not allowed")
yaml.reader.ReaderError: unacceptable character #x0002: special characters are not allowed
in "", position 0

AttributeError: 'NoneType' object has no attribute 'astype'

Getting this error while trying to infer

INFO net.py: 96: rpn_bbox_pred_fpn2_w [+ momentum] loaded from weights file into gpu_0/rpn_bbox_pred_fpn2_w: (12, 256, 1, 1)
INFO net.py: 96: rpn_bbox_pred_fpn2_b [+ momentum] loaded from weights file into gpu_0/rpn_bbox_pred_fpn2_b: (12,)
INFO net.py: 96: fc6_w [+ momentum] loaded from weights file into gpu_0/fc6_w: (1024, 12544)
INFO net.py: 96: fc6_b [+ momentum] loaded from weights file into gpu_0/fc6_b: (1024,)
INFO net.py: 96: fc7_w [+ momentum] loaded from weights file into gpu_0/fc7_w: (1024, 1024)
INFO net.py: 96: fc7_b [+ momentum] loaded from weights file into gpu_0/fc7_b: (1024,)
INFO net.py: 96: cls_score_w [+ momentum] loaded from weights file into gpu_0/cls_score_w: (2, 1024)
INFO net.py: 96: cls_score_b [+ momentum] loaded from weights file into gpu_0/cls_score_b: (2,)
INFO net.py: 96: bbox_pred_w [+ momentum] loaded from weights file into gpu_0/bbox_pred_w: (8, 1024)
INFO net.py: 96: bbox_pred_b [+ momentum] loaded from weights file into gpu_0/bbox_pred_b: (8,)
INFO net.py: 133: pred_b preserved in workspace (unused)
INFO net.py: 133: pred_w preserved in workspace (unused)
[I net_dag_utils.cc:102] Operator graph pruning prior to chain compute took: 7.9674e-05 secs
[I net_dag_utils.cc:102] Operator graph pruning prior to chain compute took: 5.9175e-05 secs
INFO infer_simple.py: 147: Processing /images/ -> /tmp/detectron-tablebank/.pdf
Traceback (most recent call last):
File "detectron/tools/infer_simple.py", line 185, in
main(args)
File "detectron/tools/infer_simple.py", line 153, in main
model, im, None, timers=timers
File "/content/detectron/detectron/core/test.py", line 66, in im_detect_all
model, im, cfg.TEST.SCALE, cfg.TEST.MAX_SIZE, boxes=box_proposals
File "/content/detectron/detectron/core/test.py", line 137, in im_detect_bbox
inputs, im_scale = _get_blobs(im, boxes, target_scale, target_max_size)
File "/content/detectron/detectron/core/test.py", line 946, in _get_blobs
blob_utils.get_image_blob(im, target_scale, target_max_size)
File "/content/detectron/detectron/utils/blob.py", line 52, in get_image_blob
im, cfg.PIXEL_MEANS, target_scale, target_max_size
File "/content/detectron/detectron/utils/blob.py", line 108, in prep_im_for_blob
im = im.astype(np.float32, copy=False)
AttributeError: 'NoneType' object has no attribute 'astype'

Table border and cell merge

TabelBank can express whether the cell has text and the number of cells, but it cannot express the cell whether has border and whether the cells are merged, right? But the table border and cell merge are important information, how can we get them?

How do I use your model to make inferences on my own data?

I tried to load up your model in model_zoo for detectron but seems like your fine tuned model is not in their repo. I downloaded your model but it only has a yaml file and a pth file, it does not have a frozen graph to make inferences with.

How can I deploy your model to make predictions on my own data?

TestPretrainedModel.md detectron2

please make TestPretrainedModel.md as per detectron2 and i think you have upload the weights of detectron2 with detectron1 config. correct me if i am worng.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.