Giter Club home page Giter Club logo

iterx's Issues

allennlp.common.checks.ConfigurationError: key "type" is required at location "model.graph_encoder."

I'm trying to run the model training script for some of my experiments. I installed all the packages in a separate python environment based on the main repo page.

Then I ran:

PYTHONPATH=./src allennlp train \
  --include-package iterx \
  -s new_iterx_dir \
  /data/sid/iterx/resources/training_configs/muc_config.jsonnet

and I get the following error:

allennlp.common.checks.ConfigurationError: key "type" is required at location "model.graph_encoder."

below is the full log:

2023-07-08 08:58:24,308 - INFO - numexpr.utils - Note: NumExpr detected 12 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
2023-07-08 08:58:24,308 - INFO - numexpr.utils - NumExpr defaulting to 8 threads.
2023-07-08 08:58:24,572 - INFO - allennlp.common.params - evaluation = None
2023-07-08 08:58:24,572 - INFO - allennlp.common.params - include_in_archive = None
2023-07-08 08:58:24,572 - INFO - allennlp.common.params - random_seed = 13370
2023-07-08 08:58:24,572 - INFO - allennlp.common.params - numpy_seed = 1337
2023-07-08 08:58:24,572 - INFO - allennlp.common.params - pytorch_seed = 133
2023-07-08 08:58:24,573 - INFO - allennlp.common.checks - Pytorch version: 2.0.1
2023-07-08 08:58:24,574 - INFO - allennlp.common.params - type = default
2023-07-08 08:58:24,574 - INFO - allennlp.common.params - dataset_reader.type = muc
2023-07-08 08:58:24,574 - INFO - allennlp.common.params - dataset_reader.max_instances = None
2023-07-08 08:58:24,575 - INFO - allennlp.common.params - dataset_reader.manual_distributed_sharding = False
2023-07-08 08:58:24,575 - INFO - allennlp.common.params - dataset_reader.manual_multiprocess_sharding = False
2023-07-08 08:58:24,575 - INFO - allennlp.common.params - dataset_reader.definition_file = resources/data/muc/definitions.json
2023-07-08 08:58:24,575 - INFO - allennlp.common.params - dataset_reader.token_indexers.tokens.type = pretrained_transformer_mismatched
2023-07-08 08:58:24,575 - INFO - allennlp.common.params - dataset_reader.token_indexers.tokens.token_min_padding_length = 0
2023-07-08 08:58:24,575 - INFO - allennlp.common.params - dataset_reader.token_indexers.tokens.model_name = t5-large
2023-07-08 08:58:24,575 - INFO - allennlp.common.params - dataset_reader.token_indexers.tokens.namespace = tags
2023-07-08 08:58:24,575 - INFO - allennlp.common.params - dataset_reader.token_indexers.tokens.max_length = 1024
2023-07-08 08:58:24,575 - INFO - allennlp.common.params - dataset_reader.token_indexers.tokens.tokenizer_kwargs = None
Downloading (…)lve/main/config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.21k/1.21k [00:00<00:00, 11.2MB/s]
Downloading (…)ve/main/spiece.model: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 792k/792k [00:00<00:00, 12.1MB/s]
Downloading (…)/main/tokenizer.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.39M/1.39M [00:00<00:00, 17.0MB/s]
/home/sidvash/.conda/envs/iterx/lib/python3.11/site-packages/transformers/models/t5/tokenization_t5_fast.py:155: FutureWarning: This tokenizer was incorrectly instantiated with a model max length o
f 512 which will be corrected in Transformers v5.
For now, this behavior is kept to avoid breaking backwards compatibility when padding/encoding with `truncation is True`.
- Be aware that you SHOULD NOT rely on t5-large automatically truncating your input to 512 when padding/encoding.
- If you want to encode/pad to sequences longer than 512 you can either instantiate this tokenizer with `model_max_length` or pass `max_length` when encoding/padding.
- To avoid this warning, please instantiate this tokenizer with `model_max_length` set to your preferred value.
  warnings.warn(
2023-07-08 08:58:26,249 - INFO - allennlp.common.params - dataset_reader.is_training = True
2023-07-08 08:58:26,249 - INFO - allennlp.common.params - dataset_reader.skip_docs_without_templates = False
2023-07-08 08:58:26,249 - INFO - allennlp.common.params - dataset_reader.skip_docs_without_spans = True
2023-07-08 08:58:26,249 - INFO - allennlp.common.params - dataset_reader.verbose = False
2023-07-08 08:58:26,265 - INFO - allennlp.common.params - train_data_path = resources/data/muc/preprocessed/tokenized/train.json
2023-07-08 08:58:26,266 - INFO - allennlp.common.params - datasets_for_vocab_creation = None
2023-07-08 08:58:26,266 - INFO - allennlp.common.params - validation_dataset_reader.type = muc
2023-07-08 08:58:26,266 - INFO - allennlp.common.params - validation_dataset_reader.max_instances = None
2023-07-08 08:58:26,266 - INFO - allennlp.common.params - validation_dataset_reader.manual_distributed_sharding = False
2023-07-08 08:58:26,266 - INFO - allennlp.common.params - validation_dataset_reader.manual_multiprocess_sharding = False
2023-07-08 08:58:26,266 - INFO - allennlp.common.params - validation_dataset_reader.definition_file = resources/data/muc/definitions.json
2023-07-08 08:58:26,266 - INFO - allennlp.common.params - validation_dataset_reader.token_indexers.tokens.type = pretrained_transformer_mismatched
2023-07-08 08:58:26,267 - INFO - allennlp.common.params - validation_dataset_reader.token_indexers.tokens.token_min_padding_length = 0
2023-07-08 08:58:26,267 - INFO - allennlp.common.params - validation_dataset_reader.token_indexers.tokens.model_name = t5-large
2023-07-08 08:58:26,267 - INFO - allennlp.common.params - validation_dataset_reader.token_indexers.tokens.namespace = tags
2023-07-08 08:58:26,267 - INFO - allennlp.common.params - validation_dataset_reader.token_indexers.tokens.max_length = 1024
2023-07-08 08:58:26,267 - INFO - allennlp.common.params - validation_dataset_reader.token_indexers.tokens.tokenizer_kwargs = None
2023-07-08 08:58:26,268 - INFO - allennlp.common.params - validation_dataset_reader.is_training = False
2023-07-08 08:58:26,268 - INFO - allennlp.common.params - validation_dataset_reader.skip_docs_without_templates = True
2023-07-08 08:58:26,268 - INFO - allennlp.common.params - validation_dataset_reader.skip_docs_without_spans = True
2023-07-08 08:58:26,268 - INFO - allennlp.common.params - validation_dataset_reader.verbose = False
2023-07-08 08:58:26,268 - INFO - allennlp.common.params - validation_data_path = resources/data/muc/preprocessed/tokenized/dev.json
2023-07-08 08:58:26,268 - INFO - allennlp.common.params - validation_data_loader = None
2023-07-08 08:58:26,268 - INFO - allennlp.common.params - test_data_path = resources/data/muc/preprocessed/tokenized/test.json
2023-07-08 08:58:26,268 - INFO - allennlp.common.params - evaluate_on_test = False
2023-07-08 08:58:26,268 - INFO - allennlp.common.params - batch_weight_key =
2023-07-08 08:58:26,268 - INFO - allennlp.common.params - data_loader.type = multiprocess
2023-07-08 08:58:26,269 - INFO - allennlp.common.params - data_loader.batch_size = None
2023-07-08 08:58:26,269 - INFO - allennlp.common.params - data_loader.drop_last = False
2023-07-08 08:58:26,269 - INFO - allennlp.common.params - data_loader.shuffle = False
2023-07-08 08:58:26,269 - INFO - allennlp.common.params - data_loader.batch_sampler.type = bucket
2023-07-08 08:58:26,269 - INFO - allennlp.common.params - data_loader.batch_sampler.batch_size = 1
2023-07-08 08:58:26,269 - INFO - allennlp.common.params - data_loader.batch_sampler.sorting_keys = ['text']
2023-07-08 08:58:26,269 - INFO - allennlp.common.params - data_loader.batch_sampler.padding_noise = 0.1
2023-07-08 08:58:26,269 - INFO - allennlp.common.params - data_loader.batch_sampler.drop_last = False
2023-07-08 08:58:26,269 - INFO - allennlp.common.params - data_loader.batch_sampler.shuffle = True
2023-07-08 08:58:26,269 - INFO - allennlp.common.params - data_loader.batches_per_epoch = None
2023-07-08 08:58:26,269 - INFO - allennlp.common.params - data_loader.num_workers = 0
2023-07-08 08:58:26,269 - INFO - allennlp.common.params - data_loader.max_instances_in_memory = None
2023-07-08 08:58:26,269 - INFO - allennlp.common.params - data_loader.start_method = fork
2023-07-08 08:58:26,269 - INFO - allennlp.common.params - data_loader.cuda_device = None
2023-07-08 08:58:26,269 - INFO - allennlp.common.params - data_loader.quiet = False
2023-07-08 08:58:26,270 - INFO - allennlp.common.params - data_loader.collate_fn = <allennlp.data.data_loaders.data_collator.DefaultDataCollator object at 0x7f7065851a90>
loading instances: 0it [00:00, ?it/s]2023-07-08 08:58:26,305 - WARNING - allennlp.data.fields.sequence_label_field - Your label namespace was 'event_types'. We recommend you use a namespace ending
with 'labels' or 'tags', so we don't add UNK and PAD tokens by default to your vocabulary.  See documentation for `non_padded_namespaces` parameter in Vocabulary.
2023-07-08 08:58:26,306 - WARNING - allennlp.data.fields.label_field - Your label namespace was 'slot_types'. We recommend you use a namespace ending with 'labels' or 'tags', so we don't add UNK an
d PAD tokens by default to your vocabulary.  See documentation for `non_padded_namespaces` parameter in Vocabulary.
loading instances: 3995it [00:04, 792.24it/s] 2023-07-08 08:58:30,380 - WARNING - iterx.data.dataset.muc_dataset - Read 1300 documents. Of these, 672 had both templates and spans. 600 had no templa
tes and 628 had no spans.
loading instances: 4032it [00:04, 980.93it/s]
2023-07-08 08:58:30,380 - INFO - allennlp.common.params - data_loader.type = multiprocess
2023-07-08 08:58:30,381 - INFO - allennlp.common.params - data_loader.batch_size = None
2023-07-08 08:58:30,381 - INFO - allennlp.common.params - data_loader.drop_last = False
2023-07-08 08:58:30,381 - INFO - allennlp.common.params - data_loader.shuffle = False
2023-07-08 08:58:30,381 - INFO - allennlp.common.params - data_loader.batch_sampler.type = bucket
2023-07-08 08:58:30,381 - INFO - allennlp.common.params - data_loader.batch_sampler.batch_size = 1
2023-07-08 08:58:30,381 - INFO - allennlp.common.params - data_loader.batch_sampler.sorting_keys = ['text']
2023-07-08 08:58:30,381 - INFO - allennlp.common.params - data_loader.batch_sampler.padding_noise = 0.1
2023-07-08 08:58:30,381 - INFO - allennlp.common.params - data_loader.batch_sampler.drop_last = False
2023-07-08 08:58:30,381 - INFO - allennlp.common.params - data_loader.batch_sampler.shuffle = True
2023-07-08 08:58:30,381 - INFO - allennlp.common.params - data_loader.batches_per_epoch = None
2023-07-08 08:58:30,381 - INFO - allennlp.common.params - data_loader.num_workers = 0
2023-07-08 08:58:30,381 - INFO - allennlp.common.params - data_loader.max_instances_in_memory = None
2023-07-08 08:58:30,381 - INFO - allennlp.common.params - data_loader.start_method = fork
2023-07-08 08:58:30,381 - INFO - allennlp.common.params - data_loader.cuda_device = None
2023-07-08 08:58:30,381 - INFO - allennlp.common.params - data_loader.quiet = False
2023-07-08 08:58:30,381 - INFO - allennlp.common.params - data_loader.collate_fn = <allennlp.data.data_loaders.data_collator.DefaultDataCollator object at 0x7f7065851a90>
loading instances: 613it [00:00, 3097.39it/s]2023-07-08 08:58:30,604 - WARNING - iterx.data.dataset.muc_dataset - Read 200 documents. Of these, 111 had both templates and spans. 84 had no templates
 and 89 had no spans.
loading instances: 666it [00:00, 2997.71it/s]
2023-07-08 08:58:30,604 - INFO - allennlp.common.params - data_loader.type = multiprocess
2023-07-08 08:58:30,604 - INFO - allennlp.common.params - data_loader.batch_size = None
2023-07-08 08:58:30,604 - INFO - allennlp.common.params - data_loader.drop_last = False
2023-07-08 08:58:30,604 - INFO - allennlp.common.params - data_loader.shuffle = False
2023-07-08 08:58:30,605 - INFO - allennlp.common.params - data_loader.batch_sampler.type = bucket
2023-07-08 08:58:30,605 - INFO - allennlp.common.params - data_loader.batch_sampler.batch_size = 1
2023-07-08 08:58:30,605 - INFO - allennlp.common.params - data_loader.batch_sampler.sorting_keys = ['text']
2023-07-08 08:58:30,605 - INFO - allennlp.common.params - data_loader.batch_sampler.padding_noise = 0.1
2023-07-08 08:58:30,605 - INFO - allennlp.common.params - data_loader.batch_sampler.drop_last = False
2023-07-08 08:58:30,605 - INFO - allennlp.common.params - data_loader.batch_sampler.shuffle = True
2023-07-08 08:58:30,605 - INFO - allennlp.common.params - data_loader.batches_per_epoch = None
2023-07-08 08:58:30,605 - INFO - allennlp.common.params - data_loader.num_workers = 0
2023-07-08 08:58:30,605 - INFO - allennlp.common.params - data_loader.max_instances_in_memory = None
2023-07-08 08:58:30,605 - INFO - allennlp.common.params - data_loader.start_method = fork
2023-07-08 08:58:30,605 - INFO - allennlp.common.params - data_loader.cuda_device = None
2023-07-08 08:58:30,605 - INFO - allennlp.common.params - data_loader.quiet = False
2023-07-08 08:58:30,605 - INFO - allennlp.common.params - data_loader.collate_fn = <allennlp.data.data_loaders.data_collator.DefaultDataCollator object at 0x7f7065851a90>
loading instances: 409it [00:00, 757.83it/s]2023-07-08 08:58:31,343 - WARNING - iterx.data.dataset.muc_dataset - Read 200 documents. Of these, 123 had both templates and spans. 74 had no templates
and 77 had no spans.
loading instances: 738it [00:00, 1000.58it/s]
2023-07-08 08:58:31,343 - INFO - allennlp.common.params - vocabulary.type = from_files
2023-07-08 08:58:31,343 - INFO - allennlp.common.params - vocabulary.directory = resources/data/muc/vocabulary
2023-07-08 08:58:31,343 - INFO - allennlp.common.params - vocabulary.padding_token = @@PADDING@@
2023-07-08 08:58:31,343 - INFO - allennlp.common.params - vocabulary.oov_token = @@UNKNOWN@@
2023-07-08 08:58:31,344 - INFO - allennlp.data.vocabulary - Loading token dictionary from resources/data/muc/vocabulary.
2023-07-08 08:58:31,344 - INFO - allennlp.common.params - model.type = iterative_template_extraction
2023-07-08 08:58:31,344 - INFO - allennlp.common.params - model.regularizer = None
2023-07-08 08:58:31,344 - INFO - allennlp.common.params - model.definition_file = resources/data/muc/definitions.json
2023-07-08 08:58:31,345 - CRITICAL - root - Uncaught exception
Traceback (most recent call last):
  File "/home/sidvash/.conda/envs/iterx/lib/python3.11/site-packages/allennlp/common/params.py", line 211, in pop
    value = self.params.pop(key)
            ^^^^^^^^^^^^^^^^^^^^
KeyError: 'type'

During handling of the above exception, another exception occurred:                                                                                                                         [15/1945]

Traceback (most recent call last):
  File "/home/sidvash/.conda/envs/iterx/bin/allennlp", line 8, in <module>
    sys.exit(run())
             ^^^^^
  File "/home/sidvash/.conda/envs/iterx/lib/python3.11/site-packages/allennlp/__main__.py", line 39, in run
    main(prog="allennlp")
  File "/home/sidvash/.conda/envs/iterx/lib/python3.11/site-packages/allennlp/commands/__init__.py", line 120, in main
    args.func(args)
  File "/home/sidvash/.conda/envs/iterx/lib/python3.11/site-packages/allennlp/commands/train.py", line 111, in train_model_from_args
    train_model_from_file(
  File "/home/sidvash/.conda/envs/iterx/lib/python3.11/site-packages/allennlp/commands/train.py", line 177, in train_model_from_file
    return train_model(
           ^^^^^^^^^^^^
  File "/home/sidvash/.conda/envs/iterx/lib/python3.11/site-packages/allennlp/commands/train.py", line 258, in train_model
    model = _train_worker(
            ^^^^^^^^^^^^^^
  File "/home/sidvash/.conda/envs/iterx/lib/python3.11/site-packages/allennlp/commands/train.py", line 494, in _train_worker
    train_loop = TrainModel.from_params(
                 ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sidvash/.conda/envs/iterx/lib/python3.11/site-packages/allennlp/common/from_params.py", line 604, in from_params
    return retyped_subclass.from_params(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sidvash/.conda/envs/iterx/lib/python3.11/site-packages/allennlp/common/from_params.py", line 638, in from_params
    return constructor_to_call(**kwargs)  # type: ignore
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sidvash/.conda/envs/iterx/lib/python3.11/site-packages/allennlp/commands/train.py", line 770, in from_partial_objects
    model_ = model.construct(
             ^^^^^^^^^^^^^^^^
  File "/home/sidvash/.conda/envs/iterx/lib/python3.11/site-packages/allennlp/common/lazy.py", line 82, in construct
    return self.constructor(**contructor_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sidvash/.conda/envs/iterx/lib/python3.11/site-packages/allennlp/common/lazy.py", line 66, in constructor_to_use
    return self._constructor.from_params(  # type: ignore[union-attr]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sidvash/.conda/envs/iterx/lib/python3.11/site-packages/allennlp/common/from_params.py", line 604, in from_params
    return retyped_subclass.from_params(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sidvash/.conda/envs/iterx/lib/python3.11/site-packages/allennlp/common/from_params.py", line 636, in from_params
    kwargs = create_kwargs(constructor_to_inspect, cls, params, **extras)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sidvash/.conda/envs/iterx/lib/python3.11/site-packages/allennlp/common/from_params.py", line 206, in create_kwargs
    constructed_arg = pop_and_construct_arg(
                      ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sidvash/.conda/envs/iterx/lib/python3.11/site-packages/allennlp/common/from_params.py", line 314, in pop_and_construct_arg
    return construct_arg(class_name, name, popped_params, annotation, default, **extras)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sidvash/.conda/envs/iterx/lib/python3.11/site-packages/allennlp/common/from_params.py", line 348, in construct_arg
    result = annotation.from_params(params=popped_params, **subextras)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sidvash/.conda/envs/iterx/lib/python3.11/site-packages/allennlp/common/from_params.py", line 585, in from_params
    choice = params.pop_choice(
             ^^^^^^^^^^^^^^^^^^
  File "/home/sidvash/.conda/envs/iterx/lib/python3.11/site-packages/allennlp/common/params.py", line 314, in pop_choice
    value = self.pop(key, default)
            ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sidvash/.conda/envs/iterx/lib/python3.11/site-packages/allennlp/common/params.py", line 216, in pop
    raise ConfigurationError(msg)
allennlp.common.checks.ConfigurationError: key "type" is required at location "model.graph_encoder."

It looks like I need to change something in the config file?

Instructions for Model Inference

There are currently no instructions for running inference on a trained model in the README (only instructions for training and for scoring). We should fix this.

When the code will be released?

Hi,

I am really interested in this paper of using imitation for doc-level extraction and congraduations this work is finally accepted by EACL. Would you release the code then?

Issues with MUC configuration

The current version of resources/training_configs/muc_config.jsonnet does not work out of the box, due to some minor discrepancies between the public code release and the code for which this file was written. I will submit a PR with the relevant changes.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.