Giter Club home page Giter Club logo

gollie's Introduction



Guideline following Large Language Model for Information Extraction

Twitter GitHub license Pretrained Models Blog Paper


We present GoLLIE, a Large Language Model trained to follow annotation guidelines. GoLLIE outperforms previous approaches on zero-shot Information Extraction and allows the user to perform inferences with annotation schemas defined on the fly. Different from previous approaches, GoLLIE is able to follow detailed definitions and does not only rely on the knowledge already encoded in the LLM. Code and models are publicly available.

Schema definition and inference example

The labels are represented as Python classes, and the guidelines or instructions are introduced as docstrings. The model start generating after the result = [ line.

Installation

You will need to install the following dependencies to run the GoLLIE codebase:

Pytorch >= 2.0.0 | https://pytorch.org/get-started
We recommend that you install the 2.1.0 version or newer, as it includes important bug fixes.

transformers >= 4.33.1
pip install --upgrade transformers

PEFT >= 0.4.0
pip install --upgrade peft

bitsandbytes >= 0.40.0
pip install --upgrade bitsandbytes

Flash Attention 2.0
pip install flash-attn --no-build-isolation
pip install git+https://github.com/HazyResearch/flash-attention.git#subdirectory=csrc/rotary

You will also need these dependencies

pip install numpy black Jinja2 tqdm rich psutil datasets ruff wandb fschat

Pretrained models

We release three GoLLIE models based on CODE-LLama (7B, 13B, and 34B). The models are available in the πŸ€—HuggingFace Hub.

Model Supervised average F1 Zero-shot average F1 πŸ€—HuggingFace Hub
GoLLIE-7B 73.0 55.3 HiTZ/GoLLIE-7B
GoLLIE-13B 73.9 56.0 HiTZ/GoLLIE-13B
GoLLIE-34B 75.0 57.2 HiTZ/GoLLIE-34B

How to use GoLLIE

Please take a look at our πŸš€ Example Jupyter Notebooks to learn how to use GoLLIE: GoLLIE Notebooks

Currently supported tasks

This is the list of task used for training and evaluating GoLLIE. However, as demonstrated in the πŸš€ Create Custom Task notebook GoLLIE can perform a wide range of unseen tasks. For more info, read our πŸ“–Paper.

We plan to continue adding more tasks to the list. If you want to contribute, please feel free to open a PR or contact us. You can use as example the already implemented tasks in the src/tasks folder.

Generate the GoLLIE dataset

The configuration files used to generate the GoLLIE dataset are available in the configs/data_configs/ folder. You can generate the dataset by running the following command (See bash_scripts/generate_data.sh for more info):

CONFIG_DIR="configs/data_configs"
OUTPUT_DIR="data/processed_w_examples"

python -m src.generate_data \
     --configs \
        ${CONFIG_DIR}/ace_config.json \
        ${CONFIG_DIR}/bc5cdr_config.json \
        ${CONFIG_DIR}/broadtwitter_config.json \
        ${CONFIG_DIR}/casie_config.json \
        ${CONFIG_DIR}/conll03_config.json \
        ${CONFIG_DIR}/crossner_ai_config.json \
        ${CONFIG_DIR}/crossner_literature_config.json \
        ${CONFIG_DIR}/crossner_music_config.json \
        ${CONFIG_DIR}/crossner_politics_config.json \
        ${CONFIG_DIR}/crossner_science_config.json \
        ${CONFIG_DIR}/diann_config.json \
        ${CONFIG_DIR}/e3c_config.json \
        ${CONFIG_DIR}/europarl_config.json \
        ${CONFIG_DIR}/fabner_config.json \
        ${CONFIG_DIR}/harveyner_config.json \
        ${CONFIG_DIR}/mitmovie_config.json \
        ${CONFIG_DIR}/mitrestaurant_config.json \
        ${CONFIG_DIR}/mitmovie_config.json \
        ${CONFIG_DIR}/multinerd_config.json \
        ${CONFIG_DIR}/ncbidisease_config.json \
        ${CONFIG_DIR}/ontonotes_config.json \
        ${CONFIG_DIR}/rams_config.json \
        ${CONFIG_DIR}/tacred_config.json \
        ${CONFIG_DIR}/wikievents_config.json \
        ${CONFIG_DIR}/wnut17_config.json \
     --output ${OUTPUT_DIR} \
     --overwrite_output_dir \
     --include_examples

We do not redistribute the datasets used to train and evaluate GoLLIE. Not all of them are publicly available; some require a license to access them.

For the datasets available in the HuggingFace Datasets library, the script will download them automatically.

For the following datasets, you must provide the path to the dataset by modifying the corresponding configs/data_configs/ file: ACE05 (Preprocessing script), CASIE, CrossNer, DIANN, E3C, HarveyNER, MitMovie, MitRestaurant, RAMS, TACRED, WikiEvents.

If you encounter difficulties generating the dataset, please don't hesitate to contact us.

How to train your own GoLLIE

First, you need to generate the GoLLIE dataset. See the previous section for more info.

Second, you must create a configuration file. Please, see the configs/model_configs folder for examples.

Finally, you can train your own GoLLIE by running the following command (See bash_scripts/ folder for more examples):

CONFIGS_FOLDER="configs/model_configs"
python3 -m src.run ${CONFIGS_FOLDER}/GoLLIE+-7B_CodeLLaMA.yaml

How to evaluate a model

First, you need to generate the GoLLIE dataset. See the previous section for more info.

Second, you must create a configuration file. Please, see the configs/model_configs/eval folder for examples.

Finally, you can evaluate your own GoLLIE by running the following command (See bash_scripts/eval folder for more examples):

CONFIGS_FOLDER="configs/model_configs/eval"
python3 -m src.run ${CONFIGS_FOLDER}/GoLLIE+-7B_CodeLLaMA.yaml

Citation

@misc{sainz2023gollie,
      title={GoLLIE: Annotation Guidelines improve Zero-Shot Information-Extraction}, 
      author={Oscar Sainz and Iker GarcΓ­a-Ferrero and Rodrigo Agerri and Oier Lopez de Lacalle and German Rigau and Eneko Agirre},
      year={2023},
      eprint={2310.03668},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

gollie's People

Contributors

ikergarcia1996 avatar osainz59 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

gollie's Issues

Dataset Generation

The address for the preprocessing code of ACE05 (Preprocessing script) seems to be invalid. Could you provide it again, please?

ptxas fatal : Ptx assembly aborted due to errors

image
RuntimeError: Internal Triton PTX codegen error:
ptxas /tmp/compile-ptx-src-38da7f, line 91; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 91; error : Feature 'cvt with .bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 92; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 92; error : Feature 'cvt with .bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 102; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 102; error : Feature 'cvt with .f32.bf16' requires .target sm_80 or
higher
ptxas /tmp/compile-ptx-src-38da7f, line 104; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 104; error : Feature 'cvt with .f32.bf16' requires .target sm_80 or
higher
ptxas /tmp/compile-ptx-src-38da7f, line 107; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 107; error : Feature 'cvt with .bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 108; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 108; error : Feature 'cvt with .bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 118; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 118; error : Feature 'cvt with .f32.bf16' requires .target sm_80 or
higher
ptxas /tmp/compile-ptx-src-38da7f, line 120; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 120; error : Feature 'cvt with .f32.bf16' requires .target sm_80 or
higher
ptxas /tmp/compile-ptx-src-38da7f, line 129; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 129; error : Feature 'cvt with .f32.bf16' requires .target sm_80 or
higher
ptxas /tmp/compile-ptx-src-38da7f, line 131; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 131; error : Feature 'cvt with .f32.bf16' requires .target sm_80 or
higher
ptxas /tmp/compile-ptx-src-38da7f, line 140; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 140; error : Feature 'cvt with .f32.bf16' requires .target sm_80 or
higher
ptxas /tmp/compile-ptx-src-38da7f, line 142; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 142; error : Feature 'cvt with .f32.bf16' requires .target sm_80 or
higher
ptxas /tmp/compile-ptx-src-38da7f, line 158; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 158; error : Feature 'cvt with .bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 160; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 160; error : Feature 'cvt with .bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 168; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 168; error : Feature 'cvt with .bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 170; error : Feature '.bf16' requires .target sm_80 or higher
ptxas /tmp/compile-ptx-src-38da7f, line 170; error : Feature 'cvt with .bf16' requires .target sm_80 or higher
ptxas fatal : Ptx assembly aborted due to errors

[BUG] RuntimeError: expected scalar type Float but found BFloat16

Describe the task

  1. Model: I was testing GoLLIE-7B with create custom task.ipynb file
  2. Task: create custom task

Describe the bug
I set use_flash_attention=False in

model, tokenizer = load_model(
    inference=True,
    model_weights_name_or_path="/data2/home/ruiqi/GoLLIE/model",
    quantization=None,
    use_lora=False,
    force_auto_device_map=True,
    use_flash_attention=False,
    torch_dtype="auto"
    # torch_dtype="bfloat16"
)

Then everything went well until RUN GoLLIE

model_ouput = model.generate(
    **model_input.to(model.device),
    max_new_tokens=128,
    do_sample=False,
    min_new_tokens=0,
    num_beams=1,
    num_return_sequences=1,
)

and there was an error message:

RuntimeError                              Traceback (most recent call last)
File <timed exec>:1

File [~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/utils/_contextlib.py:115](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/utils/_contextlib.py:115), in context_decorator.<locals>.decorate_context(*args, **kwargs)
    [112](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/utils/_contextlib.py:112) @functools.wraps(func)
    [113](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/utils/_contextlib.py:113) def decorate_context(*args, **kwargs):
    [114](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/utils/_contextlib.py:114)     with ctx_factory():
--> [115](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/utils/_contextlib.py:115)         return func(*args, **kwargs)

File [~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1673](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1673), in GenerationMixin.generate(self, inputs, generation_config, logits_processor, stopping_criteria, prefix_allowed_tokens_fn, synced_gpus, assistant_model, streamer, negative_prompt_ids, negative_prompt_attention_mask, **kwargs)
   [1656](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1656)     return self.assisted_decoding(
   [1657](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1657)         input_ids,
   [1658](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1658)         assistant_model=assistant_model,
ref='~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:0'>0</a>;32m   (...)
   [1669](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1669)         **model_kwargs,
   [1670](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1670)     )
   [1671](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1671) if generation_mode == GenerationMode.GREEDY_SEARCH:
   [1672](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1672)     # 11. run greedy search
-> [1673](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1673)     return self.greedy_search(
   [1674](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1674)         input_ids,
   [1675](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1675)         logits_processor=logits_processor,
   [1676](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1676)         stopping_criteria=stopping_criteria,
   [1677](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1677)         pad_token_id=generation_config.pad_token_id,
   [1678](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1678)         eos_token_id=generation_config.eos_token_id,
   [1679](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1679)         output_scores=generation_config.output_scores,
   [1680](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1680)         return_dict_in_generate=generation_config.return_dict_in_generate,
   [1681](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1681)         synced_gpus=synced_gpus,
   [1682](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1682)         streamer=streamer,
   [1683](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1683)         **model_kwargs,
   [1684](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1684)     )
   [1686](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1686) elif generation_mode == GenerationMode.CONTRASTIVE_SEARCH:
   [1687](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:1687)     if not model_kwargs["use_cache"]:

File [~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:2521](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:2521), in GenerationMixin.greedy_search(self, input_ids, logits_processor, stopping_criteria, max_length, pad_token_id, eos_token_id, output_attentions, output_hidden_states, output_scores, return_dict_in_generate, synced_gpus, streamer, **model_kwargs)
   [2518](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:2518) model_inputs = self.prepare_inputs_for_generation(input_ids, **model_kwargs)
   [2520](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:2520) # forward pass to get next token
-> [2521](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:2521) outputs = self(
   [2522](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:2522)     **model_inputs,
   [2523](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:2523)     return_dict=True,
   [2524](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:2524)     output_attentions=output_attentions,
   [2525](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:2525)     output_hidden_states=output_hidden_states,
   [2526](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:2526) )
   [2528](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:2528) if synced_gpus and this_peer_finished:
   [2529](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/generation/utils.py:2529)     continue  # don't waste resources running the code we don't need

File [~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1501](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1501), in Module._call_impl(self, *args, **kwargs)
   [1496](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1496) # If we don't have any hooks, we want to skip the rest of the logic in
   [1497](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1497) # this function, and just call forward.
   [1498](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1498) if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   [1499](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1499)         or _global_backward_pre_hooks or _global_backward_hooks
   [1500](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1500)         or _global_forward_hooks or _global_forward_pre_hooks):
-> [1501](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1501)     return forward_call(*args, **kwargs)
   [1502](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1502) # Do not call functions when jit is used
   [1503](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1503) full_backward_hooks, non_full_backward_hooks = [], []

File [~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:164](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:164), in add_hook_to_module.<locals>.new_forward(module, *args, **kwargs)
    [162](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:162)         output = module._old_forward(*args, **kwargs)
    [163](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:163) else:
--> [164](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:164)     output = module._old_forward(*args, **kwargs)
    [165](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:165) return module._hf_hook.post_forward(module, output)

File [~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:1034](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:1034), in LlamaForCausalLM.forward(self, input_ids, attention_mask, position_ids, past_key_values, inputs_embeds, labels, use_cache, output_attentions, output_hidden_states, return_dict)
   [1031](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:1031) return_dict = return_dict if return_dict is not None else self.config.use_return_dict
   [1033](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:1033) # decoder outputs consists of (dec_features, layer_state, dec_hidden, dec_attn)
-> [1034](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:1034) outputs = self.model(
   [1035](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:1035)     input_ids=input_ids,
   [1036](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:1036)     attention_mask=attention_mask,
   [1037](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:1037)     position_ids=position_ids,
   [1038](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:1038)     past_key_values=past_key_values,
   [1039](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:1039)     inputs_embeds=inputs_embeds,
   [1040](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:1040)     use_cache=use_cache,
   [1041](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:1041)     output_attentions=output_attentions,
   [1042](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:1042)     output_hidden_states=output_hidden_states,
   [1043](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:1043)     return_dict=return_dict,
   [1044](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:1044) )
   [1046](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:1046) hidden_states = outputs[0]
   [1047](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:1047) if self.config.pretraining_tp > 1:

File [~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1501](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1501), in Module._call_impl(self, *args, **kwargs)
   [1496](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1496) # If we don't have any hooks, we want to skip the rest of the logic in
   [1497](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1497) # this function, and just call forward.
   [1498](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1498) if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   [1499](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1499)         or _global_backward_pre_hooks or _global_backward_hooks
   [1500](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1500)         or _global_forward_hooks or _global_forward_pre_hooks):
-> [1501](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1501)     return forward_call(*args, **kwargs)
   [1502](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1502) # Do not call functions when jit is used
   [1503](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1503) full_backward_hooks, non_full_backward_hooks = [], []

File [~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:922](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:922), in LlamaModel.forward(self, input_ids, attention_mask, position_ids, past_key_values, inputs_embeds, use_cache, output_attentions, output_hidden_states, return_dict)
    [912](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:912)     layer_outputs = self._gradient_checkpointing_func(
    [913](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:913)         decoder_layer.__call__,
    [914](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:914)         hidden_states,
ref='~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:0'>0</a>;32m   (...)
    [919](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:919)         use_cache,
    [920](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:920)     )
    [921](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:921) else:
--> [922](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:922)     layer_outputs = decoder_layer(
    [923](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:923)         hidden_states,
    [924](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:924)         attention_mask=attention_mask,
    [925](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:925)         position_ids=position_ids,
    [926](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:926)         past_key_value=past_key_value,
    [927](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:927)         output_attentions=output_attentions,
    [928](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:928)         use_cache=use_cache,
    [929](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:929)     )
    [931](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:931) hidden_states = layer_outputs[0]
    [933](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:933) if use_cache:

File [~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1501](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1501), in Module._call_impl(self, *args, **kwargs)
   [1496](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1496) # If we don't have any hooks, we want to skip the rest of the logic in
   [1497](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1497) # this function, and just call forward.
   [1498](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1498) if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   [1499](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1499)         or _global_backward_pre_hooks or _global_backward_hooks
   [1500](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1500)         or _global_forward_hooks or _global_forward_pre_hooks):
-> [1501](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1501)     return forward_call(*args, **kwargs)
   [1502](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1502) # Do not call functions when jit is used
   [1503](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1503) full_backward_hooks, non_full_backward_hooks = [], []

File [~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:164](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:164), in add_hook_to_module.<locals>.new_forward(module, *args, **kwargs)
    [162](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:162)         output = module._old_forward(*args, **kwargs)
    [163](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:163) else:
--> [164](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:164)     output = module._old_forward(*args, **kwargs)
    [165](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:165) return module._hf_hook.post_forward(module, output)

File [~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:672](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:672), in LlamaDecoderLayer.forward(self, hidden_states, attention_mask, position_ids, past_key_value, output_attentions, use_cache, **kwargs)
    [669](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:669) hidden_states = self.input_layernorm(hidden_states)
    [671](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:671) # Self Attention
--> [672](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:672) hidden_states, self_attn_weights, present_key_value = self.self_attn(
    [673](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:673)     hidden_states=hidden_states,
    [674](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:674)     attention_mask=attention_mask,
    [675](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:675)     position_ids=position_ids,
    [676](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:676)     past_key_value=past_key_value,
    [677](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:677)     output_attentions=output_attentions,
    [678](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:678)     use_cache=use_cache,
    [679](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:679)     **kwargs,
    [680](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:680) )
    [681](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:681) hidden_states = residual + hidden_states
    [683](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:683) # Fully Connected

File [~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1501](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1501), in Module._call_impl(self, *args, **kwargs)
   [1496](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1496) # If we don't have any hooks, we want to skip the rest of the logic in
   [1497](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1497) # this function, and just call forward.
   [1498](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1498) if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   [1499](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1499)         or _global_backward_pre_hooks or _global_backward_hooks
   [1500](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1500)         or _global_forward_hooks or _global_forward_pre_hooks):
-> [1501](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1501)     return forward_call(*args, **kwargs)
   [1502](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1502) # Do not call functions when jit is used
   [1503](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1503) full_backward_hooks, non_full_backward_hooks = [], []

File [~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:164](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:164), in add_hook_to_module.<locals>.new_forward(module, *args, **kwargs)
    [162](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:162)         output = module._old_forward(*args, **kwargs)
    [163](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:163) else:
--> [164](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:164)     output = module._old_forward(*args, **kwargs)
    [165](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:165) return module._hf_hook.post_forward(module, output)

File [~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:366](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:366), in LlamaAttention.forward(self, hidden_states, attention_mask, position_ids, past_key_value, output_attentions, use_cache, **kwargs)
    [363](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:363)     value_states = torch.cat(value_states, dim=-1)
    [365](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:365) else:
--> [366](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:366)     query_states = self.q_proj(hidden_states)
    [367](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:367)     key_states = self.k_proj(hidden_states)
    [368](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py:368)     value_states = self.v_proj(hidden_states)

File [~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1501](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1501), in Module._call_impl(self, *args, **kwargs)
   [1496](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1496) # If we don't have any hooks, we want to skip the rest of the logic in
   [1497](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1497) # this function, and just call forward.
   [1498](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1498) if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   [1499](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1499)         or _global_backward_pre_hooks or _global_backward_hooks
   [1500](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1500)         or _global_forward_hooks or _global_forward_pre_hooks):
-> [1501](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1501)     return forward_call(*args, **kwargs)
   [1502](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1502) # Do not call functions when jit is used
   [1503](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/module.py:1503) full_backward_hooks, non_full_backward_hooks = [], []

File [~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:164](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:164), in add_hook_to_module.<locals>.new_forward(module, *args, **kwargs)
    [162](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:162)         output = module._old_forward(*args, **kwargs)
    [163](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:163) else:
--> [164](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:164)     output = module._old_forward(*args, **kwargs)
    [165](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/accelerate/hooks.py:165) return module._hf_hook.post_forward(module, output)

File [~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/linear.py:114](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/linear.py:114), in Linear.forward(self, input)
    [113](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/linear.py:113) def forward(self, input: Tensor) -> Tensor:
--> [114](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a224c6162373038227d.vscode-resource.vscode-cdn.net/data2/home/ruiqi/GoLLIE/notebooks/~/anaconda3/envs/llm/lib/python3.11/site-packages/torch/nn/modules/linear.py:114)     return F.linear(input, self.weight, self.bias)

RuntimeError: expected scalar type Float but found BFloat16

System Info

  1. GPU: (i.e Nvidia GTX2080 * 2)
  2. Pytorch version: 2.1.0
  3. Transformers version: 4.35

OOM

checkpoint keep getting killed. seems like it neeeds 33 gb of memory and its being loaded by fp32. help

[TASK] Another language support for GoLLIE (more specifically Vietnamese)

Hi GoLLIE research team, I am currently in a group of Vietnamese university students who want to present your paper for an upcoming seminar in our "Introduction to Natural Language Processing" course. Our task is to summarize and explain the contents of your paper to our fellow students and lecturers.

To make it easier to understand for our classmates, we are interested in training GoLLIE using Vietnamese datasets. If it's possible, we would greatly appreciate it if you could provide us with some instructions on how to proceed with this. We sincerely enjoyed reading your paper and believe that it would greatly benefit our presentation.

Here are some datasets for the named-entity-recognition subtask that I found on Hugging Face:

We would be extremely grateful if you could provide us with any guidance or assistance on our endeavor. Please feel free to reach out if you have any questions or require more information from us. We are more than willing to cooperate to make this collaboration successful.

generate dataset

when generateing the data sets seems like its taking a very long time. not sure if it actually completing Repo card metadata block was not found. Setting CardData to empty.
WARNING:huggingface_hub.repocard:Repo card metadata block was not found. Setting CardData to empty.
Repo card metadata block was not found. Setting CardData to empty.
WARNING:huggingface_hub.repocard:Repo card metadata block was not found. Setting CardData to empty.
NcbiDisease-NER-dev: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 924/924 [00:03<00:00, 238.68it/s]
NcbiDisease-NER-test: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 941/941 [00:04<00:00, 196.24it/s]

BC5CDR-NER-train-0: 13%|β–ˆβ–ˆβ–Œ | 609/4561 [00:03<00:30, 131.50it/s]
BC5CDR-NER-train-0: 38%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1715/4561 [00:09<00:19, 143.23it/s]
BC5CDR-NER-train-0: 61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 2796/4561 [00:14<00:07, 227.09it/s]
BC5CDR-NER-test: 27%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 1272/4798 [00:09<00:27, 126.57it/s]
BC5CDR-NER-test: 39%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 1890/4798 [00:14<00:23, 124.91it/s]
BroadTwitter-NER-dev: 59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 1188/2002 [00:10<00:07, 115.30it/s]
BroadTwitter-NER-dev: 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1755/2002 [00:15<00:02, 114.49it/s]

BroadTwitter-NER-dev: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 2002/2002 [00:17<00:00, 115.12it/s]
WNUT17-NER-test: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1287/1287 [00:17<00:00, 71.52it/s]
BC5CDR-NER-train-0: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4561/4561 [00:22<00:00, 199.49it/s]
WNUT17-NER-train-0: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 3394/3394 [00:25<00:00, 133.29it/s]
BroadTwitter-NER-test: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 2002/2002 [00:13<00:00, 146.04it/s]
CoNLL03-NER-test: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 3453/3453 [00:34<00:00, 99.02it/s]

BC5CDR-NER-train-0: 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 3290/4561 [00:17<00:05, 223.53it/s]
BC5CDR-NER-train-0: 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 3434/4561 [00:17<00:05, 224.29it/s]
BC5CDR-NER-train-0: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 4545/4561 [00:22<00:00, 222.88it/s]
BC5CDR-NER-test: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4798/4798 [00:36<00:00, 130.22it/s]
FabNER-NER-dev: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 2183/2183 [00:46<00:00, 46.57it/s]
BC5CDR-NER-train-24: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4561/4561 [00:25<00:00, 175.85it/s]
WNUT17-NER-train-24: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 3394/3394 [00:29<00:00, 115.00it/s]
BC5CDR-NER-test: 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 4341/4798 [00:34<00:03, 122.35it/s]
BC5CDR-NER-test: 97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 4660/4798 [00:36<00:00, 208.44it/s]
... (more hidden) ...

BC5CDR-NER-train-24: 42%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1909/4561 [00:14<00:18, 141.97it/s]
BC5CDR-NER-train-24: 89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 4048/4561 [00:23<00:02, 228.24it/s]
BC5CDR-NER-train-42: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4561/4561 [00:20<00:00, 221.22it/s]
BC5CDR-NER-train-42: 28%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 1276/4561 [00:05<00:14, 233.07it/sRepo card metadata block was not found. Setting CardData to empty.02, 235.46it/s]
WARNING:huggingface_hub.repocard:Repo card metadata block was not found. Setting CardData to empty.ain-0: 46%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 2431/5342 [00:15<00:23, 125.62it/s]
BroadTwitter-NER-train-0: 74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 3979/5342 [00:28<00:11, 113.66it/s]

BC5CDR-NER-train-42: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 4559/4561 [00:20<00:00, 228.19it/s]

BroadTwitter-NER-train-0: 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 4219/5342 [00:30<00:10, 112.01it/s]
BroadTwitter-NER-train-0: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 5342/5342 [00:36<00:00, 146.11it/s]
WNUT17-NER-train-42: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 3394/3394 [00:30<00:00, 111.55it/s]
FabNER-NER-test: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 2064/2064 [00:37<00:00, 54.63it/s]

BroadTwitter-NER-train-0: 96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 5114/5342 [00:35<00:01, 197.95it/s]

BC5CDR-NER-dev: 10%|β–ˆβ–ˆβ–Ž | 456/4582 [00:03<00:34, 120.66it/s]
CoNLL03-NER-train-0: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14041/14041 [01:33<00:00, 149.84it/s]
WNUT17-NER-dev: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1009/1009 [00:08<00:00, 124.45it/s]
BC5CDR-NER-dev: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4582/4582 [00:27<00:00, 166.55it/s]
BroadTwitter-NER-train-24: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 5342/5342 [00:25<00:00, 213.61it/s]
BroadTwitter-NER-train-42: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 5342/5342 [00:24<00:00, 218.66it/s]
CoNLL03-NER-train-24: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14041/14041 [01:20<00:00, 174.10it/s]
CoNLL03-NER-train-42: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14041/14041 [01:19<00:00, 176.13it/s]
CoNLL03-NER-dev: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 3250/3250 [00:20<00:00, 156.14it/s]
OntoNotes5-NER-test: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 12217/12217 [04:51<00:00, 41.97it/s]
MultiNERD-NER-test: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 32908/32908 [10:59<00:00, 49.93it/s]
NcbiDisease-NER-train-0: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 5433/5433 [00:15<00:00, 351.93it/s]
NcbiDisease-NER-train-24: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 5433/5433 [00:15<00:00, 352.46it/s]
NcbiDisease-NER-train-42: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 5433/5433 [00:15<00:00, 351.97it/s]

BC5CDR-NER-dev: 78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 3554/4582 [00:22<00:05, 194.35it/s]
BC5CDR-NER-dev: 83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 3781/4582 [00:23<00:03, 221.54it/s]
BC5CDR-NER-dev: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 4569/4582 [00:27<00:00, 211.22it/s]
BroadTwitter-NER-train-24: 81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 4352/5342 [00:20<00:05, 191.82it/s]
BroadTwitter-NER-train-24: 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 5067/5342 [00:23<00:01, 187.31it/s]
BroadTwitter-NER-train-24: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 5327/5342 [00:24<00:00, 205.77it/s]
BroadTwitter-NER-train-42: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 5342/5342 [00:24<00:00, 202.92it/s]

EE task is actually ED

EE task in your paper is actually ED(event detection).
According to src/task/wikievents/prompts.py and src/task/wikievents/data_loader.py
Your COARSE EVENT only has trigger in its definition, and your EE task use only COARSE EVENT.
The model actually only generate trigger for an event, it is event detection not event extraction task.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.