When trying to use a custom model specifying the path, the class throws an error at th

How do i use this with docker? Trying the german-bert from <a href="https://huggingfac

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Error when using custom BERT models. about bert-extractive-summarizer HOT 17 CLOSED

dmmiller612 commented on June 10, 2024 2

Error when using custom BERT models.

from bert-extractive-summarizer.

Comments (17)

dmmiller612 commented on June 10, 2024 2

Whoops, looks like I fixed one part, but need to fix the summarizer contract. I will get to that this weekend.

from bert-extractive-summarizer.

dmmiller612 commented on June 10, 2024 2

You should be able to load a custom (Transformers based) model using the library. Here is an example from the readme, let me know if it you are still having issues.

from transformers import *

# Load model, model config and tokenizer via Transformers
custom_config = AutoConfig.from_pretrained('allenai/scibert_scivocab_uncased')
custom_config.output_hidden_states=True
custom_tokenizer = AutoTokenizer.from_pretrained('allenai/scibert_scivocab_uncased')
custom_model = AutoModel.from_pretrained('allenai/scibert_scivocab_uncased', config=custom_config)

from summarizer import Summarizer

body = 'Text body that you want to summarize with BERT'
body2 = 'Something else you want to summarize with BERT'
model = Summarizer(custom_model=custom_model, custom_tokenizer=custom_tokenizer)
model(body)
model(body2)

from bert-extractive-summarizer.

dmmiller612 commented on June 10, 2024 1

Sorry, I actually fixed this last night, and forgot to commit. I will update when I get home this evening.

from bert-extractive-summarizer.

igormis commented on June 10, 2024 1

I am having the same issue:
I am trying to load trained model using:
ext_model = Summarizer(model="../models/CNN_DailyMail_Extractive/bertext_cnndm_transformer.pt")
I also tried to use
ext_model = Summarizer(custom_model="../models/CNN_DailyMail_Extractive/bertext_cnndm_transformer.pt")
However, I have the following error:
File "/usr/local/lib/python3.6/dist-packages/summarizer/BertParent.py", line 38, in __init__ self.model = base_model.from_pretrained(model, output_hidden_states=True) AttributeError: 'NoneType' object has no attribute 'from_pretrained'

from bert-extractive-summarizer.

dmmiller612 commented on June 10, 2024

I'll take a look.

from bert-extractive-summarizer.

hdatteln commented on June 10, 2024

Experiencing the same thing

from bert-extractive-summarizer.

davidlenz commented on June 10, 2024

How do i use this with docker? Trying the german-bert from here

docker run --rm -it -p 5000:5000 summary-service:latest -model bert-base-german-cased

but get

root@docker2:~/bert-extractive-summarizer/summarizer# docker run --rm -it -p 5000:5000 summary-service:latest -model bert-base-german-cased
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.
100%|#######################################################################################################################################################################################| 40155833/40155833 [00:02<00:00, 19742925.37B/s]
Using Model: bert-base-german-cased
Traceback (most recent call last):
  File "./server.py", line 86, in <module>
    summarizer = Summarizer(args.model, int(args.hidden), args.reduce, float(args.greediness))
  File "/usr/local/lib/python3.6/dist-packages/summarizer/model_processors.py", line 73, in __init__
    super(Summarizer, self).__init__(model, hidden, reduce_option, greedyness)
  File "/usr/local/lib/python3.6/dist-packages/summarizer/model_processors.py", line 53, in __init__
    super(SingleModel, self).__init__(model, hidden, reduce_option, greedyness)
  File "/usr/local/lib/python3.6/dist-packages/summarizer/model_processors.py", line 15, in __init__
    self.model = BertParent(model)
  File "/usr/local/lib/python3.6/dist-packages/summarizer/BertParent.py", line 41, in __init__
    self.model = base_model.from_pretrained(model, output_hidden_states=True)
AttributeError: 'NoneType' object has no attribute 'from_pretrained'
root@docker2:~/bert-extractive-summarizer/summarizer#

from bert-extractive-summarizer.

hdatteln commented on June 10, 2024

@davidlenz , Sorry, I wasn't using Docker when checking in the last changes for this issue, so didn't look at that setup;
Making this work would require some more code updates I think. server.py and summarize.py would need to be updated to accept arguments to pass in e.g. the path to where your custom model is stored, plus some code updates to create a BertModel (and BertTokenizer if required) from those paths, which can then be passed into the Summarizer(...) constructor.

from bert-extractive-summarizer.

davidlenz commented on June 10, 2024

@hdatteln this is a good starting point, thanks! Got the following afterwards:

Traceback (most recent call last):
  File "./server.py", line 87, in <module>
    summarizer = Summarizer(args.model, args.custom_model, args.custom_tokenizer, int(args.hidden), args.reduce, float(args.greediness))
TypeError: __init__() takes from 1 to 5 positional arguments but 7 were given
root@docker2:~/bert-extractive-summarizer#

So from the requirements-service.txt here it looks the bert-extractive-summarizer is installed via pip as version 0.2.0 which needs to be changed to reflect the latest changes in version 0.2.2.

I applied the changes locally and rebuild the docker container (docker build uses local server.py and requirements-service.txt) but had no luck. I am actually uncertain how to correctly provide inputs to custom_model and custom_tokenizer.

Staring at the code for a while, i came to the conclusion that my model is not really a custom model in the sense it is meant to be here, but rather another pretrained model already in the transformers repo. Thus i concluded it would suffice to include the bert-base-german-cased into the MODELS dict from BertParent.py. However as i currently understand these changes need to be added to pypi as well to be usable with docker.

from bert-extractive-summarizer.

davidlenz commented on June 10, 2024

Thanks for the Feedback! Unfortunately it is still not working for me and i am not sure how to go on or correctly use the german-bert.

docker run --rm -it -p 5000:5000 summary-service:latest -model bert-large-uncased

works well, but

docker run --rm -it -p 5000:5000 summary-service:latest -model bert-base-german-cased

still throws AttributeError: 'NoneType' object has no attribute 'from_pretrained'


root@docker:~/bert-extractive-summarizer# docker run --rm -it -p 5000:5000 summary-service:latest -model bert-base-german-cased
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.
100%|#######################################################################################################################################################################################| 40155833/40155833 [00:02<00:00, 17176330.37B/s]
Using Model: bert-base-german-cased
Traceback (most recent call last):
  File "./server.py", line 90, in <module>
    greedyness=float(args.greediness)
  File "/usr/local/lib/python3.6/dist-packages/summarizer/model_processors.py", line 106, in __init__
    super(Summarizer, self).__init__(model, custom_model, custom_tokenizer, hidden, reduce_option, greedyness, language, random_state)
  File "/usr/local/lib/python3.6/dist-packages/summarizer/model_processors.py", line 80, in __init__
    greedyness, language=language, random_state=random_state)
  File "/usr/local/lib/python3.6/dist-packages/summarizer/model_processors.py", line 25, in __init__
    self.model = BertParent(model, custom_model, custom_tokenizer)
  File "/usr/local/lib/python3.6/dist-packages/summarizer/BertParent.py", line 38, in __init__
    self.model = base_model.from_pretrained(model, output_hidden_states=True)
AttributeError: 'NoneType' object has no attribute 'from_pretrained'
root@docker:~/bert-extractive-summarizer#

from bert-extractive-summarizer.

houda96 commented on June 10, 2024

Is it also possible to add the multilingual options from BERT as an option? (or make it possible to custom indicate which BERT tokenizer, model and which pre-trained model it needs to use?)

Update: I found out it is already possible, but the documentation leaves some room for interpretation (as in that the custom model needs to be already pre-trained). Maybe it is possible to include the following passage for others to see how they can use it? @dmmiller612

bert_model = "bert-base-multilingual-cased"
custom_model = transformers.BertModel.from_pretrained(bert_model,  output_hidden_states=True)
custom_tokenizer = transformers.BertTokenizer.from_pretrained(bert_model)
model = Summarizer(model=bert_model, custom_model=custom_model, custom_tokenizer=custom_tokenizer)```

from bert-extractive-summarizer.

dmmiller612 commented on June 10, 2024

Yep, I can update the documentation.

from bert-extractive-summarizer.

elmeligy commented on June 10, 2024

I am having the same issue
$ docker run --rm -it -p 5000:5000 summary-service:latest -model bert-base-multilingual-cased [nltk_data] Downloading package punkt to /root/nltk_data... [nltk_data] Unzipping tokenizers/punkt.zip. 100%|################################################################################################################################################################| 40155833/40155833 [00:25<00:00, 1561989.72B/s] Using Model: bert-base-multilingual-cased Traceback (most recent call last): File "./server.py", line 90, in <module> greedyness=float(args.greediness) File "/usr/local/lib/python3.6/dist-packages/summarizer/model_processors.py", line 106, in __init__ super(Summarizer, self).__init__(model, custom_model, custom_tokenizer, hidden, reduce_option, greedyness, language, random_state) File "/usr/local/lib/python3.6/dist-packages/summarizer/model_processors.py", line 80, in __init__ greedyness, language=language, random_state=random_state) File "/usr/local/lib/python3.6/dist-packages/summarizer/model_processors.py", line 25, in __init__ self.model = BertParent(model, custom_model, custom_tokenizer) File "/usr/local/lib/python3.6/dist-packages/summarizer/BertParent.py", line 38, in __init__ self.model = base_model.from_pretrained(model, output_hidden_states=True) AttributeError: 'NoneType' object has no attribute 'from_pretrained'

from bert-extractive-summarizer.

dmmiller612 commented on June 10, 2024

Yeah, right now the service doesn't have a good way to load a custom model (It can easily been done with the library). I'll add something to hopefully address the issue sometime this week.

from bert-extractive-summarizer.

commented on June 10, 2024

There's an ad-hoc solution if it's urgent.

Replace

`
base_model, base_tokenizer = self.MODELS.get(model, (None, None))

    if custom_model:
        self.model = custom_model
    else:
        self.model = base_model.from_pretrained(model, output_hidden_states=True)

    if custom_tokenizer:
        self.tokenizer = custom_tokenizer
    else:
        self.tokenizer = base_tokenizer.from_pretrained(model)`

with

`
base_model, base_tokenizer = self.MODELS.get('bert-large-uncased', (None, None))

    if custom_model:
        self.model = base_model.from_pretrained(custom_model, output_hidden_states=True)
    else:
        self.model = base_model.from_pretrained(model, output_hidden_states=True)

    if custom_tokenizer:
        self.tokenizer = base_tokenizer.from_pretrained(custom_tokenizer)
    else:
        self.tokenizer = base_tokenizer.from_pretrained(model)`

this part in bert_parent.py to make it work. Use with caution, since it's not a permanent solution. You can use new Summarizer(custom model = 'path_or_model', custom_tokenizer = 'path_or_model') now.

from bert-extractive-summarizer.

nvenkatesh2409 commented on June 10, 2024

Yeah, right now the service doesn't have a good way to load a custom model (It can easily been done with the library). I'll add something to hopefully address the issue sometime this week.

Hi, any update on this loading the custom model

from bert-extractive-summarizer.

dmmiller612 commented on June 10, 2024

Closing as stale. Let me know if any issues arise here.

from bert-extractive-summarizer.

Error when using custom BERT models. about bert-extractive-summarizer HOT 17 CLOSED

Comments (17)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent