Giter Club home page Giter Club logo

Comments (5)

1shershah avatar 1shershah commented on July 1, 2024 6

Solution :

from transformers import BertTokenizer
##files from git
from model import BertForMultiLabelClassification
from multilabel_pipeline import MultiLabelPipeline
from pprint import pprint

tokenizer = BertTokenizer.from_pretrained("monologg/bert-base-cased-goemotions-ekman")
model = BertForMultiLabelClassification.from_pretrained("monologg/bert-base-cased-goemotions-ekman")

texts = [
    "Hey that's a thought! Maybe we need [NAME] to be the celebrity vaccine endorsement!",
    "it’s happened before?! love my hometown of beautiful new ken πŸ˜‚πŸ˜‚",
    "I love you, brother.",
    "Troll, bro. They know they're saying stupid shit. The motherfucker does nothing but stink up libertarian subs talking shit",
]
import torch
import numpy as np
results = []
for txt in texts:
    inputs = tokenizer(txt,return_tensors="pt")
    outputs = model(**inputs)
    scores =  1 / (1 + torch.exp(-outputs[0]))  # Sigmoid
    threshold = .3
    for item in scores:
        labels = []
        scores = []
        for idx, s in enumerate(item):
            if s > threshold:
                labels.append(model.config.id2label[idx])
                scores.append(s)
        results.append({"labels": labels, "scores": scores})

from goemotions-pytorch.

hannahburkhardt avatar hannahburkhardt commented on July 1, 2024 3

@bubbazz if you mean that you aren't getting outputs for all labels, but only the main labels, try this.

from transformers import BertTokenizer, AutoModelForSequenceClassification, pipeline

model_name = 'original' #'ekman'

tokenizer = BertTokenizer.from_pretrained(f"monologg/bert-base-cased-goemotions-{model_name}")
model = AutoModelForSequenceClassification.from_pretrained(f"monologg/bert-base-cased-goemotions-{model_name}", num_labels=28)

goemotions=pipeline(
        model=model, 
        tokenizer=tokenizer, 
        task="text-classification",
        return_all_scores=True,
        function_to_apply='sigmoid',
    )

goemotions(texts)

from goemotions-pytorch.

santimarro avatar santimarro commented on July 1, 2024

Hey, I had the same issue but I managed to make it work with this:

`from transformers import AutoTokenizer, AutoModelForSequenceClassification
from pprint import pprint
from multilabel_pipeline import MultiLabelPipeline

tokenizer = AutoTokenizer.from_pretrained(
"monologg/bert-base-cased-goemotions-original"
)
model = AutoModelForSequenceClassification.from_pretrained(
"monologg/bert-base-cased-goemotions-original")

def tokenize_text(text):
# Replace "text" with whatever column name has your text inputs
return tokenizer(text, truncation=True)

texts = [
"Hey that's a thought! Maybe we need [NAME] to be the celebrity vaccine endorsement!",
"it’s happened before?! love my hometown of beautiful new ken πŸ˜‚πŸ˜‚",
"I love you, brother.",
"Troll, bro. They know they're saying stupid shit. The motherfucker does nothing but stink up libertarian subs talking shit",
]

goemotions = MultiLabelPipeline(
model=model,
tokenizer=tokenizer,
threshold=0.3
)
pprint(goemotions(texts))`

Just make sure to use the multilabel_pipeline provided in the python with the same name in this repo!

from goemotions-pytorch.

1shershah avatar 1shershah commented on July 1, 2024

Hey, I had the same issue but I managed to make it work with this:

`from transformers import AutoTokenizer, AutoModelForSequenceClassification from pprint import pprint from multilabel_pipeline import MultiLabelPipeline

tokenizer = AutoTokenizer.from_pretrained( "monologg/bert-base-cased-goemotions-original" ) model = AutoModelForSequenceClassification.from_pretrained( "monologg/bert-base-cased-goemotions-original")

def tokenize_text(text): # Replace "text" with whatever column name has your text inputs return tokenizer(text, truncation=True)

texts = [ "Hey that's a thought! Maybe we need [NAME] to be the celebrity vaccine endorsement!", "it’s happened before?! love my hometown of beautiful new ken πŸ˜‚πŸ˜‚", "I love you, brother.", "Troll, bro. They know they're saying stupid shit. The motherfucker does nothing but stink up libertarian subs talking shit", ]

goemotions = MultiLabelPipeline( model=model, tokenizer=tokenizer, threshold=0.3 ) pprint(goemotions(texts))`

Just make sure to use the multilabel_pipeline provided in the python with the same name in this repo!

This will still not work, if you create a custom Pipeline with abstract Pipeline, you have to override the
abstract methods _forward, _sanitize_parameters, postprocess, preprocess

from goemotions-pytorch.

bubbazz avatar bubbazz commented on July 1, 2024

the implementation above is nearly identical to the pipeline.py but i get different result. can somebody explain what the reason for this is?

results:

[{'labels': ['joy', 'neutral'],
  'scores': [tensor(0.3892, grad_fn=<UnbindBackward0>),
   tensor(0.5499, grad_fn=<UnbindBackward0>)]},
 {'labels': ['joy', 'surprise'],
  'scores': [tensor(0.9277, grad_fn=<UnbindBackward0>),
   tensor(0.4548, grad_fn=<UnbindBackward0>)]},
 {'labels': ['joy'], 'scores': [tensor(0.9889, grad_fn=<UnbindBackward0>)]},
 {'labels': ['anger'], 'scores': [tensor(0.7580, grad_fn=<UnbindBackward0>)]}]

from goemotions-pytorch.

Related Issues (11)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.