microsoft / toxigen Goto Github PK

This repo contains the code for generating the ToxiGen dataset, published at ACL 2022.

License: Other

Python 27.34% Shell 0.56% Jupyter Notebook 72.10%

toxigen's Introduction

ToxiGen: A Large-Scale Machine-Generated Dataset for Adversarial and Implicit Hate Speech Detection

This repository includes all necessary components that we used to generate ToxiGen dataset which contains implicitly toxic and benign sentences mentioning 13 minority groups. It includes a tool referred to as ALICE to stress test a given off-the-shelf content moderation system and iteratively improve it across these minority groups.

With release of the source codes and prompt seeds for this work we hope to encourage and engage community to contribute to it by for example adding prompt seeds and generating data for minority groups that are not covered in our dataset or even scenarios we have not covered to continuously iterate and improve it (e.g., by submitting PR to this repository).

The dataset is intended to be used for training classifiers that learn to detect subtle hate speech that includes no slurs or profanity. The data, methods and two trained hatespeech detection checkpoints released with this work are intended to be used for research purposes only.

This repository includes two methods for generating new sentences given a large scale pretrained language model (e.g., GPT3) and an off the shelf classifier:

Demonstration-Based Prompting, where a language model is used to create more data given human provided prompts across different minority groups.
ALICE, which creates an adversarial set up between a given toxicity classifier and a generator (pretrained language model) to create challenging examples for the classifier and improve its performance.

WARNING: This repository contains and discusses content that is offensive or upsetting. All materials are intended to support research that improves toxicity detection methods. Included examples of toxicity do not represent how the authors or sponsors feel about any identity groups.

Downloading ToxiGen

You can download ToxiGen using HuggingFace 🤗 from this webpage or through python:

To run these commands you'll need to create a Hugging Face auth_token by following these steps. As discussed below, you can manually use use_auth_token={auth_token} or register your token with your transformers installation via huggingface-cli.

from datasets import load_dataset
TG_data = load_dataset("skg/toxigen-data", name="train", use_auth_token=True) # 250k training examples
TG_annotations = load_dataset("skg/toxigen-data", name="annotated", use_auth_token=True) # Human study

Optional, but helpful: Please fill out this form so we can track how the community uses ToxiGen.

Authorization Tokens

There are two ways to obtain authorization tokens (for which you will need a huggingface account):

Follow these directions to pass an authorization token in while loading the data.
Use huggingface-cli, as you could when sharing a model on HuggingFace, to associate your huggingface account with your installed version of the transformers library.

Installing ToxiGen source code

ToxiGen is bundled into a python package that can be installed using pip:

pip install toxigen

Jupyter Notebook Example

Please use this Jupyter Notebook to get started with the main components of this repository.

We also include a Notebook showing how to download different parts of ToxiGen using HuggingFace.

Generating data with ToxiGen demonstration-based prompts

To generate data by passing prompts into the pretrained language model (GPT-3) used in this work please use the following command:

python generate.py --input_prompt_file <path_to_prompt_file.txt> --language_model GPT3 --output_file <path_to_output_file.txt> --num_generations_per_prompt 10 --openai_api_key <your_api_key>

You can choose from a list of prompt files that we have used in this work or write your own and point to the file (shown below). A prompt file is a text file with one line per prompt (a string).

Generating data using ALICE

To generate data using ALICE, it is necessary to choose a generator (GPT3 in our case) and a pre-trained hate speech classifier. We provide examples here and the guidance about how to add new classifiers. To generate with ALICE, run this command:

python generate.py --input_prompts <path_to_prompt_file.txt> --language_model GPT3 --ALICE True --classifier HateBERT --output-file <path_to_output_file.txt> --openai_api_key <your_api_key>

Writing your own demonstrations

In the demonstrations directory, you can find the demonstrations we have used to generate the dataset, which will help you in writing your own. Notice that the demonstration files are one sentence per line, and each targets the same group within each file. Once you've written the demonstrations and want to turn them into prompts, you can run this command:

python make_prompts.py --input_demonstrations <path_to_demo_file.txt> --output-file <path_to_prompt.txt> --demonstrations_per_prompt 5 --num_prompt_to_generate 100

Using checkpoints of pretrained classifiers on ToxiGen

We have finetuned two toxicity detection classifiers on the ToxiGen data which has resulted in significant performance improvement as reported in the paper. The checkpoints for them can be loaded directly using the Huggingface's transformers library:

HateBERT_ToxiGen

HateBERT finetuned on ToxiGen can be downloaded as follows in python:

from transformers import pipeline

toxigen_hatebert = pipeline("text-classification", model="tomh/toxigen_hatebert", tokenizer="bert-base-uncased")

toxigen_hatebert("I love science")

from transformers import AutoModelForSequenceClassification

toxigen_hatebert = AutoModelForSequenceClassification.from_pretrained("tomh/toxigen_hatebert")

RoBERTa_ToxiGen

RoBERTa finetuned on ToxiGen can be downloaded as follows in python:

from transformers import pipeline

toxigen_roberta = pipeline("text-classification", model="tomh/toxigen_roberta")

toxigen_roberta("I love science")

from transformers import AutoModelForSequenceClassification

toxigen_hatebert = AutoModelForSequenceClassification.from_pretrained("tomh/toxigen_roberta")

Contributing

We encourage contribution to the ToxiGen repository of prompts and demonstrations. If you find your new prompts that work for your cases, please add them.

Community Contributions

March 9, 2024: Demonstrations for Immigrants and Bisexuality were added from a Zurich hackathon.

Citation

Please use the following to cite this work:

@inproceedings{hartvigsen2022toxigen,
  title={ToxiGen: A Large-Scale Machine-Generated Dataset for Implicit and Adversarial Hate Speech Detection},
  author={Hartvigsen, Thomas and Gabriel, Saadia and Palangi, Hamid and Sap, Maarten and Ray, Dipankar and Kamar, Ece},
  booktitle={Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics},
  year={2022}
}

Responsible AI Considerations

Please also note that there is still a lot that this dataset is not capturing about what constitutes problematic language. Our annotations might not capture the full complexity of these issues, given problematic language is context-dependent, dynamic, and can manifest in different forms and different severities. Problematic language is also fundamentally a human-centric problem and should be studied in conjunction with human experience. There is need for multi-disciplinary work to better understand these aspects. Also note that this dataset only captures implicit toxicity (more precisely hate speech) for 13 identified minority groups, and due to its large scale can naturally be noisy. Our goal in this project is to provide the community with means to improve toxicity detection on implicit toxic language for the identified minority groups and there exists limitations to this dataset and models trained on it which can potentially be the subject of future research, for example, including more target groups, a combination of them and so on that are not covered in our work.

toxigen's People

Contributors

Stargazers

Watchers

toxigen's Issues

Prompts for Annotated Test Set

Awesome work!

I noticed the annotated test set (792 examples) doesn't have the original prompts for each generated text. Could that be found anywhere?

Thanks!

IndexError from alice.py

Hey!
Awesome paper and thank you for the open resources.

I am trying to reproduce generate_text.ipynb from the notebooks in Google Colab. The link in the notebook to Google Colab displayed an error so I created a duplicate here.

Date Seen
(06/05/2023)

Versions
Python 3.10

Steps to Reproduce
The bug occurred when calling alice() as shown in the notebook.

The same thing happens with the command:

!python generate.py --input_prompt_file /content/drive/MyDrive/coding_projects/toxigen/prompts/neutral_black_1k.txt --language_model GPT3 --classifier RoBERTa --ALICE True --output_file test_file.txt --num_generations_per_prompt 10 --generation_mode neutral --endpoint_url https://api.openai.com/v1/engines/text-ada-001/completions --api_key <API-KEY>

There was a minor bug from generate.py, which can be resolved by rewriting the line to f.write(f"{response}\n").

Traceback (most recent call last):
  File "/mnt/c/Users/tranh/Desktop/unistuff/3iib/finalproject/chatbot-utterances/TOXIGEN-main/generate.py", line 52, in <module>
    main()
  File "/mnt/c/Users/tranh/Desktop/unistuff/3iib/finalproject/chatbot-utterances/TOXIGEN-main/generate.py", line 48, in main
    f.write(response + "\n")
TypeError: unsupported operand type(s) for +: 'dict' and 'str'

However, the main problem is IndexError and I am not sure how to fix it.

Request to share the full questionaire for human evaluation.

For questions 7,8,9 in the questionnaire, what options are available in the dropdown?

KeyError: 'choices'

First of all let me thank your research team. And sorry if I don't speak English very well. I'm very interested in your research but I have one error when running on both jupyter and notebook.

Looking forward to your reply. Thank you.

Warning - Legal and Ethical Issue for using this dataset - Issues could stem from political activism being built into algorithms

On reviewing the Toxigen Machine Learning algorithm and its associated dataset, as hosted on github. While I appreciate the effort to address and measure bias in language models, I'd like to raise some concerns regarding the current approach to demographic representation in your dataset.

Firstly, the dataset seems to overlook the inclusion of specific demographic groups, notably individuals of Caucasian descent and males. This omission is noteworthy as it potentially introduces a form of selection bias. By not representing these groups, the algorithm may inadvertently reinforce the misconception that bias and discrimination are issues exclusive to non-Caucasian and non-male populations. However, in a global context, any demographic group can be a minority, and their experiences and perspectives are essential in understanding and addressing bias comprehensively.

To enhance the robustness of your approach, I would recommend expanding the dataset to include a broader spectrum of demographic groups. This expansion would allow for a more holistic assessment of biases across different populations. Additionally, integrating a mechanism for the algorithm to recognize and classify a diverse range of groups would further refine its accuracy in gauging biases.

Furthermore, I suggest considering the iTo enhance the robustness of your approach, I would recommend expanding the dataset to include a broader spectrum of demographic groups. This expansion would allow for a more holistic assessment of biases across different populations. Additionally, integrating a mechanism for the algorithm to recognize and classify a diverse range of groups would further refine its accuracy in gauging biases.
mplementation of intersectional analysis in your model. This approach could provide deeper insights into how overlapping identities (such as race, gender, socioeconomic status) contribute to unique experiences of bias and discrimination.

Your project has the potential to make significant contributions to the field of ethical AI. By broadening the scope of demographic representation in your dataset, the Toxigen algorithm can become a more inclusive and effective tool in identifying and mitigating biases in language models.

Care should be taken using this dataset, as it has problems with its classification system that could open the user up to lawsuit, in many US states, and likely many nations.
Its problem stems from being a proudly activist dataset, that builds in bias as a feature. So instead of trying to eliminate all biases from the authors classification method, the author has gone out of their way to include DEI and bias into the very core of the detection method used in Toxigen. The authors have been specific in that white people can never be the target of hate speech (unless of course they fit into another intersectional category.

Thank you for your dedication to this important work, and I look forward to seeing how your project evolves.

Adding licensing info

Hi! Would it be possible to add licensing info for the dataset as well as for code and models?
In particular, I'm interested in understanding if the dataset can be used in a commercial setting, in a research setting, whether and how attribution is required etc.
Thank you so much!!!

the issue of reproduced performance

following the use:
"
from transformers import pipeline
toxigen_hatebert = pipeline("text-classification", model="tomh/toxigen_hatebert", tokenizer="bert-base-cased")
toxigen_hatebert("I love science")
"
or
"
from transformers import AutoModelForSequenceClassification
toxigen_hatebert = AutoModelForSequenceClassification.from_pretrained("tomh/toxigen_hatebert")
"
the f1 is very small and the method cannot achieve the performance in the paper. using toxigen-val dataset.

by the way, the index 1 of logits means the positive label (i.e., toxic)?

my python scripts:
from transformers import AutoTokenizer, AutoModel
import json
from sklearn.metrics import f1_score,precision_score,recall_score,accuracy_score,roc_auc_score
from tqdm import tqdm
import torch
from transformers import pipeline

from transformers import AutoModelForSequenceClassification

with open(data_path,'r') as f:
datas= json.load(f)

toxigen_hatebert = AutoModelForSequenceClassification.from_pretrained("pretrained_models/toxigen_hatebert")
tokenizer = AutoTokenizer.from_pretrained('pretrained_models/bert-base-cased')

# toxigen_hatebert = pipeline("text-classification",
# model="pretrained_models/toxigen_hatebert",
# tokenizer="pretrained_models/bert-base-cased")

predicts = []
trues = []
toxigen_hatebert.eval()
with torch.no_grad():
for idx,item in enumerate(tqdm(datas)):
content = item['content']
# outputs = toxigen_hatebert(content)
inputs = tokenizer(content,return_tensors='pt',padding='longest')
outputs = toxigen_hatebert(**inputs)
print(outputs)
trues.append(item['toxic'])
predicts.append(1-torch.argmax(outputs['logits'][0]).tolist())
print('true: ', item['toxic'])
print('pred: ', predicts[-1])

f1=f1_score(y_true=trues,y_pred=predicts)
precision=precision_score(y_true=trues,y_pred=predicts)
recall=recall_score(y_true=trues,y_pred=predicts)
acc=accuracy_score(y_true=trues,y_pred=predicts)
auc=roc_auc_score(y_true=trues,y_score=predicts)

print(f1,precision,recall,acc,auc)

IndexError: list index out of range when running alice

Hi,
When I ran alice by python generate.py --input_prompts <path_to_prompt_file.txt> --language_model GPT3 --ALICE True --classifier HateBERT --output-file <path_to_output_file.txt> --openai_api_key <your_api_key>, I got an index error as shown in the figure.

Previously, I also got index error in File "toxigen/alice.py", line 130 scores_ = torch.Tensor([[[omit_(full_names[i][0][j], scores[i][0][j], stops, prompt) for j in range(len(scores[i][0]))]] for i in range(num_beams)]).
And I find that there are some blank list in full_names and scores, here is an example when we get full_names:

[[], [], [], [[' without', ' in', ' that', ' with', ' where']], [[' Together', ' together', ' in', ' moving', ' up']], [[' United', ' reason', ' U', ' only', ' very']], [[' the', ' potential', ' their', ' what', ' how']], [[' another', ' any', ' if', ' anymore', ' here']], [['!', '.', '\n', ' and', ' .']], [[' is', ' are', ' can', ' will', ' may']]]

We can see that the first three are blank lists.
I wonder if there are any hyperparameters that can control the output from the GPT3 API not to return such blank things.
Or if there are any ways to prevent this from happening.
Can you help me with this?

Index Issue

I was trying to run the example notebook generate_text.ipynb. But ran into the following error.

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)

[<ipython-input-10-bc7211cfb8f8>](https://localhost:8080/#) in <cell line: 5>()
      3 
      4 # Let's pass our prompt right back in, but this time using ALICE
----> 5 response = alice(prompt, classifier, endpoint_url, secret_key, mode="hate") # Switch mode to "hate" to generate hate

5 frames

[/usr/local/lib/python3.10/dist-packages/toxigen/utils.py](https://localhost:8080/#) in alice(prompt, classifier, endpoint_url, api_key, mode, num_to_generate, device)
      5     language_model = ALICE(GPT3(endpoint_url=endpoint_url, apikey=api_key), classifier, mode, device="cpu")
      6 
----> 7     return [language_model(prompt) for _ in range(num_to_generate)]
      8 
      9 def label_annotations(annotated):

[/usr/local/lib/python3.10/dist-packages/toxigen/utils.py](https://localhost:8080/#) in <listcomp>(.0)
      5     language_model = ALICE(GPT3(endpoint_url=endpoint_url, apikey=api_key), classifier, mode, device="cpu")
      6 
----> 7     return [language_model(prompt) for _ in range(num_to_generate)]
      8 
      9 def label_annotations(annotated):

[/usr/local/lib/python3.10/dist-packages/toxigen/language_models.py](https://localhost:8080/#) in __call__(self, prompt)
     42 
     43     def __call__(self, prompt):
---> 44         return self.generate(prompt)
     45 
     46     def generate(self, prompt):

[/usr/local/lib/python3.10/dist-packages/toxigen/language_models.py](https://localhost:8080/#) in generate(self, prompt)
     49         else:
     50             flag = 1
---> 51         return beam_search(prompt, self.language_model, self.classifier, flag, self.device)

[/usr/local/lib/python3.10/dist-packages/toxigen/alice.py](https://localhost:8080/#) in beam_search(prompt, language_model, classifier, mode, device, end_token, weights, num_beams, vocab_size, max_length, length_penalty)
    126         full_names = [[list(x.keys()) for x in scores[i]] for i in range(num_beams)]
    127         scores = [[list(x.values()) for x in scores[i]] for i in range(num_beams)]
--> 128         scores_ = torch.Tensor([[[omit_(full_names[i][0][j], scores[i][0][j], stops, prompt) for j in range(len(scores[i][0]))]] for i in range(num_beams)])
    129         scores = scores_.view(num_beams * 1, vocab_size)
    130         full_names = list(itertools.chain.from_iterable(list(itertools.chain.from_iterable(full_names))))

[/usr/local/lib/python3.10/dist-packages/toxigen/alice.py](https://localhost:8080/#) in <listcomp>(.0)
    126         full_names = [[list(x.keys()) for x in scores[i]] for i in range(num_beams)]
    127         scores = [[list(x.values()) for x in scores[i]] for i in range(num_beams)]
--> 128         scores_ = torch.Tensor([[[omit_(full_names[i][0][j], scores[i][0][j], stops, prompt) for j in range(len(scores[i][0]))]] for i in range(num_beams)])
    129         scores = scores_.view(num_beams * 1, vocab_size)
    130         full_names = list(itertools.chain.from_iterable(list(itertools.chain.from_iterable(full_names))))

IndexError: list index out of range
------------------------------------------------------------------------------------------------------------------------------------------------------------

Can you please tell me the how to resolve the error?

Issue With prompt_label in Data

Hey
I was trying to separate out the toxic and benign samples but noticed something weird. All the values in the prompt_label column are 0 for some reason. Is this a mistake or am I missing something?

A question about data set and labeling

Dear project managers:
When I downloaded the original dataset from the link of huggingface "https://huggingface.co/datasets/skg/toxigen-data", I noted that there are another two datasets named "annotated_train.csv" and "annotated_test.csv" besides the file "toxigen.csv". And I have two questions:

What is the connection between “annotated_train.csv”and original "toxigen.csv"? What about “annotated_test.csv”and original "toxigen.csv"?
Besides "label", "annotated_train.csv" also provides the tags of "toxicity_ai" and "toxicity_human". Based on the paper (at page 7), if the maximum score between the "toxicity_ai" and "toxicity_human" >3, the label appears to be labeled "toxic". However, many examples are labled "toxic" while the maximum score is < 3.

Several examples are as follows:

text	target_group	factual?	ingroup_effect	lewd	framing	predicted_group	stereotyping	intent	toxicity_ai	toxicity_human	predicted_author	label	actual_method
b'i was walking down the street the other day and i saw a group of about 5 latino men walking towards me.'	mexican	neither factual nor opinion	doesn't use in-group language	doesn't contain lewd content	disagreement	the text directly references mexican	doesn't use positive stereotyping	2	2.666667	2.666667	human	hate	CBS
b'the most dangerous job in america is fishing'	women	neither factual nor opinion	doesn't use in-group language	doesn't contain lewd content	it is unclear who is referenced	doesn't use positive stereotyping	1	1	1	human	hate	TopK
b'i am not racist, i am just saying the truth, you should be proud of who you are '	middle_east	opinion	doesn't use in-group language	doesn't contain lewd content	it is unclear who is referenced	doesn't use positive stereotyping	2.666667	2.666667	2.666667	human	hate	CBS
b'i really like jesus a lot'	native_american	opinion	doesn't use in-group language	doesn't contain lewd content	it is unclear who is referenced	doesn't use positive stereotyping	1	1	1	human	hate	TopK

Maybe I missed something. I am sincerely looking forward to your reply. Thank you.

Warning - Use of this labelling algorithm could lead to legal or ethical problems similar to Gemini

Isnt Gemini the warning against over zealous activism being introduced into the guard rails? Toxigen is courting the same kind of negative attention, and I hope developers will think twice before using it in its current form. Double standards on labeling data need to be removed otherwise some users are going to end up in a lawsuit or financial penalties.

Thank you for your dedication to this important work, and I look forward to seeing how your project evolves.

This repo is missing important files

There are important files that Microsoft projects should all have that are not present in this repository. A pull request has been opened to add the missing file(s). When the pr is merged this issue will be closed automatically.

Microsoft teams can learn more about this effort and share feedback within the open source guidance available internally.

Merge this pull request

Prediction error while using pipeline

I am using your example to predict if a text is toxic or not, and i get the following error:

Output exceeds the size limit. Open the full output data in a text editor
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Cell In[65], line 1
----> 1 predict(data)

Cell In[64], line 3, in predict(df)
      1 def predict(df):
      2     for index, row in df.iterrows():
----> 3         prediction= toxigen_hatebert(row["comment"])
      4         if prediction[0]["label"]=='LABEL_1':
      5              df['LABEL_1'].loc[index] =prediction[0]["score"]

File c:\Users\superserver\anaconda3\envs\guiller\lib\site-packages\transformers\pipelines\text_classification.py:155, in TextClassificationPipeline.__call__(self, *args, **kwargs)
    121 def __call__(self, *args, **kwargs):
    122     """
    123     Classify the text(s) given as inputs.
    124 
   (...)
    153         If `top_k` is used, one such dictionary is returned per label.
    154     """
--> 155     result = super().__call__(*args, **kwargs)
    156     # TODO try and retrieve it in a nicer way from _sanitize_parameters.
    157     _legacy = "top_k" not in kwargs

File c:\Users\superserver\anaconda3\envs\guiller\lib\site-packages\transformers\pipelines\base.py:1084, in Pipeline.__call__(self, inputs, num_workers, batch_size, *args, **kwargs)
...
   2208     # remove once script supports set_grad_enabled
   2209     _no_grad_embedding_renorm_(weight, input, max_norm, norm_type)
-> 2210 return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)

IndexError: index out of range in self

I know is an error related to the length of the input, but i have tokenize the whole dataset previously and the max token lenght i get is 392. The code is as follows:

from transformers import pipeline
toxigen_hatebert = pipeline("text-classification", model="tomh/toxigen_hatebert",tokenizer="tomh/toxigen_hatebert")

Fine-tuned HateBERT giving an IndexError

Hi,
My code is

        tokenizer = AutoTokenizer.from_pretrained("tomh/toxigen_hatebert")
        model = AutoModelForSequenceClassification.from_pretrained("tomh/toxigen_hatebert")
        example = tokenizer(sentence, return_tensors="pt")
        with torch.no_grad():
             logits = model(**example).logits

and I'm getting this error:

logits = model(**examples).logits
File "/home/saghar/anaconda3/envs/clm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
 return forward_call(*input, **kwargs)
File "/home/saghar/anaconda3/envs/clm/lib/python3.10/site-packages/transformers/models/bert/modeling_bert.py", line 1556, in forward
 outputs = self.bert(
File "/home/saghar/anaconda3/envs/clm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
 return forward_call(*input, **kwargs)
File "/home/saghar/anaconda3/envs/clm/lib/python3.10/site-packages/transformers/models/bert/modeling_bert.py", line 1011, in forward
 embedding_output = self.embeddings(
File "/home/saghar/anaconda3/envs/clm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
 return forward_call(*input, **kwargs)
File "/home/saghar/anaconda3/envs/clm/lib/python3.10/site-packages/transformers/models/bert/modeling_bert.py", line 235, in forward
 inputs_embeds = self.word_embeddings(input_ids)
File "/home/saghar/anaconda3/envs/clm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
 return forward_call(*input, **kwargs)
File "/home/saghar/anaconda3/envs/clm/lib/python3.10/site-packages/torch/nn/modules/sparse.py", line 158, in forward
 return F.embedding(
File "/home/saghar/anaconda3/envs/clm/lib/python3.10/site-packages/torch/nn/functional.py", line 2183, in embedding
 return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
IndexError: index out of range in self

Could you please let me know what is causing this error?
Thank you!

Random sentences popping-up

Hi! First off, kudos for the amazing work done!

I had a few doubts regarding random generated sentences popping up, while I was going through the dataset.
While randomly sampling sentences for each group, I noticed that sometimes the randomly selected generated sequence tends to be either empty (just the \ token) or even random LaTeX format text, for example. I'm not sure I fully understand what the cause for these cases might be, and is there any work done to avoid this?

Unable to Load Dataset

Hey!
Awesome Paper and codebase, it's very well documented!!
I've ben facing some issues trying to load the dataset, I tried to load it on colab using the following lines -

from datasets import load_dataset
TG_data = load_dataset("skg/toxigen-data", name="train", use_auth_token=True) # 250k training examples
TG_annotations = load_dataset("skg/toxigen-data", name="annotated", use_auth_token=True) # Human study

I got the following error -


---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
[<ipython-input-2-f61e6e9de847>](https://localhost:8080/#) in <module>()
      1 from datasets import load_dataset
----> 2 TG_data = load_dataset("skg/toxigen-data", name="train", use_auth_token=True) # 250k training examples
      3 TG_annotations = load_dataset("skg/toxigen-data", name="annotated", use_auth_token=True) # Human study

7 frames
[/usr/local/lib/python3.7/dist-packages/requests/models.py](https://localhost:8080/#) in raise_for_status(self)
    939 
    940         if http_error_msg:
--> 941             raise HTTPError(http_error_msg, response=self)
    942 
    943     def close(self):

HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/api/datasets/skg/toxigen-data

I suspect it's because of some authorization issues but I've filled the form and not quite sure what else should I do?

Questions on HateBERT_ToxiGen

Hello,

First of all, thank you for your great work!

I have some questions about the details of HateBERT_ToxiGen (https://huggingface.co/tomh/toxigen_hatebert).

Is this model the same model used in Table 4. in the paper (ALICE+top-k)?
Regarding the description in the paper, "We fine-tune the widely-used HateBERT ... on the training portion of TOXIGEN", does the "training portion" mean "train subset" with 250,951 samples? or "train split" of "annotated subset" with 8960 samples? (https://huggingface.co/datasets/skg/toxigen-data)

Again, thank you a lot for your great work :)