DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Paper • 1910.01108 • Published • 23
How to use ahmednasser/DistilBert-FakeNews with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-classification", model="ahmednasser/DistilBert-FakeNews") # Load model directly
from transformers import AutoTokenizer, BertClassifier
tokenizer = AutoTokenizer.from_pretrained("ahmednasser/DistilBert-FakeNews")
model = BertClassifier.from_pretrained("ahmednasser/DistilBert-FakeNews")YAML Metadata Error:"datasets[0]" with value "Fake News https://www.kaggle.com/datasets/clmentbisaillon/fake-and-real-news-dataset" is not valid. If possible, use a dataset id from https://hf.co/datasets.
Distilbert is created with knowledge distillation during the pre-training phase which reduces the size of a BERT model by 40%, while retaining 97% of its language understanding. It's smaller, faster than Bert and any other Bert-based model.
Distilbert-base-uncased finetuned on the fake news dataset with below Hyperparameters
learning rate 5e-5,
batch size 32,
num_train_epochs=2,
Full code available @ DistilBert-FakeNews
Dataset available @ Fake News dataset