Fine-Tune Transformer Models For Question Answering On Custom Data

A tutorial on fine-tuning the Hugging Face RoBERTa QA Model on custom data and obtaining significant performance boosts

Jan 18, 2023

∙ Paid

Extractive Question Answering | Skanda Vivek

Question Answering and Transformers

BERT is a transformer model that took the world by storm in 2019. BERT was trained on unlabeled data by masking words and training the model to predict these masked words based on context. BERT was later fine-tuned on multiple tasks and achieved state of the art performance on many specific language tasks. In particular, BERT was fine-tuned on 100k+ question answer pairs from the SQUAD dataset, consisting of questions posed on Wikipedia articles, where the answer to every question is a segment of text, or span, from the corresponding passage.

BERT Transformer Architecture from https://arxiv.org/abs/1810.04805

The RoBERTa model released soon after built on BERT by modifying key hyperparameters and improved training. The model we are interested in is the fine-tuned RoBERTA model on huggingface released by deepset which was downloaded 1M+ times last month.

As an example, let’s use data from the SubjQA dataset — containing 10,000 questions over reviews from 6 different domains: books, movies, grocery, electronics, TripAdvisor (i.e. hotels), and restaurants.

In particular since I’m illustrating the power of fine-tuning, I’m going to go with questions and answers generated from movie reviews. These are conveniently split into 2 csv files for training (train.csv) and testing (test.csv).

Real-World AI

Fine-Tune Transformer Models For Question Answering On Custom Data

A tutorial on fine-tuning the Hugging Face RoBERTa QA Model on custom data and obtaining significant performance boosts

Question Answering and Transformers

This post is for paid subscribers