Summary
Question Answering with BERT (Bidirectional Encoder Representations from Transformers) has revolutionized natural language understanding and information retrieval. BERT, a state-of-the-art transformer-based model, excels at capturing contextual relationships in text due to its bidirectional attention mechanism. In Question Answering tasks, BERT processes a given question and context to generate embeddings for each word, taking into account the entire context surrounding the words. The model then identifies the answer span within the context, effectively understanding and extracting relevant information. Fine-tuned for QA, BERT has demonstrated remarkable performance on various datasets and is widely adopted in applications such as search engines, virtual assistants, and information retrieval systems, enhancing the accuracy and efficiency of extracting precise answers from textual data.
Python functions and data files to run this notebook are in my Github page.
import warnings
warnings.filterwarnings('ignore')
from transformers import BertTokenizerFast, BertForQuestionAnswering, pipeline, \
DataCollatorWithPadding, TrainingArguments, Trainer, \
AutoModelForQuestionAnswering, AutoTokenizer
from datasets import Dataset
import pandas as pd
import matplotlib.pyplot as plt
from bs4 import BeautifulSoup
import requests
There are two types of answering: Extractive, Abstractive
Extractive Answering | Abstractive Answering |
---|---|
Answer to a question given a piece of text is a direct substring of the context | Answer to a question given a piece of context is a free-form phrase based on the context |
BERT | Decoder is required |
GPT, T5 |
We can give a question and some contexts, then BERT can extract a subset of piece of context to answer that question. This is an extractive answering.
# We are using a large uncased BERT since we want to give a model a large data set since
# question and asnwering has limited examples
bert_tokenizer = BertTokenizerFast.from_pretrained('bert-large-uncased', return_token_type_ids=True)
qa_bert = BertForQuestionAnswering.from_pretrained('bert-large-uncased')
Some weights of the model checkpoint at bert-large-uncased were not used when initializing BertForQuestionAnswering: ['cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.decoder.weight', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight'] - This IS expected if you are initializing BertForQuestionAnswering from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model). - This IS NOT expected if you are initializing BertForQuestionAnswering from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). Some weights of BertForQuestionAnswering were not initialized from the model checkpoint at bert-large-uncased and are newly initialized: ['qa_outputs.bias', 'qa_outputs.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
SQUAD Data Set¶
SQUAD 2.0 question and answering data set was downloaded from kaggle. For this data set, there is question column which the answer is within column context. The columns text is the answer and answer_start gives the index that answer start within column context.
pd.set_option('display.max_colwidth', None)
# load training data set
df_qa = pd.read_csv('train-squad.csv')
df_qa.rename({'text': 'answer'}, axis=1,inplace=True)
df_qa = df_qa[['context','question','answer']]
print(df_qa.shape)
(86821, 3)
df_qa[:3]
context | question | answer | |
---|---|---|---|
0 | Beyoncé Giselle Knowles-Carter (/biːˈjɒnseɪ/ bee-YON-say) (born September 4, 1981) is an American singer, songwriter, record producer and actress. Born and raised in Houston, Texas, she performed in various singing and dancing competitions as a child, and rose to fame in the late 1990s as lead singer of R&B girl-group Destiny's Child. Managed by her father, Mathew Knowles, the group became one of the world's best-selling girl groups of all time. Their hiatus saw the release of Beyoncé's debut album, Dangerously in Love (2003), which established her as a solo artist worldwide, earned five Grammy Awards and featured the Billboard Hot 100 number-one singles "Crazy in Love" and "Baby Boy". | When did Beyonce start becoming popular? | in the late 1990s |
1 | Beyoncé Giselle Knowles-Carter (/biːˈjɒnseɪ/ bee-YON-say) (born September 4, 1981) is an American singer, songwriter, record producer and actress. Born and raised in Houston, Texas, she performed in various singing and dancing competitions as a child, and rose to fame in the late 1990s as lead singer of R&B girl-group Destiny's Child. Managed by her father, Mathew Knowles, the group became one of the world's best-selling girl groups of all time. Their hiatus saw the release of Beyoncé's debut album, Dangerously in Love (2003), which established her as a solo artist worldwide, earned five Grammy Awards and featured the Billboard Hot 100 number-one singles "Crazy in Love" and "Baby Boy". | What areas did Beyonce compete in when she was growing up? | singing and dancing |
2 | Beyoncé Giselle Knowles-Carter (/biːˈjɒnseɪ/ bee-YON-say) (born September 4, 1981) is an American singer, songwriter, record producer and actress. Born and raised in Houston, Texas, she performed in various singing and dancing competitions as a child, and rose to fame in the late 1990s as lead singer of R&B girl-group Destiny's Child. Managed by her father, Mathew Knowles, the group became one of the world's best-selling girl groups of all time. Their hiatus saw the release of Beyoncé's debut album, Dangerously in Love (2003), which established her as a solo artist worldwide, earned five Grammy Awards and featured the Billboard Hot 100 number-one singles "Crazy in Love" and "Baby Boy". | When did Beyonce leave Destiny's Child and become a solo singer? | 2003 |
Start and End Index of Answer within Context¶
def find_idx (big_index,small_index):
"""
Find the starting indices of a sequence of 'small_index' within 'big_index'.
Parameters:
- big_index (list): The larger sequence of indices.
- small_index (list): The smaller sequence of indices to be found within 'big_index'.
Returns:
- list: A list of starting indices where 'small_index' is found in 'big_index'.
"""
# Iterate through each index in 'big_index'
for i in range(len(big_index)):
# Initialize an empty list to store starting indices
indices = []
# Check if the current index in 'big_index' matches the first index in 'small_index'
if big_index[i] == small_index[0]:
# If there is a match, append the current index to 'indices'
indices.append(i)
# If the length of 'small_index' is greater than 1, check for the entire sequence
if len(small_index)>1:
j = 1
# Continue checking subsequent indices for a match with 'small_index'
while len(small_index)>j and big_index[i+j] == small_index[j]:
indices.append(j+i)
j += 1
if len(small_index) == j:
return indices
break
else:
return [i]
break
def file_add(x):
"""
Tokenize the input question and context using BERT tokenizer and find the token indices
corresponding to the answer within the tokenized sequence.
Parameters:
- x (dict): Input dictionary containing 'question', 'context', and 'answer' keys.
Returns:
- tuple: A tuple containing the starting and ending token indices of the answer within the tokenized sequence.
If the answer is not found, it returns (-1, -1).
"""
# Tokenize the question and context using BERT tokenizer
qst_contxt = bert_tokenizer.encode(x['question'],x['context'])
try:
# Tokenize the answer
answr = bert_tokenizer.encode(x['answer'])[1:-1]
# Find the indices of the answer within the tokenized question and context
answr_idx = find_idx (qst_contxt,answr)
try:
# If multiple indices are found, use the first and last indices
if len(answr_idx)>1:
# If only one index is found, use it for both start and end
tkn_strt,tkn_end = answr_idx[0], answr_idx[-1]
else :
tkn_strt,tkn_end = answr_idx[0], answr_idx[0]
except TypeError:
# Handle the case where answr_idx is not a list (Type Error)
tkn_strt,tkn_end = -1, -1
# Return the starting and ending token indices of the answer
return tkn_strt, tkn_end
except TypeError:
# Handle the case where answr is not properly defined (Type Error)
return -1, -1
tmp = df_qa.apply(lambda x: file_add(x), axis=1)
df_qa['start_positions'], df_qa['end_positions'] = [i[0] for i in tmp], [i[1] for i in tmp]
df_qa = df_qa[['question', 'context', 'start_positions', 'end_positions', 'answer']]
df_qa[:4]
Token indices sequence length is longer than the specified maximum sequence length for this model (518 > 512). Running this sequence through the model will result in indexing errors
question | context | start_positions | end_positions | answer | |
---|---|---|---|---|---|
0 | When did Beyonce start becoming popular? | Beyoncé Giselle Knowles-Carter (/biːˈjɒnseɪ/ bee-YON-say) (born September 4, 1981) is an American singer, songwriter, record producer and actress. Born and raised in Houston, Texas, she performed in various singing and dancing competitions as a child, and rose to fame in the late 1990s as lead singer of R&B girl-group Destiny's Child. Managed by her father, Mathew Knowles, the group became one of the world's best-selling girl groups of all time. Their hiatus saw the release of Beyoncé's debut album, Dangerously in Love (2003), which established her as a solo artist worldwide, earned five Grammy Awards and featured the Billboard Hot 100 number-one singles "Crazy in Love" and "Baby Boy". | 75 | 78 | in the late 1990s |
1 | What areas did Beyonce compete in when she was growing up? | Beyoncé Giselle Knowles-Carter (/biːˈjɒnseɪ/ bee-YON-say) (born September 4, 1981) is an American singer, songwriter, record producer and actress. Born and raised in Houston, Texas, she performed in various singing and dancing competitions as a child, and rose to fame in the late 1990s as lead singer of R&B girl-group Destiny's Child. Managed by her father, Mathew Knowles, the group became one of the world's best-selling girl groups of all time. Their hiatus saw the release of Beyoncé's debut album, Dangerously in Love (2003), which established her as a solo artist worldwide, earned five Grammy Awards and featured the Billboard Hot 100 number-one singles "Crazy in Love" and "Baby Boy". | 68 | 70 | singing and dancing |
2 | When did Beyonce leave Destiny's Child and become a solo singer? | Beyoncé Giselle Knowles-Carter (/biːˈjɒnseɪ/ bee-YON-say) (born September 4, 1981) is an American singer, songwriter, record producer and actress. Born and raised in Houston, Texas, she performed in various singing and dancing competitions as a child, and rose to fame in the late 1990s as lead singer of R&B girl-group Destiny's Child. Managed by her father, Mathew Knowles, the group became one of the world's best-selling girl groups of all time. Their hiatus saw the release of Beyoncé's debut album, Dangerously in Love (2003), which established her as a solo artist worldwide, earned five Grammy Awards and featured the Billboard Hot 100 number-one singles "Crazy in Love" and "Baby Boy". | 143 | 143 | 2003 |
3 | In what city and state did Beyonce grow up? | Beyoncé Giselle Knowles-Carter (/biːˈjɒnseɪ/ bee-YON-say) (born September 4, 1981) is an American singer, songwriter, record producer and actress. Born and raised in Houston, Texas, she performed in various singing and dancing competitions as a child, and rose to fame in the late 1990s as lead singer of R&B girl-group Destiny's Child. Managed by her father, Mathew Knowles, the group became one of the world's best-selling girl groups of all time. Their hiatus saw the release of Beyoncé's debut album, Dangerously in Love (2003), which established her as a solo artist worldwide, earned five Grammy Awards and featured the Billboard Hot 100 number-one singles "Crazy in Love" and "Baby Boy". | 58 | 60 | Houston, Texas |
df_qa.iloc[0]['context']
'Beyoncé Giselle Knowles-Carter (/biːˈjɒnseɪ/ bee-YON-say) (born September 4, 1981) is an American singer, songwriter, record producer and actress. Born and raised in Houston, Texas, she performed in various singing and dancing competitions as a child, and rose to fame in the late 1990s as lead singer of R&B girl-group Destiny\'s Child. Managed by her father, Mathew Knowles, the group became one of the world\'s best-selling girl groups of all time. Their hiatus saw the release of Beyoncé\'s debut album, Dangerously in Love (2003), which established her as a solo artist worldwide, earned five Grammy Awards and featured the Billboard Hot 100 number-one singles "Crazy in Love" and "Baby Boy".'
# index 75, 76, 77 and 78 including question while encoding
bert_tokenizer.decode(bert_tokenizer.encode(df_qa.iloc[0].question, df_qa.iloc[0].context)[75:79])
'in the late 1990s'
Dataset.from_pandas
is a method provided by the deep learning framework PyTorch
, specifically in the torch.utils.data
module. This method is used to create a PyTorch dataset from a pandas DataFrame.
Train & Test Split¶
We only grab 8,000 examples because fine-tunning process is very expensive.
qa_dataset = Dataset.from_pandas(df_qa.sample(8000, random_state=32))
# Dataset has a built in train test split method
qa_dataset = qa_dataset.train_test_split(test_size=0.2)
qa_dataset
DatasetDict({ train: Dataset({ features: ['question', 'context', 'start_positions', 'end_positions', 'answer', '__index_level_0__'], num_rows: 6400 }) test: Dataset({ features: ['question', 'context', 'start_positions', 'end_positions', 'answer', '__index_level_0__'], num_rows: 1600 }) })
# preprocessing here with truncation to truncate longer text
def preprocess(data):
return bert_tokenizer(data['question'], data['context'], truncation=True) # anything pass window of 512
# should be truncated
qa_dataset = qa_dataset.map(preprocess, batched=True)
Freeze BERT's Parameters¶
# to speed up training, freeze all but the last 2 encoder layers in BERT
for name, param in qa_bert.bert.named_parameters():
if 'encoder.layer.20' in name: # our large model has 24 encoder so everything until last 2 are removed
break
param.requires_grad = False # disable training in BERT
# Dynamic padding to speed up training
data_collator = DataCollatorWithPadding(tokenizer=bert_tokenizer)
Fine-tune¶
batch_size = 5
epochs = 2
training_args = TrainingArguments(
output_dir='./qsn_anw/results',
num_train_epochs=epochs,
per_device_train_batch_size=batch_size,
per_device_eval_batch_size=batch_size,
logging_dir='./qsn_anw/logs',
save_strategy='epoch',
logging_steps=10,
evaluation_strategy='epoch',
load_best_model_at_end=True
)
trainer = Trainer(
model=qa_bert, # pretrained BERT
args=training_args,
train_dataset=qa_dataset['train'],
eval_dataset=qa_dataset['test'],
data_collator=data_collator
)
# Get initial metrics
trainer.evaluate()
The following columns in the evaluation set don't have a corresponding argument in `BertForQuestionAnswering.forward` and have been ignored: question, context, __index_level_0__, answer. If question, context, __index_level_0__, answer are not expected by `BertForQuestionAnswering.forward`, you can safely ignore this message. ***** Running Evaluation ***** Num examples = 1600 Batch size = 5 You're using a BertTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
{'eval_loss': 5.589837551116943, 'eval_runtime': 1452.9636, 'eval_samples_per_second': 1.101, 'eval_steps_per_second': 0.22}
# Question and answering model is very large
trainer.train()
The following columns in the training set don't have a corresponding argument in `BertForQuestionAnswering.forward` and have been ignored: question, context, __index_level_0__, answer. If question, context, __index_level_0__, answer are not expected by `BertForQuestionAnswering.forward`, you can safely ignore this message. ***** Running training ***** Num examples = 6400 Num Epochs = 2 Instantaneous batch size per device = 5 Total train batch size (w. parallel, distributed & accumulation) = 5 Gradient Accumulation steps = 1 Total optimization steps = 2560 Number of trainable parameters = 50386946
Epoch | Training Loss | Validation Loss |
---|---|---|
1 | 2.126700 | 2.168481 |
2 | 1.494200 | 2.071796 |
The following columns in the evaluation set don't have a corresponding argument in `BertForQuestionAnswering.forward` and have been ignored: question, context, __index_level_0__, answer. If question, context, __index_level_0__, answer are not expected by `BertForQuestionAnswering.forward`, you can safely ignore this message. ***** Running Evaluation ***** Num examples = 1600 Batch size = 5 Saving model checkpoint to ./qsn_anw/results\checkpoint-1280 Configuration saved in ./qsn_anw/results\checkpoint-1280\config.json Model weights saved in ./qsn_anw/results\checkpoint-1280\pytorch_model.bin The following columns in the evaluation set don't have a corresponding argument in `BertForQuestionAnswering.forward` and have been ignored: question, context, __index_level_0__, answer. If question, context, __index_level_0__, answer are not expected by `BertForQuestionAnswering.forward`, you can safely ignore this message. ***** Running Evaluation ***** Num examples = 1600 Batch size = 5 Saving model checkpoint to ./qsn_anw/results\checkpoint-2560 Configuration saved in ./qsn_anw/results\checkpoint-2560\config.json Model weights saved in ./qsn_anw/results\checkpoint-2560\pytorch_model.bin Training completed. Do not forget to share your model on huggingface.co/models =) Loading best model from ./qsn_anw/results\checkpoint-2560 (score: 2.071795701980591).
TrainOutput(global_step=2560, training_loss=2.3458202928304672, metrics={'train_runtime': 30721.6795, 'train_samples_per_second': 0.417, 'train_steps_per_second': 0.083, 'total_flos': 5815410818714640.0, 'train_loss': 2.3458202928304672, 'epoch': 2.0})
trainer.save_model()
Saving model checkpoint to ./qsn_anw/results Configuration saved in ./qsn_anw/results\config.json Model weights saved in ./qsn_anw/results\pytorch_model.bin
pipe = pipeline("question-answering", './qsn_anw/results', tokenizer=bert_tokenizer)
loading configuration file ./qsn_anw/results\config.json Model config BertConfig { "_name_or_path": "./qsn_anw/results", "architectures": [ "BertForQuestionAnswering" ], "attention_probs_dropout_prob": 0.1, "classifier_dropout": null, "gradient_checkpointing": false, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 1024, "initializer_range": 0.02, "intermediate_size": 4096, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 16, "num_hidden_layers": 24, "pad_token_id": 0, "position_embedding_type": "absolute", "torch_dtype": "float32", "transformers_version": "4.26.1", "type_vocab_size": 2, "use_cache": true, "vocab_size": 30522 } loading configuration file ./qsn_anw/results\config.json Model config BertConfig { "_name_or_path": "./qsn_anw/results", "architectures": [ "BertForQuestionAnswering" ], "attention_probs_dropout_prob": 0.1, "classifier_dropout": null, "gradient_checkpointing": false, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 1024, "initializer_range": 0.02, "intermediate_size": 4096, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 16, "num_hidden_layers": 24, "pad_token_id": 0, "position_embedding_type": "absolute", "torch_dtype": "float32", "transformers_version": "4.26.1", "type_vocab_size": 2, "use_cache": true, "vocab_size": 30522 } loading weights file ./qsn_anw/results\pytorch_model.bin All model checkpoint weights were used when initializing BertForQuestionAnswering. All the weights of BertForQuestionAnswering were initialized from the model checkpoint at ./qsn_anw/results. If your task is similar to the task the model of the checkpoint was trained on, you can already use BertForQuestionAnswering for predictions without further training.
Test Fine-tuned Model¶
txt = """The brain is an organ that serves as the center of the nervous system in
all vertebrate and most invertebrate animals. Only a few invertebrates such as sponges,
jellyfish, adult sea squirts and starfish do not have a brain; diffuse or localised
nerve nets are present instead. The brain is located in the head, usually close to the
primary sensory organs for such senses as vision, hearing, balance, taste, and smell.
The brain is the most complex organ in a vertebrate's body. In a typical human, the
cerebral cortex (the largest part) is estimated to contain 15–33 billion neurons,
each connected by synapses to several thousand other neurons. These neurons communicate
with one another by means of long protoplasmic fibers called axons, which carry trains
of signal pulses called action potentials to distant parts of the brain or body targeting
specific recipient cells."""
pipe("How are neurons connected?", txt)
Disabling tokenizer parallelism, we're using DataLoader multithreading already
{'score': 0.12055443972349167, 'start': 716, 'end': 724, 'answer': 'synapses'}
The answer for question above is synapses.
We can google someone as below:
PERSON = 'Mehdi Rezvandehy'
# Note this is NOT an efficient way to search on google. This is done simply for education purposes
google_html = BeautifulSoup(requests.get(f'https://www.google.com/search?q={PERSON}').text).get_text()[:512]
pipe(f'Who is {PERSON}?', google_html)
{'score': 0.18983420729637146, 'start': 278, 'end': 281, 'answer': 'PhD'}
Use Huggingface Fined-tuned QA Model¶
# From Huggingface: https://huggingface.co/bert-large-uncased-whole-word-masking-finetuned-squad
squad_pipe = pipeline("question-answering", "bert-large-uncased-whole-word-masking-finetuned-squad")
loading configuration file config.json from cache at C:\Users\mrezv/.cache\huggingface\hub\models--bert-large-uncased-whole-word-masking-finetuned-squad\snapshots\cca7eb4efca266eff710a8c7154ecbc382b78e77\config.json Model config BertConfig { "_name_or_path": "bert-large-uncased-whole-word-masking-finetuned-squad", "architectures": [ "BertForQuestionAnswering" ], "attention_probs_dropout_prob": 0.1, "classifier_dropout": null, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 1024, "initializer_range": 0.02, "intermediate_size": 4096, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 16, "num_hidden_layers": 24, "pad_token_id": 0, "position_embedding_type": "absolute", "transformers_version": "4.26.1", "type_vocab_size": 2, "use_cache": true, "vocab_size": 30522 } loading configuration file config.json from cache at C:\Users\mrezv/.cache\huggingface\hub\models--bert-large-uncased-whole-word-masking-finetuned-squad\snapshots\cca7eb4efca266eff710a8c7154ecbc382b78e77\config.json Model config BertConfig { "_name_or_path": "bert-large-uncased-whole-word-masking-finetuned-squad", "architectures": [ "BertForQuestionAnswering" ], "attention_probs_dropout_prob": 0.1, "classifier_dropout": null, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 1024, "initializer_range": 0.02, "intermediate_size": 4096, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 16, "num_hidden_layers": 24, "pad_token_id": 0, "position_embedding_type": "absolute", "transformers_version": "4.26.1", "type_vocab_size": 2, "use_cache": true, "vocab_size": 30522 }
loading weights file pytorch_model.bin from cache at C:\Users\mrezv/.cache\huggingface\hub\models--bert-large-uncased-whole-word-masking-finetuned-squad\snapshots\cca7eb4efca266eff710a8c7154ecbc382b78e77\pytorch_model.bin All model checkpoint weights were used when initializing BertForQuestionAnswering. All the weights of BertForQuestionAnswering were initialized from the model checkpoint at bert-large-uncased-whole-word-masking-finetuned-squad. If your task is similar to the task the model of the checkpoint was trained on, you can already use BertForQuestionAnswering for predictions without further training. loading configuration file config.json from cache at C:\Users\mrezv/.cache\huggingface\hub\models--bert-large-uncased-whole-word-masking-finetuned-squad\snapshots\cca7eb4efca266eff710a8c7154ecbc382b78e77\config.json Model config BertConfig { "_name_or_path": "bert-large-uncased-whole-word-masking-finetuned-squad", "architectures": [ "BertForQuestionAnswering" ], "attention_probs_dropout_prob": 0.1, "classifier_dropout": null, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 1024, "initializer_range": 0.02, "intermediate_size": 4096, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 16, "num_hidden_layers": 24, "pad_token_id": 0, "position_embedding_type": "absolute", "transformers_version": "4.26.1", "type_vocab_size": 2, "use_cache": true, "vocab_size": 30522 } loading file vocab.txt from cache at C:\Users\mrezv/.cache\huggingface\hub\models--bert-large-uncased-whole-word-masking-finetuned-squad\snapshots\cca7eb4efca266eff710a8c7154ecbc382b78e77\vocab.txt loading file tokenizer.json from cache at C:\Users\mrezv/.cache\huggingface\hub\models--bert-large-uncased-whole-word-masking-finetuned-squad\snapshots\cca7eb4efca266eff710a8c7154ecbc382b78e77\tokenizer.json loading file added_tokens.json from cache at None loading file special_tokens_map.json from cache at None loading file tokenizer_config.json from cache at C:\Users\mrezv/.cache\huggingface\hub\models--bert-large-uncased-whole-word-masking-finetuned-squad\snapshots\cca7eb4efca266eff710a8c7154ecbc382b78e77\tokenizer_config.json loading configuration file config.json from cache at C:\Users\mrezv/.cache\huggingface\hub\models--bert-large-uncased-whole-word-masking-finetuned-squad\snapshots\cca7eb4efca266eff710a8c7154ecbc382b78e77\config.json Model config BertConfig { "_name_or_path": "bert-large-uncased-whole-word-masking-finetuned-squad", "architectures": [ "BertForQuestionAnswering" ], "attention_probs_dropout_prob": 0.1, "classifier_dropout": null, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 1024, "initializer_range": 0.02, "intermediate_size": 4096, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 16, "num_hidden_layers": 24, "pad_token_id": 0, "position_embedding_type": "absolute", "transformers_version": "4.26.1", "type_vocab_size": 2, "use_cache": true, "vocab_size": 30522 }
squad_pipe("Where is Mehdi living these days?", "Mehdi lives in Calgary but Hamid lives in Edmonton.")
{'score': 0.9923588633537292, 'start': 15, 'end': 22, 'answer': 'Calgary'}
Now we have 99% of correct score.
ir = 4000
print (f'question is: \n{df_qa.question.iloc[ir]}\n\n')
print (f'context is: \n{df_qa.context.iloc[ir]}\n\n')
print (f'real answer is: \n{df_qa.answer.iloc[ir]}\n\n')
print (f'Predict it with fine-tuned model \n{squad_pipe(df_qa.question.iloc[ir], df_qa.context.iloc[ir])}\n\n')
question is: What two factors did Lee demonstrate intensified prejudice? context is: Scholars argue that Lee's approach to class and race was more complex "than ascribing racial prejudice primarily to 'poor white trash' ... Lee demonstrates how issues of gender and class intensify prejudice, silence the voices that might challenge the existing order, and greatly complicate many Americans' conception of the causes of racism and segregation." Lee's use of the middle-class narrative voice is a literary device that allows an intimacy with the reader, regardless of class or cultural background, and fosters a sense of nostalgia. Sharing Scout and Jem's perspective, the reader is allowed to engage in relationships with the conservative antebellum Mrs. Dubose; the lower-class Ewells, and the Cunninghams who are equally poor but behave in vastly different ways; the wealthy but ostracized Mr. Dolphus Raymond; and Calpurnia and other members of the black community. The children internalize Atticus' admonition not to judge someone until they have walked around in that person's skin, gaining a greater understanding of people's motives and behavior. real answer is: gender and class Predict it with fine-tuned model {'score': 0.8647028207778931, 'start': 170, 'end': 186, 'answer': 'gender and class'}
txt = """The brain is an organ that serves as the center of the nervous system in
all vertebrate and most invertebrate animals. Only a few invertebrates such as sponges,
jellyfish, adult sea squirts and starfish do not have a brain; diffuse or localised
nerve nets are present instead. The brain is located in the head, usually close to the
primary sensory organs for such senses as vision, hearing, balance, taste, and smell.
The brain is the most complex organ in a vertebrate's body. In a typical human, the
cerebral cortex (the largest part) is estimated to contain 15–33 billion neurons,
each connected by synapses to several thousand other neurons. These neurons communicate
with one another by means of long protoplasmic fibers called axons, which carry trains
of signal pulses called action potentials to distant parts of the brain or body targeting
specific recipient cells."""
pipe("How are neurons connected?", txt)
{'score': 0.12055443972349167, 'start': 716, 'end': 724, 'answer': 'synapses'}
ir = 4000
print (f'question is: \nHow are neurons connected\n\n')
print (f'context is: \n{txt}\n\n')
print (f'real answer is: \nsynapses\n\n')
print (f'Predict it with fine-tuned model \n{squad_pipe("How are neurons connected?", txt)}\n\n')
question is: How are neurons connected context is: The brain is an organ that serves as the center of the nervous system in all vertebrate and most invertebrate animals. Only a few invertebrates such as sponges, jellyfish, adult sea squirts and starfish do not have a brain; diffuse or localised nerve nets are present instead. The brain is located in the head, usually close to the primary sensory organs for such senses as vision, hearing, balance, taste, and smell. The brain is the most complex organ in a vertebrate's body. In a typical human, the cerebral cortex (the largest part) is estimated to contain 15–33 billion neurons, each connected by synapses to several thousand other neurons. These neurons communicate with one another by means of long protoplasmic fibers called axons, which carry trains of signal pulses called action potentials to distant parts of the brain or body targeting specific recipient cells. real answer is: synapses Predict it with fine-tuned model {'score': 0.3291873335838318, 'start': 713, 'end': 724, 'answer': 'by synapses'}
- Home
-
- Prediction of Movie Genre by Fine-tunning GPT
- Fine-tunning BERT for Fake News Detection
- Covid Tweet Classification by Fine-tunning BART
- Semantic Search Using BERT
- Abstractive Semantic Search by OpenAI Embedding
- Fine-tunning GPT for Style Completion
- Extractive Question-Answering by BERT
- Fine-tunning T5 Model for Abstract Title Prediction
- Image Captioning by Fine-tunning ViT
- Build Serverless ChatGPT API
- Statistical Analysis in Python
- Clustering Algorithms
- Customer Segmentation
- Time Series Forecasting
- PySpark Fundamentals for Big Data
- Predict Customer Churn
- Classification with Imbalanced Classes
- Feature Importance
- Feature Selection
- Text Similarity Measurement
- Dimensionality Reduction
- Prediction of Methane Leakage
- Imputation by LU Simulation
- Histogram Uncertainty
- Delustering to Improve Preferential Sampling
- Uncertainty in Spatial Correlation
-
- Machine Learning Overview
- Python and Pandas
- Main Steps of Machine Learning
- Classification
- Model Training
- Support Vector Machines
- Decision Trees
- Ensemble Learning & Random Forests
- Artificial Neural Network
- Deep Neural Network (DNN)
- Unsupervised Learning
- Multicollinearity
- Introduction to Git
- Introduction to R
- SQL Basic to Advanced Level
- Develop Python Package
- Introduction to BERT LLM
- Exploratory Data Analysis
- Object Oriented Programming in Python
- Natural Language Processing
- Convolutional Neural Network
- Publications