Transformers was introduced in the paper Attention is all you need; it is an encoder-decoder architecture which means input processed (encoded) by one stack is used by (decoded) by another stack to generate the output.

There are modifications around the Transforms architecture, using just the encoder stack as in BERT (Bidirectional encoder representation of transformer) or using decoder stack as in GPT (Generative Pre-trained Transformer) Architecture. T5 (Text to text transfer transformer), created by Google, uses both encoder and decoder stack.

Hugging Face Transformers functions provides a pool of pre-trained models to perform various tasks such as vision, text, and audio. Transformers provides APIs to download and experiment with the pre-trained models, and we can even fine-tune them on our datasets.

Why Use Transformers Library?

Easy-to-use state-of-the-art models: High performance on Natural Language Understanding(NLU) & Generation(NLG), Computer Vision, and audio tasks
Lower compute costs, smaller carbon footprint: Researchers can share trained models instead of retraining.
Choose the proper framework for every part of a model’s lifetime: Train state-of-the-art models in 3 lines of code, pick the appropriate framework for training, evaluation, and production.
Easily customize a model or an example to our needs: It provides examples for each architecture to reproduce the results published by its original authors.

👁 Hugging Face Transformers Functions

Source

Transformers Pipeline

Pipelines are the abstraction for the complex code behind the transformers library; It is easiest to use the pre-trained models for inference. It provides easy-to-use pipeline functions for a variety of tasks, including but not limited to, Named Entity Recognition, Masked Language Modeling, Sentiment Analysis, Feature Extraction, and Question Answering.

For the machine learning/deep learning experiment, we need to preprocess the data, train the model and write an inference script; in contrast with Pipeline functions, we need to import it and pass our raw data. The Pipeline will preprocess our data in the backend, including tokenization and padding and all the relevant processing steps for the algorithm’s input, and return the output with just a call to it.

We need to install the Transformers library to use these fantastic pipeline functions. Head over to your Jupyter notebook, local or in Google Colab(Preferred).

Install the library using pip

!pip install transformers

Now, Let’s unwrap the magic box and see how it surprise us.

First import the Pipeline from transformers library

from transformers import pipeline

Let’s begin with Sentiment Analysis.

Sentiment Analysis

Sentiment analysis is used to predict the sentiment of the text, whether the text is positive or negative. To perform sentiment analysis using Pipeline, we need to initialize the Pipeline with the ‘sentiment-analysis’ task as follows.

sentimentAnalysis_pipeline = pipeline("sentiment-analysis")

test_sentence = “This is a really good movie. I loved it and will watch it again”

print(sentimentAnalysis_pipeline(test_sentence))

👁 Test Sentence | Hugging Face Transformers Functions

Source: Author

We can even pass a list of sentences, and the Pipeline will return inference for each of the examples in the list.

test_sentence1 = “This is a really good movie. I loved it and will watch it again”

test_sentence2 = “Worst movie i ever saw”

print(sentimentAnalysis_pipeline([test_sentence1,test_sentence2]))

👁 Output 3 | Hugging Face Transformers Functions

Source: Author

For the first time, the Pipeline will download the underlying model; We can even choose what model we want to use with the model parameter; by default, it uses the ‘distillery-base-uncased-finetuned-sst-2-English model.

See how easy it was; we can even train our model on custom datasets. Check out my blog to know how to fine-tune the BERT model for sentiment analysis tasks.

Have you ever imagined being a writer or poet? Well, if not, the following Pipeline can help you trigger that side.

Let’s build a text generation pipeline.

Text Generation

The model will generate the following N characters given a few words or a sentence.

We need to initialize the Pipeline with the ‘text-generation’ task.

text_gen_pipeline = pipeline('text-generation', model='gpt2')
prompt = 'Before we proceed any further, hear me speak'
text_gen_pipeline(prompt, max_length=60)

👁 Output 2

Source: Author

By default, it will return a single output of max_length provided. However, we can set the num_return_sequences parameter to output as many sequences as we want.

To learn how to build a Text generation model using LSTM, check out the Github repository.

Now let’s build our last Pipeline for the question-answering task.

Question Answering

Given a text(context) and the question, extract the answer.

For QnA, we need to initialize the Pipeline with the “question-answering” task.

context = '''
Total fees for all services paid by the 
Company and its subsidiaries, on a 
consolidated basis, to statutory auditors 
of the Company and other firms in the 
network entity of which the statutory 
auditors are a part, during the year 
ended March 31, 2021, is 59.73 crore.
During the financial year 2020-21, the 
company issued on private placement 
basis and allotted, Unsecured 
Redeemable Non-Convertible 
Debentures (NCDs) of the face value of 
10,00,000/- (Rupees Ten lakh) each, 
aggregating 24,955 crores in seven 
tranches as per the terms of issue of the 
respective tranches. Further, the third 
tranche of 500 crores was received from 
the holders of partly paid NCDs (Series 
IA). The funds raised through NCDs 
have been utilized for repayment of 
existing borrowings and other purposes 
in the ordinary course of business.
'''

ans = ques_ans_pipeline({'question': 'What is the total fees paid by the company to auditors?',
 'context': f'{context}'})
print(ans)

👁 Output

Source: Author

Excellent, the model has accurately extracted the answer for the provided question. Also, it has returned the offsets, start & end, where the answer appears in the context, and the confidence score indicates how confident the model is in the extracted solution.

The above context is taken from the 2020-2021 Annual Report of Reliance company, link in the reference section. This is just an example; however, it can be used in the financial industry to analyze the long, eye troublings reports by just asking the right questions to the model.

End Notes

We can easily use other pipelines, including text summarization, named entity recognition, language translation, and many more. With this powerful transformers functionality, we can create excellent applications without even going into the coding ground. One of the advantages of using these pre-trained models is that we don’t have to train our models from scratch, which sometimes takes days to prepare on a large volume of data, reducing our resource consumption and ultimately reducing our running cost.