Text summarization using models from Hugging Face allows developers to automatically generate concise summaries from long pieces of text. By using pretrained transformer models, it becomes easy to build applications that can extract key information and present it in a shorter, meaningful form. It has 2 components:
Extractive : Selects important sentences directly from the text
Abstractive : Generates new sentences that capture the same meaning
Implementation of Text Summarisation
Step 1: Set Up the Environment
First, install the required libraries. Run the following command in your command prompt.
This installs the Transformers library
pip install transformers torch
Step 2: Import Required Classes
T5Tokenizer: Converts text into numerical tokens that the model can process
T5ForConditionalGeneration: Generates new text based on the input
T5 works in an instruction based manner, so adding summarize tells the model what task to perform. Without this prefix, it wonβt clearly know that it needs to generate a summary.
Step 5: Tokenize the Input
Tokenization converts the text into numeric IDs the model can understand. Since the model processes numbers not raw words this step transforms the text into a format suitable for computation.
Step 6: Generate the Summary
max_length: Sets the maximum summary length
min_length: Avoids very short summaries
num_beams=4: Uses beam search to improve output quality
length_penalty: Maintains a balance between key details and conciseness.
early_stopping=True: Stops when the best result is found
Step 7: Decoding the Output
The model generates numeric token IDs and decoding converts them back into readable text so you can see the final summary.
Output:
Summary: AI systems are automating processes, analyzing data, and improving decision-making. organizations investing heavily in AI research, though ethical and privacy challenges remain.