![]() |
VOOZH | about |
BeautifulSoup, a powerful Python library for web scraping, simplifies the process of parsing HTML and XML documents. One common task is to find an HTML tag that contains specific text. In this article, we'll explore how to achieve this using BeautifulSoup, providing a step-by-step guide.
Required Python Package
pip install beautifulsoup4We'll show you how to pull out various pieces of text from web pages. We'll go through the process using the BeautifulSoup from our sample HTML page example.
Below is the HTML file that we have used to find an HTML tag that contains certain text using BeautifulSoup.
Output:
👁 ImageIn this example, we are using BeautifulSoup to parse the content of an HTML file named gfg.html. By this we can find how to get meta by name beautiful soup. Specifically, we are searching for an anchor tag (<a>) within this HTML file that contains the text "Geeks For Geeks". Once the tag is found, it is printed to the console.
Methods Used
Output:
<a href="https://www.geeksforgeeks.org/">Geeks For Geeks</a>
In this example, we are utilizing BeautifulSoup's find method to search for any HTML tag within the gfg.html content that contains the text "Geeks For Geeks". Once the tag is located, it is printed to the console.
Output:
<a href="https://www.geeksforgeeks.org/">Geeks For Geeks</a>
In this example, we are using BeautifulSoup's find_all method to locate the first three anchor tags (<a>) within the gfg.html content. The limit=3 parameter ensures that only the first three tags are retrieved. Subsequently, each of these tags is printed to the console.
Output:
<a href="https://www.geeksforgeeks.org/">Geeks For Geeks</a>
<a href="Dummy Check Text">Geeks For Geeks</a>
<a href="Dummywebsite.com">Dummy Text</a>
In this example, BeautifulSoup is used to search gfg.html for specific text patterns in different HTML tags, and the found tags are printed to the console.
Output:
[<a href="https://www.geeksforgeeks.org/">Geeks For Geeks</a>, <a href="Dummy Check Text">Geeks For Geeks</a>]
[<span class="true">Geeks For Geeks</span>, <span class="false">Geeks For Geeks</span>]
[<h1>Python Program</h1>]
[<li class="1">Python Program</li>]
[<tr>GFG Website</tr>]