BeautifulSoup - Error Handling

Last Updated : 31 Jul, 2025

When scraping data from websites, we often face different types of errors. Some are caused by incorrect URLs, server issues or incorrect usage of scraping libraries like requests and BeautifulSoup.

In this tutorial, we’ll explore some common exceptions encountered during web scraping and how to handle them.

1. HTTPError:

An HTTPError occurs when the server responds with an HTTP error status code, such as 404 (Not Found) or 500 (Internal Server Error).

Example 1 (Valid URL):

Output:

Request successful

Explanation:

raise_for_status() automatically raises an HTTPError if the response status code indicates an error.
Since the URL exists, the request succeeds.

Example 2 (Invalid URL triggering HTTPError):

Output:

HTTP Error: 404 Client Error: Not Found for url: https://www.geeksforgeeks.org/page-that-does-not-exist/

Explanation: This URL does not exist, so a 404 Not Found error is raised.

2. URLError:

URLError typically occurs when the URL is invalid, or there’s a network connection issue.

Note: In Python’s requests module, URLError is not directly raised- instead, requests.exceptions.ConnectionError is raised for connection failures.

Example:

Output:

Connection Error: HTTPSConnectionPool(host='thiswebsitedoesnotexist123456789.com', port=443): Max retries exceeded with url: / (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x783ef8211210>: Failed to resolve 'thiswebsitedoesnotexist123456789.com' ([Errno -2] Name or service not known)"))

Explanation:

If the domain name is incorrect or unreachable, ConnectionError is raised.
Always handle connection-related exceptions when scraping.

3. AttributeError (BeautifulSoup specific)

AttributeError in BeautifulSoup is raised when an invalid attribute reference is made, or when an attribute assignment fails. When we try to access the Tag using BeautifulSoup from a website and that tag is not present on that website then BeautifulSoup always gives an AttributeError.

Example:

Output:

AttributeError: 'NoneType' object has no attribute 'SomeTag'

Explanation:

If NonExistingTag does not exist, soup.NonExistingTag returns None.
Trying to access SomeTag on None triggers AttributeError.

Safer way to avoid AttributeError:

4. XMLParserError (Parsing Errors)

When parsing invalid or incomplete XML data with BeautifulSoup, you might face parsing errors or get None or empty results when using find() or find_all().

Syntax:

soup = bs4.BeautifulSoup( response, ' xml ' )
or
soup = bs4.BeautifulSoup( response, ' xml -xml' )

XML parser error generally happens when we're not passing any element in the find() and find_all() function or element is missing from the document. It gives the empty bracket [] or None as their output.

Example:

Output:

None

Comment

Article Tags:

Python

Python BeautifulSoup

Explore

Python Fundamentals

Python Data Structures

Advanced Python

Data Science with Python

Web Development with Python

Python Practice

Python Courses

URL: https://www.geeksforgeeks.org/python/beautifulsoup-error-handling/