VOOZH about

URL: https://apify.com/service-paradis/w3c-html-reporter

⇱ W3C Html Reporter Β· Apify


Pricing

from $2.00 / 1,000 results

Go to Apify Store

Get HTML validity reports from various web pages using W3C HTML validator.

Pricing

from $2.00 / 1,000 results

Rating

0.0

(0)

Developer

πŸ‘ Alexandre Paradis

Alexandre Paradis

Maintained by Community

Actor stats

2

Bookmarked

11

Total users

0

Monthly active users

4 months ago

Last modified

Share

W3C HTML Validity Reporter

The W3C HTML Validity Reporter is an Apify actor that generates reports on the validity of given webpages HTML according to the W3C HTML Validator. The actor takes webpages URL as input and produces reports with detailed information on the validity of the webpages HTML.

Input

The actor takes the following input:

  • startUrls (required): The URL of the webpages to validate.
  • proxy (Object): Proxy configuration. You can edit this to use Apify proxy, or provide your own proxy servers. Default value is { "useApifyProxy": false }.
  • debug (Boolean): See detailed logs when activated. Default value is false.

Output

The actor generates a JSON report on the validity of the webpages HTML. The report includes:

  • A list of messages given by the validator

Usage

To use the actor, you'll need an Apify account. If you don't have one, sign up for free on the Apify website.

Once you have an account, you can run the actor by creating a new task with the following configuration:

{
"startUrls":[{
"url":"https://example.com"
}
],
"proxy":{
"useApifyProxy":false
},
"debug":false
}

Replace "https://example.com" with the URL of the webpage you want to validate.

Please note that w3c validator use Cloudflare to protect their website against bot. You may need to use Apify proxy in order to use this crawler.

Results example

The output from scraping W3C validator is stored in the dataset. Each messsage is stored as an item inside the dataset. After the run is finished, you can download the scraped data onto your computer or export to any web app in various data formats (JSON, CSV, XML, RSS, HTML Table). Here's a few examples of the outputs you can get:

{
"url":"https://apify.com",
"language":"en",
"severity":"info",
"lastLine":10,
"firstColumn":301,
"lastColumn":357,
"message":"Trailing slash on void elements has no effect and interacts badly with unquoted attribute values.",
"markup":"rowser.\"/><meta name=\"twitter:card\" content=\"summary_large_image\"/><meta ",
"highlightIndex":10,
"highlightLength":57
}
{
"url":"https://apify.com",
"language":"en",
"severity":"warning",
"firstLine":614,
"lastLine":614,
"firstColumn":5684,
"lastColumn":5721,
"message":"Section lacks heading. Consider using β€œh2”-β€œh6” elements to add identifying headings to all sections, or else use a β€œdiv” element instead for any cases where no heading is needed.",
"markup":"-0 wwExY\"><section class=\"sc-1913faef-1 jYOdxN\"><div c",
"highlightIndex":10,
"highlightLength":38
}
{
"url":"https://apify.com",
"language":"en",
"severity":"error",
"lastLine":10,
"firstColumn":1210,
"lastColumn":1272,
"message":"A β€œmeta” element with an β€œhttp-equiv” attribute whose value is β€œX-UA-Compatible” must have a β€œcontent” attribute with the value β€œIE=edge”.",
"markup":"ent=\"24\"/><meta http-equiv=\"X-UA-Compatible\" content=\"IE=edge,chrome=1\"/><meta ",
"highlightIndex":10,
"highlightLength":63
}

You might also like

HTML Validity Report Generator

gentle_cloud/html-validity-report-generator

Validate web pages against W3C HTML standards. Get detailed error, warning, and info reports using the official W3C Nu HTML Checker API.

HTML Validity Report Generator

tempting_district/html-validity-report-generator

Generate deterministic HTML validity reports with standards-based findings and exact element-level source locations.

HTML Scraper

making-data-meaningful/html-scraper

Access and extract full HTML source code from any webpage instantly. The HTML Scraper API lets you retrieve clean, accurate page HTML for SEO analysis, web scraping, and content monitoring - all without being blocked.

W3C Standards Catalog Scraper

parseforge/w3c-standards-catalog-scraper

Scrape W3C standards catalog: title, status, type, date, editors, abstract, shortname, group, deliverer, errata, and specification URL. Covers Recommendations, Working Drafts, Notes, and Candidate Recommendations. Export web standards to JSON, CSV, or Excel for developer tooling.

HTML Scraper pro

scrapingxpert/html-scraper-pro

The HTML Scraper Pro is a powerful tool designed to extract the HTML source code and metadata from websites. It uses advanced web scraping techniques to retrieve the full HTML content of web pages,page title and HTTP status code.This tool is ideal for data extraction, website analysis, and archiving

321

5.0

HTML to JSON Smart Parser

parseforge/html-to-json-smart-parser

Convert HTML to structured JSON using AI! Uses OpenAI to extract and structure data from HTML into clean JSON format. Perfect for developers and data analysts who need to transform HTML into structured data without manual parsing.

41

5.0

🌐 Download HTML from URLs

scrapio/download-html-from-urls

🌐πŸ“₯ Download HTML from any URL with download-html-from-urls. Extract and save raw page source for analysis, scraping, or automationβ€”fast, reliable, and easy to use. Perfect for developers and data teams. πŸš€βœ¨

Download HTML from URLs

datapilot/download-html-from-urls

This script with an Apify Actor to fetch the complete HTML source of any website. The user provides a URL, the page is loaded with JavaScript execution, the full HTML is printed in the terminal, saved to an HTML file,

Related articles

How to parse HTML in JavaScript
Read more