Heading Extractor
Heading Extractor pulls a page's title, description, and h1–h6 heading structure in one shot. It organizes the information design and SEO heading hierarchy in a form that is easy to scan.
Each URL is fetched server-side for HTML parsing. Connections to private IPs or localhost are rejected.
Each fetch times out after 8 seconds and reads only the first 2MB of the HTML body. Tag filters affect only the on-screen view and CSV; SEO diagnostics are evaluated on the unfiltered page state.
About Heading Extractor
Heading Extractor pulls a page's title, description, and h1–h6 heading structure in one shot. It organizes the information design and SEO heading hierarchy in a form that is easy to scan.
Process up to 10 URLs per run, and download results as CSV (UTF-8 + BOM). Common SEO problems are diagnosed automatically: "multiple H1s", "level skipped (H2 followed by H4)", "title over 60 characters", "description over 160 characters", and more.
Useful for auditing site information design, taking stock of current state before a rewrite, and observing competitor structures.
How to use
- Paste one URL per line into the input (up to 10).
- Click "Extract headings" — each URL is analyzed and the results are shown.
- The top "SEO diagnosis" section flags H1 issues, level skips, and title / description length.
- Use the H1 / H2 / H3 / … checkboxes to filter which heading levels are shown.
- Click "Download CSV" to save results for analysis in Excel or Google Sheets.
Use cases
- Web producers auditing every key page's heading hierarchy before a refresh.
- SEO leads and marketers studying competitor heading structures.
- Editors and writers listing the current headings of a target article when drafting a rewrite outline.
- Operators auditing for pages with multiple H1s or skipped heading levels.
- Web directors building TOC / sitemap-style deliverables from extracted headings.
Notes
- Up to 10 URLs per request.
- Connections to private IP addresses or localhost are refused for safety.
- Each URL has an 8-second fetch timeout — slow servers may error.
- Only the first 2 MB of HTML is read; very large pages may not yield headings.
- Headings inserted by JavaScript (SPAs) are not captured if they are not in the initial HTML.
- SEO diagnosis runs on the full page (including headings hidden by the checkbox filter).