EU Government Website Scans
This project discovers and catalogues how European (and allied) government websites use social media, whether their URLs are accessible, and what technology platforms power them, including which third-party JavaScript services they rely on.
Current Scan Progress
Progress as of 2026-04-09 01:04 UTC
| Scan Type | Pages Scanned | Coverage |
|---|---|---|
| Combined Reachability | 77,380 confirmed reachable | ██████████████████░░ 93.6% |
| Social Media | 82,714 scanned (77,380 reachable) | ████████████████████ 100.0% |
| URL Validation | 13,771 validated (11,960 valid) | ███░░░░░░░░░░░░░░░░░ 16.6% |
| Technology | 10,376 scanned | ██░░░░░░░░░░░░░░░░░░ 12.5% |
| Accessibility Statements | 72,735 scanned | █████████████████░░░ 87.9% |
31 countries with scan data · 77,380 of 82,714 available pages confirmed reachable. See the Scan Progress Report for full details.
Latest Scan Results
- Scan Progress Report — Up-to-date social media, URL validation, accessibility, and technology scan coverage across all countries, including the date range each country was scanned.
- Social Media — Detailed breakdown of social platform usage across government sites, with per-country platform counts (Twitter, X, Bluesky, Mastodon).
- Accessibility Statements — Per-country tracking of accessibility statement links as required by the EU Web Accessibility Directive.
- Technology Scanning — Technologies detected on government sites (CMS, web server, analytics, and more).
- Third-Party JavaScript — Externally hosted scripts, analytics tags, consent tools, and other third-party services loaded by government sites.
- Government Domains — Full listing of all ~36,000 government domains tracked across 31 countries.
Accessing Scan Artifacts
Each GitHub Actions scan run uploads its results as a downloadable artifact:
- Go to GitHub Actions
- Click the relevant workflow (e.g. Scan Social Media Links)
- Open a completed run and scroll to the Artifacts section
- Download the artifact (e.g.
social-scan-<run_number>) to inspect:data/metadata.db— the full SQLite results database*_social.toon/*_tech.toon— annotated TOON files- Scan output logs
The Scan Progress Report is regenerated automatically after every scan and committed to this site, so you can always see the latest aggregated results here without downloading artifacts.
What We Track
Social Media Presence
We check every government URL for links to:
| Platform | Includes |
|---|---|
| Twitter / X | twitter.com, x.com |
| Bluesky | bsky.app, bsky.social |
| Mastodon / Fediverse | 40+ known instances + /@user pattern detection |
Each scanned page is classified into one of the following tiers:
| Tier | Meaning |
|---|---|
no_social |
Page is reachable but contains no social media links |
twitter_only |
Only links to Twitter / X (legacy platform) |
modern_only |
Only links to Bluesky or Mastodon (modern / open platforms) |
mixed |
Links to both Twitter/X and at least one modern platform |
unreachable |
Page could not be fetched |
See the Social Media page for full details.
URL Validation
We validate each URL and track:
- HTTP status codes and redirect chains
- Persistent failures (a URL is removed after 2 consecutive failures)
- Final redirect destinations (updated for future scans)
Technology Detection
We detect the CMS, framework, and analytics platforms used by each government site.
Third-Party JavaScript
We also track externally hosted scripts loaded by government pages, including:
- Analytics and tag managers
- Cookie consent tools
- CDNs and shared JavaScript libraries
- Customer-support and monitoring widgets
See Third-Party JavaScript for the EU-wide breakdown.
Lighthouse Audits
We run Google Lighthouse on each government page and record five quality scores: performance, accessibility, best practices, SEO, and PWA compliance (0–100 scale).
See Lighthouse Scanning for full details.
Countries Covered
The dataset covers all EU member states plus selected allied nations: United Kingdom, Switzerland, Iceland, Norway, and Canada.
See Government Domains for the full domain listing per country.
How the Scans Work
Scans run automatically on a schedule via GitHub Actions:
| Scan | Schedule | Priority |
|---|---|---|
| Social Media | Every 3 hours | Highest — confirms reachability and collects social-link data in one pass |
| Technology Detection | On demand | Medium — run manually for new countries |
| URL Validation | Every 12 hours | Lowest — lightweight redirect/404 check; skipped for pages already confirmed reachable within 30 days |
| Lighthouse Audits | Weekly (Sundays 04:00 UTC) | Medium — slow per-URL (~5 s); weekly cadence keeps data fresh without overloading servers |
| Scan Progress Report | After every scan | — |
After each scan run, this site is automatically updated with the latest results.
Source Code & Data
Scan data is collected by automated workflows and stored as GitHub Actions artifacts. The progress report is regenerated after every scan and committed directly to this site.