Toggle navigation
Home
New Query
Recent Queries
Discuss
Database tables
Database names
MediaWiki
Wikibase
Replicas browser and optimizer
Login
History
Fork
This query is marked as a draft
This query has been published
by
MarioGom
.
This query shows TLDs used in external links from the main namespace of Spanish Wikipedia, along with a count of how many links they appear in, as well as the number of articles containing, at least, one link with the given TLD. Only links with scheme http, https, ftp or relative (//) are considered. That excludes other such as irc://, mailto: or svn://. WARNING: This query does not support IDN TLDs.
Toggle Highlighting
SQL
USE eswiki_p; SELECT tld, COUNT(DISTINCT page_id) AS article_count, COUNT(*) AS total_count FROM ( -- Extract all (page_id, tld) tuples SELECT page_id, -- Extract tld from domain. SUBSTRING_INDEX(domain, '.', -1) AS tld FROM ( -- Extract all (page_id, domain) tuples. SELECT el_from AS page_id, SUBSTRING_INDEX(SUBSTRING_INDEX(SUBSTRING_INDEX(SUBSTRING_INDEX(SUBSTRING_INDEX( CAST(el_to AS CHAR(255) CHARACTER SET utf8), '/', 3), '://', -1), '/', 1), '?', 1), ':', 1) AS domain FROM externallinks WHERE -- Only links from main namespace articles. el_from_namespace = 0 -- Get only fairly well-formed URLs for http and https and exclude IPv4. AND el_to REGEXP '^https?://[[:alnum:]]+\..*[[:alpha:]].*' ) AS domains ) AS main_domains GROUP BY tld ORDER BY article_count DESC, total_count DESC;
By running queries you agree to the
Cloud Services Terms of Use
and you irrevocably agree to release your SQL under
CC0 License
.
Submit Query
Stop Query
All SQL code is licensed under
CC0 License
.
Checking query status...