Toggle navigation
Home
New Query
Recent Queries
Discuss
Database tables
Database names
MediaWiki
Wikibase
Replicas browser and optimizer
Login
History
Fork
This query is marked as a draft
This query has been published
by
MarioGom
.
This query shows TLDs used in external links from the main namespace of Spanish Wikipedia, along with a count of how many links they appear in, as well as the number of articles containing, at least, one link with the given TLD. Only links with scheme http, https, ftp or relative (//) are considered. That excludes other such as irc://, mailto: or svn://. WARNING: This query does not support IDN TLDs.
Toggle Highlighting
SQL
USE eswiki_p; SELECT * FROM ( SELECT tld, COUNT(DISTINCT page_id) AS article_count, COUNT(*) AS total_count FROM ( -- Extract all (page_id, tld) tuples SELECT page_id, LOWER(SUBSTRING_INDEX(domain, '.', -1)) AS tld FROM ( -- Extract all (page_id, domain) tuples. SELECT el_from AS page_id, REGEXP_REPLACE(el_to, '^(?:[^:/]+:)?//((?:[_0-9A-Za-z]+\\.)+[A-Za-z]+)(?:[/:?].*)?$', '\\1') AS domain FROM externallinks WHERE -- Only links from main namespace articles. el_from_namespace = 0 -- Get only fairly well-formed URLs for http and https (or implicit https?) and exclude IPv4. AND el_to REGEXP '^(https?:)?//(?:[_0-9A-Za-z]+\\.)+[A-Za-z]+(?:[/:?].*)?$' ) AS domains ) AS tlds GROUP BY tld ) AS results ORDER BY article_count DESC, total_count DESC;
By running queries you agree to the
Cloud Services Terms of Use
and you irrevocably agree to release your SQL under
CC0 License
.
Submit Query
Stop Query
All SQL code is licensed under
CC0 License
.
Checking query status...