Toggle navigation
Home
New Query
Recent Queries
Discuss
Database tables
Database names
MediaWiki
Wikibase
Replicas browser and optimizer
Login
History
Fork
Fork of
Potentially extraneous homepage links (dewiki <= M)
by
HaeB
This query is marked as a draft
This query has been published
by
HaeB
.
goal: find extraneous external links like in "[http://example.com/page foo] on [http://example.com/ example site]" - without parsing dumps, but with reasonably few false positives
Toggle Highlighting
SQL
USE dewiki_p; SET @language = 'de'; DESCRIBE externallinks; # https://quarry.wmcloud.org/query/77235 : SELECT *, CONCAT(REGEXP_REPLACE(el_to_domain_index, '^(.*?://)(?:([^.]+)\\.)([^.]+\\.)?([^.]+\\.)?([^.]+\\.)?([^.]+\\.)?([^.]+\\.)?([^.]+\\.)?([^.]+\\.)$', '\\1\\9\\8\\7\\6\\5\\4\\3\\2'), el_to_path) AS url FROM externallinks LIMIT 5;
By running queries you agree to the
Cloud Services Terms of Use
and you irrevocably agree to release your SQL under
CC0 License
.
Submit Query
Stop Query
All SQL code is licensed under
CC0 License
.
Checking query status...