Fork of
Likely partially disambiguated titles (1)
by Certes
This query is marked as a draft
This query has been published
by Certes.
SQL
x
SELECT REGEXP_REPLACE(page_title, "(.*?\\().*?([^_\\W]+\\))", "\\1\\2") AS PT,
COUNT(*) AS N,
MIN(CASE WHEN page_title REGEXP "\\([^_\\W]+\\)$" THEN NULL ELSE page_title END) AS title1,
CASE WHEN COUNT(*)>1 THEN MAX(CASE WHEN page_title REGEXP "\\([^_\\W]+\\)$" THEN NULL ELSE page_title END) ELSE NULL END AS title2,
SUBSTRING_INDEX(page_title, '_(', 1) AS basename,
REGEXP_REPLACE(page_title, ".*?\\(.*?([^_\\W]+)\\)", "\\1") AS qualifier
FROM page
LEFT JOIN categorylinks ON cl_from=page_id AND cl_type="page" AND cl_to="All_disambiguation_pages"
WHERE page_namespace=0 AND page_title LIKE "%(%)" AND page_is_redirect=0 AND cl_from IS NULL
GROUP BY basename, qualifier
HAVING SUM(page_title = PT)=1 AND N>1
By running queries you agree to the Cloud Services Terms of Use and you irrevocably agree to release your SQL under CC0 License.
All SQL code is licensed under CC0 License.