Toggle navigation
Home
New Query
Recent Queries
Discuss
Database tables
Database names
MediaWiki
Wikibase
Replicas browser and optimizer
Login
History
Fork
Fork of
WikiProject to page category cross-reference, 5% scan
by
The Anome
This query is marked as a draft
This query has been published
by
The Anome
.
Part of a plan to assign articles to WikiProjects using Naive Bayes. Detects only pages w/talk pages that have been assigned to WikiProjects. Still being worked on. Older version of this query that did 1/100 of the article space: Executed in 78.19 seconds as of Wed, 15 Mar 2023 14:13:29 UTC, Resultset (61075 rows) For 1/10 of the article space: Executed in 634.59 seconds as of Wed, 15 Mar 2023 14:16:55 UTC Resultset (609869 rows) Suggests about 2 hours for the whole article space... let's try it Executed in 5683.80 seconds as of Wed, 15 Mar 2023 14:34:20 UTC. Resultset (6102031 rows) Note that loading the 6102031 row resultset into this web page severely taxes your web browser.
Toggle Highlighting
SQL
SELECT DISTINCT article.page_title, article.page_id, talk.page_id -- REGEXP_EXTRACT(talkcats.cl_to, '(?WikiProject_|.*_importance_|.*_priority_|Unassessed)(.*)'), -- does not work FROM page AS article INNER JOIN page AS talk ON talk.page_title = article.page_title INNER JOIN categorylinks AS pagecats ON pagecats.cl_from = article.page_id INNER JOIN categorylinks AS talkcats ON talkcats.cl_from = talk.page_id WHERE -- article.page_id % 10 = 0 AND article.page_namespace = 0 AND talk.page_namespace = 1 AND article.page_is_redirect = 0 AND talk.page_is_redirect = 0 AND ( talkcats.cl_to LIKE "%WikiProject_%" OR talkcats.cl_to LIKE "%-Class_%_articles" OR talkcats.cl_to LIKE "%-importance_%_articles" OR talkcats.cl_to LIKE "%-priority_%_articles" OR talkcats.cl_to LIKE "Unassessed_%_articles" ) AND NOT ( talkcats.cl_to LIKE "%vital%" OR talkcats.cl_to LIKE "%Version%" ) AND NOT (pagecats.cl_to LIKE "%Disambig%" OR pagecats.cl_to LIKE "%disambig%" OR pagecats.cl_to LIKE "%set_index%" OR pagecats.cl_to LIKE "Set_index%") AND NOT (talkcats.cl_to LIKE "%Disambig%" OR talkcats.cl_to LIKE "%disambig%") AND NOT (pagecats.cl_to LIKE "Short_description%" OR pagecats.cl_to LIKE "%_errors%" OR pagecats.cl_to LIKE "CS1_%" OR pagecats.cl_to LIKE "%short_description%" OR pagecats.cl_to LIKE "%articles%" OR pagecats.cl_to LIKE "Articles%" OR pagecats.cl_to LIKE "%pages%" OR pagecats.cl_to LIKE "%disputes%" OR pagecats.cl_to LIKE "Pages%" OR pagecats.cl_to LIKE "Use_dmy_date%" OR pagecats.cl_to LIKE "%Wikipedia%" OR pagecats.cl_to LIKE "%articles%" OR pagecats.cl_to LIKE "%Articles%" OR pagecats.cl_to LIKE "%Wikidata%" OR pagecats.cl_to LIKE "Webarchive%")
By running queries you agree to the
Cloud Services Terms of Use
and you irrevocably agree to release your SQL under
CC0 License
.
Submit Query
Stop Query
All SQL code is licensed under
CC0 License
.
Checking query status...