Toggle navigation
Home
New Query
Recent Queries
Discuss
Database tables
Database names
MediaWiki
Wikibase
Replicas browser and optimizer
Login
History
Fork
Fork of
Random 7.5k sample of Wikidata items across different size strata
by
EpochFail
This query is marked as a draft
This query has been published
by
Glorian WD
.
Generates a stratified random sample of Wikidata items by byte-length size of the JSON blobs. Additionally, the query generates low Q-ID Wikidata items. Redirects are removed.
Toggle Highlighting
SQL
use wikidatawiki_p; (SELECT page_latest AS rev_id, page_title, page_len, "1024" as strata FROM page WHERE page_namespace = 0 AND page_len BETWEEN 0 AND 1024 AND page_is_redirect = 0 ORDER BY RAND() LIMIT 1500) UNION ALL (SELECT page_latest AS rev_id, page_title, page_len, "8192" as strata FROM page WHERE page_namespace = 0 AND page_len BETWEEN 1025 AND 8192 AND page_is_redirect = 0 ORDER BY RAND() LIMIT 1500) UNION ALL (SELECT page_latest AS rev_id, page_title, page_len, "131072" as strata FROM page WHERE page_namespace = 0 AND page_len BETWEEN 8193 AND 131072 AND page_is_redirect = 0 ORDER BY RAND() LIMIT 1500) UNION ALL (SELECT page_latest AS rev_id, page_title, page_len, "262144" as strata FROM page WHERE page_namespace = 0 AND page_len BETWEEN 131073 AND 262144 AND page_is_redirect = 0 ORDER BY RAND() LIMIT 500) UNION ALL (SELECT page_latest AS rev_id, page_title, page_len, "inf" as strata FROM page WHERE page_namespace = 0 AND page_len >= 262145 AND page_is_redirect = 0 ORDER BY RAND() LIMIT 500) UNION ALL (SELECT page_latest AS rev_id, page_title, page_len, "low-qid" as strata FROM page WHERE page_namespace = 0 AND page_is_redirect = 0 order by page_id LIMIT 2000);
By running queries you agree to the
Cloud Services Terms of Use
and you irrevocably agree to release your SQL under
CC0 License
.
Submit Query
Stop Query
All SQL code is licensed under
CC0 License
.
Checking query status...