Toggle navigation
Home
New Query
Recent Queries
Discuss
Database tables
Database names
MediaWiki
Wikibase
Replicas browser and optimizer
Login
History
Fork
This query is marked as a draft
This query has been published
by
Zar2gar1
.
This query performs basic filtering on the Wikipedia (EN) pages, then gathers data like the human-readable article name. It's largely just a demo / starting-point for later joins, which may be handled in a ToolDB user image. Anything that doesn't play well with the database schema or needs an analytical tool (like page-views) belongs elsewhere.
Toggle Highlighting
SQL
# Set a query timeout for politeness SET STATEMENT max_statement_time = 300 FOR # Indices don't allow for an efficient join ... # ... so we'll just grab page IDs for now SELECT page_id AS "Article ID", page_title AS "Article Name", page_is_redirect AS "Redirecting?", page_latest AS "Latest Revision ID", page_len AS "Article Size (B)" FROM page WHERE ( # We're only interested in articles (main namespace) page_namespace = 0 AND # Keep redirects for later joins with revision history # Stubs, however, are another matter ... # ... 2500 B threshold based on 500 words * ~5 chars / word (page_is_redirect = 0 AND page_len > 2500) ) # We can't select article pages without an efficient join ... # ... so we'll just oversample for now # TESTING: Start with a small limit to gauge time LIMIT 100000;
By running queries you agree to the
Cloud Services Terms of Use
and you irrevocably agree to release your SQL under
CC0 License
.
Submit Query
Stop Query
All SQL code is licensed under
CC0 License
.
Checking query status...