Toggle navigation
Home
New Query
Recent Queries
Discuss
Database tables
Database names
MediaWiki
Wikibase
Replicas browser and optimizer
Login
History
Fork
This query is marked as a draft
This query has been published
by
Robertsky
.
Toggle Highlighting
SQL
-- Identify potential cut-and-paste move candidates (limited to 10 for testing) WITH ArticlePairs AS ( SELECT r1.article_id AS source_article, r2.article_id AS destination_article, r1.rev_id AS source_rev_id, r2.rev_id AS destination_rev_id, r1.timestamp AS source_timestamp, r2.timestamp AS destination_timestamp, r1.content AS source_content, r2.content AS destination_content, SIMILARITY(r1.content, r2.content) AS diff_score FROM revisions r1 JOIN revisions r2 ON r1.article_id != r2.article_id -- Compare different articles WHERE SIMILARITY(r1.content, r2.content) > 0.85 -- Set a threshold for similarity AND ABS(EXTRACT(EPOCH FROM (r1.timestamp - r2.timestamp))) < 3600 -- Within 1 hour ) SELECT ROW_NUMBER() OVER () AS idx, -- Index for the report a1.article_name AS source, -- Source article name ap.source_rev_id AS PreID, -- Source revision ID TO_CHAR(ap.source_timestamp, 'YYYY/MM/DD HH24:MI:SS') AS Predate, -- Source timestamp a2.article_name AS destination, -- Destination article name ap.destination_rev_id AS PostID, -- Destination revision ID TO_CHAR(ap.destination_timestamp, 'YYYY/MM/DD HH24:MI:SS') AS Postdate, -- Destination timestamp ap.diff_score AS DiffScore -- Difference score between revisions FROM ArticlePairs ap JOIN articles a1 ON ap.source_article = a1.article_id JOIN articles a2 ON ap.destination_article = a2.article_id ORDER BY ap.source_timestamp DESC LIMIT 10; -- Limit the results to 10 pairs for testing
By running queries you agree to the
Cloud Services Terms of Use
and you irrevocably agree to release your SQL under
CC0 License
.
Submit Query
Stop Query
All SQL code is licensed under
CC0 License
.
Checking query status...