Toggle navigation
Home
New Query
Recent Queries
Discuss
Database tables
Database names
MediaWiki
Wikibase
Replicas browser and optimizer
Login
History
Fork
This query is marked as a draft
This query has been published
by
Isaac (WMF)
.
There are two types of img_metadata data formats: JSONs and PHP-serialized strings. This approach handles both. The metadata keys/values depend on the media type (ogg, mpeg, or webm). This approach handles all three. Audio files longer than 1 day in time are excluded as outliers / likely bad metadata. Data details: https://www.mediawiki.org/wiki/Manual:Image_table Additional statistics: https://commons.wikimedia.org/wiki/Special:MediaStatistics
Toggle Highlighting
SQL
WITH audio_length AS ( SELECT img_minor_mime, IF(img_minor_mime = 'ogg', ROUND(COALESCE(JSON_EXTRACT(img_metadata, '$.data.length'), SUBSTR(REGEXP_SUBSTR(img_metadata, '(s:6:"length";d:)[0-9]*\.?[0-9]*'), 16), 0), 3), ROUND(COALESCE(JSON_EXTRACT(img_metadata, '$.data.playtime_seconds'), SUBSTR(REGEXP_SUBSTR(img_metadata, '(s:16:"playtime_seconds";d:)[0-9]*\.?[0-9]*'), 27), 0), 3) ) AS audio_length_seconds FROM image WHERE img_media_type = 'AUDIO' ) SELECT img_minor_mime, COUNT(1) AS num_files, SUM(audio_length_seconds) AS total_audio_length_seconds FROM audio_length WHERE audio_length_seconds < (60 * 60 * 24) GROUP BY img_minor_mime
By running queries you agree to the
Cloud Services Terms of Use
and you irrevocably agree to release your SQL under
CC0 License
.
Submit Query
Stop Query
All SQL code is licensed under
CC0 License
.
Checking query status...