FMA: 106K Tracks Dataset for MIR Tasks
FMA dataset offers 106,574 CC-licensed tracks from Free Music Archive with metadata, precomputed features, and audio subsets for MIR tasks like genre recognition on 161 genres.
Dataset Structure Enables Scalable MIR Experiments
FMA compiles 106,574 tracks (343 days, 917 GiB total) from 16,341 artists across 14,854 albums in a 161-genre hierarchy. Metadata in tracks.csv covers ID, title, artist, genres, tags, play counts; genres.csv defines hierarchy; features.csv has librosa-extracted acoustics; echonest.csv adds EchoNest (Spotify) metrics for 13,129 tracks. Audio comes in subsets: fma_small (8k 30s clips, 8 balanced genres, 7.2 GiB), fma_medium (25k clips, 16 genres, 22 GiB), fma_large (106k clips, 161 genres, 93 GiB), fma_full (untrimmed tracks, 879 GiB). Train/val/test splits proposed in paper; verify downloads via SHA1 checksums like f0df49ffe5f2a6008d7dc83c6915b31835dfe733 for metadata.zip.
Extract Features and Run Genre Baselines
Use features.py to compute spectral, temporal traits from raw audio matching features.csv. baselines.ipynb trains genre classifiers: MFCCs yield 0.45 accuracy on small set (8 genres); full acoustic features hit 0.55; EchoNest reaches 0.60. Scale to full dataset for end-to-end learning. analysis.ipynb generates stats/figures; webapi.ipynb queries FMA API for updates; creation.py scrapes/processes originals.
Quickstart Reproducible Workflow
Clone repo, conda/mamba env with Python 3.6+, pip install -r requirements.txt (resampy workaround: pip install cython before resampy). Set AUDIO_DIR in .env to decompressed path. Run usage.ipynb for loading CSVs, training models; Binder launches instantly. MIT-licensed code, CC BY 4.0 metadata; cite ISMIR 2017 paper. Repo: 2.6k stars, 456 forks, 100+ citing papers (e.g., zero-shot classification, graph NNs), challenges like WWW 2018 genre contest.