This page has links to the MusicNet dataset and Python tutorials showing how to load and use the dataset.
Direct download links to the MusicNet dataset are available below. MusicNet is available in two formats: native Python 2.7 and HDF5. The Python and HDF5 releases contain the same data; the Python version takes advantage of native data structures for ease-of-use. The Python dataset is distributed as a NumPy npz file. This format has three dependencies:
For non-Python users, we distribute MusicNet in the HDF5 format. For the HDF5 release, you will need to download an HDF5 parser for your language of choice. The data is organized into 330 groups, one for each song, under headings "/id_<MusicNet ID>". Each group contains a "data" dataset (a CArray containing the audio signal) and a "labels" dataset (a Table of labels).
We also provide metadata for recordings in MusicNet. This metadata is distributed in csv format. The id column of the metadata file is cross-indexed with the MusicNet ids in the data files.
Here are some tutorials for getting started with MusicNet. You can browse these tutorials using the Html viewer, or download the notebook and run it yourself using Jupyter. Some of the tutorials additionally depend on TensorFlow, scikit-learn, and matplotlib.