Getting Started

Overview Getting Started Demonstrations Statistics In the Press About

This page has links to the MusicNet dataset and Python tutorials showing how to load and use the dataset.

Downloading MusicNet

Direct download links to the MusicNet dataset are available below. MusicNet is available in two formats: native Python 2.7 and HDF5. The Python and HDF5 releases contain the same data; the Python version takes advantage of native data structures for ease-of-use. The Python dataset is distributed as a NumPy npz file. This format has three dependencies:

  • Python 2.7 - This version of MusicNet is distributed as a Python object.
  • NumPy - The MusicNet features are stored in NumPy arrays.
  • intervaltree - The MusicNet labels are stored in an IntervalTree.

For non-Python users, we distribute MusicNet in the HDF5 format. For the HDF5 release, you will need to download an HDF5 parser for your language of choice. The data is organized into 330 groups, one for each song, under headings "/id_<MusicNet ID>". Each group contains a "data" dataset (a CArray containing the audio signal) and a "labels" dataset (a Table of labels).

  • HDF5 - This is the official webpage for HDF5.
  • Parsers - Wikipedia maintains an extensive list of HDF5 interfaces for various languages.

We also provide metadata for recordings in MusicNet. This metadata is distributed in csv format. The id column of the metadata file is cross-indexed with the MusicNet ids in the data files.

Download Links
  • MusicNet metadata - High level information about recordings in MusicNet.
  • MusicNet (Python) - Numpy distribution of MusicNet (11GB). Updated 12/1/16: The earlier version mistakenly did not include instrument and metrical labels.
  • MusicNet (HDF5) - HDF5 distribution of MusicNet (7.1GB).


Here are some tutorials for getting started with MusicNet. You can browse these tutorials using the Html viewer, or download the notebook and run it yourself using Jupyter. Some of the tutorials additionally depend on TensorFlow, scikit-learn, and matplotlib.

Jupyter Notebook Html Viewer
Introduction Introduction
Spectrograms Spectrograms
Linear Model Linear Model
Multi-Layer Perceptron Multi-Layer Perceptron