MuseNet is a deep neural network created by OpenAI that can generate 4-minute musical compositions with up to 10 different instruments. It combines styles from different genres such as country, Mozart and the Beatles. It is based on the same general-purpose unsupervised technology as GPT-2, a large-scale transformer model trained to predict the next token in a sequence, whether audio or text.
The model is trained on sequential data, by asking it to predict the upcoming note given a set of notes. It uses chordwise encoding, which considers every combination of notes sounding at one time as an individual ‘chord’, and assigns a token to each chord.
Additionally, the composer and instrumentation tokens are used to give more control over the kinds of samples MuseNet generates. The model is able to generate music that blends different styles and instruments, while also being able to remember long-term structure in a piece. It is trained using a dataset collected from various sources such as Classical Archives and BitMidi, as well as the MAESTRO dataset.
More details about Musenet
How does MuseNet remember long-term structure in music?
MuseNet is equipped to remember long-term structure in music through the use of a Sparse Transformer model. This enables it to pay attention to a context of 4,096 tokens, allowing it to remember and replicate the long-term structures found within a musical piece.
What are some limitations of MuseNet?
MuseNet has a few limitations, including unpredictability in instrument selection and difficulty in managing odd pairings of styles and instruments. Despite user input, it sometimes chooses unexpected instruments due to its inherent randomness and probabilistic nature. It also struggles with strange pairings, like generating a Chopin piece with bass and drums, as it’s more inclined towards instruments typical to the specific genre or style.
How does MuseNet blend different musical styles?
MuseNet blends different musical styles by sequentially generating notes based on a mixed selection of style-specific tokens. For example, the first few set of notes could be in the style of Mozart and follow up notes could be in the style of the Beatles. Due to its machine learning nature, MuseNet is able to smoothly transition between these styles creating a unique blend.
What is MuseNet?
MuseNet is a deep neural network developed by OpenAI that can generate four-minute musical compositions involving up to ten different instruments. It can compose music in a variety of styles including country, classical like Mozart, and pop band styles like the Beatles. The technology was not directly programmed with an understanding of music, but learned patterns of harmony, rhythm, and style by predicting the next token in hundreds of thousands of MIDI files.