Deep Learning Assisted Music Composition

Ranjbar, Sorush
Journal Title
Journal ISSN
Volume Title

Much headway has been made over the last 70 years in the field of deep learning, opening new doors for music composers. The potential for deep learning models as assistive tools, rather than end to end music composers, is reviewed here. Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), and Deep Convolutional Generative Adversarial Networks (DC-GAN) are trained with symbolic music data with the goal of producing segments of interesting music. These musical motifs are qualitatively compared to identify which model could function as a musical creative block mitigator. The model that generates the most interesting harmonic content is chosen as the starting point for a progressive rock composition. The three models are trained over Musical Instrument Digital Interface (MIDI) files, a format which is used to digitally represent musical notes by integers. The main challenge in this thesis presented itself during the training process, where the recurrent models experienced overfitting. This challenge can be overcome by reducing the model complexity and ensuring that the training data arises from the same pool as the validation data. Despite overfitting, the LSTM music did not mimic the training data and had many phrases which were unique. The DC-GAN generated sample inspired the creation of a 2-minute song in less than 20 minutes. The use of deep learning tools to inspire original music composition is achievable and can be used to remedy the temporary creative blocks that musicians face.

This item is available only to currently enrolled UTSA students, faculty or staff.
Deep Learning, Generative Adversarial Networks, Long Short Term Memory, Machine Learning, Music Information Technology
Electrical and Computer Engineering