Hanslip, Mark ORCID: https://orcid.org/0000-0003-2106-5375 (2023) Applications of Discriminative, Generative and Predictive Deep Learning Processes to Solo Saxophone Practice. PhD thesis, University of York.
Abstract
Modelling of audio data through deep learning provides a means of creating novel sounds, processes, ideas and tools for musical creativity, yet its actual usefulness is relatively under-explored. Only a handful of researcher-practitioners are using AI models in their musical works, and artistic research into applications of deep learning modelling to instrumental practice and improvisation currently occupies an even smaller niche.
The research presented in this thesis and accompanying portfolio is an examination of potential creative applications of statistical modelling of audio data, through deep learning processes, to instrumental music practice; these processes are classification of a live input, generation of raw audio samples and sequential prediction of pitch. The goal of this work is, through the development of processes and creation of musical works, to generate knowledge concerning the practicality of modelling the systematic aspects of an instrumental improvised practice, the creative usefulness of such models to the practitioner, and the
musical and technical ‘behaviours’ of specific classes of deep learning architecture with respect to the data on which the models are trained.
These concerns are addressed through a practice-based research methodology consisting of multiple steps: recording original audio datasets; pre-processing audio data as appropriate to model architecture and task; training statistical models; artistic experimentation and development of software, resulting in novel processes for musical creativity; and creation of artistic outputs, resulting in a portfolio of recordings and notated scores.
This project finds that deep learning can play useful roles in both technical and creative musical processes: classification can not only form the basis of interactive systems for improvisation but also be suggestive of new compositional structures; outputs of generative models of raw audio not only return valuable information about the training data but also generate useful source material for technical instrumental practice, improvisation and composition; notated outputs from symbolic-domain predictive models can also be richly suggestive of compositional ideas and structures for electroacoustic improvisation. This rich diversity of
applications found posits AI as creative assistant, teacher and as deeply personalised tool for the instrumental practitioner.
When considering the utility of this work to others, there will be specific variances not covered by this project: appropriate choices of data representations, data-preprocessing techniques, model architectures and their training parameters will vary according to task, instrument, genre and taste, as will of course the character of others’ creative outputs. However, the abundance of affordances and future directions this work uncovers gives confidence of its utility for other instrumental practitioners and researchers.
Given the pace of ongoing development of deep learning methods for modelling of audio and their still-limited adoption by creative practitioners, I hope that this thesis will motivate further explorations of the unique creative potential of these technologies by instrumental practitioners, improvisers and practice-based researchers in the wider field of AI for musical creativity.
Metadata
Supervisors: | Reuben, Federico |
---|---|
Related URLs: |
|
Keywords: | music;improvisation;compositition;instrumental practice;deep learning;machine learning;data science;computational creativity;NIME |
Awarding institution: | University of York |
Academic Units: | The University of York > School of Arts and Creative Technologies (York) |
Depositing User: | Mr Mark Hanslip |
Date Deposited: | 27 Jun 2024 14:14 |
Last Modified: | 27 Jun 2024 14:14 |
Open Archives Initiative ID (OAI ID): | oai:etheses.whiterose.ac.uk:35184 |
Downloads
Examined Thesis (PDF)
Filename: Hanslip_205034757_CorrectedThesisFinal.pdf
Licence:
This work is licensed under a Creative Commons Attribution NonCommercial NoDerivatives 4.0 International License
Supplementary Material
Filename: Portfolio.zip
Description: Creative Portfolio
Licence:
This work is licensed under a Creative Commons Attribution NonCommercial NoDerivatives 4.0 International License
Supplementary Material
Filename: Code.zip
Description: Code
Licence:
This work is licensed under a Creative Commons Attribution NonCommercial NoDerivatives 4.0 International License
Supplementary Material
Filename: Datasets.zip
Description: Raw Audio Datasets
Licence:
This work is licensed under a Creative Commons Attribution NonCommercial NoDerivatives 4.0 International License
Related datasets
Export
Statistics
You do not need to contact us to get a copy of this thesis. Please use the 'Download' link(s) above to get a copy.
You can contact us about this thesis. If you need to make a general enquiry, please see the Contact us page.