Finding temporal structure in music: blues improvisation with LSTM recurrent networks
Tóm tắt
We consider the problem of extracting essential ingredients of music signals, such as a well-defined global temporal structure in the form of nested periodicities (or meter). We investigate whether we can construct an adaptive signal processing device that learns by example how to generate new instances of a given musical style. Because recurrent neural networks (RNNs) can, in principle, learn the temporal structure of a signal, they are good candidates for such a task. Unfortunately, music composed by standard RNNs often lacks global coherence. The reason for this failure seems to be that RNNs cannot keep track of temporally distant events that indicate global music structure. Long short-term memory (LSTM) has succeeded in similar domains where other RNNs have failed, such as timing and counting and the learning of context sensitive languages. We show that LSTM is also a good mechanism for learning to compose music. We present experimental results showing that LSTM successfully learns a form of blues music and is able to compose novel (and we believe pleasing) melodies in that style. Remarkably, once the network has found the relevant structure, it does not drift from it: LSTM is able to play the blues with good timing and proper structure as long as one is willing to listen.
Từ khóa
#Intelligent networks #Multiple signal classification #Recurrent neural networks #Timing #Adaptive signal processing #Signal generators #Signal processing #Machine learning #Bars #Learning systemsTài liệu tham khảo
hochreiter, 2001, Gradient flow in recurrent nets: the difficulty of learning long-term dependencies, A Field Guide to Dynamical Recurrent Neural Networks
10.1142/S0218488598000100
10.2307/3679550
mozer, 1994, Neural network composition by prediction: Exploring the benefits of psychophysical constraints and multiscale processing, Cognitive Science Cognitive Science, 6, 247
10.1016/S0893-6080(02)00219-8
plaut, 1986, Experiments on learning back propagation, Techn Report CMU-CS-86–126
robinson, 1987, The Utility Driven Dynamic Error Propagation Network, Technical Report CUED/F-INFENG/TR291
10.1037/0033-295X.89.4.305
stevens, 1994, Representations of Tonal Music: A Case study in the development of temporal relationship, Proceedings of the 1993 Connectionist Models Summer School, 228
10.2307/3679551
10.1007/s004260100070
10.1007/3-540-44668-0_173
10.1109/IJCNN.2000.861302
gcrs, 2002, DEKF-LSTM, ESANN'2002 proceedings - European symposium on artificial neural networks
10.1162/089976600300015015
10.1109/72.963769
cooper, 1960, The Rhythmic Structure of Music
10.2307/3679552
hochreiter, 1991, Untersuchungen dynarnischen Netzen
williams, 1995, Gradient-based learning algorithms for recurrent networks and their computational complexity, Back-Propagation Theory Architectures and Applications, 433