Publications

Conference Papers and Journal Articles

  • Florian Stimberg, Alex Narest, Alessio Bazzica, Lennart Kolmodin, Pablo Barrera González, Olga Sharonova, Henrik Lundin, Thomas C. Walters (2020) “WaveNetEQ - Packet Loss Concealment with WaveRNN,” IEEE Asilomar Conference on Signals, Systems, and Computers 1-5 November 2020.
  • Archit Gupta, Brendan Shillingford, Yannis Assael, Thomas C. Walters (2019) “Speech bandwidth extension with WaveNet” IEEE WASPAA 2019 arXiv:1907.04927 Conference Poster
  • Cristina Garbacea, Aaron van den Oord, Yazhe Li, Felicia Lim, Alejandro Luebs, Oriol Vinyals, Thomas C. Walters (2019) “Low Bit-rate Speech Coding with VQ-VAE and a WaveNet Decoder,” ICASSP 2019 arXiv Conference Poster
  • Aaron van den Oord, Yazhe Li, Igor Babuschkin, Karen Simonyan, Oriol Vinyals, Koray Kavukcuoglu, George van den Driessche, Edward Lockhart, Luis C. Cobo, Florian Stimberg, Norman Casagrande, Dominik Grewe, Seb Noury, Sander Dieleman, Erich Elsen, Nal Kalchbrenner, Heiga Zen, Alex Graves, Helen King, Tom Walters, Dan Belov, Demis Hassabis (2018). “Parallel WaveNet: Fast High-Fidelity Speech Synthesis,” Proceedings of the 35th International Conference on Machine Learning, PMLR 80:3918-3926. PDF
  • W. Bastiaan Kleijn, Felicia S. C. Lim, Alejandro Luebs, Jan Skoglund, Florian Stimberg, Quan Wang, Thomas C. Walters (2018). “Wavenet based low rate speech coding,” ICASSP 2018. arXiv:1712.01120
  • Thomas C. Walters, David A. Ross, Richard F. Lyon (2013). “The Intervalgram: An Audio Feature for Large-Scale Cover-Song Recognition”, From Sounds to Music and Emotions: 9th International Symposium, CMMR 2012, London, UK, June 19-22, 2012, Revised Selected Papers, Springer Berlin Heidelberg, pp. 197-213. PDF
  • Roy D. Patterson, Timothy Ives, Thomas C. Walters, Richard F. Lyon (2012). “Modelling the Distortion Produced by Cochlear Compression.” 16th International Symposium on Hearing
  • Richard F. Lyon, Martin Rehn, Samy Bengio, Thomas C. Walters and Gal Chechik (2010). “Sound retrieval and ranking using auditory sparse-code representations,” Neural Computation, 22, 2390-2416. PDF
  • Roy D. Patterson, Thomas C. Walters, Jessica J. M. Monaghan, Etienne Gaudrain (2010). “Reviewing the Definition of Timbre as It Pertains to the Perception of Speech and Musical Sounds,” In: The Neurophysiological Bases of Auditory Perception ; proceedings of the 15th International Symposium on Hearing, edited by Lopez-Poveda E.A., Palmer A.R., Meddis R. (Springer, New York) p.223–233.
  • Roy D. Patterson,Thomas C. Walters, Jessica Monaghan, Christian Feldbauer, Toshio Irino (2010). “Auditory Speech Processing for Scale-Shift Covariance and its Evaluation in Automatic Speech Recognition”, IEEE International Symposium on Circuits and Systems, Paris, France. DOI:10.1109/ISCAS.2010.5537725
  • Martin Rehn, Richard F. Lyon, Samy Bengio, Thomas C. Walters and Gal Chechik (2009). “Sound Ranking Using Auditory Sparse-Code Representations”. International Conference on Machine Learning 2009, Workshop: Sparse Methods for Music Audio, Montréal, Canada
  • Richard E. Turner, Thomas C. Walters, Jessica J. M. Monaghan, and Roy D. Patterson (2009). “A statistical formant-pattern model for segregating vowel type and vocal-tract length in developmental formant data,” Journal of the Acoustical Society of America 125(4), 2374-2386 PDF
  • Thomas C. Walters, Phil A. Gomersall, Richard E. Turner and Roy D. Patterson (2008). “Comparison of relative and absolute judgments of speaker size based on vowel sounds,” Proceedings of Meetings on Acoustics 050003 1
  • Roy D. Patterson, Thomas C. Walters and Jessica J.M. Monaghan (2008). “Revising the definition of timbre to make it useful for speech and musical sounds”. British Society of Audiology short papers meeting, York, UK
  • Jessica J. M. Monaghan, Christian Feldbauer, Thomas C. Walters and Roy D. Patterson (2008). “Low-Dimensional, Auditory Feature Vectors that Improve VTL Normalization in Automatic Speech Recognition”. Journal of the Acoustical Society of America 123, 3066
  • Toshio Irino, Tom C. Walters and Roy D. Patterson (2007). “A computational auditory model with a nonlinear cochlea and acoustic scale normalization,” 19th International Congress on Acoustics, Madrid PDF
  • David R. R. Smith, Thomas C. Walters and Roy D. Patterson (2007). “Discrimination of speaker sex and size when glottal-pulse rate and vocal-tract length are controlled,” Journal of the Acoustical Society of America 122(6), 3628-3639 PDF
  • David R. R. Smith, Thomas C. Walters and Roy D. Patterson (2007). “Judging sex and age: Effect of glottal-pulse rate, vocal-tract length and original talker,” 19th International Congress on Acoustics, Madrid PDF
  • Thomas C. Walters, Phil A. Gomersall, Richard Turner and Roy D. Patterson (2007). “Comparison of relative and absolute judgments of speaker size based on vowel sounds,” The Journal of the Acoustical Society of America 121(5), 3119
  • Roy D. Patterson, Thomas C. Walters and Toshio Irino (2005). “Extracting a carrier-independent version of the syllabic message: The principles” The Journal of the Acoustical Society of America 117(4), 2373

Patents (Applied for and / or granted)

Book Chapters

  • Steven R. Ness, Thomas Walters and Richard F. Lyon (2011). “Auditory Sparse Coding,” In: Music Data Mining, CRC Press/Chapman Hall (2011)
  • Roy D. Patterson, Etienne Gaudrain, and Thomas C. Walters (2010). “The perception of family and register in musical tones,” In: Music Perception. Jones, M. R., Popper, A. N., and Fay, R. R. (Eds). Springer-Verlag, New York. DOI:10.1007/978-1-4419-6114-3_2
  • Roy D. Patterson, David R. R. Smith, Ralph van Dinther and Thomas C. Walters (2007). “Size Information in the Production and Perception of Communication Sounds,” In: Auditory Perception of Sound Sources. Yost, W. A., Popper, A. N., and Fay, R. R. (Eds). Springer Science+Business Media, LLC, New York. Google Books

PhD Thesis