Chargement...
 

Natural Language Speech and Audio Processing

Domaine
Natural Language Speech and Audio Processing
Domain - extra
machine learning, signal processing
Année
2010
Starting
automn 2010
État
Open
Sujet
Automatic indexing of singing voice
Thesis advisor
ADDA-DECKER Martine
Co-advisors
Jean-Luc Rouas, LIMSI/CNRS
Laboratory
EXT
Collaborations
CREM (Centre de Recherche d'Ethnomusicologie),
LAM (Lutheries - Acoustique - Musique), Institut Jean Le Rond d'Alembert, Paris
Abstract
The automatic indexing of multimedia documents on the WEB raises the status of the singing voice: should these types of acoustic segments be labelled as music, as speech or more specifically as singing voice? If they can be automatically identified as singing voice, is it possible to automatically identify the language of the lyrics? transcribe some of the lyrics?
The proposed subject aims at adapting methods and techniques available for the processing of speech and music to the singing voice. Two separate investigations are proposed: the first one makes use of a corpus of singing voice from a high-quality classical repertoire, to improve the characterization of singing voice with respect to spoken voice and to propose methods for the automatic (language) identification of the singing voice.
A second investigation will make use of a huge collection of ethnomusicological recordings mixing speech and singing voice.
Context
Automatic indexing of singing voice is an relatively new research topic. The unique historical, ethnomusicological collection provides a unique testbed for a large spectrum of experimental work and benchmark new methods and approaches.
Objectives
Contribute to our knowledge of spoken/singing voice, the characterisation of different singing techniques and their improved automatic processing. The automatic indexing of speech/song/music audio documents enables new applications/usages of archives as well as of the WEB.
Work program
Singing voice only audio corpus:
  • acoustic characterisation of sung phonemes as opposed to spoken phonemes in different languages.
  • model estimation and automatic language identification.
Ethnomusicological recordings:
  • voice/music/noise classification
  • voice: speech/singing classification
  • model estimation and automatic language identification
Extra information
Prerequisite
Détails
Expected funding
Research contract
Status of funding
Expected
Candidates
Utilisateur
martine.adda-decker
Créé
Vendredi 26 février 2010 19:12:54 CET
dernière modif.
Vendredi 26 février 2010 19:12:54 CET

Fichiers joints

 filenamecrééhitsfilesize 
SujetTheseJLR.pdf 26 Feb 2010 19:1335734.87 Kb


Ecole Doctorale Informatique Paris-Sud


Directrice
Nicole Bidoit
Assistante
Stéphanie Druetta
Conseiller aux thèses
Dominique Gouyou-Beauchamps

ED 427 - Université Paris-Sud
UFR Sciences Orsay
Bat 650 - aile nord - 417
Tel : 01 69 15 63 19
Fax : 01 69 15 63 87
courriel: ed-info à lri.fr