Conference Paper (published)
Details
Citation
Abel A, Marxer R, Hussain A, Barker J, Watt R, Whitmer B & Derleth P (2016) A Data Driven Approach to Audiovisual Speech Mapping. In: Liu C, Hussain A, Luo B, Tan K, Zeng Y & Zhang Z (eds.) Advances in Brain Inspired Cognitive Systems. Lecture Notes in Computer Science, 10023. BICS 2016: International Conference on Brain Inspired Cognitive Systems, Beijing, China, 28.11.2016-30.11.2016. Cham, Switzerland: Springer, pp. 331-342. https://doi.org/10.1007/978-3-319-49685-6_30
Abstract
The concept of using visual information as part of audio speech processing has been of significant recent interest. This paper presents a data driven approach that considers estimating audio speech acoustics using only temporal visual information without considering linguistic features such as phonemes and visemes. Audio (log filterbank) and visual (2D-DCT) features are extracted, and various configurations of MLP and datasets are used to identify optimal results, showing that given a sequence of prior visual frames an equivalent reasonably accurate audio frame estimation can be mapped.
Keywords
Audiovisual; Speech processing; Speech mapping; ANNs
| Status | Published |
|---|---|
| Funders | Engineering and Physical Sciences Research Council |
| Title of series | Lecture Notes in Computer Science |
| Number in series | 10023 |
| Publication date | 31/12/2016 |
| Publication date online | 30/11/2016 |
| URL | http://hdl.handle.net/1893/24710 |
| Publisher | Springer |
| Place of publication | Cham, Switzerland |
| ISSN of series | 0302-9743 |
| ISBN | 978-3-319-49685-6 |
| Conference | BICS 2016: International Conference on Brain Inspired Cognitive Systems |
| Conference location | Beijing, China |
| Dates |
People (1)
Emeritus Professor, Psychology