The MAVA corpus (MARCS Auditory-Visual Australian recordings of IEEE sentences) is a collection of high quality audiovisual recordings of 205 phonetically balanced sentences from the IEEE sentence database, recorded by a native Australian English female talker. The audio channel is annotated at the word and phoneme level. In addition, for the video channel, frame-by-frame lip contour X Y coordinates are provided. The center of the lip region is used as a reference for deriving four video regions: full face, upper face, lower face and lips. All files are freely available for download under the Creative Commons BY-NC-SA licence.

The MAVA corpus

To cite MAVA, please use:

Aubanel, V., Davis, C. Kim, J (2017). The MAVA corpus. [online resource]. DOI:

The research leading to these results was partly funded by the Autralian Research Council under grant agreement DP130104447. V. A. also acknowledges support from the European Research Council under the European Community’s Seventh Framework Program (FP7/2007-2013 Grant Agreement 339152.