Lip tracking and quick mouth shape extraction
It has been proven that automatic speech recognition systems can benefit from combining audible and visual inputs. Therefore, in this thesis we propose a system for lip tracking and fast mouth shape extraction. This system can be considered as a step towards the improvement of speech recognition systems.
In our system we start by extracting the face, the region of interest, from the background of the image in order to concentrate all future computations on that region of interest. In order to locate the position of the mouth, it is a good idea to know where another facial feature is located. Therefore, we locate the nostrils to use them as main references. Then we set six landmark points that are used to allow lip tracking and mouth shape extraction. The landmark points are located through the use of the jumping snake algorithm in conjunction to a data fitting technique. Finally we track the landmark points and extract the shape of the mouth through a sequence of video frames.
The proposed system works in a very fast manner. However, it is only useful for applications that do not require high precision. This is because our system is only capable of extracting the general shape of the mouth with low accuracy on the boundary between lips and skin of the face.