Deep learning methods for processing endoscopic high-speed video and laryngeal parameter estimation
Authors
More about the book
Deep learning methods have had tremendous impact in computer vision, image processing and all areas that relate to these fields. This dissertation explores the application of these methods to the enhancement and processing of endoscopic high-speed video (HSV). HSV is one of the main technique used in voice research as the small-scale, rapid oscillation of the vocal folds requires sophisticated recording techniques. As voice disorders have been shown to have a tremendous negative impact on the quality of life of the affected and society in general, a new generation of more objective diagnostic techniques is required. This dissertation features several contributions towards this goal: - An innovative method to enhance low-light HSV using an improved U-Net convolutional neural network - A robust and fast deep-learning-based automatic method for the segmentation of the glottis in HSV data - Development of an improved two-mass-model of the vocal folds - Proof of concept of estimating ex-vivo subglottal pressure validated on experimental data - Proof of concept of estimating subglottal pressure with a recurrent neural network trained on a numerical model After a thorough introduction to the field of voice research and deep learning the dissertation describes the developed methods and results in detail. The dissertation describes signifcant improvements in regard to low-light image enhancement, automatic glottis segmentation physical voice parameter inference.