H Disponible en ligne
Nouveau
MEMOIRES ET THESES
Iwan, Lukman Hakim
Melbourne, Australie, 2015
Typically, archival performance videos are: filmed in a single shot, lengthy, affected by camera operation, and originate from various video formats. To be useful, a video of a whole performance needs to be segmented into discrete acts that represent individual clips within the total performance; however, this is not a simple task due to the characteristics of the video content.
The Circus Oz video collection is an existing performance video archive that comprises over 1,074 videos totaling over 1,000 hours of viewing. To deliver their video collection to users, a prototype of the Circus Oz performance video archive system has been developed which includes system architecture and database schema.
For the purpose of video segmentation, we identify the specific clues that indicate where a performance video is likely to be segmented: that is when an applause sound is detected in combination with one or more other clues such as black frames and image changes.
An applause detection technique for multiple applause classes has been proposed. In order to evaluate the performance of the proposed technique, an audio data set together with applause ground truth data on a sample of the Circus Oz performance videos have been developed. This applause data set contains three applause classes: less clap, more clap, and pure clap.
The proposed applause detection technique uses both characteristic-based and classification- based approaches. Our experiments show that minimum applause strength and duration values are the two essential threshold values for improving the precision of applause detection using the classification-based approach. In this approach, we found the optimum combination of several audio features. In our applause classification experiment, we achieved 83%, 94%, and 100% correctly classified for quaternary, ternary, and binary class classification respectively.
Using the clues we identified, we proposed a method for detecting end-of-act using applause sound detection, black frames detection and image comparison. The experiment shows that the precision and recall of the end-of-act-detection method is 49% and 92% respectively, making the task of manual annotators much more productive. [author summary]
Typically, archival performance videos are: filmed in a single shot, lengthy, affected by camera operation, and originate from various video formats. To be useful, a video of a whole performance needs to be segmented into discrete acts that represent individual clips within the total performance; however, this is not a simple task due to the characteristics of the video content.
The Circus Oz video collection is an existing performance video ...