Automated choreographic notation for ballet using computer vision

Margaux  Bowditch

Over recent decades, significant advancements in the computer vision field have been driven by factors such as deep learning techniques, hardware acceleration, and the availability of large-scale datasets. Furthermore, the computer vision field is continually extending its reach across various application domains, rapidly fuelling the emergence of novel applications. Classical ballet, a captivating art form characterised by its principles of grace, precision, and narrative conveyed through movement, presents an especially intriguing application domain. Yet, there remains room for the utilisation of computer vision technology in ballet, particularly in the notation of choreography. Recording ballet choreography in a dance notation format effectively protects choreographers’ original works and accurately preserves the dance heritage of past and future generations. This thesis proposes an approach for the automated notation of ballet choreography using computer vision techniques. A novel video dataset, AnnChor, is presented first to address the need for a highquality annotated dataset for ballet. A baseline study is conducted to evaluate the dataset for the task of temporal action localisation using Coarse-Fine Networks and TriDet models. A choreographic ontology and digital bit vector approach are then created as a basis for an appropriate intermediate representation of dance notation for computer vision. Furthermore, a rule-based approach based on pose estimated data and the developed bit vector representation is used to generate ground-truth digital dance notation data. The digital bit vector representation is inspected for distinct groupings of different actions using t-distributed stochastic neighbour embedding. Finally, all the components of the study can be assembled to construct computer vision models for automated choreographic notation. Accordingly, encoderdecoder- based sequence models for predicting dance notation from pose estimated data are implemented and evaluated. The results of the benchmark performed on the final developed sequence models reveal that the study accomplishes the overall aim of automating the notation of ballet choreography using computer vision. An ablation study and key results show that our models achieve promising results. The top-performing model achieves low error on metrics including the mean squared error (0.01), mean absolute error (0.02), root mean squared error (0.12) and mean absolute percentage error (2.19 %). Additionally, the top-performing model correlation results demonstrated high correlation with the ground truth data including metrics such as: coefficient of determination (R2) (0.87), Spearman correlation (0.6), Pearson correlation (0.93) and Matthew’s correlation coefficient (0.93). Further key findings indicate that ballet movements are intricate and that certain positions of the body, which involve subtle differences in foot orientations or high variance in arm positions, may contribute to more error. Future work includes exploring alternative model architectures to improve baseline results in light of the error variance revealed in the study. The overall significance of this research work lies in the fact that it ventures into unexplored territory, marking a first step in demonstrating feasibility for fine-grained temporal action localisation and automated notation translation in ballet. Therefore, this research provides choreographers, choreologists, and dancers with a valuable tool that enables the preservation of their dance heritage and legacy.

Automated choreographic notation for ballet using computer vision

Abstract

Files and links (1)

Metrics

Details