January 10, 2025
The Gist Editors' notes
This text has been reviewed in keeping with Science X's editorial course of and insurance policies. Editors have highlighted the next attributes whereas guaranteeing the content material's credibility:
fact-checked
proofread
Group exercise recognition: A dataset with detailed annotation and wealthy semantics
Group exercise recognition (GAR), which goals to determine actions carried out collectively in movies, has gained vital consideration just lately. Present GAR datasets sometimes annotate solely a single Group Exercise (GA) occasion per pattern, rigorously chosen from authentic movies.
This strategy, whereas exact, diverges considerably from real-world contexts, which regularly contain a number of GA situations. Furthermore, single word-level annotations are inadequate to encapsulate the complicated semantic info in GA, thereby constraining the enlargement and analysis of different GA-related duties.
To mitigate these limitations, a analysis workforce led by Wang Yun-Hong (Beihang College, China) revealed their analysis on 15 December 2024 in Frontiers of Pc Science.
The workforce proposed FIFAWC, a novel dataset for GAR characterised by three notable distinctions:
- Complete annotation: They completely annotate all included GAs in every pattern and retain the unique body rely, diverging from earlier datasets that concentrate on a single GA annotation and uniform body normalization, which boosts the dataset's complexity and sensible utility potential for superior analysis.
- Semantic description: Every clip in FIFAWC is accompanied by an elaborate caption from sports activities commentators, guaranteeing content material accuracy and professionalism. This positions FIFAWC as a knowledge basis for quite a lot of duties, resembling video captioning and retrieval.
- New situation: FIFAWC marks a novel divergence from earlier ones by that includes soccer match footage. The expansive spatial areas and speedy actions attribute of soccer introduce new challenges, resembling dynamic digital camera actions and smaller targets in frames, considerably elevating the complexity and issue of GAR.
Within the analysis, they benchmark FIFAWC on two duties: conventional GAR and revolutionary GA video captioning. For GAR, they consider the classical detector-based strategy ARG, and the state-of-the-art detector-free DFWSGAR.
The outcomes reveal excessive accuracy at class degree, however low accuracy at pattern degree due to a number of GAs per pattern, reflecting the complexity and problem of FIFAWC. In comparison with the exemplary efficiency (25.87 when it comes to CIDEr) of PDVC on the ActivityNet dataset, the poor efficiency on FIFAWC signifies that additional analysis is critical for GA video captioning.
Extra info: Duoxuan Pei et al, FIFAWC: a dataset with detailed annotation and wealthy semantics for group exercise recognition, Frontiers of Pc Science (2024). DOI: 10.1007/s11704-024-40027-3
Offered by Increased Schooling Press Quotation: Group exercise recognition: A dataset with detailed annotation and wealthy semantics (2025, January 10) retrieved 10 January 2025 from https://techxplore.com/information/2025-01-group-recognition-dataset-annotation-rich.html This doc is topic to copyright. Aside from any truthful dealing for the aim of personal examine or analysis, no half could also be reproduced with out the written permission. The content material is offered for info functions solely.
Discover additional
X-ray vision-language basis mannequin enhances medical diagnostics shares
Feedback to editors