Rui Hou, Zhao Wang
Dec 17, 2021
Skeleton sequences are widely used for action recognition task due to its lightweight and compact characteristics. Recent graph convolutional network (GCN) approaches have achieved great success for skeleton-based action recognition since its grateful modeling ability of non-Euclidean data. GCN is able to utilize the short-range joint dependencies while lack to directly model the distant joints relations that are vital to distinguishing various actions. Thus, many GCN approaches try to employ hierarchical mechanism to aggregate wider-range neighborhood information. We propose a novel self-attention based skeleton-anchor proposal (SAP) module to comprehensively model the internal relations of a human body for motion feature learning. The proposed SAP module aims to explore inherent relationship within human body using a triplet representation via encoding high order angle information rather than the fixed pair-wise bone connection used in the existing hierarchical GCN approaches. A Self-attention based anchor selection method is designed in the proposed SAP module for extracting the root point of encoding angular information. By coupling proposed SAP module with popular spatial-temporal graph neural networks, e.g. MSG3D, it achieves new state-of-the-art accuracy on challenging benchmark datasets. Further ablation study have shown the effectiveness of our proposed SAP module, which is able to obviously improve the performance of many popular skeleton-based action recognition methods.