Gesture of Interest: Gesture Search for Multi-Person, Multi-Perspective TV Footage

Parian, Mahnaz; Walzer, Claire; Rossetto, Luca; Heller, Silvan; Dupont, Stéphane; Schuldt, Heiko

doi:10.1109/CBMI50038.2021.9461887

Request a copy

Article (Scientific journals)

Gesture of Interest: Gesture Search for Multi-Person, Multi-Perspective TV Footage

Parian, Mahnaz; Walzer, Claire; Rossetto, Luca et al.

2021 • In International Conference on Content-Based Multimedia Indexing

Peer reviewed

Permalink
https://hdl.handle.net/20.500.12907/32045

DOI
10.1109/CBMI50038.2021.9461887

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

CBMI_Submission.pdf

Publisher postprint (1.49 MB)

Request a copy

All documents in ORBi UMONS are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Research center :

CRTI - Centre de Recherche en Technologie de l'Information

Disciplines :

Library & information sciences

Author, co-author :

Parian, Mahnaz

Walzer, Claire

Rossetto, Luca

Heller, Silvan

Dupont, Stéphane ; Université de Mons > Faculté Polytechnique > Service Information, Signal et Intelligence artificielle ; Université de Mons > Faculté des Sciences > Service d'Intelligence Artificielle

Schuldt, Heiko

Language :

English

Title :

Gesture of Interest: Gesture Search for Multi-Person, Multi-Perspective TV Footage

Publication date :

28 June 2021

Journal title :

International Conference on Content-Based Multimedia Indexing

ISSN :

1949-3991

Peer reviewed :

Peer reviewed

Research unit :

F105 - Information, Signal et Intelligence artificielle
S841 - Artificial Intelligence

Research institute :

R300 - Institut de Recherche en Technologies de l'Information et Sciences de l'Informatique

Available on ORBi UMONS :

since 18 January 2022

Statistics

Number of views

61 (3 by UMONS)

Number of downloads

0 (0 by UMONS)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

OpenAlex citations

Bibliography

J. Joo, F. F. Steen, and M. Turner, "Red hen lab: Dataset and tools for multimodal human communication research, " Künstliche Intelligenz, vol. 31, no. 4, 2017.
L. Rossetto, R. Gasser, J. Lokoc, W. Bailer, K. Schoeffmann, B. Muenzer, T. Soucek, P. A. Nguyen, P. Bolettieri, A. Leibetseder et al., "Interactive video retrieval in the age of deep learning-detailed evaluation of vbs 2019, " IEEE Transactions on Multimedia, 2020.
L. Rossetto, I. Giangreco, C. Tanase, and H. Schuldt, "vitrivr: A flexible retrieval stack supporting multiple query modes for searching in multimedia collections, " in Proceedings of the 24th ACM international conference on Multimedia, 2016.
K. He, G. Gkioxari, P. Dollár, and R. B. Girshick, "Mask R-CNN, " in IEEE International Conference on Computer Vision, ICCV, Italy, 2017.
S. Tripathi, M. Collins, M. Brown, and S. J. Belongie, "Pose2instance: Harnessing keypoints for person instance segmentation, " vol. abs/1704. 01152, 2017.
S. Zhang, R. Li, X. Dong, P. L. Rosin, Z. Cai, X. Han, D. Yang, H. Huang, and S. Hu, "Pose2seg: Detection free human instance segmentation, " in IEEE Conference on Computer Vision and Pattern Recognition, CVPR, USA, 2019.
D. Zhou and Q. He, "Poseg: Pose-aware refinement network for human instance segmentation, " IEEE Access, vol. 8, 2020.
K. Lin, L. Wang, K. Luo, Y. Chen, Z. Liu, and M.-T. Sun, "Cross-domain complementary learning using pose for multi-person part segmentation, " IEEE Transactions on Circuits and Systems for Video Technology, 2020.
F. Xia, P. Wang, X. Chen, and A. L. Yuille, "Joint multi-person pose estimation and semantic part segmentation, " in IEEE Conference on Computer Vision and Pattern Recognition, CVPR, USA, 2017.
Y. Chen, Z. Wang, Y. Peng, Z. Zhang, G. Yu, and J. Sun, "Cascaded pyramid network for multi-person pose estimation, " in IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, USA, 2018.
Z. Cao, T. Simon, S. Wei, and Y. Sheikh, "Realtime multi-person 2d pose estimation using part affinity fields, " in IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, USA, 2017.
P. Dollár, C. Wojek, B. Schiele, and P. Perona, "Pedestrian detection: A benchmark, " in IEEE Computer Society Conference on Computer Vision and Pattern Recognition CVPR, USA, 2009.
S. Paisitkriangkrai, C. Shen, and A. Van Den Hengel, "Learning to rank in person re-identification with metric ensembles, " in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015.
Y. Shen, W. Lin, J. Yan, M. Xu, J. Wu, and J. Wang, "Person reidentification with correspondence structure learning, " in Proceedings of the IEEE international conference on computer vision, 2015.
S. Ding, L. Lin, G. Wang, and H. Chao, "Deep feature learning with relative distance comparison for person re-identification, " Pattern Recognit., vol. 48, no. 10, 2015.
E. Ristani and C. Tomasi, "Features for multi-target multi-camera tracking and re-identification, " in IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, USA, 2018.
T. Xiao, S. Li, B. Wang, L. Lin, and X. Wang, "Joint detection and identification feature learning for person search, " in IEEE Conference on Computer Vision and Pattern Recognition, CVPR, USA, 2017.
A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, and F. Li, "Large-scale video classification with convolutional neural networks, " in IEEE Conference on Computer Vision and Pattern Recognition CVPR, USA, 2014.
S. Hochreiter and J. Schmidhuber, "Long short-term memory, " Neural Comput., vol. 9, no. 8, 1997.
K. Simonyan and A. Zisserman, "Two-stream convolutional networks for action recognition in videos, " in Advances in Neural Information Processing Systems, Canada, 2014.
S. Ji, W. Xu, M. Yang, and K. Yu, "3d convolutional neural networks for human action recognition, " Trans. Pattern Anal. Mach. Intell., 2013.
L. Zhang, G. Zhu, P. Shen, and J. Song, "Learning spatiotemporal features using 3DCNN and convolutional LSTM for gesture recognition, " in IEEE International Conference on Computer Vision, Italy, 2017.
J. Carreira and A. Zisserman, "Quo vadis, action recognition? A new model and the kinetics dataset, " in IEEE Conference on Computer Vision and Pattern Recognition, CVPR, USA, 2017.
S. Sharma, R. Kiros, and R. Salakhutdinov, "Action recognition using visual attention, " CoRR, vol. abs/1511. 04119, 2015.
Z. Li, K. Gavrilyuk, E. Gavves, M. Jain, and C. G. M. Snoek, "Videolstm convolves, attends and flows for action recognition, " Comput. Vis. Image Underst., vol. 166, 2018.
R. Girdhar, J. Carreira, C. Doersch, and A. Zisserman, "Video action transformer network, " in IEEE Conference on Computer Vision and Pattern Recognition, CVPR, USA, 2019.
M. Jaderberg, K. Simonyan, A. Zisserman, and K. Kavukcuoglu, "Spatial transformer networks, " in Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems, Montreal, Quebec, Canada, 2015.
C. Cao, Y. Zhang, C. Zhang, and H. Lu, "Body joint guided 3D deep convolutional descriptors for action recognition, " IEEE Trans. Cybern., vol. 48, no. 3, 2018.
Y. Du, W. Wang, and L. Wang, "Hierarchical recurrent neural network for skeleton based action recognition, " in IEEE Conference on Computer Vision and Pattern Recognition, CVPR, USA, 2015.
W. Du, Y. Wang, and Y. Qiao, "RPAN: An end-to-end recurrent poseattention network for action recognition in videos, " in Proceedings of the IEEE International Conference on Computer Vision, 2017.
V. Ferrari, M. J. Marín-Jiménez, and A. Zisserman, "Pose search: Retrieving people using their pose, " in IEEE Computer Society Conference on Computer Vision and Pattern Recognition CVPR, USA, 2009.
S. Yousefi and H. Li, "3D hand gesture analysis through a real-time gesture search engine, " International Journal of Advanced Robotic Systems, vol. 12, no. 6, 2015.
M. A. Parian, L. Rossetto, H. Schuldt, and S. Dupont, "Are you watching closely? content-based retrieval of hand gestures, " in Proceedings of the International Conference on Multimedia Retrieval, Ireland, 2020.
K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition, " in IEEE Conference on Computer Vision and Pattern Recognition, CVPR, USA, 2016.
L. Wang, Y. Xiong, Z. Wang, Y. Qiao, D. Lin, X. Tang, and L. Van Gool, "Temporal segment networks: Towards good practices for deep action recognition, " Lecture Notes in Computer Science, 2016.
F. Schroff, D. Kalenichenko, and J. Philbin, "Facenet: A unified embedding for face recognition and clustering, " in IEEE Conference on Computer Vision and Pattern Recognition, CVPR, USA, 2015.
J. Wan, S. Z. Li, Y. Zhao, S. Zhou, I. Guyon, and S. Escalera, "Chalearn looking at people RGB-D isolated and continuous datasets for gesture recognition, " in IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPR, USA, 2016.