CRTI - Centre de Recherche en Technologie de l'Information
Disciplines :
Library & information sciences
Author, co-author :
Parian, Mahnaz
Walzer, Claire
Rossetto, Luca
Heller, Silvan
Dupont, Stéphane ; Université de Mons > Faculté Polytechnique > Service Information, Signal et Intelligence artificielle ; Université de Mons > Faculté des Sciences > Service d'Intelligence Artificielle
Schuldt, Heiko
Language :
English
Title :
Gesture of Interest: Gesture Search for Multi-Person, Multi-Perspective TV Footage
Publication date :
28 June 2021
Journal title :
International Conference on Content-Based Multimedia Indexing
ISSN :
1949-3991
Peer reviewed :
Peer reviewed
Research unit :
F105 - Information, Signal et Intelligence artificielle S841 - Artificial Intelligence
Research institute :
R300 - Institut de Recherche en Technologies de l'Information et Sciences de l'Informatique
J. Joo, F. F. Steen, and M. Turner, "Red hen lab: Dataset and tools for multimodal human communication research, " Künstliche Intelligenz, vol. 31, no. 4, 2017.
L. Rossetto, R. Gasser, J. Lokoc, W. Bailer, K. Schoeffmann, B. Muenzer, T. Soucek, P. A. Nguyen, P. Bolettieri, A. Leibetseder et al., "Interactive video retrieval in the age of deep learning-detailed evaluation of vbs 2019, " IEEE Transactions on Multimedia, 2020.
L. Rossetto, I. Giangreco, C. Tanase, and H. Schuldt, "vitrivr: A flexible retrieval stack supporting multiple query modes for searching in multimedia collections, " in Proceedings of the 24th ACM international conference on Multimedia, 2016.
K. He, G. Gkioxari, P. Dollár, and R. B. Girshick, "Mask R-CNN, " in IEEE International Conference on Computer Vision, ICCV, Italy, 2017.
S. Tripathi, M. Collins, M. Brown, and S. J. Belongie, "Pose2instance: Harnessing keypoints for person instance segmentation, " vol. abs/1704. 01152, 2017.
S. Zhang, R. Li, X. Dong, P. L. Rosin, Z. Cai, X. Han, D. Yang, H. Huang, and S. Hu, "Pose2seg: Detection free human instance segmentation, " in IEEE Conference on Computer Vision and Pattern Recognition, CVPR, USA, 2019.
D. Zhou and Q. He, "Poseg: Pose-aware refinement network for human instance segmentation, " IEEE Access, vol. 8, 2020.
K. Lin, L. Wang, K. Luo, Y. Chen, Z. Liu, and M.-T. Sun, "Cross-domain complementary learning using pose for multi-person part segmentation, " IEEE Transactions on Circuits and Systems for Video Technology, 2020.
F. Xia, P. Wang, X. Chen, and A. L. Yuille, "Joint multi-person pose estimation and semantic part segmentation, " in IEEE Conference on Computer Vision and Pattern Recognition, CVPR, USA, 2017.
Y. Chen, Z. Wang, Y. Peng, Z. Zhang, G. Yu, and J. Sun, "Cascaded pyramid network for multi-person pose estimation, " in IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, USA, 2018.
Z. Cao, T. Simon, S. Wei, and Y. Sheikh, "Realtime multi-person 2d pose estimation using part affinity fields, " in IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, USA, 2017.
P. Dollár, C. Wojek, B. Schiele, and P. Perona, "Pedestrian detection: A benchmark, " in IEEE Computer Society Conference on Computer Vision and Pattern Recognition CVPR, USA, 2009.
S. Paisitkriangkrai, C. Shen, and A. Van Den Hengel, "Learning to rank in person re-identification with metric ensembles, " in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015.
Y. Shen, W. Lin, J. Yan, M. Xu, J. Wu, and J. Wang, "Person reidentification with correspondence structure learning, " in Proceedings of the IEEE international conference on computer vision, 2015.
S. Ding, L. Lin, G. Wang, and H. Chao, "Deep feature learning with relative distance comparison for person re-identification, " Pattern Recognit., vol. 48, no. 10, 2015.
E. Ristani and C. Tomasi, "Features for multi-target multi-camera tracking and re-identification, " in IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, USA, 2018.
T. Xiao, S. Li, B. Wang, L. Lin, and X. Wang, "Joint detection and identification feature learning for person search, " in IEEE Conference on Computer Vision and Pattern Recognition, CVPR, USA, 2017.
A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, and F. Li, "Large-scale video classification with convolutional neural networks, " in IEEE Conference on Computer Vision and Pattern Recognition CVPR, USA, 2014.
S. Hochreiter and J. Schmidhuber, "Long short-term memory, " Neural Comput., vol. 9, no. 8, 1997.
K. Simonyan and A. Zisserman, "Two-stream convolutional networks for action recognition in videos, " in Advances in Neural Information Processing Systems, Canada, 2014.
S. Ji, W. Xu, M. Yang, and K. Yu, "3d convolutional neural networks for human action recognition, " Trans. Pattern Anal. Mach. Intell., 2013.
L. Zhang, G. Zhu, P. Shen, and J. Song, "Learning spatiotemporal features using 3DCNN and convolutional LSTM for gesture recognition, " in IEEE International Conference on Computer Vision, Italy, 2017.
J. Carreira and A. Zisserman, "Quo vadis, action recognition? A new model and the kinetics dataset, " in IEEE Conference on Computer Vision and Pattern Recognition, CVPR, USA, 2017.
S. Sharma, R. Kiros, and R. Salakhutdinov, "Action recognition using visual attention, " CoRR, vol. abs/1511. 04119, 2015.
Z. Li, K. Gavrilyuk, E. Gavves, M. Jain, and C. G. M. Snoek, "Videolstm convolves, attends and flows for action recognition, " Comput. Vis. Image Underst., vol. 166, 2018.
R. Girdhar, J. Carreira, C. Doersch, and A. Zisserman, "Video action transformer network, " in IEEE Conference on Computer Vision and Pattern Recognition, CVPR, USA, 2019.
M. Jaderberg, K. Simonyan, A. Zisserman, and K. Kavukcuoglu, "Spatial transformer networks, " in Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems, Montreal, Quebec, Canada, 2015.
C. Cao, Y. Zhang, C. Zhang, and H. Lu, "Body joint guided 3D deep convolutional descriptors for action recognition, " IEEE Trans. Cybern., vol. 48, no. 3, 2018.
Y. Du, W. Wang, and L. Wang, "Hierarchical recurrent neural network for skeleton based action recognition, " in IEEE Conference on Computer Vision and Pattern Recognition, CVPR, USA, 2015.
W. Du, Y. Wang, and Y. Qiao, "RPAN: An end-to-end recurrent poseattention network for action recognition in videos, " in Proceedings of the IEEE International Conference on Computer Vision, 2017.
V. Ferrari, M. J. Marín-Jiménez, and A. Zisserman, "Pose search: Retrieving people using their pose, " in IEEE Computer Society Conference on Computer Vision and Pattern Recognition CVPR, USA, 2009.
S. Yousefi and H. Li, "3D hand gesture analysis through a real-time gesture search engine, " International Journal of Advanced Robotic Systems, vol. 12, no. 6, 2015.
M. A. Parian, L. Rossetto, H. Schuldt, and S. Dupont, "Are you watching closely? content-based retrieval of hand gestures, " in Proceedings of the International Conference on Multimedia Retrieval, Ireland, 2020.
K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition, " in IEEE Conference on Computer Vision and Pattern Recognition, CVPR, USA, 2016.
L. Wang, Y. Xiong, Z. Wang, Y. Qiao, D. Lin, X. Tang, and L. Van Gool, "Temporal segment networks: Towards good practices for deep action recognition, " Lecture Notes in Computer Science, 2016.
F. Schroff, D. Kalenichenko, and J. Philbin, "Facenet: A unified embedding for face recognition and clustering, " in IEEE Conference on Computer Vision and Pattern Recognition, CVPR, USA, 2015.
J. Wan, S. Z. Li, Y. Zhao, S. Zhou, I. Guyon, and S. Escalera, "Chalearn looking at people RGB-D isolated and continuous datasets for gesture recognition, " in IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPR, USA, 2016.