Depth prediction from 2D images: A taxonomy and an evaluation study

Moreau, Ambroise; Mancas, Matei; Dutoit, Thierry

Request a copy

Article (Scientific journals)

Depth prediction from 2D images: A taxonomy and an evaluation study

Moreau, Ambroise; Mancas, Matei; Dutoit, Thierry

2019 • In Image and Vision Computing

Peer Reviewed verified by ORBi

Permalink
https://hdl.handle.net/20.500.12907/42251

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

1-s2.0-S0262885619304184-main.pdf

Author postprint (3.02 MB)

Request a copy

All documents in ORBi UMONS are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Author, co-author :

Moreau, Ambroise ; Université de Mons > Faculté Polytechnique > Service Information, Signal et Intelligence artificielle

Mancas, Matei ; Université de Mons > Faculté Polytechnique > Information, Signal et Intelligence artificielle

Dutoit, Thierry ; Université de Mons > Faculté Polytechnique > Service Information, Signal et Intelligence artificielle

Language :

English

Title :

Depth prediction from 2D images: A taxonomy and an evaluation study

Publication date :

10 November 2019

Journal title :

Image and Vision Computing

ISSN :

0262-8856

Publisher :

Elsevier, Netherlands

Peer reviewed :

Peer Reviewed verified by ORBi

Research unit :

F105 - Information, Signal et Intelligence artificielle

Research institute :

R450 - Institut NUMEDIART pour les Technologies des Arts Numériques

Available on ORBi UMONS :

since 29 November 2019

Statistics

Number of views

96 (0 by UMONS)

Number of downloads

1 (1 by UMONS)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

Bibliography

Courtney, J., Magee, M.J., Aggarwal, J.K., Robot guidance using computer vision. Pattern Recogn. 17:6 (1984), 585–592, 10.1016/0031-3203(84)90012-8.
Gandhi, T., Trivedi, M.M., Pedestrian collision avoidance systems: a survey of computer vision based recent studies. 2006 IEEE Intelligent Transportation Systems Conference, 2006, IEEE, 976–981, 10.1109/ITSC.2006.1706871.
Lenz, I., Lee, H., Saxena, A., Deep learning for detecting robotic grasps. Int. J. Robot. Res. 34:4-5 (2015), 705–724, 10.1177/0278364914549607.
Michels, J., Saxena, A., Ng, A.Y., High speed obstacle avoidance using monocular vision and reinforcement learning. Proceedings of the 22nd International Conference on Machine Learning, 2005, ACM, 593–600, 10.1145/1102351.1102426.
Fehn, C., Kauff, P., De Beeck, M.O., Ernst, F., Ijsselsteijn, W., Pollefeys, M., Van Gool, L., Ofek, E., Sexton, I., An evolutionary and optimised approach on 3D-TV. Proc. of IBC, 2, 2002, 357–365.
Zhang, L., Vazquez, C., Knorr, S., 3D-TV content creation: automatic 2D-to-3D video conversion. IEEE Trans. Broadcast. 57:2 (2011), 372–383, 10.1109/TBC.2011.2122930.
Xie, J., Girshick, R., Farhadi, A., Deep3d: fully automatic 2d-to-3d video conversion with deep convolutional neural networks. European Conference on Computer Vision, 2016, Springer, 842–857, 10.1007/978-3-319-46493-0_51.
Coren, S., Subjective contours and apparent depth. Psychol. Rev., 79(4), 1972, 359, 10.1037/h0032940.
Troscianko, T., Montagnon, R., Le Clerc, J., Malbert, E., Chanteau, P.-L., The role of colour as a monocular depth cue. Vis. Res. 31:11 (1991), 1923–1929, 10.1016/0042-6989(91)90187-A.
O'Shea, R.P., Blackburn, S.G., Ono, H., Contrast as a depth cue. Vis. Res. 34:12 (1994), 1595–1604, 10.1016/0042-6989(94)90116-3.
Saxena, A., Chung, S.H., Ng, A.Y., Learning depth from single monocular images. Advances in Neural Information Processing Systems, 2006, 1161–1168.
Saxena, A., Schulte, J., Ng, A.Y., Depth estimation using monocular and stereo cues. IJCAI, 7, 2007, 2197–2203.
Hartley, R.I., Zisserman, A., Multiple View Geometry in Computer Vision. second, 2004, Cambridge University Press, 10.1017/CBO9780511811685 ISBN: 0521540518.
Scharstein, D., Szeliski, R., A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int. J. Comput. Vis. 47:1-3 (2002), 7–42, 10.1023/A:1014573219977.
Ullman, S., The interpretation of structure from motion. Proc. R. Soc. Lond. Series B. Biol. Sci. 203:1153 (1979), 405–426, 10.1098/rspb.1979.0006.
Faugeras, O.D., Lustman, F., Motion and structure from motion in a piecewise planar environment. Int. J. Pattern Recognit. Artif. Intell. 2:03 (1988), 485–508, 10.1142/s0218001488000285.
Koenderink, J.J., Van Doorn, A.J., Affine structure from motion. JOSA A 8:2 (1991), 377–385, 10.1364/JOSAA.8.000377.
Schonberger, J.L., Frahm, J.-M., Structure-from-motion revisited. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, 4104–4113, 10.1109/CVPR.2016.445.
Özyeşil, O., Voroninski, V., Basri, R., Singer, A., A survey of structure from motion. Acta Numer. 26 (2017), 305–364.
Eigen, D., Puhrsch, C., Fergus, R., Depth map prediction from a single image using a multi-scale deep network. Advances in Neural Information Processing Systems, 2014, 2366–2374.
Eigen, D., Fergus, R., Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. Proceedings of the IEEE International Conference on Computer Vision, 2015, 2650–2658, 10.1109/iccv.2015.304.
Liu, F., Shen, C., Lin, G., Deep convolutional neural fields for depth estimation from a single image. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, 5162–5170, 10.1109/CVPR.2015.7299152.
Liu, F., Shen, C., Lin, G., Reid, I., Learning depth from single monocular images using deep convolutional neural fields. IEEE Trans. Pattern. Anal. Mach. Intell. 38:10 (2015), 2024–2039, 10.1109/TPAMI.2015.2505283.
Garg, R., BG, V.K., Carneiro, G., Reid, I., Unsupervised CNN for single view depth estimation: geometry to the rescue. European Conference on Computer Vision, 2016, Springer, 740–756, 10.1007/978-3-319-46484-8_45.
Mayer, N., Ilg, E., Hausser, P., Fischer, P., Cremers, D., Dosovitskiy, A., Brox, T., A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, 4040–4048, 10.1109/CVPR.2016.438.
Ummenhofer, B., Zhou, H., Uhrig, J., Mayer, N., Ilg, E., Dosovitskiy, A., Brox, T., Demon: depth and motion network for learning monocular stereo. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, 5038–5047, 10.1109/CVPR.2017.596.
Vijayanarasimhan, S., Ricco, S., Schmid, C., Sukthankar, R., Fragkiadaki, K., Sfm-net: learning of structure and motion from video. arXiv preprint arXiv:1704.07804, 2017.
Zhou, T., Brown, M., Snavely, N., Lowe, D.G., Unsupervised learning of depth and ego-motion from video. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, 1851–1858, 10.1109/CVPR.2017.700.
Godard, C., Mac Aodha, O., Brostow, G.J., Unsupervised monocular depth estimation with left-right consistency. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, 270–279, 10.1109/CVPR.2017.699.
Kendall, A., Martirosyan, H., Dasgupta, S., Henry, P., Kennedy, R., Bachrach, A., Bry, A., End-to-end learning of geometry and context for deep stereo regression. Proceedings of the IEEE International Conference on Computer Vision, 2017, 66–75, 10.1109/ICCV.2017.17.
Godard, C., Mac Aodha, O., Firman, M., Brostow, G., Digging into self-supervised monocular depth estimation. arXiv preprint arXiv:1806.01260, 2018.
Mahjourian, R., Wicke, M., Angelova, A., Unsupervised learning of depth and ego-motion from monocular video using 3d geometric constraints. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, 5667–5675, 10.1109/CVPR.2018.00594.
Wang, C., Miguel Buenaposada, J., Zhu, R., Lucey, S., Learning depth from monocular videos using direct methods. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, 2022–2030, 10.1109/CVPR.2018.00216.
Kendall, A., Gal, Y., Cipolla, R., Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, 7482–7491, 10.1109/CVPR.2018.00781.
Howard, I.P., Perceiving in Depth, Volume 1: Basic Mechanisms. 2012, Oxford University Press, 10.1093/acprof:oso/9780199764143.001.0001.
Hirschmuller, H., Scharstein, D., Evaluation of cost functions for stereo matching. 2007 IEEE Conference on Computer Vision and Pattern Recognition, 2007, IEEE, 1–8, 10.1109/CVPR.2007.383248.
Marr, D., Poggio, T., Cooperative computation of stereo disparity. Science 194:4262 (1976), 283–287, 10.1126/science.968482.
Fua, P., A parallel stereo algorithm that produces dense depth maps and preserves image features. Mach. Vis. Appl. 6:1 (1993), 35–49, 10.1007/BF01212430.
Yang, Q., Engels, C., Akbarzadeh, A., Near real-time stereo for weakly-textured scenes. BMVC, 2008, 1–10, 10.5244/C.22.72.
Bianco, S., Ciocca, G., Marelli, D., Evaluating the performance of structure from motion pipelines. J. Imaging, 4(8), 2018, 98, 10.3390/jimaging4080098.
Kanade, T., Amidi, O., Ke, Q., Real-time and 3D vision for autonomous small and micro air vehicles. 2004 43rd IEEE Conference on Decision and Control (CDC)(IEEE Cat. No. 04CH37601), 2, 2004, IEEE, 1655–1662, 10.1109/CDC.2004.1430282.
Hoiem, D., Efros, A.A., Hebert, M., Automatic photo pop-up. ACM transactions on graphics (TOG), 24, 2005, ACM, 577–584, 10.1145/1186822.1073232.
Saxena, A., Sun, M., Ng, A.Y., Make3d: learning 3d scene structure from a single still image. IEEE Trans. Pattern Anals. Mach. Intell. 31:5 (2008), 824–840, 10.1109/TPAMI.2008.132.
Liu, B., Gould, S., Koller, D., Single image depth estimation from predicted semantic labels. 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010, IEEE, 1253–1260.
Karsch, K., Liu, C., Kang, S.B., Depth transfer: depth extraction from video using non-parametric sampling. IEEE Trans. Pattern Anal. Mach. Intell. 36:11 (2014), 2144–2158, 10.1109/TPAMI.2014.2316835.
Lafferty, J., McCallum, A., Pereira, F.C., Conditional random fields: probabilistic models for segmenting and labeling sequence data. 2001.
Dosovitskiy, A., Fischer, P., Ilg, E., Hausser, P., Hazirbas, C., Golkov, V., Van Der Smagt, P., Cremers, D., Brox, T., Flownet: learning optical flow with convolutional networks. Proceedings of the IEEE International Conference on Computer Vision, 2015, 2758–2766, 10.1109/ICCV.2015.316.
Luo, W., Schwing, A.G., Urtasun, R., Efficient deep learning for stereo matching. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, 5695–5703, 10.1109/CVPR.2016.614.
Geiger, A., Lenz, P., Urtasun, R., Are we ready for autonomous driving? The KITTI vision benchmark suite. 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012, IEEE, 3354–3361, 10.1109/CVPR.2012.6248074.
Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., Schiele, B., The Cityscapes dataset for semantic urban scene understanding. Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, 3213–3223, 10.1109/CVPR.2016.350.
Riche, N., Duvinage, M., Mancas, M., Gosselin, B., Dutoit, T., Saliency and human fixations: state-of-the-art and study of comparison metrics. Proceedings of the IEEE international conference on computer vision, 2013, 1153–1160, 10.1109/ICCV.2013.147.
Howell, D.C., Statistical methods for psychology. 2009, Cengage Learning.