Gandhi, T., Trivedi, M.M., Pedestrian collision avoidance systems: a survey of computer vision based recent studies. 2006 IEEE Intelligent Transportation Systems Conference, 2006, IEEE, 976–981, 10.1109/ITSC.2006.1706871.
Lenz, I., Lee, H., Saxena, A., Deep learning for detecting robotic grasps. Int. J. Robot. Res. 34:4-5 (2015), 705–724, 10.1177/0278364914549607.
Michels, J., Saxena, A., Ng, A.Y., High speed obstacle avoidance using monocular vision and reinforcement learning. Proceedings of the 22nd International Conference on Machine Learning, 2005, ACM, 593–600, 10.1145/1102351.1102426.
Fehn, C., Kauff, P., De Beeck, M.O., Ernst, F., Ijsselsteijn, W., Pollefeys, M., Van Gool, L., Ofek, E., Sexton, I., An evolutionary and optimised approach on 3D-TV. Proc. of IBC, 2, 2002, 357–365.
Xie, J., Girshick, R., Farhadi, A., Deep3d: fully automatic 2d-to-3d video conversion with deep convolutional neural networks. European Conference on Computer Vision, 2016, Springer, 842–857, 10.1007/978-3-319-46493-0_51.
Troscianko, T., Montagnon, R., Le Clerc, J., Malbert, E., Chanteau, P.-L., The role of colour as a monocular depth cue. Vis. Res. 31:11 (1991), 1923–1929, 10.1016/0042-6989(91)90187-A.
O'Shea, R.P., Blackburn, S.G., Ono, H., Contrast as a depth cue. Vis. Res. 34:12 (1994), 1595–1604, 10.1016/0042-6989(94)90116-3.
Saxena, A., Chung, S.H., Ng, A.Y., Learning depth from single monocular images. Advances in Neural Information Processing Systems, 2006, 1161–1168.
Saxena, A., Schulte, J., Ng, A.Y., Depth estimation using monocular and stereo cues. IJCAI, 7, 2007, 2197–2203.
Hartley, R.I., Zisserman, A., Multiple View Geometry in Computer Vision. second, 2004, Cambridge University Press, 10.1017/CBO9780511811685 ISBN: 0521540518.
Scharstein, D., Szeliski, R., A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int. J. Comput. Vis. 47:1-3 (2002), 7–42, 10.1023/A:1014573219977.
Ullman, S., The interpretation of structure from motion. Proc. R. Soc. Lond. Series B. Biol. Sci. 203:1153 (1979), 405–426, 10.1098/rspb.1979.0006.
Faugeras, O.D., Lustman, F., Motion and structure from motion in a piecewise planar environment. Int. J. Pattern Recognit. Artif. Intell. 2:03 (1988), 485–508, 10.1142/s0218001488000285.
Koenderink, J.J., Van Doorn, A.J., Affine structure from motion. JOSA A 8:2 (1991), 377–385, 10.1364/JOSAA.8.000377.
Schonberger, J.L., Frahm, J.-M., Structure-from-motion revisited. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, 4104–4113, 10.1109/CVPR.2016.445.
Özyeşil, O., Voroninski, V., Basri, R., Singer, A., A survey of structure from motion. Acta Numer. 26 (2017), 305–364.
Eigen, D., Puhrsch, C., Fergus, R., Depth map prediction from a single image using a multi-scale deep network. Advances in Neural Information Processing Systems, 2014, 2366–2374.
Eigen, D., Fergus, R., Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. Proceedings of the IEEE International Conference on Computer Vision, 2015, 2650–2658, 10.1109/iccv.2015.304.
Liu, F., Shen, C., Lin, G., Deep convolutional neural fields for depth estimation from a single image. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, 5162–5170, 10.1109/CVPR.2015.7299152.
Liu, F., Shen, C., Lin, G., Reid, I., Learning depth from single monocular images using deep convolutional neural fields. IEEE Trans. Pattern. Anal. Mach. Intell. 38:10 (2015), 2024–2039, 10.1109/TPAMI.2015.2505283.
Garg, R., BG, V.K., Carneiro, G., Reid, I., Unsupervised CNN for single view depth estimation: geometry to the rescue. European Conference on Computer Vision, 2016, Springer, 740–756, 10.1007/978-3-319-46484-8_45.
Mayer, N., Ilg, E., Hausser, P., Fischer, P., Cremers, D., Dosovitskiy, A., Brox, T., A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, 4040–4048, 10.1109/CVPR.2016.438.
Ummenhofer, B., Zhou, H., Uhrig, J., Mayer, N., Ilg, E., Dosovitskiy, A., Brox, T., Demon: depth and motion network for learning monocular stereo. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, 5038–5047, 10.1109/CVPR.2017.596.
Vijayanarasimhan, S., Ricco, S., Schmid, C., Sukthankar, R., Fragkiadaki, K., Sfm-net: learning of structure and motion from video. arXiv preprint arXiv:1704.07804, 2017.
Zhou, T., Brown, M., Snavely, N., Lowe, D.G., Unsupervised learning of depth and ego-motion from video. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, 1851–1858, 10.1109/CVPR.2017.700.
Godard, C., Mac Aodha, O., Brostow, G.J., Unsupervised monocular depth estimation with left-right consistency. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, 270–279, 10.1109/CVPR.2017.699.
Kendall, A., Martirosyan, H., Dasgupta, S., Henry, P., Kennedy, R., Bachrach, A., Bry, A., End-to-end learning of geometry and context for deep stereo regression. Proceedings of the IEEE International Conference on Computer Vision, 2017, 66–75, 10.1109/ICCV.2017.17.
Godard, C., Mac Aodha, O., Firman, M., Brostow, G., Digging into self-supervised monocular depth estimation. arXiv preprint arXiv:1806.01260, 2018.
Mahjourian, R., Wicke, M., Angelova, A., Unsupervised learning of depth and ego-motion from monocular video using 3d geometric constraints. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, 5667–5675, 10.1109/CVPR.2018.00594.
Wang, C., Miguel Buenaposada, J., Zhu, R., Lucey, S., Learning depth from monocular videos using direct methods. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, 2022–2030, 10.1109/CVPR.2018.00216.
Kendall, A., Gal, Y., Cipolla, R., Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, 7482–7491, 10.1109/CVPR.2018.00781.
Howard, I.P., Perceiving in Depth, Volume 1: Basic Mechanisms. 2012, Oxford University Press, 10.1093/acprof:oso/9780199764143.001.0001.
Hirschmuller, H., Scharstein, D., Evaluation of cost functions for stereo matching. 2007 IEEE Conference on Computer Vision and Pattern Recognition, 2007, IEEE, 1–8, 10.1109/CVPR.2007.383248.
Marr, D., Poggio, T., Cooperative computation of stereo disparity. Science 194:4262 (1976), 283–287, 10.1126/science.968482.
Fua, P., A parallel stereo algorithm that produces dense depth maps and preserves image features. Mach. Vis. Appl. 6:1 (1993), 35–49, 10.1007/BF01212430.
Yang, Q., Engels, C., Akbarzadeh, A., Near real-time stereo for weakly-textured scenes. BMVC, 2008, 1–10, 10.5244/C.22.72.
Bianco, S., Ciocca, G., Marelli, D., Evaluating the performance of structure from motion pipelines. J. Imaging, 4(8), 2018, 98, 10.3390/jimaging4080098.
Kanade, T., Amidi, O., Ke, Q., Real-time and 3D vision for autonomous small and micro air vehicles. 2004 43rd IEEE Conference on Decision and Control (CDC)(IEEE Cat. No. 04CH37601), 2, 2004, IEEE, 1655–1662, 10.1109/CDC.2004.1430282.
Hoiem, D., Efros, A.A., Hebert, M., Automatic photo pop-up. ACM transactions on graphics (TOG), 24, 2005, ACM, 577–584, 10.1145/1186822.1073232.
Saxena, A., Sun, M., Ng, A.Y., Make3d: learning 3d scene structure from a single still image. IEEE Trans. Pattern Anals. Mach. Intell. 31:5 (2008), 824–840, 10.1109/TPAMI.2008.132.
Liu, B., Gould, S., Koller, D., Single image depth estimation from predicted semantic labels. 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010, IEEE, 1253–1260.
Karsch, K., Liu, C., Kang, S.B., Depth transfer: depth extraction from video using non-parametric sampling. IEEE Trans. Pattern Anal. Mach. Intell. 36:11 (2014), 2144–2158, 10.1109/TPAMI.2014.2316835.
Lafferty, J., McCallum, A., Pereira, F.C., Conditional random fields: probabilistic models for segmenting and labeling sequence data. 2001.
Dosovitskiy, A., Fischer, P., Ilg, E., Hausser, P., Hazirbas, C., Golkov, V., Van Der Smagt, P., Cremers, D., Brox, T., Flownet: learning optical flow with convolutional networks. Proceedings of the IEEE International Conference on Computer Vision, 2015, 2758–2766, 10.1109/ICCV.2015.316.
Luo, W., Schwing, A.G., Urtasun, R., Efficient deep learning for stereo matching. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, 5695–5703, 10.1109/CVPR.2016.614.
Geiger, A., Lenz, P., Urtasun, R., Are we ready for autonomous driving? The KITTI vision benchmark suite. 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012, IEEE, 3354–3361, 10.1109/CVPR.2012.6248074.
Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., Schiele, B., The Cityscapes dataset for semantic urban scene understanding. Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, 3213–3223, 10.1109/CVPR.2016.350.
Riche, N., Duvinage, M., Mancas, M., Gosselin, B., Dutoit, T., Saliency and human fixations: state-of-the-art and study of comparison metrics. Proceedings of the IEEE international conference on computer vision, 2013, 1153–1160, 10.1109/ICCV.2013.147.
Howell, D.C., Statistical methods for psychology. 2009, Cengage Learning.