University of Bonn, 2017. — 224 p.As cameras are becoming ubiquitous and internet storage abundant, the need for computers to understand images is growing rapidly. This thesis is concerned with two computer vision tasks, recognizing objects and their location, and segmenting images according to object classes. We focus on deep learning approaches, which in recent years had a tremendous influence on machine learning in general and computer vision in particular. The thesis presents our research into deep learning models and algorithms. It is divided into three parts. The first part describes our GPU deep learning framework. Its hierarchical structure allows transparent use of GPU, facilitates specification of complex models, model inspection, and constitutes the implementation basis of the later chapters. Components of this framework were used in a real-time GPU library for random forests, which we present and evaluate. In the second part, we investigate greedy learning techniques for semi-supervised object recognition. We improve the feature learning capabilities of restricted Boltzmann machines (RBM) with lateral interactions and auto-encoders with additional hidden layers, and offer empirical insight into the evaluation of RBM learning algorithms. The third part of this thesis focuses on object class segmentation. Here, we incrementally introduce novel neural network models and training algorithms, successively improving the state of the art on multiple datasets. Our novel methods include supervised pre-training, histogram of oriented gradient DNN inputs, depth normalization and recurrence. All contribute towards improving segmentation performance beyond what is possible with competitive baseline methods. We further demonstrate that pixelwise labeling combined with a structured loss function can be utilized to localize objects. Finally, we show how transfer learning in combination with object-centered depth colorization can be used to identify objects. We evaluate our proposed methods on the publicly available MNIST, MSRC, INRIA Graz-02, NYU-Depth, Pascal VOC, and Washington RGB-D Objects datasets.
Чтобы скачать этот файл зарегистрируйтесь и/или войдите на сайт используя форму сверху.