Technical University of Berlin, 2013. — 153 p.This thesis deals with the recognition of visual concepts on images using statistical machine learning. Recognition is treated here as classification task with continuous predictions. The continuous predictions can be used to generate a ranking of images and thus will be often evaluated in a ranking setting. Ranking means that for a given visual concept the set of all test images will be sorted according to the prediction in a descending order and evaluated using a ranking measure. This dissertation treats the general case of visual concepts in which concepts are defined explicitly by a set of images. The aim is multi-label classification in which for one image all present concepts are to be predicted. The challenge compared to highly specialized tasks such as face recognition is the ability to deal with a generic set of visual concepts which are defined by the training data. In the first part of the dissertation models are considered which are capable of minimizing hierarchical loss functions which are induced by taxonomies over the set of all visual concepts. The idea is that a taxonomy defines a prioritization of classification and ranking errors. The goal is to avoid errors which originate from confusing concepts which are distant under the given taxonomy. One example is a system which annotates images such that it returns for a request of dogs in case of absence of dogs or in case of error rather images of cats than images of cars. In contrast to preceding publications the focus lies not on speed during testing time but on improved classification and ranking performance under the hierarchical loss. The developed model aggregates the votes of all edges in the taxonomy, not only those of the locally best or shortest path. Furthermore the hierarchical models are generalized such that they can be predict multiple labels for multi-label ranking problems in which each image can have more than one visual concept. Previous approaches based on greedy walks along the edges of the hierarchy are able to predict only the most likely concept. In the context of multi-label ranking we define also a ranking measure which incorporates taxonomical information. The developed model is compared against one-versus-all and structured prediction baselines. In the second part of the dissertation the non-sparse multiple kernel learning (MKL) is analyzed for multi-label ranking of images. It is compared against average kernel support vector machines (SVMs) and sparse $\ell_1$-norm MKL. For the empirical part the performance of these methods is evaluated on the Pascal VOC2009 Classification and ImageCLEF2010 Photo Annotation datasets. It is shown that when using model selection in a practical setup, non-sparse MKL yields equal or better results compared to the average kernel SVM which does not learn feature combinations, in contrast to sparse $\ell_1$-norm MKL which yields worse results. For the theoretical part limiting and promoting factors for the performance gains of non-sparse MKL when compared to the other methods are identified.
Чтобы скачать этот файл зарегистрируйтесь и/или войдите на сайт используя форму сверху.