IGI Global, 2011. — 555 p.Machine audition is the field of the study of algorithms and systems for the automatic analysis and understanding of sound by machine. It plays an important role in many applications, such as automatic audio indexing for internet searching, robust speech recognition in un-controlled natural environment, untethered audio communication within an intelligent office scenario, and speech enhancement for hearing aids and cochlear implants, etc. It has recently attracted increasing interest within several research communities, such as signal processing, machine learning, auditory modelling, perception and cognition, psychology, pattern recognition, and artificial intelligence. However, the developments made so far are fragmented within these disciplines, lacking connections and incurring potentially overlapping research activities in this subject area. The proposed book intends to bring together the advances in recent algorithmic developments, bridge the gaps between the methodologies adopted by the various disciplines, and overlook future directions in this subject. This book aims to provide algorithmic developments, theoretical frameworks and empirical and experimental research findings in the area of machine audition. It could be useful for professionals who want to improve their understanding about how to design algorithms for performing automatic analysis of audio signals, how to construct a computing system that could understand sound sources around us, and how to build advanced human-computer interactive systems. The book covers the existing and the emerging algorithms and frameworks for processing sound mixtures, the practical approaches for implementing machine audition systems, as well as the relationship between human and machine audition. It will provide professionals, academic researchers, students, consultants and practitioners with a good overview of how the sound might be understood by a machine based on algorithmic operation, and how the machine audition approaches might be useful for solving practical engineering problems in daily life. The book is the first of its kind that describes the theoretical, algorithmic and systematic results from the area of machine audition. It intends to promote machine audition as a subject area that is equally attractive to the popular subject of computer vision. The book treats audition in the context of general audio, rather than for specific data, such as speech in some existing literature. It contains many new approaches and algorithms, most recent numerical and experimental results, which could foster a better understanding of the state of the art of the subject and ultimately motivate novel ideas and thinking in the research communities. A unique characteristic about the book is that it brings together the fragments of the research findings in machine audition research across several disciplines, which could potentially promote cutting-edge research in this subject area. The contents of this book are expected to be attractive to professionals, researchers, students and practitioners working in the fields of machine audition, audio engineering and signal processing. Researchers from the field of computer sciences, information technology and psychology will also be the audience of the book. The proposed book will be a precious reference for these audience who wish to have better understanding about the subject, to contribute to research of the subject, and to implement their new ideas and to provide technical consultancy in the field. The potential uses of the book include library reference, upper-level course supplement, resource for instructors, reference for researchers, reference book for policy makers, reference book for businessman, studying material for undergraduate or postgraduate students, resource for practitioners, resource for consultants, etc.Section 1 Audio Scene Analysis, Recognition and Modeling Unstructured Environmental Audio: Representation, Classification and Modeling Modeling Grouping Cues for Auditory Scene Analysis using a Spectral Clustering Formulation Cocktail Party Problem: Source Separation Issues and Computational Methods Audition: From Sound to Sounds Section 2 Audio Signal Separation, Extraction and Localization A Multimodal Solution to Blind Source Separation of Moving Sources Sound Source Localization: Conventional Methods and Intensity Vector Direction Exploitation Probabilistic Modeling Paradigms for Audio Source Separation Tensor Factorization with Application to Convolutive Blind Source Separation of Speech Multi-Channel Source Separation: Overview and Comparison of Mask-Based and Linear Separation Algorithms Audio Source Separation Using Sparse Representations Section 3 Audio Transcription, Mining and Information Retrieval Itakura-Saito Nonnegative Factorizations of the Power Spectrogram for Music Signal Decomposition Music Onset Detection On the Inherent Segment Length in Music Automatic Tagging of Audio: The State-of-the-Art Instantaneous vs. Convolutive Non-Negative Matrix Factorization: Models, Algorithms and Applications to Audio Pattern Separation Section 4 Audio Cognition, Modeling and Affective Computing Musical Information Dynamics as Models of Auditory Anticipation Multimodal Emotion Recognition Machine Audition of Acoustics: Acoustic Channel Modeling and Room Acoustic Parameter Estimation Neuromorphic Speech Processing: Objectives and Methods
Чтобы скачать этот файл зарегистрируйтесь и/или войдите на сайт используя форму сверху.