Pure Sciences

Pure Sciences Paper For Sale

Statistical model based multi-microphone speech processing: Toward overcoming mismatch problem

In this thesis, a joint optimal method for clean speech estimation and ASR in a mismatched condition will be described with a unified speech model under a generalized expectation maximization GEM) scheme. From this perspective, multi-microphone optimal speech estimation can be interpreted as pre-processing to increase reliability of feature components before the actual speech recognition or model based speech estimation is performed. Also, ideal binary mask IBM) estimation from the context of the statistical model for ASR can be regarded as an initialization step to exclude the unreliable portion for ASR and to increase the estimation accuracy based only on the reliable components and trained speech process model. Optimal multi-microphone speech processing is performed in the short-time Fourier transform STFT) domain, since the atomic speech information can be meaningfully represented with a series of 10 to 30 ms short frames. Convolution in the time domain is formulated as filtering via a feed-forward network in the STFT domain, and is shown to be an appropriate representation under the overlap-add framework. With this structure in mind, sufficient statistics for estimating target speech from the multi-microphone measurements are formulated, and realistic relaxations for them are discussed since we need to estimate not only the target speech information but also the room impulse responses RIRs), which have unavoidable uncertainty due to the movement of speakers. Firstly, reverberant speech mixture separation with typical background noise is tackled. Standard adaptive independent component analysis ICA) implemented with the natural gradient method is extended into the STFT domain with regularized feed-forward ICA RFFICA) and post-processing based on direction-per-frequency. This method showed up to almost an order of magnitude performance improvement 29 dB in C-weighting) compared with the state of the art methods. Secondly, we try to update the filters fast enough, with a smaller amount of measured data sharing the same directional information about target and interference location. Expectation maximization beamforming EMB) followed by minimum mean squared error MMSE) post-filtering is proposed to reduce the number of filter taps to update. Because we can obtain generative model based information about the target speech presence probability per each frequency bin and per each frame with enhanced robust DOA estimation capability, EMB can also be used to replace the direction-per-frequency based post-processing, which has been applied independently after RFFICA. Thirdly, the DOA only based beamforming is extended to early response based beamforming. We estimate the RIRs from target and interference speech given the robust estimation on DOAs and construct linearly constrained minimum variance LCMV) beamforming, which can be easily extended with the EMB framework. Because we perform a two-step approach, estimating RIR first and applying a demixing filter, without introducing more taps in the frame for adaptation purposes, we can have good demixing or dereverberation results. Finally, IBM estimation and ASR are jointly formulated under a GEM framework. Even with the optimal front-end pre-processing, there always exists a mismatched portion with the statistical speech process model which is going to be used for ASR. Therefore, identifying the corrupted portions and removing them in ASR from the perspective of ASR itself is a necessary procedure. The cepstral domain ASR models are transformed into the spectral domain without loss of information through the global tying process. The proposed algorithm achieved much higher absolute ASR accuracy, ranging from 14.69% at 0 dB signal-to-noise ratio SNR) to 40.10% at 15 dB SNR, than a normal ASR method with an optimal front-end processing in a highly non-stationary mismatch environment.

Perhaps You will be interested in these papers

Probabilistic inference via sum-product algorithms on binary pairwise Gibbs random fields with applications to multiple fault diagnosis

In this dissertation, we consider probabilistic inference problems on binary pairwise Gibbs random fields BPW-GRFs), which belong to a class of Markov random fields with applications to a large variety of systems, including computer vision, statistical mechanics, modeling of neural functions, and others. In particular, we study the application of iterative heuristic sum-product algorithms SPAs) to the underlying graphs for solving the marginal problem on BPW-GRFs. These algorithms operate on the BPW-GRF graph by propagating messages along the edges and by using them to update the beliefs at each node of the graph; these beliefs then serve as suboptimal solutions to the marginal problem. SPAs offer several advantages such as complexity that is polynomial in the number of nodes and edges in the graph and the ability to operate in a distributed fashion determined by the structure of the underlying graph). In general, the analysis of SPAs can be categorized into i) finding conditions under which the SPAs converge, and ii) determining the correctness of the marginal solutions provided by the SPAs with respect to the true marginals. In this dissertation, we consider both problems. For each problem, we first review existing results and then present our specific contribution within the class of BPW-GRFs. Finally, we extend our analysis of SPAs on BPW-GRFs to the application of multiple fault diagnosis note that the equivalent GRFs for fault diagnosis systems are typically non-binary). In particular, we establish tighter bounds over previous results, and show that fault diagnosis using SPA beliefs as suboptimal solutions to the true marginals) can detect multiple faults with very high accuracy.

Perhaps You will be interested in these papers

Bayesian Approaches to Trajectory Estimation in Maritime Surveillance

In maritime surveillance, multisensor data differ to a great extent in their temporal resolution. Additionally, due to multi-level security and information management processing, many contact reports arrive hours after observations. This makes the contact report data usually available for batch processing. The dissimilar multi-source information environment results in contact reports with heteroscedastic and correlated errors i.e. measurement errors characterized by normal probability distributions with non-constant and nondiagonal covariance matrices), while the obtained measurement errors may be relatively large. Hence, the appropriate choice of a trajectory estimation algorithm, which addresses the aforementioned issues of the surveillance data, will significantly contribute to increased awareness in the maritime domain. This thesis presents two novel batch single ship trajectory estimation algorithms employing Bayesian approaches to estimation: 1) a stochastic linear filtering algorithm and 2) a curve fitting algorithm which employs Bayesian statistical inference for nonparametric regression. The stochastic linear filtering algorithm employs a combination of two stochastic processes, namely the Integrated Ornstein-Uhlenbeck process IOU) and the random walk RW), process to describe the ships motion. The assumptions on linear modeling and bivariate Gaussian distribution of measurement errors allow for the use of Kalman filtering and Rauch-Tung-Striebel optimal smoothing. In the curve fitting algorithm, the trajectory is considered to be in the form of a cubic spline with an unknown number of knots in two-dimensional Euclidean plane of longitude and latitude. The function estimate is determined from the data which are assumed Gaussian distributed. A fully Bayesian approach is adopted by defining the prior distributions on all unknown parameters: the spline coefficients, the number and the locations of knots. The calculation of the posterior distributions is performed using Markov Chain Monte Carlo MCMC) and reversible jump Markov sampling due to the varying dimensions of subspaces where the searches are performed. Both algorithms assume no knowledge about the ship motion model, however assuming standard ship maneuvers. The quality of the estimated trajectories obtained by both algorithms is assessed using several simulated scenarios and evaluated statistically. The positional measurements, received at irregular time intervals are assumed to have heteroscedastic and correlated errors and available in batches. The performance evaluation includes the performance comparison of both algorithms with another batch stochastic optimization algorithm for trajectory estimation, i.e. the genetic algorithm GA). The sensitivity analysis is carried out with respect to perturbations in parameters of the algorithms. The results show similar performance between the linear stochastic filtering algorithm and the Bayesian spline regression algorithm, while both algorithms show superiority over the GA-based trajectory fitting with respect to tracking accuracy, due to complete account for uncertainty. Batch data processing approach is confirmed to be more suitable in maritime surveillance than standard recursive approaches. The thesis demonstrates that for the accurate trajectory estimation it is crucial to completely account for uncertainty of measurements, especially if the measurements are characterized by heteroscedastic and correlated errors. The results of this thesis are useful as they facilitate selecting the appropriate approach to data processing in maritime surveillance applications, hence contribute to increased maritime domain awareness. These can also serve for selecting appropriate methods for data processing in dissimilar sensor and other environments in which data have large and heteroscedastic measurement errors.

Perhaps You will be interested in these papers

A computational model for biochemical pathways with applications to metabolic processes

Statement of the problem. The development of computational models allows one to easily reveal and quantify certain behaviors of the biochemical network of interest. The existing models for chemical reactions based on the Michaelis-Menten approach are highly nonlinear and could cause computational instability for a network of reactions. The objectives of this thesis are 1) to develop a model for chemical reactions that is robust and stable for simulating biochemical pathways, 2) to design a customized software and graphic user interface GUI) for easy implementation of model-based simulations, and 3) to apply the modeling and simulation system to a metabolic process namely glycolysis—the metabolism of glucose. Methods. The modeling and simulation system was implemented in the C++ programming language using a cross-platform development tool, wxWidgets. A set of rate equations were used to describe the dynamics of a system with a compartment-like physical model. This allows for the reduction of such complexities that can restrict the models overall stability. A novel approach to modeling the driving force of chemical reactions that comprise biochemical pathways has been established. This method models the kinetics of the reaction by relating the mechanisms of the reaction to a common physical model. Using this approach, the differential equations or state equations that characterize the dynamics of the chemical reactions were established. Numerical methods were then used to solve these equations via a 2nd order Runge-Kutta method. After establishing the modeling methodology for its validity, a GUI was designed and developed for data entry, parameter storage and retrieval, model simulation, display of the concentration time-series curves, and for linking reactions to form a pathway. Results. The modeling and simulation system was successfully implemented. The computational part of the customized C++ program was validated against the results from MatLab. Selected steps from the 10-step glycolysis process were simulated and produced results consistent with those reported in literature. The models reliability was tested by reversing the reactions driving potential. The model was robust with respect to the initial conditions, inconsistent and/or incomplete data for assigning the model parameters. It has been concluded that this modeling approach gives logical and consistent results. For future work, the software developed in this thesis will be used to study the complete pathway of glycolysis and other biochemical pathways.

Perhaps You will be interested in these papers

On Learning in Problems with Geometric Constraints

Geometric structure plays an important role in modern signal processing problems. Often the variability in natural data sets can be described with fewer degrees of freedom than are suggested by the dimensionality of the data. When expressed in low dimensional intrinsic coordinates, the data may be better analyzed or visualized. We develop an efficient algorithm for nonlinear dimensionality reduction based on Laplacian eigenmaps to find these informative coordinates and examine its properties experimentally on speech and image data. We also consider the problem of estimation of random processes on manifolds and develop an algorithm for Bayesian filtering on a particular manifold of interest—the Stiefel manifold. As a point on Stiefel represents a basis for a linear subspace of a particular dimension, this manifold occurs naturally in many signal processing problems. The goal of this work is not only to advance the state of the art in dimensionality reduction and nonlinear filtering but also to serve as an introduction or guide to practitioners interested in geometry-based, nonlinear approaches—particularly in speech. Chapters 1 and 2 introduce the reader to problems in which geometry plays a fundamental role and lay out the mathematical foundations upon which the algorithms are based. Chapter 3 describes the regime where the underlying geometry is known a priori and may be effectively incorporated into a model—estimation on manifolds. Chapter 4 focuses on the regime where the geometry is unknown and must be estimated from the data—manifold learning. Some initial experiments are carried out here to probe the properties of our algorithm. Chapter 5 applies the algorithm to speech data to examine its structure in novel ways. The discussion throughout is bound together by the common thread of geometry and its contribution to understanding and solving contemporary learning problems.

Perhaps You will be interested in these papers

Solar cells based on cadmium tellurium thin film and composite of orgamic and inorganic nano-scale materials

In recent years there has been much interest in solar energy conversion because of oil price hike and environmental issues of burning fossil fuels. Photovoltaic solar cells are a promising alternative energy source. In this work, to start with, CdTe solar cells were fabricated and tested. CdTe was grown by e-beam evaporation followed by post annealing. 12% energy conversion efficiency achieved with first efforts. To explore materials for tandem solar cells, nanowires were studied. PbSe nanowires were grown by magnetron sputtering, and its crystal structure and stoichiometry as well as its optical properties were characterized. Closely packed PbSe nanowires with diameters of approximately 100 nm were grown. In spite of their relatively large size these wires showed a large blue shift in the luminescence and absorption and hence of its energy band gap compared to the bulk crystal demonstrating quantum confinement. This has been attributed to pinning of the Fermi level due to surface states, band bending and a strong depletion layer witch confines the carrier states. PbSe nanowires with different diameters are promising candidate for a new tandem cell. We also investigated enhancement of light harvesting in photosynthesis by integration of nanocrystalline (NQDs) quantum dots and photosystem I (PSI). We show strong evidence of energy transfer from CdSe NQDs to PSI by PL and transient absorption measurements. Experimental data indicates that the energy of the excited charge carriers in CdSe NQDs were transferred to PSI by means of radiative emission, FRET, and electron/hole transfer between inorganic/organic system. This exciting breakthrough provides a basis for design of novel energy harvesting and other electronic devices based on photosynthesis. Applying tandem structure and incorporating nanostructures paths exist toward solar cells with higher efficiency.

Perhaps You will be interested in these papers

The role of cross-linking in surface roughening of polymers during plasma etching

Cross-linking has been suggested as one of the dominant degradation mechanisms in the surface modified layer during plasma etching of polymer materials. In this study, we investigate its role in surface roughening. Polystyrene PS) and polymethyl-methacrylate PMMA) are our primary focus, motivated by a need to selectively remove PMMA for PS-b-PMMA block copolymer lithography. In the first part of this study, the effects of ion energy and different gas mixtures, including O2, Ar/O2, Ar, CF 4, CHF3/O2, Ar/SF6, Ar/H2, Ar/F2 and Ar/H2 on etch selectivity and surface/sidewall roughnesses were characterized to direct PMMA removal etch process development. Results show Ar/H2 produced optimal surface roughness and etch selectivity for the PMMA removal process. In Ar/O2 and O2 plasmas, opposite trends were observed for PS and PMMA: roughness decreases with increasing ion energy for PS and increases for PMMA. The second part of the thesis examines surface roughening mechanisms for PS and PMMA, with a focus on the roles of cross-linking. Evolution of the size and morphology of roughness features during initial stages formation are consistent with etch rate nonuniformity associated with heterogeneous cross-linking. The diffusion depth of oxygen atoms, associated with cross linking, was examined using surface chemical analysis XPS and NEXAFS) and ellipsometry, showing a direct correlation with surface roughness in Ar/O 2 plasma etched PS. This correlation leads to the conclusion that enhanced cross-linking suppresses surface roughening. Ar/H2 and Ar/F 2 etching of PS and PMMA resulted in low surface roughness, attributed to reduced cross-linking due to termination of dangling bonds in the polymer by H and F atoms. Data support a proposed mechanism in which surface roughness is caused by polymer aggregation associated with cross-linking induced by energetic ion bombardment that leads to etch rate non-uniformity. In this mechanism, RMS roughness peaks when cross-linking rates are comparable to chain scissioning rates, and drops to negligible levels for either very low or very high rates of cross-linking. Etch rate non- uniformity and thus surface roughness is thus low under two extreme cross-linking conditions: 1) very low rates and 2) rates sufficiently high that a continuous cross-linked surface layer that creates homogeneous surface, while intermediate rates produce a heterogeneous surface, non-uniform etch rate and subsequent roughness.

Perhaps You will be interested in these papers

Dynamic Modeling of Electrochemical Cells With Application to Proximal Three-Terminal Electrolysis

Electrolyzing water in a neutral solution is typically less efficient than in an acidic or basic solution. This is because in the absence of abundant H+ or OH- ions, water molecules must be decomposed at both electrodes. Water is oxidized at the anode, producing oxygen gas and H+ ions, and reduced at the cathode, producing hydrogen gas and OH- ions. The energy to drive both these reactions is provided by a potential applied to the electrodes equal to at least the sum of the two respective standard potentials. The excess energy delivered by this potential is converted to heat when OH- ions recombine with H+ ions in solution. The work described here circumvents this problem by taking advantage of highly proximal working electrodes, fabricated with semiconductor processing techniques, coupled with the introduction of a third electrode used to drive alternate reactions. Applying an alternating potential to this third electrode, the gate electrode, enables selective driving of the two decomposition reactions in alternate time periods. In this way the product ions of the half reactions, H+ and OH- , can diffuse through the solution in the relative absence of the other, increasing the probability of collection at the opposite electrode. It is found that such collection does occur in devices where the spacing between working electrodes is on the scale of a few microns. It is also found that efficiency improves when such a proximal effect occurs. To aid in understanding the behavior of the gated electrolysis cell, a network model was developed, which is presented in detail. The modeling method uses linked chemical and electrical domains, similar to models developed for batteries, but differs in how it handles the widely changing ion concentrations which exist in an electrolytic cell with AC excitation. As such, the method has broad application to driven electrolytic cells. As a further validation of the method, a model for an electrochemical DNA sensor is also presented.

Perhaps You will be interested in these papers

Computable Performance Analysis of Recovering Signals with Low-dimensional Structures

The last decade witnessed the burgeoning development in the reconstruction of signals by exploiting their low-dimensional structures, particularly, the sparsity, the block-sparsity, the low-rankness, and the low-dimensional manifold structures of general nonlinear data sets. The reconstruction performance of these signals relies heavily on the structure of the sensing matrix/operator. In many applications, there is a flexibility to select the optimal sensing matrix among a class of them. A prerequisite for optimal sensing matrix design is the computability of the performance for different recovery algorithms. I present a computational framework for analyzing the recovery performance of signals with low-dimensional structures. I define a family of goodness measures for arbitrary sensing matrices as the optimal values of a set of optimization problems. As one of the primary contributions of this work, I associate the goodness measures with the fixed points of functions defined by a series of linear programs, second-order cone programs, or semidefinite programs, depending on the specific problem. This relation with the fixed-point theory, together with a bisection search implementation, yields efficient algorithms to compute the goodness measures with global convergence guarantees. As a by-product, we implement efficient algorithms to verify sufficient conditions for exact signal recovery in the noise-free case. The implementations perform orders-of-magnitude faster than the state-of-the-art techniques. The utility of these goodness measures lies in their relation with the reconstruction performance. I derive bounds on the recovery errors of convex relaxation algorithms in terms of these goodness measures. Using tools from empirical processes and generic chaining, I analytically demonstrate that as long as the number of measurements are relatively large, these goodness measures are bounded away from zeros for a large class of random sensing matrices, a result parallel to the probabilistic analysis of the restricted isometry property. Numerical experiments show that, compared with the restricted isometry based performance bounds, our error bounds apply to a wider range of problems and are tighter, when the sparsity levels of the signals are relatively low. I expect that computable performance bounds would open doors for wide applications in compressive sensing, sensor arrays, radar, MRI, image processing, computer vision, collaborative filtering, control, and many other areas where low-dimensional signal structures arise naturally.

Perhaps You will be interested in these papers

Multidimensional signal processing in spatial-spectral holographic media

In this thesis I present the analyses, simulations and demonstrations of a number of novel optical signal processing systems, which are designed to explore the large bandwidths (10′s–100′s of GHz), time-bandwidth products (105 and greater) and massive spatial parallelism that spatial-spectral holography and photon-echo (PE) processing can provide. The systems investigated include RF spectrum analyzers, a time-integrating correlator, an RF-array multibeam imager, and a high-bandwidth LIDAR range-Doppler processor, all of which were built around a Tm3+:YAG crystal as the spatial-spectral holographic (SSH) medium. The time-integrating correlator (TIC) is the first SSH experiment that illustrates spatial coherence across parallel channels of PE processors. In this experiment, ∼150 SSH gratings with linearly increasing time-delays are recorded in the SSH, which, when read out, result in the scanned output required for the TIC. In the high-bandwidth RF spectrum analyzer; the spectral components from an RF signal are modulated onto an optical carrier and burned into the spectrally selective absorption band of the SSH. This altered absorption profile is then recovered by a frequency-swept source and a high dynamic range, low bandwidth detector. This is the first experimental SSH system to process RF signals with bandwidths in excess of 10 GHz, and was enabled by a novel linearized readout technique. In the LIDAR experiment, the Doppler and range information of targets is encoded in the position and spectral period of sinusoidal SSH gratings. These gratings (spanning ∼16 GHz) are snapped out with the linearized readout technique and post processed to recover the Doppler and range of the targets. The required experimental infrastructure and the spectrally-linearized chirped readout laser are discussed in detail.

Perhaps You will be interested in these papers