Pure Sciences

Pure Sciences Paper For Sale

Identifying gene-gene interactions and transcription regulators via dimension reduction methods

The advent of whole-system approaches, such as DNA chips and high-throughput sequencing, has created opportunities for exploring new computational dimension reduction algorithms in modern genetic analysis. In this dissertation, I have applied dimensional reduction methods to solve two problems in the field of genetics: first detecting gene-gene interactions or epistasis), and second, identifying candidate transcription regulatory genes or transcription factors). For the first problem, I proposed two combinatorial statistical methods: MCSM & CPMDR. MCSM Multivariate Combinatorial Searching Method) is designed to identify a set of loci that are associated with multiple traits. It can take into account multiple phenotypes at one time, and utilizes various techniques of feature selection to search for a set of disease-susceptibility genes that may have interactions. By applying MCSM on GAW16 Genetic Analysis Workshop 16) rheumatoid arthritis data, we have identified a significant gene-gene interaction between two genes, PTPN22 and TRAF1-C5. CPMDR is a novel likelihood-based combinatorial method to locate interplaying genes using only cases and their parents. It utilizes a score of the conditional likelihood for each nuclear family parents and diseased children) to partition the multi-locus genotypes into high and low risk classes. Our simulation results showed that CPMDR gained uniformly better performance to detect underlying interactions compared to other popular methods in a variety of scenarios. As to the second problem, I have designed an automated algorithm that combines adaptive sparse canonical correlation analysis ASCCA) as well as k-mean clustering analysis for recognizing transcription factors TFs) involved in a biological process using pooled gene expression data from publicly available resources. This algorithm is demonstrated to be highly efficient in ranking known or inferring novel transcriptional factors, and multifunctional TFs can also be identified by intersecting the gene lists involved in different biological processes.

Perhaps You will be interested in these papers

Biomonitoring for exposure to trace elements in utero: Analysis of the human placenta

The placenta forms in eutherian mammals, and is responsible for the nutrition of the developing fetus. However, maternal exposure to environmental pollutants both before and during pregnancy may result in the passage of toxins through the placental barrier and into fetal tissues. The placenta is the only organ derived from both maternal and fetal tissues, and establishes a link between the fetus and the environmental exposures of the mother. The analysis of placentae for the presence of environmental pollutants offers the possibility of exposure measurements in both the mother and the developing fetus. Specifically, trace element determination in human placentae may reveal fetal nutritional requirements, as well as identify potential indicators of negative health effects in both the mother and fetus. The principal goal of this project is to analyze approximately 160 archived human term placental tissues placenta body, placenta membrane, and umbilical cord) for essential trace elements, such as copper, zinc, and selenium, nonessential trace elements, including mercury, lead, and cadmium, and the rare earth elements of the lanthanide series. Sample preparation procedures focus on trace element homogeneity within the placenta, and contamination prevention. Analytical methodologies based on inductively coupled plasma mass spectrometry ICP-MS) and electrothermal atomic absorption spectrometry ETAAS) are developed and validated for all analytes of interest, using a variety of quality control materials. Subsequently, total element concentrations in each tissue component are measured and compared. Possible inter-element correlations within each tissue component are identified, as well as potential associations between analytical measurements and selected demographic and obstetric variables collected from the population studied. Results indicate that the placenta largely accumulates cadmium, an element strongly correlated with maternal smoking behavior. Lead and mercury are easily transported into fetal tissues. Placental concentrations of rare earth elements followed the typical abundance pattern of these elements in Earths crust, suggesting natural sources of exposure. Inter-element correlations were found for rubidium and cesium, as well as for aluminum and other bone-seeking elements, such as lead, lanthanum, uranium, barium, and strontium. Manganese in the placenta was positively correlated with infant growth variables, including birth weight, length, and head and chest circumference.

Perhaps You will be interested in these papers

Bayesian hierarchical modeling for adaptive incorporation of historical information in clinical trials

Bayesian clinical trial designs offer the possibility of a substantially reduced sample size, increased statistical power, and reductions in cost and ethical hazard. However when prior and current information conflict, Bayesian methods can lead to higher than expected Type I error, as well as the possibility of a costlier and lengthier trial. We develop several models that allow for the commensurability of the information in the historical and current data to determine how much historical information is used. First, we propose methods for univariate Gaussian data and provide an example analysis of data from two successive colon cancer trials that illustrates a linear models extension of our adaptive borrowing approach. Next, we extend the general method to linear and linear mixed models as well as generalized linear and generalized linear mixed models. We also provide two more sample analyses using the colon cancer data. Finally, we consider the effective historical sample size of our adaptive method for the case when historical data is available only for the concurrent control arm, and propose “optimal” use of new patients in the current trial using an adaptive randomization scheme that is balanced with respect to the amount of incorporated historical information. The approach is then demonstrated using data from a trial comparing antiretroviral strategies in HIV-1-infected persons. Throughout the thesis we present simulation studies that compare frequentist operating characteristics and highlight the advantages of our adaptive borrowing methods.

Perhaps You will be interested in these papers

Enzymology and medicinal chemistry of N5-carboxyaminoimidazole ribonucleotide synthetase: A novel antibacterial target

N5-Carboxyaminoimidazole ribonucleotide synthetase N 5-CAIR synthetase), a key enzyme in microbial de novo purine biosynthesis, catalyzes the conversion of aminoimidazole ribonucleotide AIR) to N 5-CAIR. To date, this enzyme has been observed only in microorganisms, and thus, it represents an ideal target for antimicrobial drug development. Here, we report structural and functional studies on the Aspergillus clavatus N5-CAIR synthetase and identification of inhibitors for the enzyme. In collaboration with Dr. Hazel Holden of the University of Wisconsin, the three-dimensional structure of Aspergillus clavatus N5-CAIR synthetase was solved in the presence of either Mg2ATP or MgADP and AIR. These structures, determined to 2.1 and 2.0 A resolution, respectively, revealed that AIR binds in a pocket analogous to that observed for other ATP-grasp enzymes involved in purine metabolism. On the basis of these models, a site-directed mutagenesis study was subsequently conducted that focused on five amino acid residues located in the active site region of the enzyme. These investigations demonstrated that Asp153 and Lys353 play critical roles in catalysis without affecting substrate binding. All other mutations affected substrate binding and, in some instances, catalysis as well. Taken together, the structural and kinetic data presented here suggest a catalytic mechanism whereby Mg2ATP and bicarbonate first react to form the unstable intermediate carboxyphosphate. This intermediate subsequently decarboxylates to CO2 and inorganic phosphate, and the amino group of AIR, through general base assistance by Asp153, attacks CO2 to form N5-CAIR. To identify the inhibitors for this enzyme, we have conducted high-throughput screening HTS) against Escherichia coli N5-CAIR synthetase using a highly reproducible phosphate assay. HTS of 48,000 compounds identified 14 compounds that inhibited the enzyme. The hits identified could be classified into three classes based on chemical structure. Class I contains compounds with an indenedione core. Class II contains an indolinedione group, and class III contains compounds that are structurally unrelated to other inhibitors in the group. We determined the Michaelis-Menten kinetics for five compounds representing each of the classes. Examination of compounds belonging to class I indicates that these compounds do not follow normal Michaelis-Menten kinetics. Instead, these compounds inhibit N5-CAIR synthetase by reacting with the substrate AIR. Kinetic analysis indicates that the class II families of compounds are non-competitive with both AIR and ATP. One compound in class III is competitive with AIR but uncompetitive with ATP, whereas the other is non-competitive with both substrates. Finally, these compounds display no inhibition of human AIR carboxylase indicating that these agents are selective inhibitors of N5-CAIR synthetase. Given the importance of the class II, non-competitive inhibitors, we developed a diazirine-based photocrosslinking agent to identify the binding site of these inhibitors. These studies revealed that the isatin core of class II inhibitors is capable of undergoing photochemical conversion to isatoic anhydride. Once formed, the anhydride is capable of reacting with the protein. Treatment of N5-CAIR synthetase with the photoreactive agent lead to the dimerization of two monomers of the synthetase. Proteomic analysis of the crosslinked protein identified serine 227 as a possible site of modification. These studies also revealed two peptides that were missing in the dimerized protein sample. These two peptides were located near serine 227. While compelling, the location of the missing peptides and serine 227 is 20 A away from the dimerization interface observed in the crystal structure. Thus, our photocrosslinking studies suggest that N5-CAIR synthetase may exist in multiple dimer conformations.

Perhaps You will be interested in these papers

Semiparametric models for joint analysis of longitudinal data and counting processes

In this dissertation, we study statistical methodology for joint modeling that correctly controls for the interplay among longitudinal and counting processes and makes the most efficient use of data. Three types of joint modeling approaches are proposed based on three different purposes of studies. In the first topic, we develop a method for joint modeling of longitudinal data and recurrent events in the presence of an informative terminal event. We focus on data from patients who experience the same type of event at multiple times, such as multiple infection episodes or recurrent strokes, have longitudinal biomarkers, and may be subject to an event, for example death, that makes further observations impossible. To analyze such complicated data, we propose joint models based on a likelihood approach. A broad class of transformation models for the cumulative intensity of recurrent events and the cumulative hazard of the terminal event is considered. We propose to estimate all the parameters using nonparametric maximum likelihood estimators NPMLE), and we provide computationally efficient EM algorithms to implement the proposed inference procedure. Asymptotic properties of the estimators are shown to be asymptotically normal and semiparametrically efficient. Finally, we evaluate the performance of the proposed method through extensive simulations and application to real data. In the second topic, we develop a method for joint modeling of longitudinal and cure-survival data. By cure-survival data, we mean time-to-event data in which a certain proportion of patients never have any event during a sufficiently long follow-up period. These patients are believed to have been cured by treatment, such as radiation therapy or an initial surgery, and are often the source of heavy tail probabilities in survival curves. To take into account the possibility of patients being cured, we propose to model time-to-event through a transformed promotion time cure model, jointly with a linear mixed effects model for longitudinal data. Due to transformations applied to the promotion time cure model, the proposed method is able to be used in cases where the proportionality assumption does not hold. All the parameters are estimated using NPMLEs, and inference procedures are implemented via a simple EM algorithm. Asymptotic properties of the proposed NPMLEs are derived based on empirical process theory. Simulation studies are conducted and the method is applied to the ARIC data in order to demonstrate the small-sample performance of the proposed method. In the third topic, we develop a partially linear model for longitudinal data with informative censoring, where the main interest is in making inferences about the individuals trajectory of longitudinal responses, which may be informatively censored. Since a fully parameterized mean structure may be insufficient to capture the underlying patterns of longitudinal and event processes, we propose to use a partially linear model for longitudinal responses, where an unspecified underlying function is formulated along with linear covariate effects, and a transformation model is used for informative censoring times. We employ a sieve estimation for the nonparametric trajectory of longitudinal responses, where the unknown trajectory is approximated by cubic B-spline basis functions. All parameters are estimated based on a likelihood approach, and inference procedures are implemented via the EM algorithm. We also investigate a reliable way to select the number of knots and the best transformation. Through empirical process theory, asymptotic properties of the proposed estimators are shown to provide desirable properties. The validity of the proposed method is confirmed by simulated and real data examples.

Perhaps You will be interested in these papers

Biomedical applications of cobalt-spinel ferrite nanoparticles for cancer cell extraction and drug delivery

In this presentation it is demonstrated that the unique magnetic properties of superparamagnetic cobalt-spinel ferrite nanoparticles can be employed in several novel applications. A method to selectively capture and remove pathogens from infected organisms to improve longevity is presented. Evidence is provided to show that automated methods using modified forms of hemofiltration or peritoneal dialysis could be used to eliminate the particle/pathogen or particle/infected cell conjugates from the organism postoperatively. It is shown that disparately functionalized nanoparticles can be used in concert as drug carrier and release mechanisms. Lastly, we provide preliminary evidence to support the use of magnetic nanoparticles for controlling reaction kinetics.

Perhaps You will be interested in these papers

Statistical designs and algorithms for mapping cancer genes

The identification of genes that are directly involved in tumor initiation and maintenance is instrumental for understanding the phenotypic variation of cancer and ultimately designing crucial therapeutic drugs to treat this disease. In recent years, the completed genome sequence of humans and cancers has markedly enhanced cancer gene identification. The overall goal of this dissertation is to develop a warehouse of statistical tools for identifying cancer genes with growingly increasing sequence data. These tools are founded on the latest discoveries for the genetic and developmental roots of cancer formation, including somatic mutations, aneuploid induction, epigenetic modifications, transgenerational imprinting, copy number variants, and host-tumor genetic interactions. New statistical methods and algorithms will be developed to integrate each of these discoveries. By comparing the difference in the DNA structure and sequence between the human and cancer genomes, a disequilibrium model has been formulated to identify and test the genetic mutations or “drivers” that cause cancer. A quantitative model is derived to unravel the aneuploidy control of cancer and estimate the genetic effects of aneuploid loci on cancer risk. Using a commonly used three-generation design, a two-stage hierarchical model is developed to estimate and test the transgenerational alteration of genetic effects and identify genetic imprinting effects due to different parental origins of the same allele. This hierarchical model allows the characterization of genetic interactions between additive and dominant effects and imprinting effects over generations. Cancer susceptibility may be controlled not only by host genes and mutated genes in cancer cells, but also by the epistatic interactions between genes from the host and cancer genomes. A model was derived to estimate genome-genome interactions of host DNA and cancer DNA. Models for cancer gene identifications require the solution of missing data problems given the fact that cancer genes and their incidence in a natural population cannot be observed directly. For this reason, I have built up the models within the mixture model framework. The maximum likelihood approaches, implemented with the EM algorithm, have been derived to provide the estimates of genetic parameters related to mutation rates, chromosome duplication rates, genetic imprinting, genetic interactions, and haplotype frequencies. I have performed various sets of computer simulation to investigate the statistical properties of the new models in terms of power, estimation precision, and false positive rates. A series of practical computational issues, including convergence rates and choices of initial values, are discussed. I have also formulated various testable hypotheses about the frequencies of genetic mutations and the effects of host genes, cancer genes, and their interactions on cancer susceptibility. This dissertation provides a most complete set of statistical models for cancer gene identification thus far in the literature. The biological relevance and statistical sophistication of these models will make them practically useful to unlock the genetic secrets of cancer.

Perhaps You will be interested in these papers

Detector development for positron emission based real-time tumor tracking

Tumor motion limits the accuracy of radiation therapy. Positron emission tracking (PeTrack) is a technique that can track tumors through detecting annihilation gammas from implanted positron emission markers. This thesis focuses on detector development for PeTrack. Due to the intense scattered x rays from a Linac, scintillator (BGO) afterglow and detector gating, i.e. turning off the detector during the intense x-ray pulse, need to be addressed. The evaluation of BGO showed very low afterglow. A gating circuit was designed, optimized and tested. Energy resolution of the detector is better than 25% (FWHM) with optimal gating parameters. A data acquisition system for PeTrack was set up and calibrated successfully. The first PeTrack prototype was developed and evaluated. The prototype was able to localize two positron emitting markers with an average precision of 0.16-0.21 mm, and an average accuracy of 0.6 mm on distance between the two markers.

Perhaps You will be interested in these papers

Structural Insights into the Adaptability and Specificity of Near Germline Monoclonal Antibodies

All jawed vertebrates have humoral immune systems that are capable of recognizing a nearly limitless number of potential antigens, yet an individual organisms genome contains a limited number of genes. In order to achieve this stunning adaptability, these limited number of germline gene segments are combined to form the primary antibody repertoire. Further expansion of affinity and specificity is achieved by somatic hypermutation, a process which usually requires T cell help. Carbohydrate antigens are customarily unable to elicit such T cell help and are thus more dependent on the primary germline gene repertoire. Carbohydrate specific antibodies thus make an excellent model to scrutinize how these germline gene segments are capable of balancing the need for specificity, while maintaining the capability of adapting to new antigenic challenges. This thesis explores the structural basis of this adaptability and specificity using carbohydrate-specific antibodies derived from two model systems. The first model system exploits antibodies generated against Chlamydial inner core lipopolysaccharide LPS) carbohydrates, while the second model system is an antibody specific for the Tn antigen Thr/Ser-GalNAc). The crystallization of several homologous Chlamydia-specific antibodies that differ in their fine specificity revealed a conserved Kdo binding pocket and an adaptive binding groove. Small changes in the sequences of CDR H2 and CDR H3 within the binding groove enable these remarkable antibodies to distinguish among varying Chlamydial LPS epitopes. The crucial role of CDR H3 in defining antibody specificity was highlighted by the examination of this series of antibodies of generally similar sequence but with varying CDR H3. The structure of CDR H3 was found to play an important role in defining antibody promiscuity or specificity in binding, and was found to code for both redundant and differential antigen recognition. The structures of these antibodies revealed how specificity is maintained by coding for a conserved Kdo binding pocket, while D and J gene re-arrangements combined with somatic hypermutation allow differential recognition of Chlamydial epitopes, thus providing an important means of antigen adaptability. The structure of the Tn antigen-specific 237mAb revealed a unique mechanism of imposing specificity by having the combining site require the interaction of both a sugar moiety in a binding pocket composed of germline gene residues) and a peptide moiety in a long surface groove). This structure revealed how the immune system can create a highly specific antibody by forcing different regions of the combining site to both contribute to binding, thus preventing cross-reactivity with similar epitopes. Determination of the structures of antibodies from these two models demonstrates how a binding pocket that provides the base specificity of the germline gene segments, combined with sequence variation in CDR H3 allows for adaptability to modifications on the core epitopes. In addition to examining adaptability and specificity in mAbs, this thesis examined the carbohydrate specificity of the toxin aerolysin. Aerolysin is a bacterial channel-forming toxin produced by Aeromonas species. The toxin and its inactive precursor proaerolysin both bind to the conserved glycan core of glycosylphosphatidylinositol GPI)-anchored proteins with high affinity. Here, I report the high resolution structure of proaerolysin in complex with mannose-6-phosphate, which is a component of the GPI anchor core glycan. The structure reveals unambiguous electron density for the monosaccharide and for the residues involved in binding. Trp-127, Arg-323, Trp-324 and Arg-336 all form hydrogen bonds to mannose-6-phosphate and there are electrostatic interactions between the phosphate moiety and the side chain of Arg-336. Trp-127 forms hydrophobic stacking interactions with the mannose ring, which result in the expulsion of water from the binding site. Examination of the carbohydrate specificity of the toxin by surface plasmon resonance SPR) revealed that the toxin did not bind to all GPI anchor structures equally. SPR also suggested that mannose-6-phosphate unlikely represents the true binding determinant of the toxin, but rather only a partial low affinity ligand.

Perhaps You will be interested in these papers

Phagosome maturation: Aging with pH, lysosome-associated membrane proteins, and cholesterol, while staying young with Burkholderia cenocepacia

Phagocytosis is an innate immune response that is paramount in the clearance of pathogenic particles. Recognition of target particles by phagocytic receptors expressed on phagocytes induces modifications in the underlying actin cytoskeleton to form pseudopods that encircle and internalize the target particle into a membrane bound organelle called the phagosome. The nascent phagosome undergoes a maturation sequence that is characterized by substantial remodeling of the membrane and its luminal contents through interactions with components of the endocytic pathway, culminating in an acidic and hydrolytic organelle capable of digesting and eliminating pathogens. Phagosome maturation is a complicated pathway that involves many protein and lipid signaling molecules. Several factors that influence phagosome maturation particularly the participation of pH, lysosome-associated membrane proteins-1 and -2, cholesterol, in addition to the survival and escape mechanisms used by, Burkholderia cenocepacia were explored. All three tenets are essential for phagosome maturation, although each factor has different mechanistic consequences. Acidification alters Rab5 activation, while ablation of LAMPs and accumulation of cholesterol interferes with various aspects of Rab7 turnover in phagosomes and/or endosome membranes. Moreover, Burkholderia cenocepacia, an intracellular pathogen, inactivates Rab7 on phagosome membranes from within the vacuole lumen. Herein, mechanisms that govern phagosome maturation are explored and several molecules are added to the long list of essential players in this complicated pathway.

Perhaps You will be interested in these papers