Pure Sciences

Pure Sciences Paper For Sale

Space-time forecasting and evaluation of wind speed with statistical tests for comparing accuracy of spatial predictions

High-quality short-term forecasts of wind speed are vital to making wind power a more reliable energy source. Gneiting et al. 2006) have introduced a model for the average wind speed two hours ahead based on both spatial and temporal information. The forecasts produced by this model are accurate, and subject to accuracy, the predictive distribution is sharp, i.e., highly concentrated around its center. However, this model is split into nonunique regimes based on the wind direction at an off-site location. This work both generalizes and improves upon this model by treating wind direction as a circular variable and including it in the model. It is robust in many experiments, such as predicting at new locations. This is compared with the more common approach of modeling wind speeds and directions in the Cartesian space and use a skew-t distribution for the errors. The quality of the predictions from all of these models can be more realistically assessed with a loss measure that depends upon the power curve relating wind speed to power output. This proposed loss measure yields more insight into the true value of each models predictions. One method of evaluating time series forecasts, such as wind speed forecasts, is to test the null hypothesis of no difference in the accuracy of two competing sets of forecasts. Diebold and Mariano 1995) proposed a test in this setting that has been extended and widely applied. It allows the researcher to specify a wide variety of loss functions, and the forecast errors can be non-Gaussian, nonzero mean, serially correlated, and contemporaneously correlated. In this work, a similar unconditional test of forecast accuracy for spatial data is proposed. The forecast errors are no longer potentially serially correlated but spatially correlated. Simulations will illustrate the properties of this test, and an example with daily average wind speeds measured at over 100 locations in Oklahoma will demonstrate its use. This test is compared with a wavelet-based method introduced by Shen et al. 2002) in which the presence of a spatial signal at each location in the dataset is tested.

Perhaps You will be interested in these papers

New developments in planning accelerated life tests

Accelerated life tests ALTs) are often used to make timely assessments of the life time distribution of materials and components. The goal of many ALTs is estimation of a quantile of a log-location failure time distribution. Much of the previous work on planning accelerated life tests has focused on deriving test-planning methods under a specific log-location distribution. This thesis presents a new approach for computing approximate large-sample variances of maximum likelihood estimators of a quantile of general log-location distribution with censoring and time-varying stress based on a cumulative exposure model. This thesis also presents a strategy to develop useful test plans using a small number of test units. We provide an approach to find optimum step-stress accelerated life test plans by using the large-sample approximate variance of the maximum likelihood estimator of a quantile of the failure time distribution at use conditions from a step-stress accelerated life test. In Chapter 2, we show this approach allows for multi-step stress changes and censoring for general log-location-scale distributions. As an application of this approach, the optimum variance is studied as a function of shape parameter for both Weibull and lognormal distributions. Graphical comparisons among test plans using step-up, step-down, and constant-stress patterns are also presented. The results show that, depending on the values of the model parameters and quantile of interest, each of the three test plans can be preferable in terms of optimum variance. In Chapter 3, using sample data from a published paper describing optimum ramp-stress test plans, we show that our approach and the one used in the previous work give the same variance-covariance matrix of the quantile estimator from the two different approaches. Then, as an application of this approach, we extend the previous work to a new optimum ramp-stress test plan obtained by simultaneously adjusting the ramp rate and the lower start level of stress. We find that the new optimum test plan can have smaller variances than that of the optimum ramp-stress test plan previously obtained by adjusting only the ramp rate. We also compare optimum ramp-stress test plans with the more commonly used constant-stress accelerated life test plans. Previous work on planning accelerated life tests has been based on large-sample approximations to evaluate test plan properties. In Chapter 4, we use more accurate simulation methods to investigate the properties of accelerated life tests with small sample sizes where large-sample approximations might not be expected to be adequate. These properties include the simulated bias and variance for quantiles of the failure-time distribution at use conditions. We focus on using these methods to find practical compromise test plans that use three levels of stress. We also study the effects of not having any failures at test conditions and the effect of using incorrect planning values. We note that the large-sample approximate variance is far from adequate when the probability of zero failures at certain test conditions is not negligible. We suggest a strategy to develop useful test plans using a small number of test units while meeting constraints on the estimation precision and on the probability that there will be zero failures at one or more of the test stress levels.

Perhaps You will be interested in these papers

Nonparametric function smoothing: Fiducial inference of free knot splines and ecological applications

Nonparametric function estimation has proven to be a useful tool for applied statisticians. Classic techniques such as locally weighted regression and smoothing splines are being used in a variety of circumstances to address questions at the forefront of ecological theory. We first examine an ecological threshold problem and define a threshold as where the derivative of the estimated functions changes states negative, possibly zero, or positive) and present a graphical method that examines the state changes across a wide interval of smoothing levels. We apply this method to macro-invertabrate data from the Arkansas River. Next we investigate a measurement error model and a generalization of the commonly used regression calibration method whereby a nonparametric function is used instead of a linear function. We present a simulation study to assess the effectiveness of the method and apply the method to a water quality monitoring data set. The possibility of defining thresholds as knot point locations in smoothing splines led to the investigation of the fiducial distribution of free-knot splines. After introducing the theory behind fiducial inference, we then derive conditions sufficient to for asymptotic normality of the multivariate fiducial density. We then derive the fiducial density for an arbitrary degree spline with an arbitrary number of knot points. We then show that free-knot splines of degree 3 or greater satisfy the asymptotic normality conditions. Finally we conduct a simulation study to assess quality of the fiducial solution compared to three other commonly used methods.

Perhaps You will be interested in these papers

Robust network inference with multivariate t-distributions

Graphical Gaussian models have proven to be useful tools for exploring network structures based on multivariate data. Applications to studies of gene expression have generated substantial interest in these models. Recent progress includes the development of fitting methodology involving penalization of the likelihood function and Bayesian approaches putting explicit priors on graphs. In this paper we advocate the use of multivariate t-distributions for more robust inference of graphs. We demonstrate that penalized likelihood inference combined with an application of the EM algorithm provides a computationally efficient approach to model selection in the t-distribution case. We consider two versions of multivariate t distributions, the Classical multivariate t distribution and the more flexible Alternative t distribution which requires the use of approximation techniques. For this distribution, we describe a Markov chain Monte Carlo EM algorithm based on a Gibbs sampler as well as a simple variational approximation that makes the resulting method feasible in large problems. We also show how Bayesian approaches based on Gaussian distributions can be extended to t distributions. The main challenge here is to develop extensions that are computationally efficient. We introduce a third t distribution model involving Dirichlet process priors that maintains much of the flexibility of the Alternative model, while reducing computational costs.

Perhaps You will be interested in these papers

Higher order asymptotics: Applications to mixed models and bioassay

Likelihood based tests and confidence intervals typically require large sample sizes for their validity. For small sample problems, higher order asymptotics or small sample asymptotics) appear to be an attractive option to handle parametric inference problems. In the thesis, we will discuss applications of a class of small sample asymptotic procedures, namely, modified signed log-likelihood ratio test MSLRT) procedures, for the following problems: i) interval estimation of the consensus mean in inter-laboratory studies, ii) construction of tolerance intervals in general mixed and random effects models and iii) inference problems in the combination of multivariate bioassays. In inter-laboratory studies, a fundamental problem of interest is inference concerning the consensus mean, when the measurements are made by several laboratories, which may exhibit different within-laboratory variances, apart from the between laboratory variability. A heteroscedastic one-way random model is very often used to model this scenario. Under such a model, an MSLRT procedure is developed for the interval estimation of the common mean. Furthermore, simulation results are presented to show the accuracy of the proposed confidence interval, especially for small samples. The results are illustrated using an example. The computation of tolerance intervals in mixed and random effects models has not been addressed at all in a general setting, when the data are unbalanced. We derive satisfactory one-sided and two-sided tolerance intervals in such a general scenario, by applying small sample asymptotic procedures. In the case of one-sided tolerance limits, the problem reduces to the interval estimation of a percentile, and accurate confidence intervals are derived using the MSLRT procedure. In the case of two-sided tolerance intervals, the problem does not reduce to an interval estimation problem; however, it is possible to derive an approximate margin of error statistic that is an upper confidence limit for a linear combination of the variance components. For the latter problem, the MSLRT procedures can once again be used in order to arrive at an accurate upper confidence limit. Here, balanced and unbalanced data situations are treated separately, and computational issues are addressed in detail. Bioassays are frequently carried out for the purpose of estimating the relative potency of a drug or material i.e., test treatment) by comparing its effects with that of a standard treatment on a culture of living cells or a test organism. Data on the relative potencies can sometimes be obtained from several independent experiments, performed at different laboratories or locations. When this is the case, statistical inference problems of interest include testing the homogeneity of the relative potencies, and, if accepted, the interval estimation of the common relative potency. The available literature on the problem assumes a common covariance matrix for the data from the different laboratories or locations. Here we will address the problems in the setup of a MANOVA model for the multivariate bioassay data from the different studies, allowing for different covariance matrices, once again applying the MSLRT procedure. The accuracy of the proposed solutions is assessed based on simulations. Extensive numerical results show that the procedures derived based on higher order asymptotics exhibit satisfactory performance in all three problems regardless of the sample size. The results are illustrated using several examples. The overall conclusion is that the application of higher order asymptotics will result in accurate inference procedures for mixed models and for bioassay problems.

Perhaps You will be interested in these papers

Residuals for the general linear model with patterned covariance matrices

Gelman et al. (2003) state that, “Checking the model is crucial to statistical analysis”. Residuals have long been used to perform model assessment. However, properties of residuals for normal linear models of correlated observations have not been assessed. In this study, various studentized residuals are examined in terms of their ability to produce approximate standard normal distributions under correct model specification. This assessment includes both frequentist and Bayesian modeling approaches. Not only is there interest in assessing the properties of such residuals, but there is also interest in obtaining suitable diagnostic plots and tests to check violations of model assumptions when a fitted model is misspecified. This includes the ability of the residuals to identify misspecification of the covariance structure. Residuals that have been recommended in the literature are considered as well as new residuals based upon replicated observations. Particular emphasis is given to models with patterned covariance matrices in order to examine the effects of covariance matrix complexity and the number of subjects on the properties of the proposed residuals and diagnostics.

Perhaps You will be interested in these papers

Methods for estimating mediation effect in survival analysis: Does weight loss mediate the undernutrition-mortality relationship in the older adults

The influence of undernutrition on mortality and other adverse outcomes through the mechanism of unintentional weight loss in older adults is often assumed, but the analytic methods to test these mediation mechanisms are not well-developed, and the need for methodological advances in this area motivated this program of research. We first examined the test-retest reliability and predictive validity of self-reported caloric intake as a measure of undernutrition. Acceptable reliability was observed, consistent with previous reports, but the evidence for predictive validity was inconsistent and self-reported caloric intake deficiency was not found to be related to observed weight loss. We then extended the existing mediation methods in survival analysis by conducting a simulation study to further investigate the properties of two mediation effect calculation methods under the condition when censored data is present. Our findings from examination of product of coefficients method did not show a clear pattern in terms of bias with different specifications of hazard rate, mortality rate and amount of censoring. However we did find point estimates with increasing hazard rate have shown the smallest standard error and mean square error, followed by the constant and decreased hazard rates. The comparison between two mediation effect methods showed there is evidence that two methods can lead to substantially different estimates and inferential conclusions under the impact of hazard rate, mortality rate and amount of censoring. Generally speaking, the product of coefficient method performs better than the other under most of scenarios with moderate sample size, and two methods become less distinguishable when sample size increases to be 1000. Further, we also applied this improved method to a population of older adults and our findings indicated that the causal relationship between certain risk factors and mortality are mediated by weight loss. Finally, we have contributed a novel input to the research of examining mediation effects in the context of survival analysis with censored data and have made recommendation regarding the choice between two methods under difference scenarios.

Perhaps You will be interested in these papers

Estimating teacher effects using value-added models

Value-added modeling is an alternative approach to test-based accountability systems based on the proportions of students scoring at or above pre-determined proficiency levels. Value-added modeling techniques provide opportunities to estimate an individual teachers effect on student learning, while allowing for the possibility to control for the effect of non-educational factors beyond a school systems control, such as socioeconomic status. However, numerous considerations exist when using value-added models to estimate teacher effects and defining what the teacher effects really describe. Chapter 2 provides an introduction to value-added methodology by describing several value-added models available for estimating teacher effects and their respective advantages and disadvantages. Modeling variations and their impact on estimated teacher effects are also discussed in addition to the various statistical and psychometric issues associated with estimating value-added teacher effects. Because value-added analyses require high-quality longitudinal data that are often not available, Chapters 3 and 4 propose methodology for analyzing less-than-ideal assessment data. Chapter 3 proposes value-added methodology for analyzing longitudinal student achievement data not on a single developmental scale and addresses issues arising when using a layered, longitudinal mixed model to analyze gains in standardized scores. The chapter also discusses methods for estimating teacher effects on student learning before and after entering professional development programs and applies these methods of analysis to achievement data. Chapter 4 describes the use of curve-of-factors methodology to analyze longitudinal achievement data collected from two differently scaled assessments in a single year and subject, such as mathematics. Assuming data come from a curve-offactors model structure, a simulation study evaluates the performance of the proposed curve-of-factors model in its ability to accurately rank teachers in the presence of either complete or missing test data and compares it to the performance of the Z-score methodology proposed in Chapter 3.

Perhaps You will be interested in these papers

Bayesian nonparametric models for ranked set sampling

Ranked Set Sampling (RSS) is a data collection technique that combines measurement with judgment ranking for statistical inference. After a brief review of the basics of RSS, this dissertation lays out a formal and natural Bayesian framework for RSS that is analogous to its frequentist justification, and that does not require the assumption of perfect ranking or use of any imperfect ranking models. Prior beliefs about the judgment order statistic distributions and their interdependence are embodied by a nonparametric prior distribution. Posterior inference is carried out by means of Markov Chain Monte Carlo (MCMC) techniques, and yields estimators of the judgment order statistic distributions (and of functionals of those distributions). Because of non-conjugacy, different MCMC algorithms are used for continuous and discrete data. Judgment post-stratification is introduced to answer questions about handling information from multiple rankers, the quality of judgment ranking, and the role of set size. Finally, a more specific model is proposed for RSS with judgment ranking via a concomitant variable.

Perhaps You will be interested in these papers

The single-index hazards model

We first propose the single-index hazards model for right censored survival data. As an extension of the Cox model, this model allows nonparametric modeling of covariate effects in a parsimonious way via a single-index. In addition, the relative importance of covariates can be assessed via this model. We consider the conventional profile-kernel method based on the local likelihood for model estimation. It is shown that this method may give consistent estimation under certain restrictive conditions, but in general it can yield biased estimation. Simulation studies are conducted to demonstrate the bias phenomena. The existence and nature of the failure of this commonly used approach is somewhat surprising. The interpretation of covariate effects in the aforementioned single-index hazards model is difficult. Thus, we further propose the partly proportional single-index hazards model in which the effect of covariates of primary interest is represented by the regression parameter while “nuisance” covariates can have nonparametric effect on the survival time. We again consider the conventional profile-kernel method and it leads to biased estimation as well. A bias correction method is then proposed and the corrected profile local likelihood estimators are shown to be consistent, asymptotically normal and semiparametrically efficient. We evaluate the finite-sample properties of our estimators through simulation studies and illustrate the proposed model and method with an application to a dataset from the Multicenter AIDS Cohort Study MACS). Besides the profile-kernel method, we also study the profile stratified likelihood method based on stratification of the single-index. In the single-index hazards model, this method may give consistent estimation under the restrictive “independent censoring” condition, but in general it can yield biased estimation. Simulation studies are conducted to demonstrate the situations in which the bias phenomena do or do not) exist; In the partly proportional single-index hazards model, we demonstrate numerically the existence of the bias and then propose a bias correction method. The estimators from the corrected profile stratified likelihood method are shown to be consistent. Their finite-sample properties are evaluated through simulation studies. The corrected profile stratified method is applied to the aforementioned MACS study for illustration.

Perhaps You will be interested in these papers