/post/index.xml Past Seminar Series - McGill Statistics Seminars
  • Multivariate extremal dependence: Estimation with bias correction

    Date: 2012-11-02

    Time: 14:30-15:30

    Location: BURN 1205

    Abstract:

    Estimating extreme risks in a multivariate framework is highly connected with the estimation of the extremal dependence structure. This structure can be described via the stable tail dependence function L, for which several estimators have been introduced. Asymptotic normality is available for empirical estimates of L, with rate of convergence k^1/2, where k denotes the number of high order statistics used in the estimation. Choosing a higher k might be interesting for an improved accuracy of the estimation, but may lead to an increased asymptotic bias. We provide a bias correction procedure for the estimation of L. Combining estimators of L is done in such a way that the asymptotic bias term disappears. The new estimator of L is shown to allow more flexibility in the choice of k. Its asymptotic behavior is examined, and a simulation study is provided to assess its small sample behavior. This is a joint work with Cécile Mercadier (Université Lyon 1) and Laurens de Haan (Erasmus University Rotterdam).

  • Simulation model calibration and prediction using outputs from multi-fidelity simulators

    Date: 2012-10-26

    Time: 14:30-15:30

    Location: BURN 1205

    Abstract:

    Computer simulators are used widely to describe physical processes in lieu of physical observations. In some cases, more than one computer code can be used to explore the same physical system - each with different degrees of fidelity. In this work, we combine field observations and model runs from deterministic multi-fidelity computer simulators to build a predictive model for the real process. The resulting model can be used to perform sensitivity analysis for the system and make predictions with associated measures of uncertainty. Our approach is Bayesian and will be illustrated through a simple example, as well as a real application in predictive science at the Center for Radiative Shock Hydrodynamics at the University of Michigan.

  • Observational studies in healthcare: are they any good?

    Date: 2012-10-19

    Time: 14:30-15:30

    Location: UdeM

    Abstract:

    Observational healthcare data, such as administrative claims and electronic health records, play an increasingly prominent role in healthcare. Pharmacoepidemiologic studies in particular routinely estimate temporal associations between medical product exposure and subsequent health outcomes of interest, and such studies influence prescribing patterns and healthcare policy more generally. Some authors have questioned the reliability and accuracy of such studies, but few previous efforts have attempted to measure their performance.

  • Modeling operational risk using a Bayesian approach to EVT

    Date: 2012-10-12

    Time: 14:30-15:30

    Location: BURN 1205

    Abstract:

    Extreme Value Theory has been widely used for assessing risk for highly unusual events, either by using block maxima or peaks over the threshold (POT) methods. However, one of the main drawbacks of the POT method is the choice of a threshold, which plays an important role in the estimation since the parameter estimates strongly depend on this value. Bayesian inference is an alternative to handle these difficulties; the threshold can be treated as another parameter in the estimation, avoiding the classical empirical approach. In addition, it is possible to incorporate internal and external observations in combination with expert opinion, providing a natural, probabilistic framework in which to evaluate risk models. In this talk, we analyze operational risk data using a mixture model which combines a parametric form for the center and a GPD for the tail of the distribution, using all observations for inference about the unknown parameters from both distributions, the threshold included. A Bayesian analysis is performed and inference is carried out through Markov Chain Monte Carlo (MCMC) methods in order to determine the minimum capital requirement for operational risk.

  • Markov switching regular vine copulas

    Date: 2012-10-05

    Time: 14:30-15:30

    Location: BURN 1205

    Abstract:

    Using only bivariate copulas as building blocks, regular vines(R-vines) constitute a flexible class of high-dimensional dependence models. In this talk we introduce a Markov switching R-vine copula model, combining the flexibility of general R-vine copulas with the possibility for dependence structures to change over time. Frequentist as well as Bayesian parameter estimation is discussed. Further, we apply the newly proposed model to examine the dependence of exchange rates as well as stock and stock index returns. We show that changes in dependence are usually closely interrelated with periods of market stress. In such times the Value at Risk of an asset portfolio is significantly underestimated when changes in the dependence structure are ignored.

  • The current state of Q-learning for personalized medicine

    Date: 2012-09-28

    Time: 14:30-15:30

    Location: BURN 1205

    Abstract:

    In this talk, I will provide an introduction to DTRs and an overview the state of the art (and science) of Q-learning, a popular tool in reinforcement learning. The use of Q-learning and its variance in randomized and non-randomized studies will be discussed, as well as issues concerning inference as the resulting estimators are not always regular. Current and future directions of interest will also be considered.

  • Regularized semiparametric functional linear regression

    Date: 2012-09-21

    Time: 14:30-15:30

    Location: McGill, Burnside Hall 1214

    Abstract:

    In many scientific experiments we need to face analysis with functional data, where the observations are sampled from random process, together with a potentially large number of non-functional covariates. The complex nature of functional data makes it difficult to directly apply existing methods to model selection and estimation. We propose and study a new class of penalized semiparametric functional linear regression to characterize the regression relation between a scalar response and multiple covariates, including both functional covariates and scalar covariates. The resulting method provides a unified and flexible framework to jointly model functional and non-functional predictors, identify important covariates, and improve efficiency and interpretability of the estimates. Featured with two types of regularization: the shrinkage on the effects of scalar covariates and the truncation on principal components of the functional predictor, the new approach is flexible and effective in dimension reduction. One key contribution of this paper is to study theoretical properties of the regularized semiparametric functional linear model. We establish oracle and consistency properties under mild conditions by allowing possibly diverging number of scalar covariates and simultaneously taking the infinite-dimensional functional predictor into account. We illustrate the new estimator with extensive simulation studies, and then apply it to an image data analysis.

  • Li: High-dimensional feature selection using hierarchical Bayesian logistic regression with heavy-tailed priors | Rao: Best predictive estimation for linear mixed models with applications to small area estimation

    Date: 2012-04-13

    Time: 14:00-16:30

    Location: MAASS 217

    Abstract:

    Li: The problem of selecting the most useful features from a great many (eg, thousands) of candidates arises in many areas of modern sciences. An interesting problem from genomic research is that, from thousands of genes that are active (expressed) in certain tissue cells, we want to find the genes that can be used to separate tissues of different classes (eg. cancer and normal). In this paper, we report a Bayesian logistic regression method based on heavytailed priors with moderately small degree freedom (such as 1) and small scale (such as 0.01), and using Gibbs sampling to do the computation. We show that it can distinctively separate a couple of useful features from a large number of useless ones, and discriminate many redundant correlated features. We also show that this method is very stable to the choice of scale. We apply our method to a microarray data set related to prostate cancer, and identify only 3 genes out of 6033 candidates that can separate cancer and normal tissues very well in leave-one-out cross-validation.

  • Hypothesis testing in finite mixture models: from the likelihood ratio test to EM-test

    Date: 2012-04-05

    Time: 15:30-16:30

    Location: BURN 1205

    Abstract:

    In the presence of heterogeneity, a mixture model is most natural to characterize the random behavior of the samples taken from such populations. Such strategy has been widely employed in applications ranging from genetics, information technology, marketing, to finance. Studying the mixing structure behind a random sample from the population allows us to infer the degree of heterogeneity with important implications in applications such as the presence of disease subgroups in genetics. The statistical problem is to test the hypotheses on the order of the finite mixture models. There has been continued interest in the limiting behavior of the likelihood ratio tests. The non-regularity of the finite mixture models has provided statisticians ample examples of unusual limiting distributions. Yet many of such results are not convenient for conducting hypothesis tests. Motivated at overcoming such difficulties, we have developed a number of strategies to obtain tests with high efficiency yet easy to use limiting distributions. The latest development is a class of EM-tests which are advantageous in many respects. Their limiting distributions are easier to derive mathematically, simple for implementation in data analysis and valid for more general class of mixture models without restrictions on the space of the mixing distribution. The simulation indicates the limiting distributions have good precision at approximating the finite sample distributions in the examples investigated.

  • A matching-based approach to assessing the surrogate value of a biomarker

    Date: 2012-03-30

    Time: 15:30-16:30

    Location: BURN 1205

    Abstract:

    Statisticians have developed a number of frameworks which can be used to assess the surrogate value of a biomarker, i.e. establish whether treatment effects on a biological quantity measured shortly after administration of treatment predict treatment effects on the clinical endpoint of interest. The most commonly applied of these frameworks is due to Prentice (1989), who proposed a set of criteria which a surrogate marker should satisfy. However, verifying these criteria using observed data can be challenging due to the presence of unmeasured simultaneous predictors (i.e. confounders) which influence both the potential surrogate and the outcome. In this work, we adapt a technique proposed by Rosenbaum (2002) for observational studies, in which observations are matched and the odds of treatment within each matched pair is bounded. This yields a straightforward and interpretable sensitivity analysis which can be performed particularly efficiently for certain types of test statistics. In this talk, I will introduce the surrogate endpoint problem, discuss the details of my proposed technique for assessing surrogate value, and illustrate with some simulated examples inspired by the problem of identifying immune surrogates in HIV vaccine trials.