/tags/2013-winter/index.xml 2013 Winter - McGill Statistics Seminars
  • Arup Bose: Consistency of large dimensional sample covariance matrix under weak dependence

    Date: 2013-04-12

    Time: 14:30-15:30

    Location: Concordia

    Abstract:

    Estimation of large dimensional covariance matrix has been of interest recently. One model assumes that there are $p$ dimensional independent identically distributed Gaussian observations $X_1, \ldots , X_n$ with dispersion matrix $\Sigma_p$ and $p$ grows much faster than $n$. Appropriate convergence rate results have been established in the literature for tapered and banded estimators of $\Sigma_p$ which are based on the sample variance covariance matrix of $n$ observations.

  • Éric Marchand: On improved predictive density estimation with parametric constraints

    Date: 2013-04-05

    Time: 14:30-15:30

    Location: BURN 1205

    Abstract:

    We consider the problem of predictive density estimation under Kullback-Leibler loss when the parameter space is restricted to a convex subset. The principal situation analyzed relates to the estimation of an unknown predictive p-variate normal density based on an observation generated by another p-variate normal density. The means of the densities are assumed to coincide, the covariance matrices are a known multiple of the identity matrix. We obtain sharp results concerning plug-in estimators, we show that the best unrestricted invariant predictive density estimator is dominated by the Bayes estimator associated with a uniform prior on the restricted parameter space, and we obtain minimax results for cases where the parameter space is (i) a cone, and (ii) a ball. A key feature, which we will describe, is a correspondence between the predictive density estimation problem with a collection of point estimation problems. Finally, if time permits, we describe recent work concerning : (i) non-normal models, and (ii) analysis relative to other loss functions such as reverse Kullback-Leibler and integrated L2.

  • Hélène Massam: The hyper Dirichlet revisited: a characterization

    Date: 2013-03-22

    Time: 14:30-15:30

    Location: BURN 107

    Abstract:

    We give a characterization of the hyper Dirichlet distribution hyper Markov with respect to a decomposable graph $G$ (or equivalently a moral directed acyclic graph). For $X=(X_1,\ldots,X_d)$ following the hyper Dirichlet distribution, our characterization is through the so-called “local and global independence properties” for a carefully designed family of orders of the variables $X_1,\ldots,X_d$.

    The hyper Dirichlet for general directed acyclic graphs was derived from a characterization of the Dirichlet distribution given by Geiger and Heckerman (1997). This characterization of the Dirichlet for $X=(X_1,\ldots,X_d)$ is obtained through a functional equation derived from the local and global independence properties for two different orders of the variables. These two orders are seemingly chosen haphazardly but, as our results show, this is not so. Our results generalize those of Geiger and Heckerman (1997) and are given without the assumption of existence of a positive density for $X$.

  • Jiahua Chen: Quantile and quantile function estimations under density ratio model

    Date: 2013-03-15

    Time: 14:30-15:30

    Location: BURN 1205

    Abstract:

    Join work with Yukun Liu (East China Normal University)

    Population quantiles and their functions are important parameters in many applications. For example, the lower level quantiles often serve as crucial quality indices of forestry products and others. In the presence of several independent samples from populations satisfying density ratio model, we investigate the properties of the empirical likelihood (EL) based inferences of quantiles and their functions. In this paper, we first establish the consistency and asymptotic normality of the estimators of parameters and cumulative distributions. The induced EL quantile estimators are then shown to admit Bahadur representation. The results are used to construct asymptotically valid confidence intervals for functions of quantiles. In addition, we rigorously prove that the EL quantiles based on all samples are more efficient than the empirical quantiles which can only utilize information from individual samples. Simulation study shows that the EL quantiles and their functions have superior performances both when the density ratio model assumption is satisfied and mildly violated. An application example is used to demonstrate the new methods and potential cost savings.

  • Natalia Stepanova: On asymptotic efficiency of some nonparametric tests for testing multivariate independence

    Date: 2013-03-01

    Time: 14:30-15:30

    Location: BURN 1205

    Abstract:

    Some problems of statistics can be reduced to extremal problems of minimizing functionals of smooth functions defined on the cube $[0,1]^m$, $m\geq 2$. In this talk, we consider a class of extremal problems that is closely connected to the problem of testing multivariate independence. By solving the extremal problem, we provide a unified approach to establishing weak convergence for a wide class of empirical processes which emerge in connection with testing multivariate independence. The use of our result will be also illustrated by describing the domain of local asymptotic optimality of some nonparametric tests of independence.

  • Changbao Wu: Analysis of complex survey data with missing observations

    Date: 2013-02-22

    Time: 14:30-15:30

    Location: CRM, Université de Montréal, Pav. André-Ainsenstadt, salle 1360

    Abstract:

    In this talk, we first provide an overview of issues arising from and methods dealing with complex survey data in the presence of missing observations, with a major focus on the estimating equation approach for analysis and imputation methods for missing data. We then propose a semiparametric fractional imputation method for handling item nonresponses, assuming certain baseline auxiliary variables can be observed for all units in the sample. The proposed strategy combines the strengths of conventional single imputation and multiple imputation methods, and is easy to implement even with a large number of auxiliary variables available, which is typically the case for large scale complex surveys. Simulation results and some general discussion on related issues will also be presented.

  • Eric Cormier: Data Driven Nonparametric Inference for Bivariate Extreme-Value Copulas

    Date: 2013-02-15

    Time: 14:30-15:30

    Location: BURN 1205

    Abstract:

    It is often crucial to know whether the dependence structure of a bivariate distribution belongs to the class of extreme-­‐value copulas. In this talk, I will describe a graphical tool that allows judgment regarding the existence of extreme-­‐value dependence. I will also present a data-­‐ driven nonparametric estimator of the Pickands dependence function. This estimator, which is constructed from constrained b-­‐splines, is intrinsic and differentiable, thereby enabling sampling from the fitted model. I will illustrate its properties via simulation. This will lead me to highlight some of the limitations associated with currently available tests of extremeness.

  • Celia Greenwood: Multiple testing and region-based tests of rare genetic variation

    Date: 2013-02-08

    Time: 14:30-15:30

    Location: BURN 1205

    Abstract:

    In the context of univariate association tests between a trait of interest and common genetic variants (SNPs) across the whole genome, corrections for multiple testing have been well-studied. Due to the patterns of correlation (i.e. linkage disequilibrium), the number of independent tests remains close to 1 million, even when many more common genetic markers are available. With the advent of the DNA sequencing era, however, newly-identified genetic variants tend to be rare or even unique, and consequently single-variant tests of association have little power. As a result, region-based tests of association are being developed that examine associations between the trait and all the genetic variability in a small pre-defined region of the genome. However, coping with multiple testing in this situation has had little attention. I will discuss two aspects of multiple testing for region-based tests. First, I will describe a method for estimating the effective number of independent tests, and second, I will discuss an approach for controlling type I error that is based stratified false discovery rates, where strata are defined by external information such as genomic annotation.

  • Daniela Witten: Structured learning of multiple Gaussian graphical models

    Date: 2013-02-01

    Time: 14:30-15:30

    Location: BURN 1205

    Abstract:

    I will consider the task of estimating high-dimensional Gaussian graphical models (or networks) corresponding to a single set of features under several distinct conditions. In other words, I wish to estimate several distinct but related networks. I assume that most aspects of the networks are shared, but that there are some structured differences between them. The goal is to exploit the similarity among the networks in order to obtain more accurate estimates of each individual network, as well as to identify the differences between the networks.

  • Mylène Bédard: On the empirical efficiency of local MCMC algorithms with pools of proposals

    Date: 2013-01-25

    Time: 14:30-15:30

    Location: BURN 1205

    Abstract:

    In an attempt to improve on the Metropolis algorithm, various MCMC methods with auxiliary variables, such as the multiple-try and delayed rejection Metropolis algorithms, have been proposed. These methods generate several candidates in a single iteration; accordingly they are computationally more intensive than the Metropolis algorithm. It is usually difficult to provide a general estimate for the computational cost of a method without being overly conservative; potentially efficient methods could thus be overlooked by relying on such estimates. In this talk, we describe three algorithms with auxiliary variables - the multiple-try Metropolis (MTM) algorithm, the multiple-try Metropolis hit-and-run (MTM-HR) algorithm, and the delayed rejection Metropolis algorithm with antithetic proposals (DR-A) - and investigate the net performance of these algorithms in various contexts. To allow for a fair comparison, the study is carried under optimal mixing conditions for each of these algorithms. The DR-A algorithm, whose proposal scheme introduces correlation in the pool of candidates, seems particularly promising. The algorithms are used in the contexts of Bayesian logistic regressions and classical inference for a linear regression model. This talk is based on work in collaboration with M. Mireuta, E. Moulines, and R. Douc.