/categories/crm-ssc-prize-address/index.xml CRM-SSC Prize Address - McGill Statistics Seminars
  • Full likelihood inference for abundance from capture-recapture data: semiparametric efficiency and EM-algorithm

    Date: 2022-09-30

    Time: 15:30-16:30 (Montreal time)

    HTTPS://US06WEB.ZOOM.US/J/84226701306?PWD=UEZ5NVPZAULLDW5QNU8VZZIVBEJXQT09

    MEETING ID: 842 2670 1306

    PASSCODE: 692788

    Abstract:

    Capture-recapture experiments are widely used to collect data needed to estimate the abundance of a closed population. To account for heterogeneity in the capture probabilities, Huggins (1989) and Alho (1990) proposed a semiparametric model in which the capture probabilities are modelled parametrically and the distribution of individual characteristics is left unspecified. A conditional likelihood method was then proposed to obtain point estimates and Wald-type confidence intervals for the abundance. Empirical studies show that the small-sample distribution of the maximum conditional likelihood estimator is strongly skewed to the right, which may produce Wald-type confidence intervals with lower limits that are less than the number of captured individuals or even negative.

  • Tales of tails, tiles and ties in dependence modeling

    Date: 2019-10-04

    Time: 16:00-17:00

    Location: CRM, UdeM, Pav. André-Aisenstadt, 2920, ch. de la Tour, salle 1355

    Abstract:

    Modeling dependence between random variables is omnipresent in statistics. When rare events with high impact are involved, such as severe storms, floods or heat waves, the issue is both of great importance for risk management and theoretically challenging. Combining extreme-value theory with copula modeling and rank-based inference yields a particularly flexible and promising approach to this problem. I will present three recent advances in this area. One will tackle the question of how to account for dependence between rare events in the medium regime, in which asymptotic extreme-value models are not suitable. The other will explore what can be done when a large number of variables is involved and how a hierarchical model structure can be learned from large-scale rank correlation matrices. Finally, I won’t resist giving you a glimpse of the notoriously intricate world of rank-based inference for discrete or mixed data.

  • Robust estimation in the presence of influential units for skewed finite and infinite populations

    Date: 2018-10-12

    Time: 16:00-

    Location: CRM, Université de Montréal, Pavillon André-Aisenstadt, salle 6254

    Abstract:

    Many variables encountered in practice (e.g., economic variables) have skewed distributions. The latter provide a conducive ground for the presence of influential observations, which are those that have a drastic impact on the estimates if they were to be excluded from the sample. We examine the problem of influential observations in a classical statistic setting as well as in a finite population setting that includes two main frameworks: the design-based framework and the model-based framework. Within each setting, classical estimators may be highly unstable in the presence of influential units. We propose a robust estimator of the population mean based on the concept of conditional bias of a unit, which is a measure of influence. The idea is to reduce the impact of the sample units that have a large conditional bias. The proposed estimator depends on a cut-off value. We suggest selecting the cut-off value that minimizes the maximum absolute estimated conditional bias with respect to the robust estimator. The properties of the proposed estimator will be discussed. Finally, the results of a simulation study comparing the performance of several estimators in terms of bias and mean square error will be presented.

  • Back to the future: why I think REGRESSION is the new black in genetic association studies

    Date: 2018-01-26

    Time: 15:30-16:30

    Location: ROOM 6254 Pavillon Andre-Aisenstadt 2920, UdeM

    Abstract:

    Linear regression remains an important framework in the era of big and complex data. In this talk I present some recent examples where we resort to the classical simple linear regression model and its celebrated extensions in novel settings. The Eureka moment came while reading Wu and Guan’s (2015) comments on our generalized Kruskal-Wallis (GKW) test (Elif Acar and Sun 2013, Biometrics). Wu and Guan presented an alternative “rank linear regression model and derived the proposed GKW statistic as a score test statistic", and astutely pointed out that “the linear model approach makes the derivation more straightforward and transparent, and leads to a simplified and unified approach to the general rank based multi-group comparison problem." More recently, we turned our attention to extending Levene’s variance test for data with group uncertainty and sample correlation. While a direct modification of the original statistic is indeed challenging, I will demonstrate that a two-stage regression framework makes the ensuing development quite straightforward, eventually leading to a generalized joint location-scale test (David Soave and Sun 2017, Biometrics). Finally, I will discuss on-going work, with graduate student Lin Zhang, on developing an allele-based association test that is robust to the assumption of Hardy-Weinberg equilibrium and is generalizable to complex data structure. The crux of this work is, again, reformulating the problem as a regression!

  • Bayesian inference for conditional copula models

    Date: 2017-01-27

    Time: 15:30-16:30

    Location: ROOM 6254 Pavillon Andre-Aisenstadt 2920, UdeM

    Abstract:

    Conditional copula models describe dynamic changes in dependence and are useful in establishing high dimensional dependence structures or in joint modelling of response vectors in regression settings. We describe some of the methods developed for estimating the calibration function when multiple predictors are needed and for resolving some of the model choice questions concerning the selection of copula families and the shape of the calibration function. This is joint work with Evgeny Levi, Avideh Sabeti and Mian Wei.

  • Outlier detection for functional data using principal components

    Date: 2016-02-11

    Time: 16:00-17:00

    Location: CRM 6254 (U. de Montréal)

    Abstract:

    Principal components analysis is a widely used technique that provides an optimal lower-dimensional approximation to multivariate observations. In the functional case, a new characterization of elliptical distributions on separable Hilbert spaces allows us to obtain an equivalent stochastic optimality property for the principal component subspaces of random elements on separable Hilbert spaces. This property holds even when second moments do not exist. These lower-dimensional approximations can be very useful in identifying potential outliers among high-dimensional or functional observations. In this talk we propose a new class of robust estimators for principal components, which is consistent for elliptical random vectors, and Fisher-consistent for elliptically distributed random elements on arbitrary Hilbert spaces. We illustrate our method on two real functional data sets, where the robust estimator is able to discover atypical observations in the data that would have been missed otherwise. This talk is the result of recent collaborations with Graciela Boente (Buenos Aires, Argentina) and David Tyler (Rutgers, USA).

  • Functional data analysis and related topics

    Date: 2015-01-15

    Time: 16:00-17:00

    Location: CRM 1360 (U. de Montréal)

    Abstract:

    Functional data analysis (FDA) has received substantial attention, with applications arising from various disciplines, such as engineering, public health, finance etc. In general, the FDA approaches focus on nonparametric underlying models that assume the data are observed from realizations of stochastic processes satisfying some regularity conditions, e.g., smoothness constraints. The estimation and inference procedures usually do not depend on merely a finite number of parameters, which contrasts with parametric models, and exploit techniques, such as smoothing methods and dimension reduction, that allow data to speak for themselves. In this talk, I will give an overview of FDA methods and related topics developed in recent years.

  • Changbao Wu: Analysis of complex survey data with missing observations

    Date: 2013-02-22

    Time: 14:30-15:30

    Location: CRM, Université de Montréal, Pav. André-Ainsenstadt, salle 1360

    Abstract:

    In this talk, we first provide an overview of issues arising from and methods dealing with complex survey data in the presence of missing observations, with a major focus on the estimating equation approach for analysis and imputation methods for missing data. We then propose a semiparametric fractional imputation method for handling item nonresponses, assuming certain baseline auxiliary variables can be observed for all units in the sample. The proposed strategy combines the strengths of conventional single imputation and multiple imputation methods, and is easy to implement even with a large number of auxiliary variables available, which is typically the case for large scale complex surveys. Simulation results and some general discussion on related issues will also be presented.