/categories/crm-colloquium/index.xml CRM-Colloquium - McGill Statistics Seminars
  • Insurance company operations and dependence modeling

    Date: 2014-03-21

    Time: 15:30-16:30

    Location: BURN 107

    Abstract:

    Actuaries and other analysts have long had the responsibility in insurance company operations for various financial functions including (i) ratemaking, the process of setting premiums, (ii) loss reserving, the process of predicting obligations that arise from policies, and (iii) claims management, including fraud detection. With the advent of modern computing capabilities and detailed and novel data sources, new opportunities to make an impact on insurance company operations are extensive.

  • ABC as the new empirical Bayes approach?

    Date: 2014-02-28

    Time: 13:30-14:30

    Location: UdM, Pav. Roger-Gaudry, Salle S-116

    Abstract:

    Approximate Bayesian computation (ABC) has now become an essential tool for the analysis of complex stochastic models when the likelihood function is unavailable. The approximation is seen as a nuisance from a computational statistic point of view but we argue here it is also a blessing from an inferential perspective. We illustrate this paradoxical stand in the case of dynamic models and population genetics models. There are also major inference difficulties, as detailed in the case of Bayesian model choice.

  • Calibration of computer experiments with large data structures

    Date: 2014-01-24

    Time: 15:30-16:30

    Location: Salle 1355, pavillon André-Aisenstadt (CRM)

    Abstract:

    Statistical model calibration of computer models is commonly done in a wide variety of scientific endeavours. In the end, this exercise amounts to solving an inverse problem and a form of regression. Gaussian process model are very convenient in this setting as non-parametric regression estimators and provide sensible inference properties. However, when the data structures are large, fitting the model becomes difficult. In this work, new methodology for calibrating large computer experiments is presented. We proposed to perform the calibration exercise by modularizing a hierarchical statistical model with approximate emulation via local Gaussian processes. The approach is motivated by an application to radiative shock hydrodynamics.

  • Great probabilists publish posthumously

    Date: 2013-12-06

    Time: 15:30-16:30

    Location: UQAM Salle SH-3420

    Abstract:

    Jacob Bernoulli died in 1705. His great book Ars Conjectandi was published in 1713, 300 years ago. Thomas Bayes died in 1761. His great paper was read to the Royal Society of London in December 1763, 250 years ago, and published in 1764. These anniversaries are noted by discussing new evidence regarding the circumstances of publication, which in turn can lead to a better understanding of the works themselves. As to whether or not these examples of posthumous publication suggest a career move for any modern probabilist; that question is left to the audience.

  • Signal detection in high dimension: Testing sphericity against spiked alternatives

    Date: 2013-11-29

    Time: 15:30-16:30

    Location: Concordia MB-2.270

    Abstract:

    We consider the problem of testing the null hypothesis of sphericity for a high-dimensional covariance matrix against the alternative of a finite (unspecified) number of symmetry-breaking directions (multispiked alternatives) from the point of view of the asymptotic theory of statistical experiments. The region lying below the so-called phase transition or impossibility threshold is shown to be a contiguity region. Simple analytical expressions are derived for the asymptotic power envelope and the asymptotic powers of existing tests. These asymptotic powers are shown to lie very substantially below the power envelope; some of them even trivially coincide with the size of the test. In contrast, the asymptotic power of the likelihood ratio test is shown to be uniformly close to the same.

  • XY - Basketball meets Big Data

    Date: 2013-10-25

    Time: 15:30-16:30

    Location: HEC Montréal Salle CIBC 1er étage

    Abstract:

    In this talk, I will explore the state of the art in the analysis and modeling of player tracking data in the NBA. In the past, player tracking data has been used primarily for visualization, such as understanding the spatial distribution of a player’s shooting characteristics, or to extract summary statistics, such as the distance traveled by a player in a given game. In this talk, I will present how we’re using advanced statistics and machine learning tools to answer previously unanswerable questions about the NBA. Examples include “How should teams configure their defensive matchups to minimize a player’s effectiveness?”, “Who are the best decision makers in the NBA?”, and “Who was responsible for the most points against in the NBA last season?”

  • Measurement error and variable selection in parametric and nonparametric models

    Date: 2013-09-27

    Time: 15:30-16:30

    Location: RPHYS 114

    Abstract:

    This talk will start with a discussion of the relationships between LASSO estimation, ridge regression, and attenuation due to measurement error as motivation for, and introduction to, a new generalizable approach to variable selection in parametric and nonparametric regression and discriminant analysis. The approach transcends the boundaries of parametric/nonparametric models. It will first be described in the familiar context of linear regression where its relationship to the LASSO will be described in detail. The latter part of the talk will focus on implementation of the approach to nonparametric modeling where sparse dependence on covariates is desired. Applications to two- and multi-category classification problems will be discussed in detail.

  • Arup Bose: Consistency of large dimensional sample covariance matrix under weak dependence

    Date: 2013-04-12

    Time: 14:30-15:30

    Location: Concordia

    Abstract:

    Estimation of large dimensional covariance matrix has been of interest recently. One model assumes that there are $p$ dimensional independent identically distributed Gaussian observations $X_1, \ldots , X_n$ with dispersion matrix $\Sigma_p$ and $p$ grows much faster than $n$. Appropriate convergence rate results have been established in the literature for tapered and banded estimators of $\Sigma_p$ which are based on the sample variance covariance matrix of $n$ observations.

  • Hélène Massam: The hyper Dirichlet revisited: a characterization

    Date: 2013-03-22

    Time: 14:30-15:30

    Location: BURN 107

    Abstract:

    We give a characterization of the hyper Dirichlet distribution hyper Markov with respect to a decomposable graph $G$ (or equivalently a moral directed acyclic graph). For $X=(X_1,\ldots,X_d)$ following the hyper Dirichlet distribution, our characterization is through the so-called “local and global independence properties” for a carefully designed family of orders of the variables $X_1,\ldots,X_d$.

    The hyper Dirichlet for general directed acyclic graphs was derived from a characterization of the Dirichlet distribution given by Geiger and Heckerman (1997). This characterization of the Dirichlet for $X=(X_1,\ldots,X_d)$ is obtained through a functional equation derived from the local and global independence properties for two different orders of the variables. These two orders are seemingly chosen haphazardly but, as our results show, this is not so. Our results generalize those of Geiger and Heckerman (1997) and are given without the assumption of existence of a positive density for $X$.

  • Victor Chernozhukov: Inference on treatment effects after selection amongst high-dimensional controls

    Date: 2013-01-18

    Time: 14:30-15:30

    Location: BURN 306

    Abstract:

    We propose robust methods for inference on the effect of a treatment variable on a scalar outcome in the presence of very many controls. Our setting is a partially linear model with possibly non-Gaussian and heteroscedastic disturbances. Our analysis allows the number of controls to be much larger than the sample size. To make informative inference feasible, we require the model to be approximately sparse; that is, we require that the effect of confounding factors can be controlled for up to a small approximation error by conditioning on a relatively small number of controls whose identities are unknown. The latter condition makes it possible to estimate the treatment effect by selecting approximately the right set of controls. We develop a novel estimation and uniformly valid inference method for the treatment effect in this setting, called the “post-double-selection” method. Our results apply to Lasso-type methods used for covariate selection as well as to any other model selection method that is able to find a sparse model with good approximation properties.