/tags/2014-fall/index.xml 2014 Fall - McGill Statistics Seminars
  • Testing for structured Normal means

    Date: 2014-12-12 Time: 15:30-16:30 Location: BURN 1205 Abstract: We will discuss the detection of pattern in images and graphs from a high-dimensional Gaussian measurement. This problem is relevant to many applications including detecting anomalies in sensor and computer networks, large-scale surveillance, co-expressions in gene networks, disease outbreaks, etc. Beyond its wide applicability, structured Normal means detection serves as a case study in the difficulty of balancing computational complexity with statistical power.
  • Copula model selection: A statistical approach

    Date: 2014-12-05 Time: 15:30-16:30 Location: BURN 1205 Abstract: Copula model selection is an important problem because similar but differing copula models can offer different conclusions surrounding the dependence structure of random variables. Chen & Fan (2005) proposed a model selection method involving a statistical hypothesis test. The hypothesis test attempts to take into account the randomness of the AIC and other likelihood-based model selection methods for finite samples. Performance of the test compared to the more common approach of AIC is illustrated in a series of simulations.
  • Model-based methods of classification with applications

    Date: 2014-11-28 Time: 15:30-16:30 Location: BURN 1205 Abstract: Model-based clustering via finite mixture models is a popular clustering method for finding hidden structures in data. The model is often assumed to be a finite mixture of multivariate normal distributions; however, flexible extensions have been developed over recent years. This talk demonstrates some methods employed in unsupervised, semi-supervised, and supervised classification that include skew-normal and skew-t mixture models. Both real and simulated data sets are used to demonstrate the efficacy of these techniques.
  • Estimating by solving nonconvex programs: Statistical and computational guarantees

    Date: 2014-11-21 Time: 15:30-16:30 Location: BURN 1205 Abstract: Many statistical estimators are based on solving nonconvex programs. Although the practical performance of such methods is often excellent, the associated theory is frequently incomplete, due to the potential gaps between global and local optima. In this talk, we present theoretical results that apply to all local optima of various regularized M-estimators, where both loss and penalty functions are allowed to be nonconvex.
  • High-dimensional phenomena in mathematical statistics and convex analysis

    Date: 2014-11-20 Time: 16:00-17:00 Location: CRM 1360 (U. de Montréal) Abstract: Statistical models in which the ambient dimension is of the same order or larger than the sample size arise frequently in different areas of science and engineering. Although high-dimensional models of this type date back to the work of Kolmogorov, they have been the subject of intensive study over the past decade, and have interesting connections to many branches of mathematics (including concentration of measure, random matrix theory, convex geometry, and information theory).
  • Bridging the gap: A likelihood function approach for the analysis of ranking data

    Date: 2014-11-14 Time: 15:30-16:30 Location: BURN 1205 Abstract: In the parametric setting, the notion of a likelihood function forms the basis for the development of tests of hypotheses and estimation of parameters. Tests in connection with the analysis of variance stem entirely from considerations of the likelihood function. On the other hand, non- parametric procedures have generally been derived without any formal mechanism and are often the result of clever intuition.
  • Bayesian regression with B-splines under combinations of shape constraints and smoothness properties

    Date: 2014-11-07 Time: 15:30-16:30 Location: BURN 1205 Abstract: We approach the problem of shape constrained regression from a Bayesian perspective. A B-spline basis is used to model the regression function. The smoothness of the regression function is controlled by the order of the B-splines and the shape is controlled by the shape of an associated control polygon. Controlling the shape of the control polygon reduces to some inequality constraints on the spline coefficients.
  • A copula-based model for risk aggregation

    Date: 2014-10-31 Time: 15:30-16:30 Location: BURN 1205 Abstract: A flexible approach is proposed for risk aggregation. The model consists of a tree structure, bivariate copulas, and marginal distributions. The construction relies on a conditional independence assumption whose implications are studied. Selection the tree structure, estimation and model validation are illustrated using data from a Canadian property and casualty insurance company. Speaker Marie-Pier Côté is a PhD student in the Department of Mathematics and Statistics at McGill University.
  • PREMIER: Probabilistic error-correction using Markov inference in error reads

    Date: 2014-10-24 Time: 15:30-16:30 Location: BURN 1205 Abstract: Next generation sequencing (NGS) is a technology revolutionizing genetics and biology. Compared with the old Sanger sequencing method, the throughput is astounding and has fostered a slew of innovative sequencing applications. Unfortunately, the error rates are also higher, complicating many downstream analyses. For example, de novo assembly of genomes is less accurate and slower when reads include many errors. We develop a probabilistic model for NGS reads that can detect and correct errors without a reference genome and while flexibly modeling and estimating the error properties of the sequencing machine.
  • Patient privacy, big data, and specimen pooling: Using an old tool for new challenges

    Date: 2014-10-17 Time: 15:30-16:30 Location: BURN 1205 Abstract: In the recent past, electronic health records and distributed data networks emerged as a viable resource for medical and scientific research. As the use of confidential patient information from such sources become more common, maintaining privacy of patients is of utmost importance. For a binary disease outcome of interest, we show that the techniques of specimen pooling could be applied for analysis of large and/or distributed data while respecting patient privacy.