McGill Statistics Seminar - McGill Statistics Seminars

- Feb 10, 2017
- post
Sparse envelope model: Efficient estimation and response variable selection in multivariate linear regression

Zhihua Su · Feb 10, 2017
Date: 2017-02-10

Time: 15:30-16:30

Location: BURN 1205

Abstract:

The envelope model is a method for efficient estimation in multivariate linear regression. In this article, we propose the sparse envelope model, which is motivated by applications where some response variables are invariant to changes of the predictors and have zero regression coefficients. The envelope estimator is consistent but not sparse, and in many situations it is important to identify the response variables for which the regression coefficients are zero. The sparse envelope model performs variable selection on the responses and preserves the efficiency gains offered by the envelope model. Response variable selection arises naturally in many applications, but has not been studied as thoroughly as predictor variable selection. In this article, we discuss response variable selection in both the standard multivariate linear regression and the envelope contexts. In response variable selection, even if a response has zero coefficients, it still should be retained to improve the estimation efficiency of the nonzero coefficients. This is different from the practice in predictor variable selection. We establish consistency, the oracle property and obtain the asymptotic distribution of the sparse envelope estimator.

Read More…
- Feb 3, 2017
- post
MM algorithms for variance component models

Hua Zhou · Feb 3, 2017
Date: 2017-02-03

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Variance components estimation and mixed model analysis are central themes in statistics with applications in numerous scientific disciplines. Despite the best efforts of generations of statisticians and numerical analysts, maximum likelihood estimation and restricted maximum likelihood estimation of variance component models remain numerically challenging. In this talk, we present a novel iterative algorithm for variance components estimation based on the minorization-maximization (MM) principle. MM algorithm is trivial to implement and competitive on large data problems. The algorithm readily extends to more complicated problems such as linear mixed models, multivariate response models possibly with missing data, maximum a posteriori estimation, and penalized estimation. We demonstrate, both numerically and theoretically, that it converges faster than the classical EM algorithm when the number of variance components is greater than two.

Read More…
- Jan 20, 2017
- post
Order selection in multidimensional finite mixture models

Tudor Manole · Jan 20, 2017
Date: 2017-01-20

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Finite mixture models provide a natural framework for analyzing data from heterogeneous populations. In practice, however, the number of hidden subpopulations in the data may be unknown. The problem of estimating the order of a mixture model, namely the number of subpopulations, is thus crucial for many applications. In this talk, we present a new penalized likelihood solution to this problem, which is applicable to models with a multidimensional parameter space. The order of the model is estimated by starting with a large number of mixture components, which are clustered and then merged via two penalty functions. Doing so estimates the unknown parameters of the mixture, at the same time as the order. We will present extensive simulation studies, showing our approach outperforms many of the most common methods for this problem, such as the Bayesian Information Criterion. Real data examples involving normal and multinomial mixtures further illustrate its performance.

Read More…
- Jan 13, 2017
- post
(Sparse) exchangeable graphs

Victor Veitch · Jan 13, 2017
Date: 2017-01-13

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Many popular statistical models for network valued datasets fall under the remit of the graphon framework, which (implicitly) assumes the networks are densely connected. However, this assumption rarely holds for the real-world networks of practical interest. We introduce a new class of models for random graphs that generalises the dense graphon models to the sparse graph regime, and we argue that this meets many of the desiderata one would demand of a model to serve as the foundation for a statistical analysis of real-world networks. The key insight is to define the models by way of a novel notion of exchangeability; this is analogous to the specification of conditionally i.i.d. models by way of de Finetti’s representation theorem. We further develop this model class by explaining the foundations of sampling and estimation of network models in this setting. The later result can be can be understood as the (sparse) graph analogue of estimation via the empirical distribution in the i.i.d. sequence setting.

Read More…
- Dec 2, 2016
- post
Modeling dependence in bivariate multi-state processes: A frailty approach

Andrea Giussani · Dec 2, 2016
Date: 2016-12-02

Time: 15:30-16:30

Location: BURN 1205

Abstract:

The aim of this talk is to present a statistical framework for the analysis of dependent bivariate multistate processes, allowing one to study the dependence both across subjects in a pair and among individual-specific events. As for the latter, copula- based models are employed, whereas dependence between multi-state models can be accomplished by means of frailties. The well known Marshall-Olkin Bivariate Exponential Distribution (MOBVE) is considered for the joint distribution of frailties. The reason is twofold: on the one hand, it allows one to model shocks that affect the two individual-specific frailties; on the other hand, the MOBVE is the only bivariate exponential distribution with exponential marginals, which allows for the modeling of each multi-state process as a shared frailty model. We first discuss a frailty bivariate survival model with some new results, and then move to the construction of the frailty bivariate multi-state model, with the corresponding observed data likelihood maximization estimating procedure in presence of right censoring. The last part of the talk will be dedicated to some open problems related to the modeling of multiple multi-state processes in presence of Marshall-Olkin type copulas.

Read More…
- Nov 25, 2016
- post
Spatio-temporal models for skewed processes

Alexandra Schmidt · Nov 25, 2016
Date: 2016-11-25

Time: 15:30-16:30

Location: BURN 1205

Abstract:

In the analysis of most spatio-temporal processes in environmental studies, observations present skewed distributions. Usually, a single transformation of the data is used to approximate normality, and stationary Gaussian processes are assumed to model the transformed data. The choice of transformation is key for spatial interpolation and temporal prediction. We propose a spatio-temporal model for skewed data that does not require the use of data transformation. The process is decomposed as the sum of a purely temporal structure with two independent components that are considered to be partial realizations from independent spatial Gaussian processes, for each time t. The model has an asymmetry parameter that might vary with location and time, and if this is equal to zero, the usual Gaussian model results. The inference procedure is performed under the Bayesian paradigm, and uncertainty about parameters estimation is naturally accounted for. We fit our model to different synthetic data and to monthly average temperature observed between 2001 and 2011 at monitoring locations located in the south of Brazil. Different model comparison criteria, and analysis of the posterior distribution of some parameters, suggest that the proposed model outperforms standard ones used in the literature. This is joint work with Kelly Gonçalves (UFRJ, Brazil) and Patricia L. Velozo (UFF, Brazil)

Read More…
- Nov 18, 2016
- post
Progress in theoretical understanding of deep learning

Yoshua Bengio · Nov 18, 2016
Date: 2016-11-18

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Deep learning has arisen around 2006 as a renewal of neural networks research allowing such models to have more layers. Theoretical investigations have shown that functions obtained as deep compositions of simpler functions (which includes both deep and recurrent nets) can express highly varying functions (with many ups and downs and different input regions that can be distinguished) much more efficiently (with fewer parameters) than otherwise, under a prior which seems to work well for artificial intelligence tasks. Empirical work in a variety of applications has demonstrated that, when well trained, such deep architectures can be highly successful, remarkably breaking through previous state-of-the-art in many areas, including speech recognition, object recognition, language models, machine translation and transfer learning. Although neural networks have long been considered lacking in theory and much remains to be done, theoretical advances have been made and will be discussed, to support distributed representations, depth of representation, the non-convexity of the training objective, and the probabilistic interpretation of learning algorithms (especially of the auto-encoder type, which were lacking one). The talk will focus on the intuitions behind these theoretical results.

Read More…
- Nov 11, 2016
- post
Tyler's M-estimator: Subspace recovery and high-dimensional regime

Teng Zhang · Nov 11, 2016
Date: 2016-11-11

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Given a data set, Tyler’s M-estimator is a widely used covariance matrix estimator with robustness to outliers or heavy-tailed distribution. We will discuss two recent results of this estimator. First, we show that when a certain percentage of the data points are sampled from a low-dimensional subspace, Tyler’s M-estimator can be used to recover the subspace exactly. Second, in the high-dimensional regime that the number of samples n and the dimension p both go to infinity, p/n converges to a constant y between 0 and 1, and when the data samples are identically and independently generated from the Gaussian distribution N(0,I), we showed that the difference between the sample covariance matrix and a scaled version of Tyler’s M-estimator tends to zero in spectral norm, and the empirical spectral densities of both estimators converge to the Marcenko-Pastur distribution. We also prove that when the data samples are generated from an elliptical distribution, the limiting distribution of Tyler’s M-estimator converges to a Marcenko-Pastur-Type distribution. The second part is joint work with Xiuyuan Cheng and Amit Singer.

Read More…
- Nov 4, 2016
- post
Lawlor: Time-varying mixtures of Markov chains: An application to traffic modeling Piché: Bayesian nonparametric modeling of heterogeneous groups of censored data

Sean Lawlor and Alexandre Piché · Nov 4, 2016
Date: 2016-11-04

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Piché: Analysis of survival data arising from different groups, whereby the data in each group is scarce, but abundant overall, is a common issue in applied statistics. Bayesian nonparametrics are tools of choice to handle such datasets given their ability to share information across groups. In this presentation, we will compare three popular Bayesian nonparametric methods on the modeling of survival functions coming from related heterogeneous groups. Specifically, we will first compare the modeling accuracy of the Dirichlet process, the hierarchical Dirichlet process, and the nested Dirichlet process on simulated datasets of different sizes, where groups differ in shape or in expectation, and finally we will compare the models on real world injury datasets.

Read More…
- Nov 2, 2016
- post
First talk: Bootstrap in practice | Second talk: Statistics and Big Data at Google

Tim Hesterberg · Nov 2, 2016
Date: 2016-11-02

Time: 15:00-16:00 17:35-18:25

Location: 1st: BURN 306 2nd: ADAMS AUD

Abstract:

First talk: This talk focuses on three practical aspects of resampling: communication, accuracy, and software. I’ll introduce the bootstrap and permutation tests, and discussed how they may be used to help clients understand statistical results. I’ll talk about accuracy – there are dramatic differences in how accurate different bootstrap methods are. Surprisingly, the most common bootstrap methods are less accurate than classical methods for small samples, and more accurate for larger samples. There are simple variations that dramatically improve the accuracy. Finally, I’ll compare two R packages, the the easy-to-use “resample” package, and the more-powerful “boot” package.

Read More…

Date: 2017-02-10

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Date: 2017-02-03

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Date: 2017-01-20

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Date: 2017-01-13

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Date: 2016-12-02

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Date: 2016-11-25

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Date: 2016-11-18

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Date: 2016-11-11

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Date: 2016-11-04

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Date: 2016-11-02

Time: 15:00-16:00 17:35-18:25

Location: 1st: BURN 306 2nd: ADAMS AUD

Abstract: