McGill Statistics Seminar - McGill Statistics Seminars

- Nov 22, 2024
- post
Asymptotic behavior of data driven empirical measures for testing multivariate regular variation

Benjamin Bobbia · Nov 22, 2024
Date: 2024-11-22

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

https://mcgill.zoom.us/j/82125361063

Meeting ID: 821 2536 1063

Passcode: None

Abstract:

Nowadays, empirical processes are well known objects. A reason that push forward theirs studies is that, in many models, we can write the estimators as images of empirical measures. In this work, the interest is touched upon the case of local empirical measures built over a sub-sample of data conditioned to be in a certain area, itself depending on the data. In the present work we present a general framework which allows to derive asymptotic results for these empirical measures. This approach is specified for the framework of extreme values theory. As an application, an asymptotic result allowing to derive a test procedure for Multivariate Regular Variation is detailed.

Read More…
- Nov 15, 2024
- post
Practical existence theorems for deep learning approximation in high dimensions

Simone Brugiapaglia · Nov 15, 2024
Date: 2024-11-15

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

https://mcgill.zoom.us/j/89043936588

Meeting ID: 890 4393 6588

Passcode: None

Abstract:

Deep learning is having a profound impact on industry and scientific research. Yet, while this paradigm continues to show impressive performance in a wide variety of applications, its mathematical foundations are far from being well understood. Motivated by deep learning methods for scientific computing, I will present new practical existence theorems that aim at bridging the gap between theory and practice in this area. Combining universal approximation results for deep neural networks with sparse high-dimensional polynomial approximation theory, these theorems identify sufficient conditions on the network architecture, the training strategy, and the size of the training set able to guarantee a target accuracy. I will illustrate practical existence theorems in the contexts of high-dimensional function approximation via feedforward networks, reduced order modeling based on convolutional autoencoders, and physics-informed neural networks for high-dimensional PDEs.

Read More…
- Nov 8, 2024
- post
A latent-vine factor-copula time series model for extreme flood insurance losses

Christian Genest · Nov 8, 2024
Date: 2024-11-08

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

https://mcgill.zoom.us/j/89121567327

Meeting ID: 891 2156 7327

Passcode: None

Abstract:

Vines and factor copula models are handy tools for statistical inference in high dimension. However, their use for assessing and predicting the co-occurrence of rare events is subject to caution when multivariate extreme data are sparse. Motivated by the need to assess the risk of concurrent large insurance claims in the American National Flood Insurance Program (NFIP), I will describe a novel class of copula models that can account for spatio-temporal dependence within clustered sets of time series. This new class, which combines the advantages of vines and factor copula models, provides great flexibility in capturing tail dependence while maintaining interpretability through a parsimonious latent structure. Using NFIP data, I will show the value of this approach in evaluating the risks associated with extreme weather events.

Read More…
- Nov 1, 2024
- post
On Mixture of Experts in Large-Scale Statistical Machine Learning Applications

Nhat Ho · Nov 1, 2024
Date: 2024-11-01

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

https://mcgill.zoom.us/j/81284191962

Meeting ID: 812 8419 1962

Passcode: None

Abstract:

Mixtures of experts (MoEs), a class of statistical machine learning models that combine multiple models, known as experts, to form more complex and accurate models, have been combined into deep learning architectures to improve the ability of these architectures and AI models to capture the heterogeneity of the data and to scale up these architectures without increasing the computational cost. In mixtures of experts, each expert specializes in a different aspect of the data, which is then combined with a gating function to produce the final output. Therefore, parameter and expert estimates play a crucial role by enabling statisticians and data scientists to articulate and make sense of the diverse patterns present in the data. However, the statistical behaviors of parameters and experts in a mixture of experts have remained unsolved, which is due to the complex interaction between gating function and expert parameters.

Read More…
- Oct 25, 2024
- post
New Statistical Methods and Quality Control for Industry 4.0

Antonio Lepore · Oct 25, 2024
Date: 2024-10-25

Time: 15:30-16:30 (Montreal time)

Location: Online, retransmitted in Burnside 1104

https://mcgill.zoom.us/j/81908885431

Meeting ID: 819 0888 5431

Passcode: None

Abstract:

Statistical and machine learning techniques have emerged as powerful tools in Industry 4.0 to benefit from the increasing availability of sensors and data and enable informed decision-making and process optimization. This seminar will provide an overview of several industrial applications in high-dimensional settings developed by the Statistics for Engineering Research (SFERe) group (www.sfere.unina.it), affiliated with the Department of Industrial Engineering at the University of Naples FEDERICO II. In these applications, the definition and use of novel statistical methods have been a competitive advantage for naval, automotive, and rail companies. The open-source R software packages implementing these methods will also be mentioned to highlight their accessibility and potential applicability in different industrial contexts.

Read More…
- Oct 11, 2024
- post
A functional data approach for statistical shapes analysis

Issam Moindjié · Oct 11, 2024
Date: 2024-10-11

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

https://mcgill.zoom.us/j/87824357176

Meeting ID: 878 2435 7176

Passcode: None

Abstract:

The shape $\tilde{\mathbf{X}}$ of a random planar curve, $\mathbf{X}$, is what remains when the deformation variables (scaling, rotation, translation, and reparametrization) are removed. Previous studies in statistical shape analysis have focused on analyzing $\tilde{\bf X}$ through discrete observations of ${\bf X}$. While this approach has some computational advantages, it overlooks the continuous nature of variables: $\tilde{\bf X}$, ${\bf X}$, and it ignores the potential dependence of deformation variables on each other and $\tilde{ \bf X}$, which results in a loss of information in the data structure. I will introduce a new framework for studying $\bf X$ based on functional data analysis in this presentation. Basis expansion techniques are employed to find analytic solutions for deformation variables such as rotation and parametrization deformations. Then, the generative model of $\bf X$ is investigated using a joint-principal component analysis approach. Numerical experiments on synthetic and real datasets demonstrate how this new approach performs better at analyzing random planar curves than traditional functional data methods.

Read More…
- Sep 27, 2024
- post
VCBART: Bayesian trees for varying coefficients

Ray Bai · Sep 27, 2024
Date: 2024-09-27

Time: 15:30-16:30 (Montreal time)

Location: Online, retransmitted in Burnside 1104

https://mcgill.zoom.us/j/88350756970

Meeting ID: 883 5075 6970

Passcode: None

Abstract:

The linear varying coefficient models posits a linear relationship between an outcome and covariates in which the covariate effects are modeled as functions of additional effect modifiers. Despite a long history of study and use in statistics and econometrics, state-of-the-art varying coefficient modeling methods cannot accommodate multivariate effect modifiers without imposing restrictive functional form assumptions or involving computationally intensive hyperparameter tuning. In response, we introduce VCBART which flexibly estimates the covariate effect in a varying coefficient model using Bayesian Additive Regression Trees. With simple default settings, VCBART outperforms existing varying coefficient methods in terms of covariate effect estimation, uncertainty quantification, and outcome prediction. Theoretically, we show that the VCBART posterior contracts at the near-minimax optimal rate. Finally, we illustrate the utility of VCBART through simulation studies and a real data application examining how the association between later-life cognition and measures of socioeconomic position vary with respect to age and sociodemographics.

Read More…
- Sep 20, 2024
- post
Variance reduction by occluding a Markov chain

Florian Maire · Sep 20, 2024
Date: 2024-09-20

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

https://mcgill.zoom.us/j/88265323185

Meeting ID: 882 6532 3185

Passcode: None

Abstract:

Stochastic algorithms which simulate random variables/processes on a computer to estimate intractable quantities are ubiquitous in Statistics and elsewhere. One such method is Markov chain Monte Carlo which, under mild conditions, offer asymptotical (in time) guarantees. In this talk, we define infinitely many stopping times at which an ergodic Markov chain is occluded by a (conditionally) independent process. The resulting process, called the occluded process, is not Markov, but provided that the stopping times/independent process are cleverly defined, we show that it is ergodic. One particularly powerful way to define the stopping times/independent process leverages the recent advances in ML regarding approximations of probability distributions (divergence minimization, normalizing flows, etc.). We discuss the variance reduction effect of the occluded process through some illustrations and (weak) theoretical results in some limiting regime.

Read More…
- Apr 12, 2024
- post
Free energy fluctuations of spherical spin glasses near the critical temperature threshold

Elizabeth Collins-Woodfin · Apr 12, 2024
Date: 2024-04-12

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

https://mcgill.zoom.us/j/86957985232

Meeting ID: 869 5798 5232

Passcode: None

Abstract:

One of the fascinating phenomena of spin glasses is the dramatic change in behavior that occurs between the high and low temperature regimes. In addition to its physical meaning, this phase transition corresponds to a detection threshold with respect to the signal-to-noise ratio in a spiked matrix model. The free energy of the spherical Sherrington-Kirkpatrick (SSK) model has Gaussian fluctuations at high temperature, but Tracy-Widom fluctuations at low temperature. A similar phenomenon holds for the bipartite SSK model, and we show that, when the temperature is within a small window around the critical temperature, the free energy fluctuations converge to an independent sum of Gaussian and Tracy-Widom random variables (joint work with Han Le). Our work follows two recent papers that proved similar results for the SSK model (by Landon and by Johnstone, Klochkov, Onatski, Pavlyshyn). From a statistical perspective, the free energy of SSK and bipartite SSK correspond to log-likelihood ratios for spiked Wigner and spiked Wishart matrices respectively. Analyzing bipartite SSK at critical temperature requires a variety of tools including classical random matrix results, contour integral techniques, and a CLT for the log-characteristic polynomial of Wishart random matrices evaluated near the spectral edge.

Read More…
- Mar 22, 2024
- post
Minimum Covariance Determinant: Spectral Embedding and Subset Size Determination

Qiang Heng · Mar 22, 2024
Date: 2024-03-22

Time: 15:30-16:30 (Montreal time)

Location: Online, retransmitted in Burnside 1104

https://mcgill.zoom.us/j/81895414756

Meeting ID: 818 9541 4756

Passcode: None

Abstract:

This paper introduces several enhancements to the minimum covariance determinant method of outlier detection and robust estimation of means and covariances. We leverage the principal component transform to achieve dimension reduction and ultimately better analyses. Our best subset selection algorithm strategically combines statistical depth and concentration steps. To ascertain the appropriate subset size and number of principal components, we introduce a bootstrap procedure that estimates the instability of the best subset algorithm. The parameter combination exhibiting minimal instability proves ideal for the purposes of outlier detection and robust estimation. Rigorous benchmarking against prominent MCD variants showcases our approach’s superior statistical performance and computational speed in high dimensions. Application to a fruit spectra data set and a cancer genomics data set illustrates our claims.

Read More…

Date: 2024-11-22

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

Meeting ID: 821 2536 1063

Passcode: None

Abstract:

Date: 2024-11-15

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

Meeting ID: 890 4393 6588

Passcode: None

Abstract:

Date: 2024-11-08

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

Meeting ID: 891 2156 7327

Passcode: None

Abstract:

Date: 2024-11-01

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

Meeting ID: 812 8419 1962

Passcode: None

Abstract:

Date: 2024-10-25

Time: 15:30-16:30 (Montreal time)

Location: Online, retransmitted in Burnside 1104

Meeting ID: 819 0888 5431

Passcode: None

Abstract:

Date: 2024-10-11

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

Meeting ID: 878 2435 7176

Passcode: None

Abstract:

Date: 2024-09-27

Time: 15:30-16:30 (Montreal time)

Location: Online, retransmitted in Burnside 1104

Meeting ID: 883 5075 6970

Passcode: None

Abstract:

Date: 2024-09-20

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

Meeting ID: 882 6532 3185

Passcode: None

Abstract:

Date: 2024-04-12

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

Meeting ID: 869 5798 5232

Passcode: None

Abstract:

Date: 2024-03-22

Time: 15:30-16:30 (Montreal time)

Location: Online, retransmitted in Burnside 1104

Meeting ID: 818 9541 4756

Passcode: None

Abstract: