Past Seminar Series - McGill Statistics Seminars

- Nov 7, 2025
- post
Towards Efficient and Reliable Generative and Sampling Models

Tianshu Yu · Nov 7, 2025
Date: 2025-11-07

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

https://mcgill.zoom.us/j/87181846336

Meeting ID: 871 8184 6336

Passcode: None

Abstract:

This talk presents a unified framework for enhancing the reliability and geometric fidelity of generative models. We first develop a diffusion mechanism defined intrinsically on the SE(3) manifold, enabling the efficient sampling. To address the critical issue of mode collapse in energy-based samplers, we introduce a novel Importance Weighted Score Matching method that provably improves coverage of complex, multi-modal distributions. Finally, we extend these principles to infer underlying dynamical systems directly from incomplete and scattered training data. Collectively, this work bridges geometric consistency, statistical reliability, and learning from partial observations to advance the frontiers of generative and sampling models.

Read More…
- Oct 24, 2025
- post
Regularized Fine-Tuning for Representation Multi-Task Learning: Adaptivity, Minimaxity, and Robustness

Yang Feng · Oct 24, 2025
Date: 2025-10-24

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

https://mcgill.zoom.us/j/81872329544

Meeting ID: 818 7232 9544

Passcode: None

Abstract:

We study multi-task linear regression for a collection of tasks that share a latent, low-dimensional structure. Each task’s regression vector belongs to a subspace whose dimension, denoted intrinsic dimension, is much smaller than the ambient dimension. Unlike classical analyses that assume an identical subspace for every task, we allow each task’s subspace to drift from a single reference subspace by a controllable similarity radius, and we permit an unknown fraction of tasks to be outliers that violate the shared-structure assumption altogether. Our contributions are threefold. First, adaptivity: we design a penalized empirical-risk algorithm and a spectral method. Both algorithms automatically adjust to the unknown similarity radius and to the proportion of outliers. Second, minimaxity: we prove information-theoretic lower bounds on the best achievable prediction risk over this problem class and show that both algorithms attain these bounds up to constant factors; when no outliers are present, the spectral method is exactly minimax-optimal. Third, robustness: for every choice of similarity radius and outlier proportion, the proposed estimators never incur larger expected prediction error than independent single-task regression, while delivering strict improvements whenever tasks are even moderately similar and outliers are sparse. Additionally, we introduce a thresholding algorithm to adapt to an unknown intrinsic dimension. We conduct extensive numerical experiments to validate our theoretical findings.

Read More…
- Oct 10, 2025
- post
K-contact Distance for Noisy Nonhomogeneous Spatial Point Data and Application to Repeating Fast Radio Burst Sources

Amanda M. Cook · Oct 10, 2025
Date: 2025-10-10

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

https://mcgill.zoom.us/j/81986712072

Meeting ID: 819 8671 2072

Passcode: None

Abstract:

In this talk, I’ll introduce an approach to analyze nonhomogeneous Poisson processes (NHPP) observed with noise which focuses on previously unstudied second-order characteristics of the noisy process. Utilizing a hierarchical Bayesian model with noisy data, we first estimate hyperparameters governing a physically motivated NHPP intensity. Leveraging the posterior distribution, we then infer the probability of detecting a certain number of events within a given radius, the $k$-contact distance. This methodology is demonstrated by its motivating application: observations of fast radio bursts (FRBs) detected by the Canadian Hydrogen Intensity Mapping Experiment’s FRB Project (CHIME/FRB). The approach allows us to identify repeating FRB sources by computing the probability of observing $k$ physically independent sources within some radius in the detection domain, or the probability of coincidence ($P_C$). Applied, the new methodology improves the repeater detection $P_C$, in 86% of cases when applied to the largest sample of previously classified observations, with a median improvement factor (existing metric over $P_C$ from our methodology) of ~ 3000. Throughout the talk, I will provide the necessary astrophysical context to motivate the application and highlight some of the other active statistical problems in FRB science.

Read More…
- Oct 3, 2025
- post
Convergence Guarantees for Adversarially Robust Classifiers

Rachel Morris · Oct 3, 2025
Date: 2025-10-03

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

https://mcgill.zoom.us/j/82469112499

Meeting ID: 824 6911 2499

Passcode: None

Abstract:

Neural networks can be trained to classify images and achieve high levels of accuracy. However, researchers have discovered that well-targeted perturbations of an image can completely fool a trained classifier, even in cases where the modified image is visually indistinguishable from the original. This has sparked many new approaches to classification which include an adversary in the training process: such an adversary can improve robustness and generalization properties at the cost of decreased accuracy and increased training time. In this presentation, I will explore the connection between a certain class of adversarial training problems and the Bayes classification problem for binary classification. In particular, robustness can be encouraged by adding a regularizing nonlocal perimeter term, providing a strong connection to classical studies of perimeter. Borrowing tools from geometric measure theory, I will show the Hausdorff convergence of adversarially robust classifiers to Bayes classifiers as the strength of adversary decreases to 0. In this way, the theoretical results discussed in the presentation provide a rigorous comparison with the standard Bayes classification problem.

Read More…
- Sep 26, 2025
- post
Sparse Causal Learning: Challenges and Opportunities

Dingke Tang · Sep 26, 2025
Date: 2025-09-26

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

https://mcgill.zoom.us/j/81200178578

Meeting ID: 812 0017 8578

Passcode: None

Abstract:

In many observational studies, researchers are often interested in studying the effects of multiple exposures on a single outcome. Standard approaches for high-dimensional data such as the lasso assume the associations between the exposures and the outcome are sparse. These methods, however, do not estimate the causal effects in the presence of unmeasured confounding. In this paper, we consider an alternative approach that assumes the causal effects in view are sparse. We show that with sparse causation, the causal effects are identifiable even with unmeasured confounding. At the core of our proposal is a novel device, called the synthetic instrument, that in contrast to standard instrumental variables, can be constructed using the observed exposures directly. We show that under linear structural equation models, the problem of causal effect estimation can be formulated as an l0-penalization problem and hence can be solved efficiently using off-the-shelf software. Simulations show that our approach outperforms state-of-art methods in both low-dimensional and high-dimensional settings. We further illustrate our method using a mouse obesity dataset.

Read More…
- Sep 19, 2025
- post
Optimal vintage factor analysis with deflation varimax

Xin Bing · Sep 19, 2025
Date: 2025-09-19

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

https://mcgill.zoom.us/j/83914219181

Meeting ID: 839 1421 9181

Passcode: None

Abstract:

Vintage factor analysis is one important type of factor analysis that aims to first find a low-dimensional representation of the original data, and then to seek a rotation such that the rotated low-dimensional representation is scientifically meaningful. The most widely used vintage factor analysis is the Principal Component Analysis (PCA) followed by the varimax rotation. Despite its popularity, little theoretical guarantee can be provided to date mainly because varimax rotation requires to solve a non-convex optimization over the set of orthogonal matrices.

Read More…
- Sep 12, 2025
- post
Proper Correlation Coefficients for Nominal Random Variables

Lukas Wermuth · Sep 12, 2025
Date: 2025-09-12

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

https://mcgill.zoom.us/j/88021402798

Meeting ID: 880 2140 2798

Passcode: None

Abstract:

This work develops an intuitive concept of perfect dependence between two variables of which at least one has a nominal scale that is attainable for all marginal distributions and proposes a set of dependence measures that are 1 if and only if this perfect dependence is satisfied. The advantages of these dependence measures relative to classical dependence measures like contingency coefficients, Goodman-Kruskal’s lambda and tau and the so-called uncertainty coefficient are twofold. Firstly, they are defined if one of the variables is real-valued and exhibits continuities. Secondly, they satisfy the property of attainability. That is, they can take all values in the interval [0,1] irrespective of the marginals involved. Both properties are not shared by the classical dependence measures which need two discrete marginal distributions and can in some situations yield values close to 0 even though the dependence is strong or even perfect. Additionally, this work provide a consistent estimator for one of the new dependence measures together with its asymptotic distribution under independence as well as in the general case. This allows to construct confidence intervals and an independence test, whose finite sample performance is subsequently examine in a simulation study. Finally, we illustrate the use of the new dependence measure in two applications on the dependence between the variables country and income or country and religion, respectively.

Read More…
- May 23, 2025
- post
GARCH copulas, v-transforms and D-vines

Alexander McNeil · May 23, 2025
Date: 2025-05-23

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

https://mcgill.zoom.us/j/89626299031

Meeting ID: 896 2629 9031

Passcode: None

Abstract:

Stationary models from the GARCH class have proved to be extremely useful for forecasting volatility and measuring risk in financial time series. However, the nature of their implied copulas is opaque.

We analyse the serial dependence structure of first-order GARCH-type models in terms of the implied bivariate copulas that describe the dependence and partial dependence of pairs of variables at different lags. Our aim is to understand whether such dependence structures could be approximated with appropriately chosen bivariate copulas arranged in D-vines.

Read More…
- Apr 4, 2025
- post
Normalization effects on deep neural networks and deep learning for scientific problems

Konstantinos Spiliopoulos · Apr 4, 2025
Date: 2025-04-04

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

https://mcgill.zoom.us/j/81100654212

Meeting ID: 811 0065 4212

Passcode: None

Abstract:

We study the effect of normalization on the layers of deep neural networks. A given layer $i$ with $N_{i}$ hidden units is normalized by $1/N_{i}^{\gamma_{i}}$ with $\gamma_{i}\in[1/2,1]$. We study the effect of the choice of the $\gamma_{i}$ on the statistical behavior of the neural network’s output (such as variance) as well as on the test accuracy and generalization properties of the architecture. We find that in terms of variance of the neural network’s output and test accuracy the best choice is to choose the $\gamma_{i}$’s to be equal to one, which is the mean-field scaling. We also find that this is particularly true for the outer layer, in that the neural network’s behavior is more sensitive in the scaling of the outer layer as opposed to the scaling of the inner layers. The mechanism for the mathematical analysis is an asymptotic expansion for the neural network’s output. An important practical consequence of the analysis is that it provides a systematic and mathematically informed way to choose the learning rate hyperparameters. Such a choice guarantees that the neural network behaves in a statistically robust way as the number of hidden units $N_i$ grow. Time permitting, I will discuss applications of these ideas to design of deep learning algorithms for scientific problems including solving high dimensional partial differential equations (PDEs), closure of PDE models and reinforcement learning with applications to financial engineering, turbulence and more.

Read More…
- Mar 28, 2025
- post
From the distribution of string counts in Bernoulli sequences to multivariate discrete models

Éric Marchand · Mar 28, 2025
Date: 2025-03-28

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

https://mcgill.zoom.us/j/85849766730

Meeting ID: 858 4976 6730

Passcode: None

Abstract:

I will provide a personalized account of a sequence of problems, that I have worked on over the years, beginning with string counts in Bernoulli sequences and transiting to multivariate discrete models. As a starting point, we consider independent Bernoulli trials with varying success probabilities 1/k for the kth trial, the sum of the products of two consecutive occurrences, and the problem of establishing that the sum is distributed Poisson with mean equal to 1. We will explain how this finding connects to cycles in random permutations, records for continuous random variables, the Hoppe-Polya urn, and the classical Montmort matching problem.

Read More…

Date: 2025-11-07

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

Meeting ID: 871 8184 6336

Passcode: None

Abstract:

Date: 2025-10-24

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

Meeting ID: 818 7232 9544

Passcode: None

Abstract:

Date: 2025-10-10

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

Meeting ID: 819 8671 2072

Passcode: None

Abstract:

Date: 2025-10-03

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

Meeting ID: 824 6911 2499

Passcode: None

Abstract:

Date: 2025-09-26

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

Meeting ID: 812 0017 8578

Passcode: None

Abstract:

Date: 2025-09-19

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

Meeting ID: 839 1421 9181

Passcode: None

Abstract:

Date: 2025-09-12

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

Meeting ID: 880 2140 2798

Passcode: None

Abstract:

Date: 2025-05-23

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

Meeting ID: 896 2629 9031

Passcode: None

Abstract:

Date: 2025-04-04

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

Meeting ID: 811 0065 4212

Passcode: None

Abstract:

Date: 2025-03-28

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

Meeting ID: 858 4976 6730

Passcode: None

Abstract: