2025 Fall - McGill Statistics Seminars

- Nov 28, 2025
- post
Deep P-Spline: Theory, Fast Tuning, and Application

Li-Hsiang Lin · Nov 28, 2025
Date: 2025-11-28

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

https://mcgill.zoom.us/j/86339405056

Meeting ID: 863 3940 5056

Passcode: None

Abstract:

Deep neural networks (DNNs) have become a standard tool for tackling complex regression problems, yet identifying an optimal network architecture remains a fundamental challenge. In this work, we connect neuron selection in DNNs with knot placement in basis expansion methods. Building on this connection, we propose a difference-penalty approach that automates knot selection and, in turn, simplifies the process of choosing neurons. We call this method Deep P-Spline (DPS). This approach extends the class of models considered in conventional DNN modeling and forms the basis for a latent-variable modeling framework using the Expectation–Conditional Maximization (ECM) algorithm for efficient network structure tuning with theoretical guarantees. From the perspective of nonparametric regression, DPS alleviates the curse of dimensionality, allowing effective analysis of high-dimensional data where conventional methods often fail. These properties make DPS particularly well suited for applications such as computer experiments and image data analysis, where regression tasks routinely involve a large number of inputs. Numerical studies demonstrate the strong performance of DPS, underscoring its potential as a powerful tool for advanced nonlinear regression problems.

Read More…
- Nov 14, 2025
- post
Can uncertainty be quantified? On confident hallucinations in deep learning-based methods for inverse problems

Ben Adcock · Nov 14, 2025
Date: 2025-11-14

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

https://mcgill.zoom.us/j/82687773039

Meeting ID: 826 8777 3039

Passcode: None

Abstract:

Deep learning is currently transforming how inverse problems arising in imaging reconstruction are solved. However, it is increasingly well-known that such deep learning-based methods are susceptible to hallucinations. In this talk, I will present a series of theoretical explanations for why hallucinations occur, in both deterministic and statistical estimators. I will conclude by observing that hallucinations can only be avoided by careful design of the forwards operator in tandem with the recovery algorithm, and then provide a theoretical framework for how this can be achieved when solving inverse problems using generative models.

Read More…
- Nov 7, 2025
- post
Towards Efficient and Reliable Generative and Sampling Models

Tianshu Yu · Nov 7, 2025
Date: 2025-11-07

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

https://mcgill.zoom.us/j/87181846336

Meeting ID: 871 8184 6336

Passcode: None

Abstract:

This talk presents a unified framework for enhancing the reliability and geometric fidelity of generative models. We first develop a diffusion mechanism defined intrinsically on the SE(3) manifold, enabling the efficient sampling. To address the critical issue of mode collapse in energy-based samplers, we introduce a novel Importance Weighted Score Matching method that provably improves coverage of complex, multi-modal distributions. Finally, we extend these principles to infer underlying dynamical systems directly from incomplete and scattered training data. Collectively, this work bridges geometric consistency, statistical reliability, and learning from partial observations to advance the frontiers of generative and sampling models.

Read More…
- Oct 24, 2025
- post
Regularized Fine-Tuning for Representation Multi-Task Learning: Adaptivity, Minimaxity, and Robustness

Yang Feng · Oct 24, 2025
Date: 2025-10-24

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

https://mcgill.zoom.us/j/81872329544

Meeting ID: 818 7232 9544

Passcode: None

Abstract:

We study multi-task linear regression for a collection of tasks that share a latent, low-dimensional structure. Each task’s regression vector belongs to a subspace whose dimension, denoted intrinsic dimension, is much smaller than the ambient dimension. Unlike classical analyses that assume an identical subspace for every task, we allow each task’s subspace to drift from a single reference subspace by a controllable similarity radius, and we permit an unknown fraction of tasks to be outliers that violate the shared-structure assumption altogether. Our contributions are threefold. First, adaptivity: we design a penalized empirical-risk algorithm and a spectral method. Both algorithms automatically adjust to the unknown similarity radius and to the proportion of outliers. Second, minimaxity: we prove information-theoretic lower bounds on the best achievable prediction risk over this problem class and show that both algorithms attain these bounds up to constant factors; when no outliers are present, the spectral method is exactly minimax-optimal. Third, robustness: for every choice of similarity radius and outlier proportion, the proposed estimators never incur larger expected prediction error than independent single-task regression, while delivering strict improvements whenever tasks are even moderately similar and outliers are sparse. Additionally, we introduce a thresholding algorithm to adapt to an unknown intrinsic dimension. We conduct extensive numerical experiments to validate our theoretical findings.

Read More…
- Oct 10, 2025
- post
K-contact Distance for Noisy Nonhomogeneous Spatial Point Data and Application to Repeating Fast Radio Burst Sources

Amanda M. Cook · Oct 10, 2025
Date: 2025-10-10

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

https://mcgill.zoom.us/j/81986712072

Meeting ID: 819 8671 2072

Passcode: None

Abstract:

In this talk, I’ll introduce an approach to analyze nonhomogeneous Poisson processes (NHPP) observed with noise which focuses on previously unstudied second-order characteristics of the noisy process. Utilizing a hierarchical Bayesian model with noisy data, we first estimate hyperparameters governing a physically motivated NHPP intensity. Leveraging the posterior distribution, we then infer the probability of detecting a certain number of events within a given radius, the $k$-contact distance. This methodology is demonstrated by its motivating application: observations of fast radio bursts (FRBs) detected by the Canadian Hydrogen Intensity Mapping Experiment’s FRB Project (CHIME/FRB). The approach allows us to identify repeating FRB sources by computing the probability of observing $k$ physically independent sources within some radius in the detection domain, or the probability of coincidence ($P_C$). Applied, the new methodology improves the repeater detection $P_C$, in 86% of cases when applied to the largest sample of previously classified observations, with a median improvement factor (existing metric over $P_C$ from our methodology) of ~ 3000. Throughout the talk, I will provide the necessary astrophysical context to motivate the application and highlight some of the other active statistical problems in FRB science.

Read More…
- Oct 3, 2025
- post
Convergence Guarantees for Adversarially Robust Classifiers

Rachel Morris · Oct 3, 2025
Date: 2025-10-03

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

https://mcgill.zoom.us/j/82469112499

Meeting ID: 824 6911 2499

Passcode: None

Abstract:

Neural networks can be trained to classify images and achieve high levels of accuracy. However, researchers have discovered that well-targeted perturbations of an image can completely fool a trained classifier, even in cases where the modified image is visually indistinguishable from the original. This has sparked many new approaches to classification which include an adversary in the training process: such an adversary can improve robustness and generalization properties at the cost of decreased accuracy and increased training time. In this presentation, I will explore the connection between a certain class of adversarial training problems and the Bayes classification problem for binary classification. In particular, robustness can be encouraged by adding a regularizing nonlocal perimeter term, providing a strong connection to classical studies of perimeter. Borrowing tools from geometric measure theory, I will show the Hausdorff convergence of adversarially robust classifiers to Bayes classifiers as the strength of adversary decreases to 0. In this way, the theoretical results discussed in the presentation provide a rigorous comparison with the standard Bayes classification problem.

Read More…
- Sep 26, 2025
- post
Sparse Causal Learning: Challenges and Opportunities

Dingke Tang · Sep 26, 2025
Date: 2025-09-26

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

https://mcgill.zoom.us/j/81200178578

Meeting ID: 812 0017 8578

Passcode: None

Abstract:

In many observational studies, researchers are often interested in studying the effects of multiple exposures on a single outcome. Standard approaches for high-dimensional data such as the lasso assume the associations between the exposures and the outcome are sparse. These methods, however, do not estimate the causal effects in the presence of unmeasured confounding. In this paper, we consider an alternative approach that assumes the causal effects in view are sparse. We show that with sparse causation, the causal effects are identifiable even with unmeasured confounding. At the core of our proposal is a novel device, called the synthetic instrument, that in contrast to standard instrumental variables, can be constructed using the observed exposures directly. We show that under linear structural equation models, the problem of causal effect estimation can be formulated as an l0-penalization problem and hence can be solved efficiently using off-the-shelf software. Simulations show that our approach outperforms state-of-art methods in both low-dimensional and high-dimensional settings. We further illustrate our method using a mouse obesity dataset.

Read More…
- Sep 19, 2025
- post
Optimal vintage factor analysis with deflation varimax

Xin Bing · Sep 19, 2025
Date: 2025-09-19

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

https://mcgill.zoom.us/j/83914219181

Meeting ID: 839 1421 9181

Passcode: None

Abstract:

Vintage factor analysis is one important type of factor analysis that aims to first find a low-dimensional representation of the original data, and then to seek a rotation such that the rotated low-dimensional representation is scientifically meaningful. The most widely used vintage factor analysis is the Principal Component Analysis (PCA) followed by the varimax rotation. Despite its popularity, little theoretical guarantee can be provided to date mainly because varimax rotation requires to solve a non-convex optimization over the set of orthogonal matrices.

Read More…
- Sep 12, 2025
- post
Proper Correlation Coefficients for Nominal Random Variables

Lukas Wermuth · Sep 12, 2025
Date: 2025-09-12

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

https://mcgill.zoom.us/j/88021402798

Meeting ID: 880 2140 2798

Passcode: None

Abstract:

This work develops an intuitive concept of perfect dependence between two variables of which at least one has a nominal scale that is attainable for all marginal distributions and proposes a set of dependence measures that are 1 if and only if this perfect dependence is satisfied. The advantages of these dependence measures relative to classical dependence measures like contingency coefficients, Goodman-Kruskal’s lambda and tau and the so-called uncertainty coefficient are twofold. Firstly, they are defined if one of the variables is real-valued and exhibits continuities. Secondly, they satisfy the property of attainability. That is, they can take all values in the interval [0,1] irrespective of the marginals involved. Both properties are not shared by the classical dependence measures which need two discrete marginal distributions and can in some situations yield values close to 0 even though the dependence is strong or even perfect. Additionally, this work provide a consistent estimator for one of the new dependence measures together with its asymptotic distribution under independence as well as in the general case. This allows to construct confidence intervals and an independence test, whose finite sample performance is subsequently examine in a simulation study. Finally, we illustrate the use of the new dependence measure in two applications on the dependence between the variables country and income or country and religion, respectively.

Read More…

Date: 2025-11-28

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

Meeting ID: 863 3940 5056

Passcode: None

Abstract:

Date: 2025-11-14

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

Meeting ID: 826 8777 3039

Passcode: None

Abstract:

Date: 2025-11-07

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

Meeting ID: 871 8184 6336

Passcode: None

Abstract:

Date: 2025-10-24

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

Meeting ID: 818 7232 9544

Passcode: None

Abstract:

Date: 2025-10-10

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

Meeting ID: 819 8671 2072

Passcode: None

Abstract:

Date: 2025-10-03

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

Meeting ID: 824 6911 2499

Passcode: None

Abstract:

Date: 2025-09-26

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

Meeting ID: 812 0017 8578

Passcode: None

Abstract:

Date: 2025-09-19

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

Meeting ID: 839 1421 9181

Passcode: None

Abstract:

Date: 2025-09-12

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

Meeting ID: 880 2140 2798

Passcode: None

Abstract: