/tags/2026-winter/index.xml 2026 Winter - McGill Statistics Seminars
  • Hierarchical Clustering With Confidence

    Date: 2026-02-20

    Time: 15:30-16:30 (Montreal time)

    Location: In person, Burnside 1104

    https://mcgill.zoom.us/j/82441217734

    Meeting ID: 824 4121 7734

    Passcode: None

    Abstract:

    Hierarchical clustering is one of the most widely used approaches for exploring data. However, its greedy nature makes it highly sensitive to small perturbations, blurring the lines between genuine structure and spurious patterns.  In this work, we show how randomizing hierarchical clustering can be useful not just for assessing clustering stability but also for designing valid hypothesis testing procedures based on clustering results.  In particular, we propose a method for constructing a valid p-value at each node of the hierarchical clustering dendrogram that quantifies evidence against performing the greedy merge.  Furthermore, we show how our p-values can be used to estimate the number of clusters, with a probabilistic guarantee on overestimation of the number of clusters.  This is joint work with Di Wu and Snigdha Panigrahi.

  • Asymptotic Behavior, Risk Measures, and Simulation of Distorted Copulas

    Date: 2026-02-13

    Time: 15:30-16:30 (Montreal time)

    Location: In person, Burnside 1104

    https://mcgill.zoom.us/j/81379129957

    Meeting ID: 813 7912 9957

    Passcode: None

    Abstract:

    Distorting multivariate distributions is a useful approach for introducing flexibility and capturing model uncertainty. In particular, applying distortions to the copulas representing the underlying dependence structure allows one to generate new, flexible dependence models from existing ones. In this presentation, we investigate the extremal domain of attraction problem for Morillas-type distorted copulas. We establish not only conditions under which such copula-to-copula transformations alter the respective asymptotic behavior, but also discuss conditions under which the distorted copulas remain in the same domain of attraction as the initial undistorted copula. Furthermore, we discuss the effect of these distortions on multivariate risk measures, such as the lower-orthant Value-at-Risk and Range-Value-at-Risk. Finally, we propose a simulation algorithm for Morillas-type distorted copulas, addressing a gap in the literature and providing the means to utilize these modified dependence structures in practice. We end the presentation with an application of distorted copula models for hail insurance.

  • Survival analysis of extreme events with missing observations

    Date: 2026-01-23

    Time: 15:30-16:30 (Montreal time)

    Location: In person, Burnside 1104

    https://mcgill.zoom.us/j/82195728045

    Meeting ID: 821 9572 8045

    Passcode: None

    Abstract:

    The analysis of extreme wave surge heights in Atlantic Canada is key in determining areas that are subject to flooding or at risk of severe damage from intense storms. One method for modelling extreme events is through the block maxima approach which divides a series of observations into equal-sized blocks to extract the maxima after which inference is conducted on the generalized extreme value (GEV) distribution. When observations at the series level are missing, the observed block maxima may not correspond to the true block maxima. In this presentation, we introduce this missing data problem in the context of an extreme value analysis and explain how concepts from survival analysis can be used to improve inferences on the GEV distribution using the observed block maxima.

  • A General Framework for Testing Clustering Significance and Variable-Level Inference in High-Dimensional Data

    Date: 2026-01-16

    Time: 15:30-16:30 (Montreal time)

    Location: In person, Burnside 1104

    https://mcgill.zoom.us/j/89692052783

    Meeting ID: 896 9205 2783

    Passcode: None

    Abstract:

    Clustering is a fundamental tool for uncovering heterogeneity in data, yet a longstanding challenge lies in assessing whether detected clusters represent genuine structure or arise from sampling variability, and in determining which variables drive the clustering structure. Statistical significance clustering (SigClust; Liu et al. (2008)) addresses the first challenge by testing the cluster index under a Gaussian null, estimating its distribution via Monte Carlo simulation in high dimensions. We propose SigClust-DE, which builds on recent advances in high-dimensional covariance estimation to improve the accuracy of SigClust and extends it to variable-level inference. In particular, SigClust-DE unifies clustering significance testing and differential expression (DE) analysis, a central task in RNA-seq studies. By leveraging the Monte Carlo framework, our method controls type I error while maintaining high power for variable selection. Through extensive simulations and an application to RNA-seq data, we show that SigClust-DE achieves more accurate covariance estimation, effectively controls false discoveries, and substantially improves power in detecting differentially expressed variables, providing a general framework for clustering significance and variable-level inference in high-dimensional data.