Past Seminar Series - McGill Statistics Seminars

- Oct 20, 2023
- post
Neural network architectures for functional data analysis

Cédric Beaulac · Oct 20, 2023
Date: 2023-10-20

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

https://mcgill.zoom.us/j/89761165882

Meeting ID: 897 6116 5882

Passcode: None

Abstract:

Functional data is defined as any random variables that assume values in an infinite precision domain, such as time or space. In applications, this data is usually discretely observed at some regularly or irregularly-spaced points over the domain. In this talk, we discuss ways to adapt modern neural network architectures for the analysis of functional data. To do so, we design new neural network layers in order to process functional data either as input, output or both. First, we propose the functional output layer, which can be used to solve a multitude of function-on-scalar regression problems in a non-linear way. The proposed layer provides a smooth representation of the output and we demonstrate how to regularize such a layer during the network training phase. Second, we propose a concept for functional weights that project functional data to a scalar representation, leading to a novel formulation for a functional input layer. We demonstrate how to combine both of these proposed functional layers to create a functional autoencoder. This model takes as input the data in the form it is usually collected, as discrete points over the domain, and can be used for feature extraction and functional data smoothing. We demonstrate the benefits of the proposed architectures with various experiments on simulated data and real data applications. We conclude with a brief discussion of ongoing work in the design of a functional convolution layer that bridges the gap between the discrete convolution operation and its continuous counterpart.

Read More…
- Oct 13, 2023
- post
Distances on and between complex networks

Pierre Miasnikof · Oct 13, 2023
Date: 2023-10-13

Time: 15:30-16:30 (Montreal time)

Location: Online, retransmitted in Burnside 1104

https://mcgill.zoom.us/j/83477865796

Meeting ID: 834 7786 5796

Passcode: None

Abstract:

Distance plays a pivotal role in statistics. Meanwhile, recent technologies and social networks have yielded large complex network data sets, which require customized statistical tools. From a mathematical viewpoint, these complex networks are graphs with non-trivial structures (in contrast to Erdös-Rényi graphs, for example). These networks are models of systemic phenomena and cases where individual-level analyses are insufficient. Such models are not only used in the study of social networks, but are also widely employed in neurology, biology, telecommunication and finance, among many areas of application. Unfortunately, however, distances on graphs are not clearly defined.

Read More…
- Sep 29, 2023
- post
Doubly robust inference under possibly misspecified marginal structural Cox model

Ronghui (Lily) Xu · Sep 29, 2023
Date: 2023-09-29

Time: 15:30-16:30 (Montreal time)

Location: Online, retransmitted in Burnside 1104

https://mcgill.zoom.us/j/82440807026

Meeting ID: 824 4080 7026

Passcode: None

Abstract:

Doubly robust estimation under the marginal structural Cox model has been a challenge until recently due to the non-collapsibility of the Cox regression model. This is because the estimand of causal hazard ratio assumes that the marginal structural Cox model holds, while the doubly robust estimating function requires the specification of an additional model for the conditional distribution of the time-to-event given treatment and covariates, both models unlikely to hold simultaneously. It became possible recently to resolve this issue with the understanding of rate double robustness and machine learning or nonparametric approaches, although technical details are still to be spelt out to ensure root-n inference for the estimand. We describe our work considering both observational studies setting and in the presence of covariate-induced informative censoring. An added benefit of our approach is the interpretation of the estimand when the assumed marginal structural Cox model does not hold, as a time-averaged treatment effect. This allows meaningful estimation of treatment effects for general two-group comparison without the Cox model, or under alternative models such as the semiparametric proportional odds or transformation models for the potential time-to-event outcomes.

Read More…
- Sep 22, 2023
- post
Detection of Multiple Influential Observations on Variable Selection for High-dimensional Data: New Perspective with an Application to Neurologic Signature of Physical Pain.

Dongliang Zhang · Sep 22, 2023
Date: 2023-09-22

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

https://mcgill.zoom.us/j/89374813252

Meeting ID: 893 7481 3252

Passcode: None

Abstract:

Influential diagnosis is an integral part of data analysis, of which most existing methodological frameworks presume a deterministic submodel and are designed for low-dimensional data (i.e., the number of predictors $p$ smaller than the sample size $n$). However, the stochastic selection of a submodel from high-dimensional data where $p$ exceeds $n$ has become ubiquitous. Thus, methods for identifying observations that could exert undue influence on the choice of a submodel can play an important role in this setting. To date, discussion of this topic has been limited, falling short in two domains: (1) constrained ability to detect multiple influential points, and (2) applicability only in restrictive settings. In this talk, building on a recently proposed measure, we introduce a generalized version accommodating different model selectors, the asymptotic property of which is subsequently examined for large $p$. The $K$-means clustering is incorporated into our scheme to detect multiple influential points. Simulation is then conducted to assess the performances of various diagnostic approaches. The proposed procedure further demonstrates its value in improving predictive power when analyzing thermal-stimulated pain based on fMRI data. In addition, the latest development revolving around this newly proposed measure is also presented. This work is conducted under the joint supervision of Professors Masoud Asgharian and Martin Lindquist.

Read More…
- Sep 15, 2023
- post
Three Myths About Causal Mediation

Naftali Weinberger · Sep 15, 2023
Date: 2023-09-15

Time: 15:30-16:30 (Montreal time)

Location: Burnside 1104

https://mcgill.zoom.us/j/86404798712

Meeting ID: 864 0479 8712

Passcode: None

Abstract:

Causal mediation techniques are a means for identifying the degree to which a cause influences its effect along particular causal paths. For example, in a model where a cause influences its effect both indirectly via a mediator and directly via factors not included in the model, mediation techniques enable one to measure both direct and indirect effects. Although mediation techniques are widely employed, they are often misunderstood. This is in part due to the long-term influence of Baron and Kenny’s (1986) treatment of mediation, which applies only to linear models without interaction, and which leads one to develop intuitions about direct and indirect effects that do not generalize to non-parametric causal models. In my talk, I identify and reject three persistent myths about mediation. I argue that such methods: 1. Should not be understood as decomposing the total effect into additive components corresponding to the contributions of the paths; 2. Are not a means for eliminating latent heterogeneity; and 3. Do not require one to appeal to causal concepts other than the counterfactual causal ones built into structural causal models. These points are crucial for understanding mediation effects in any contexts in which they are studied, and have particular applications for studies of fairness and discrimination, in which such effects play an increasingly central role (Plečko and Bareinboim, 2022).

Read More…
- Aug 17, 2023
- post
Empirical Bayes Control of the False Discovery Exceedance

Pallavi Basu · Aug 17, 2023
Date: 2023-08-17

Time: 15:30-16:30 (Montreal time)

Hybrid: In person / Zoom

Location: Burnside Hall 1104

https://mcgill.zoom.us/j/89623344755?pwd=S1E0QWVjSm8wRHdIYU5IZzllSXNjUT09

Meeting ID: 896 2334 4755

Passcode: 287381

Abstract:

In sparse large-scale testing problems where the false discovery proportion (FDP) is highly variable, the false discovery exceedance (FDX) provides a valuable alternative to the widely used false discovery rate (FDR). We develop an empirical Bayes approach to controlling the FDX. We show that for independent hypotheses from a two-group model and dependent hypotheses from a Gaussian model fulfilling the exchangeability condition, an oracle decision rule based on ranking and thresholding the local false discovery rate (lfdr) is optimal in the sense that the power is maximized subject to FDX constraint. We propose a data-driven FDX procedure that emulates the oracle via carefully designed computational shortcuts. We investigate the empirical performance of the proposed method using simulations and illustrate the merits of FDX control through an application for identifying abnormal stock trading strategies.

Read More…
- Aug 14, 2023
- post
Residual-based estimation of parametric copulas under regression

Yue Zhao · Aug 14, 2023
Date: 2023-08-14

Time: 15:30-16:30 (Montreal time)

Hybrid: In person / Zoom

Location: Burnside Hall 1104

https://mcgill.zoom.us/j/83436686293?pwd=b0RmWmlXRXE3OWR6NlNIcWF5d0dJQT09

Meeting ID: 834 3668 6293

Passcode: 12345

Abstract:

We study a multivariate response regression model where each coordinate is described by a location-scale regression, and where the dependence structure of the “noise” terms in the regression is described by a parametric copula. Our goal is to estimate the associated Euclidean copula parameter given a sample of the response and the covariate. In the absence of the copula sample, the oracle ranks in the usual pseudo-likelihood estimation procedure are no longer computable. Instead, we base our estimation on the residual ranks calculated from some preliminary estimators of the regression functions. We show that the residual-based estimators are asymptotically equivalent to their oracle counterparts, even when the dimension of the covariate in the regression is moderately diverging. Partially to serve this objective, we also study the weighted convergence of the residual empirical processes.

Read More…
- Mar 24, 2023
- post
Confidence sets for Causal Discovery

Mladen Kolar · Mar 24, 2023
Date: 2023-03-24

Time: 15:30-16:30 (Montreal time)

On Zoom only

https://mcgill.zoom.us/j/83436686293?pwd=b0RmWmlXRXE3OWR6NlNIcWF5d0dJQT09

Meeting ID: 834 3668 6293

Passcode: 12345

Abstract:

Causal discovery procedures are popular methods for discovering causal structure across the physical, biological, and social sciences. However, most procedures for causal discovery only output a single estimated causal model or single equivalence class of models. We propose a procedure for quantifying uncertainty in causal discovery. Specifically, we consider linear structural equation models with non-Gaussian errors and propose a procedure which returns a confidence sets of causal orderings which are not ruled out by the data. We show that asymptotically, the true causal ordering will be contained in the returned set with some user specified probability.

Read More…
- Mar 17, 2023
- post
Excursions in Statistical History: Highlights

James Hanley · Mar 17, 2023
Date: 2023-03-17

Time: 15:30-16:30 (Montreal time)

Hybrid: In person / Zoom

Location: Burnside Hall 1104

https://mcgill.zoom.us/j/83436686293?pwd=b0RmWmlXRXE3OWR6NlNIcWF5d0dJQT09

Meeting ID: 834 3668 6293

Passcode: 12345

Abstract:

Over the last 20 years, the speaker has delved into the origins of ‘regression’; the development of the ’t’ and ‘Poisson’ distributions; forerunners of the ‘hazard’ function; and the statistical design and conduct of US Selective Service lotteries from 1917 onwards. This talk will recount the stories, data and simulations behind some of these, and provide some modern-day re-enactments.

Read More…
- Mar 10, 2023
- post
Heteroskedastic Sparse PCA in High Dimensions

Zhao Ren · Mar 10, 2023
Date: 2023-03-10

Time: 15:30-16:30 (Montreal time)

Hybrid: In person / Zoom

Location: Burnside Hall 1104

https://mcgill.zoom.us/j/83436686293?pwd=b0RmWmlXRXE3OWR6NlNIcWF5d0dJQT09

Meeting ID: 834 3668 6293

Passcode: 12345

Abstract:

Principal component analysis (PCA) is one of the most commonly used techniques for dimension reduction and feature extraction. Though it has been well-studied for high-dimensional sparse PCA, little is known when the noise is heteroskedastic, which turns out to be ubiquitous in many scenarios, like biological sequencing data and information network data. We propose an iterative algorithm for sparse PCA in the presence of heteroskedastic noise, which alternatively updates the estimates of the sparse eigenvectors using the power method with adaptive thresholding in one step, and imputes the diagonal values of the sample covariance matrix to reduce the estimation bias due to heteroskedasticity in the other step. Our procedure is computationally fast and provably optimal under the generalized spiked covariance model, assuming the leading eigenvectors are sparse. A comprehensive simulation study demonstrates its robustness and effectiveness in various settings.

Read More…

Date: 2023-10-20

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

Meeting ID: 897 6116 5882

Passcode: None

Abstract:

Date: 2023-10-13

Time: 15:30-16:30 (Montreal time)

Location: Online, retransmitted in Burnside 1104

Meeting ID: 834 7786 5796

Passcode: None

Abstract:

Date: 2023-09-29

Time: 15:30-16:30 (Montreal time)

Location: Online, retransmitted in Burnside 1104

Meeting ID: 824 4080 7026

Passcode: None

Abstract:

Date: 2023-09-22

Time: 15:30-16:30 (Montreal time)

Location: In person, Burnside 1104

Meeting ID: 893 7481 3252

Passcode: None

Abstract:

Date: 2023-09-15

Time: 15:30-16:30 (Montreal time)

Location: Burnside 1104

Meeting ID: 864 0479 8712

Passcode: None

Abstract:

Date: 2023-08-17

Time: 15:30-16:30 (Montreal time)

Hybrid: In person / Zoom

Location: Burnside Hall 1104

Meeting ID: 896 2334 4755

Passcode: 287381

Abstract:

Date: 2023-08-14

Time: 15:30-16:30 (Montreal time)

Hybrid: In person / Zoom

Location: Burnside Hall 1104

Meeting ID: 834 3668 6293

Passcode: 12345

Abstract:

Date: 2023-03-24

Time: 15:30-16:30 (Montreal time)

On Zoom only

Meeting ID: 834 3668 6293

Passcode: 12345

Abstract:

Date: 2023-03-17

Time: 15:30-16:30 (Montreal time)

Hybrid: In person / Zoom

Location: Burnside Hall 1104

Meeting ID: 834 3668 6293

Passcode: 12345

Abstract:

Date: 2023-03-10

Time: 15:30-16:30 (Montreal time)

Hybrid: In person / Zoom

Location: Burnside Hall 1104

Meeting ID: 834 3668 6293

Passcode: 12345

Abstract: