/post/index.xml Past Seminar Series - McGill Statistics Seminars
  • Normalization effects on deep neural networks and deep learning for scientific problems

    Date: 2025-04-04

    Time: 15:30-16:30 (Montreal time)

    Location: In person, Burnside 1104

    https://mcgill.zoom.us/j/81100654212

    Meeting ID: 811 0065 4212

    Passcode: None

    Abstract:

    We study the effect of normalization on the layers of deep neural networks. A given layer $i$ with $N_{i}$ hidden units is normalized by $1/N_{i}^{\gamma_{i}}$ with $\gamma_{i}\in[1/2,1]$. We study the effect of the choice of the $\gamma_{i}$ on the statistical behavior of the neural network’s output (such as variance) as well as on the test accuracy and generalization properties of the architecture. We find that in terms of variance of the neural network’s output and test accuracy the best choice is to choose the $\gamma_{i}$’s to be equal to one, which is the mean-field scaling. We also find that this is particularly true for the outer layer, in that the neural network’s behavior is more sensitive in the scaling of the outer layer as opposed to the scaling of the inner layers. The mechanism for the mathematical analysis is an asymptotic expansion for the neural network’s output. An important practical consequence of the analysis is that it provides a systematic and mathematically informed way to choose the learning rate hyperparameters. Such a choice guarantees that the neural network behaves in a statistically robust way as the number of hidden units $N_i$ grow. Time permitting, I will discuss applications of these ideas to design of deep learning algorithms for scientific problems including solving high dimensional partial differential equations (PDEs), closure of PDE models and reinforcement learning with applications to financial engineering, turbulence and more.

  • From the distribution of string counts in Bernoulli sequences to multivariate discrete models

    Date: 2025-03-28

    Time: 15:30-16:30 (Montreal time)

    Location: In person, Burnside 1104

    https://mcgill.zoom.us/j/85849766730

    Meeting ID: 858 4976 6730

    Passcode: None

    Abstract:

    I will provide a personalized account of a sequence of problems, that I have worked on over the years, beginning with string counts in Bernoulli sequences and transiting to multivariate discrete models. As a starting point, we consider independent Bernoulli trials with varying success probabilities 1/k for the kth trial, the sum of the products of two consecutive occurrences,  and  the problem of establishing that the sum is distributed Poisson with mean equal to 1.  We will explain how this finding connects to cycles in random permutations, records for continuous random variables, the Hoppe-Polya urn, and the classical Montmort matching problem.

  • A computational framework for linear inverse problems via the maximum entropy on the mean method

    Date: 2025-03-14

    Time: 15:30-16:30 (Montreal time)

    Location: In person, Burnside 1104

    https://mcgill.zoom.us/j/88555780651

    Meeting ID: 885 5578 0651

    Passcode: None

    Abstract:

    We present a framework for solving linear inverse problems that is computationally tractable and has mathematical certificates. To this end, we interpret the ground truth of a linear inverse problem as a random vector with unknown distribution. We solve for a distribution which is close to a prior P (guessed or data-driven) measured in the KL-divergence while also having an expectation that yields high fidelity with the given data that defines the problem. After reformulation this yields a strictly convex, finite dimensional optimization problem whose regularizer, the MEM functional, is paired in duality with the log-moment generating function of the prior P. We exploit this computationally via Fenchel-Rockafellar duality. When no obvious guess for P is available, we use data to generate an empirical prior. Using techniques from variational analysis and stochastic optimization, we show that, and at what rate, the solution of the empirical problems converge (as the sample size grows) to the solution of the problem with known prior.

  • How can mathematics contribute to AI?

    Date: 2025-02-28

    Time: 15:30-16:30 (Montreal time)

    Location: Online, retransmitted in Burnside 1104

    https://mcgill.zoom.us/j/89838224036

    Meeting ID: 898 3822 4036

    Passcode: None

    Abstract:

    Artificial intelligence is arguably the hottest topic in science. Computer science and engineering currently set the agenda in this field, sidelining mathematics to a large extent. This talk, however, will highlight that mathematics has a lot to offer. We will introduce mathematical guarantees that provide deep insights into the inner workings of AI, and we will show how statistical principles can make AI more efficient. More generally, we will discuss the role of mathematics, especially statistics, in AI and data science.

  • The empirical copula process on classes of non-rectangular sets

    Date: 2025-02-07

    Time: 15:30-16:30 (Montreal time)

    Location: In person, Burnside 1104

    https://mcgill.zoom.us/j/81032144286

    Meeting ID: 810 3214 4286

    Passcode: None

    Abstract:

    The copula of a random vector with unknown marginals can be estimated non-parametrically by the empirical copula, akin to the empirical distribution. However, the asymptotic analysis of the empirical copula is made considerably more involved than that of the empirical distribution by the use of pseudo-observations, involving the marginal empirical distribution functions. In particular, it is still unknown whether the empirical copula evaluated at a non-rectangular set is asymptotically normally distributed. In this work, sufficient conditions under which this is the case are identified. The result is extended to a Donsker theorem for the empirical copula indexed by an infinite collection of non-rectangular sets. Some aspects of the proof involving geometric measure theory will be discussed. Based on ongoing joint work with Axel Bücher, Johan Segers and Stanislav Volgushev.

  • Multivariate Extremes Generator by Statistical Learning

    Date: 2025-01-31

    Time: 15:30-16:30 (Montreal time)

    Location: In person, Burnside 1104

    https://mcgill.zoom.us/j/88929152266

    Meeting ID: 889 2915 2266

    Passcode: None

    Abstract:

    Generating realistic extremes from an observational dataset is crucial when trying to estimate the risks associated with the occurrence of future extremes, possibly of greater magnitude than those already observed. Generative approaches from the machine learning community are not applicable to extreme samples without careful adaptation. On the other hand, asymptotic results from extreme value theory provide a theoretical framework for modeling multivariate extreme events, through the notion of multivariate regular variation. Bridging these two fields, this presentation details a variational autoencoder approach for sampling multivariate distributions with heavy tails, i.e., distributions likely to exhibit extremes of particularly large intensities.

  • Tree Pólya Splitting distributions for multivariate count data

    Date: 2025-01-17

    Time: 15:30-16:30 (Montreal time)

    Location: In person, Burnside 1104

    https://mcgill.zoom.us/j/82903352833

    Meeting ID: 829 0335 2833

    Passcode: None

    Abstract:

    The analysis of multivariate count data is fundamental in various fields. An appropriate model must be able to be flexible enough for inducing correlation, but also simple for inference and interpretation. One such model is the Pólya Splitting model, which randomly decomposes the sum of a discrete vector into its components. This simple approach offers several compelling properties. However, it imposes the constraint that the dependency structure must be identical across all components. To overcome this limitation, a generalization of this model called Tree Pólya splitting is proposed. For this new model, the splitting process is represented by a tree structure, allowing for more flexibility. In this seminar, we will define the Tree Pólya Splitting model and explore various properties, including marginal distributions, factorial moments, and the dependency structure.

  • Goodness-of-Fit Testing for the Wishart Distributions

    Date: 2024-12-13

    Time: 15:30-16:30 (Montreal time)

    Location: In person, Burnside 1104

    https://mcgill.zoom.us/j/81501161882

    Meeting ID: 815 0116 1882

    Passcode: None

    Abstract:

    The problem of testing that a random sample is drawn from a specific probability distribution is an old one, the most famous example perhaps being the problem of testing that a sequence of playing cards was drawn from a fairly shuffled deck. In recent years, random data consisting of positive definite (symmetric) matrices have appeared in areas of applied research such as factor analysis, diffusion tensor imaging, wireless communication systems, synthetic aperture radar, and models of financial volatility. Given a random sample of positive definite matrices, we develop a goodness-of-fit test for the Wishart distributions. We derive the asymptotic distribution of the test statistic in terms of a certain Gaussian random field, and we obtain an explicit formula for the corresponding covariance operator. The eigenfunctions of the covariance operator are determined explicitly, and the eigenvalues are shown to satisfy certain interlacing properties. As an application, we carry out a test that a financial data set has a Wishart distribution and, finally, we describe some recent research and open problems on related goodness-of-fit tests.

  • Conditional nonparametric variable screening via neural network factor regression

    Date: 2024-12-06

    Time: 15:30-16:30 (Montreal time)

    Location: In person, Burnside 1104

    https://mcgill.zoom.us/j/83785721810

    Meeting ID: 837 8572 1810

    Passcode: None

    Abstract:

    High-dimensional covariates often admit linear factor structure. To effectively screen correlated covariates in high-dimension, we propose a conditional variable screening test based on non-parametric regression using neural networks due to their representation power. We ask the question whether individual covariates have additional contributions given the latent factors. Our test statistics are based on the estimated partial derivative of the regression function of the candidate variable for screening and an observable proxy for the latent factors. Hence, our test reveals how much predictors contribute additionally to the non-parametric regression after accounting for the latent factors. Our derivative estimator is the convolution of a deep neural network regression estimator and a smoothing kernel. We demonstrate that when the neural network size diverges with the sample size, unlike estimating the regression function itself, it is necessary to smooth the partial derivative of the neural network estimator to recover the desired convergence rate for the derivative. Moreover, our screening test achieves asymptotic normality under the null after finely centering our test statistics that makes the biases negligible, as well as consistency for local alternatives under mild conditions. We demonstrate the performance of our test in a simulation study and a real world application.

  • Asymptotic behavior of data driven empirical measures for testing multivariate regular variation

    Date: 2024-11-22

    Time: 15:30-16:30 (Montreal time)

    Location: In person, Burnside 1104

    https://mcgill.zoom.us/j/82125361063

    Meeting ID: 821 2536 1063

    Passcode: None

    Abstract:

    Nowadays, empirical processes are well known objects. A reason that push forward theirs studies is that, in many models, we can write the estimators as images of empirical measures. In this work, the interest is touched upon the case of local empirical measures built over a sub-sample of data conditioned to be in a certain area, itself depending on the data. In the present work we present a general framework which allows to derive asymptotic results for these empirical measures. This approach is specified for the framework of extreme values theory. As an application, an asymptotic result allowing to derive a test procedure for Multivariate Regular Variation is detailed.