/post/index.xml Past Seminar Series - McGill Statistics Seminars
  • Repulsiveness for integration (not my social program)

    Date: 2019-10-11

    Time: 15:30-16:30

    Location: BURN 1205

    Abstract:

    Integral estimation in any dimension is an extensive topic, largely treated in the literature, with a broad range of applications. Monte-Carlo type methods arise naturally when one looks forward to quantifying/controlling the error. Many methods have already been developped: MCMC, Poisson disk sampling, QMC (and randomized versions), Bayesian quadrature, etc. In this talk, I’ll consider a different approach which consists in defining the quadrature nodes as the realization of a spatial point process. In particular I’ll show that a very specific class of determinantal point processes, a class of repulsive point patterns, has excellent properties and is able to estimate efficiently integrals for non-differentiable functions with an explicit and faster rate of convergence than current methods.

  • Tales of tails, tiles and ties in dependence modeling

    Date: 2019-10-04

    Time: 16:00-17:00

    Location: CRM, UdeM, Pav. André-Aisenstadt, 2920, ch. de la Tour, salle 1355

    Abstract:

    Modeling dependence between random variables is omnipresent in statistics. When rare events with high impact are involved, such as severe storms, floods or heat waves, the issue is both of great importance for risk management and theoretically challenging. Combining extreme-value theory with copula modeling and rank-based inference yields a particularly flexible and promising approach to this problem. I will present three recent advances in this area. One will tackle the question of how to account for dependence between rare events in the medium regime, in which asymptotic extreme-value models are not suitable. The other will explore what can be done when a large number of variables is involved and how a hierarchical model structure can be learned from large-scale rank correlation matrices. Finally, I won’t resist giving you a glimpse of the notoriously intricate world of rank-based inference for discrete or mixed data.

  • Regression Models for Spatial Images

    Date: 2019-09-27

    Time: 15:30-16:30

    Location: McIntyre Medical Building, Room 521

    Abstract:

    This work is motivated by a problem in describing forest nitrogen cycling, and a consequent goal of constructing regression models for spatial images. Specifically, I present a functional concurrent linear model (FLCM) with varying coefficients for two-dimensional spatial images. To address overparameterization issues, the parameter surfaces in this model are transformed into the wavelet domain and then sparse representations are found using two different methods: LASSO and Bayesian variable selection. I will briefly discuss extensions to address missing data problems for colocated spatial images and the modeling of tree species in landscape ecology. In addition I will discuss the use of the sextant in marine navigation.

  • Deep Representation Learning using Discrete Domain Symmetries

    Date: 2019-09-20

    Time: 15:30-16:30

    Location: BURN 1205

    Abstract:

    Symmetry has played a significant role in modern physics, in part by constraining the physical laws. I will discuss how it could play a fundamental role in AI by constraining the deep model design. In particular, I focus on discrete domain symmetries and through examples show how we can use this inductive bias as a principled means for constraining a feedforward layer and significantly improving its sample efficiency.

  • Integrative computational approach in genomics and healthcare

    Date: 2019-09-13

    Time: 15:30-16:30

    Location: BURN 1205

    Abstract:

    In the current era of multi-omics and digital healthcare, we are facing unprecedented amount of data with tremendous opportunities to link molecular phenotypes with complex diseases. However, the lack of integrative statistical method hinders system-level interrogation of relevant disease-related pathways and the genetic implication in various healthcare outcome.

    In this talk, I will present our current progress in mining genomics and healthcare data. In particular, I will cover two main topics: (1) a statistical approach to assess gene set enrichments using genetic and transcriptomic data; (2) multimodal latent topic model for mining electronic healthcare and whole genome sequencing data from small patient cohort.

  • MAPLE; Semiparametric Estimation and Variable Selection for Length-biased Data with Heavy Censoring

    Date: 2019-09-06

    Time: 15:30-16:30

    Location: BURN 1205

    Abstract:

    In this talk, we discuss two problems of semiparametric estimation and variable selection for length-biased data with heavy censoring. The common feature of the proposed estimation procedures in the literature is that they only put probability mass on failure times. Under length-biased sampling, however, censoring is informative and failing to incorporate censored observations into estimation can lead to a substantial loss of efficiency. We propose two estimation procedures by computing the likelihood contribution of both uncensored and censored observations. For variable selection problem, we introduce a unified penalized estimating function and use an optimization algorithm to solve it. We discuss the asymptotic properties of the resulting penalized estimators. The work is motivated by the International stroke Trial dataset collected in Argentina in which the survival times of about 88% of the 545 cases are censored.

  • Graph Representation Learning and Applications

    Date: 2019-04-26

    Time: 15:30-16:30

    Location: BURNSIDE 1205

    Abstract:

    Graphs, a general type of data structures for capturing interconnected objects, are ubiquitous in a variety of disciplines and domains ranging from computational social science, recommender systems, medicine, bioinformatics to chemistry. Representative examples of real-world graphs include social networks, user-item networks, protein-protein interaction networks, and molecular structures, which are represented as graphs. In this talk, I will introduce our work on learning effective representations of graphs such as learning low-dimensional node representations of large graphs (e.g., social networks, protein-protein interaction graphs, and knowledge graphs) and learning representations of entire graphs (e.g., molecule structures).

  • Estimating Time-Varying Causal Excursion Effect in Mobile Health with Binary Outcomes

    Date: 2019-04-12

    Time: 15:30-16:30

    Location: BURNSIDE 1205

    Abstract:

    Advances in wearables and digital technology now make it possible to deliver behavioral, mobile health, interventions to individuals in their every-day life. The micro-randomized trial (MRT) is increasingly used to provide data to inform the construction of these interventions. This work is motivated by multiple MRTs that have been conducted or are currently in the field in which the primary outcome is a longitudinal binary outcome. The first, often called the primary, analysis in these trials is a marginal analysis that seeks to answer whether the data indicates that a particular intervention component has an effect on the longitudinal binary outcome. Under rather restrictive assumptions one can, based on existing literature, derive a semi-parametric, locally efficient estimator of the causal effect. In this talk, starting from this estimator, we develop multiple estimators that can be used as the basis of a primary analysis under more plausible assumptions. Simulation studies are conducted to compare the estimators. We illustrate the developed methods using data from the MRT, BariFit. In BariFit, the goal is to support weight maintenance for individuals who received bariatric surgery.

  • Bayesian Estimation of Individualized Treatment-Response Curves in Populations with Heterogeneous Treatment Effects

    Date: 2019-04-05

    Time: 15:30-16:30

    Location: BURNSIDE 1104

    Abstract:

    Estimating individual treatment effects is crucial for individualized or precision medicine. In reality, however, there is no way to obtain both the treated and untreated outcomes from the same person at the same time. An approximation can be obtained from randomized controlled trials (RCTs). Despite the limitations that randomizations are usually expensive, impractical or unethical, pre-specified variables may still not fully incorporate all the relevant characteristics capturing individual heterogeneity in treatment response. In this work, we use non-experimental data; we model heterogenous treatment effects in the studied population and provide a Bayesian estimator of the individual treatment response. More specifically, we develop a novel Bayesian nonparametric (BNP) method that leverages the G-computation formula to adjust for time-varying confounding in observational data, and it flexibly models sequential data to provide posterior inference over the treatment response at both group level and individual level. On a challenging dataset containing time series from patients admitted to intensive care unit (ICU), our approach reveals that these patients have heterogenous responses to the treatments used in managing kidney function. We also show that on held out data the resulting predicted outcome in response to treatment (or no treatment) is more accurate than alternative approaches.

  • Introduction to Statistical Network Analysis

    Date: 2019-03-29

    Time: 13:00-16:30

    Location: McIntyre – Room 521

    Abstract:

    Classical statistics often makes assumptions about conditional independence in order to fit models but in the modern world connectivity is key. Nowadays we need to account for many dependencies and sometimes the associations and dependencies themselves are the key items of interest e.g. how do we predict conflict between countries, how can we use friendships between school children to choose the best groups for study tips/help, how does the pattern of needle-sharing among partners correlate to HIV transmission and where interventions can best be made. Basically any type of study where we are interested in connections or associations between pairs of actors, be they people, companies, countries or anything else, we are looking at a network analysis. The methods falling under this area are collectively known as “Statistical Network Analysis” or sometimes “Social Network Analysis” (which can be a bit misleading as we are not only talking about Facebook and the like). This workshop will give a general introduction to networks, their visualisation, summary measures and statistical models that can be used to analyse them. The practical component will be in R and attendees will get the most benefit if they are able to bring a laptop along to work through examples.