McGill Statistics Seminar - McGill Statistics Seminars

- Mar 15, 2019
- post
Hierarchical Bayesian Modelling for Wireless Cellular Networks

Deniz Ustebay · Mar 15, 2019
Date: 2019-03-15

Time: 15:30-16:30

Location: BURN 1205

Abstract:

With the recent advances in wireless technologies, base stations are becoming more sophisticated. The network operators are also able to collect more data to improve network performance and user experience. In this paper we concentrate on modeling performance of wireless cells using hierarchical Bayesian modeling framework. This framework provides a principled way to navigate the space between the option of creating one model to represent all cells in a network and the option of creating separate models at each cell. The former option ignores the variations between cells (complete pooling) whereas the latter is overly noisy and ignores the common patterns in cells (no pooling). The hierarchical Bayesian model strikes a trade-off between these two extreme cases and enables us to do partial pooling of the data from all cells. This is done by estimating a parametric population distribution and assuming that each cell is a sample from this distribution. Because this model is fully Bayesian, it provides uncertainty intervals around each estimated parameter which can be used by network operators making network management decisions. We examine the performance of this method on a synthetic dataset and a real dataset collected from a cellular network.

Read More…
- Mar 1, 2019
- post
Statistical Inference for partially observed branching processes, with application to hematopoietic lineage tracking

Jason Xu · Mar 1, 2019
Date: 2019-03-01

Time: 15:30-16:30

Location: BURN 1104

Abstract:

The likelihood function is central to many statistical procedures, but poses challenges in classical and modern data settings. Motivated by cell lineage tracking experiments to study hematopoiesis (the process of blood cell production), we present recent methodology enabling likelihood-based inference for partially observed data arising from continuous-time branching processes. These computational advances allow principled procedures such as maximum likelihood estimation, posterior inference, and expectation-maximization (EM) algorithms in previously intractable data settings. We then discuss limitations and alternatives when data are very large or generated from a hidden process, and potential ways forward using ideas from sparse optimization.

Read More…
- Feb 22, 2019
- post
Uniform, nonparametric, non-asymptotic confidence sequences

Aaditya Ramdas · Feb 22, 2019
Date: 2019-02-22

Time: 15:30-16:30

Location: BURN 1205

Abstract:

A confidence sequence is a sequence of confidence intervals that is uniformly valid over an unbounded time horizon. In this paper, we develop non-asymptotic confidence sequences under nonparametric conditions that achieve arbitrary precision. Our technique draws a connection between the classical Cramer-Chernoff method, the law of the iterated logarithm (LIL), and the sequential probability ratio test (SPRT)—our confidence sequences extend the first to produce time-uniform concentration bounds, provide tight non-asymptotic characterizations of the second, and generalize the third to nonparametric settings, including sub-Gaussian and Bernstein conditions, self-normalized processes, and matrix martingales. We strengthen and generalize existing constructions of finite-time iterated logarithm (“finite LIL”) bounds. We illustrate the generality of our proof techniques by deriving an empirical-Bernstein finite LIL bound as well as a novel upper LIL bound for the maximum eigenvalue of a sum of random matrices. Finally, we demonstrate the utility of our approach with applications to covariance matrix estimation and to estimation of sample average treatment effect under the Neyman-Rubin potential outcomes model, for which we give a non-asymptotic, sequential estimation strategy which handles adaptive treatment mechanisms such as Efron’s biased coin design.

Read More…
- Feb 15, 2019
- post
Causal Inference with Unmeasured Confounding: an Instrumental Variable Approach

Linbo Wang · Feb 15, 2019
Date: 2019-02-15

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Causal inference is a challenging problem because causation cannot be established from observational data alone. Researchers typically rely on additional sources of information to infer causation from association. Such information may come from powerful designs such as randomization, or background knowledge such as information on all confounders. However, perfect designs or background knowledge required for establishing causality may not always be available in practice. In this talk, I use novel causal identification results to show that the instrumental variable approach can be used to combine the power of design and background knowledge to draw causal conclusions. I also introduce novel estimation tools to construct estimators that are robust, efficient and enjoy good finite sample properties. These methods will be discussed in the context of a randomized encouragement design for a flu vaccine.

Read More…
- Feb 8, 2019
- post
Patient-Specific Finite Element Analysis of Human Heart: Mathematical and Statistical Opportunities and Challenges

Alireza Heidari · Feb 8, 2019
Date: 2019-02-08

Time: 15:30-16:30

Location: BURN 1104

Abstract:

Cardiovascular diseases (CVD) are the leading cause of death globally and ranks second in Canada, costing the Canadian economy over $20 billion every year. Despite the recent progress in CVD through prevention, lifestyle changes, and the use of biomedical treatments to improve survival rates and quality of life, there has been a lack in the integration of computer-aided engineering (CAE) in this field. Clinically, proposing cut-off values while taking into consideration patient-specific risk is of paramount importance for increased rate ofsurvival and improved quality of life. Computational modeling has proved to be used in determining parameters that cannot be assessed experimentally. The latest developments in computational modelling of human heart are presented and the constitutive equations, the key ingredient of these in-silico modellings of human heart, are discussed. Finite Element analysis of cardiac diseases provide a framework to generate synthetic data for developing statistical models when collecting the real data require invasive procedure. The idea of virtual personalized cardiology will be discussed.

Read More…
- Jan 25, 2019
- post
Modern Non-Problems in Optimization: Applications to Statistics and Machine Learning

Ying Cui · Jan 25, 2019
Date: 2019-01-25

Time: 16:00-17:00

Location: BURN 920

Abstract:

We have witnessed a lot of exciting development of data science in recent years. From the perspective of optimization, many modern data-science problems involve some basic ``non’’-properties that lack systematic treatment by the current approaches for the sake of the computation convenience. These non-properties include the coupling of the non-convexity, non-differentiability and non-determinism. In this talk, we present rigorous computational methods for solving two typical non-problems: the piecewise linear regression and the feed-forward deep neural network. The algorithmic framework is an integration of the first order non-convex majorization-minimization method and the second order non-smooth Newton methods. Numerical experiments demonstrate the effectiveness of our proposed approach. Contrary to existing methods for solving non-problems which provide at best very weak guarantees on the computed solutions obtained in practical implementation, our rigorous mathematical treatment aims to understand properties of these computed solutions with reference to both the empirical and the population risk minimizations.

Read More…
- Jan 18, 2019
- post
Singularities of the information matrix and longitudinal data with change points

Masoud Asgharian · Jan 18, 2019
Date: 2019-01-18

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Non-singularity of the information matrix plays a key role in model identification and the asymptotic theory of statistics. For many statistical models, however, this condition seems virtually impossible to verify. An example of such models is a class of mixture models associated with multi-path change-point problems (MCP) which can model longitudinal data with change points. The MCP models are similar in nature to mixture-of-experts models in machine learning. The question then arises as to how often the non-singularity assumption of the information matrix fails to hold. We show that

Read More…
- Jan 11, 2019
- post
Magic Cross-Validation Theory for Large-Margin Classification

Boxiang Wang · Jan 11, 2019
Date: 2019-01-11

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Cross-validation (CV) is perhaps the most widely used tool for tuning supervised machine learning algorithms in order to achieve better generalization error rate. In this paper, we focus on leave-one-out cross-validation (LOOCV) for the support vector machine (SVM) and related algorithms. We first address two wide-spreading misconceptions on LOOCV. We show that LOOCV, ten-fold, and five-fold CV are actually well-matched in estimating the generalization error, and the computation speed of LOOCV is not necessarily slower than that of ten-fold and five-fold CV. We further present a magic CV theory with a surprisingly simple recipe which allows users to very efficiently tune the SVM. We then apply the magic CV theory to demonstrate a straightforward way to prove the Bayes risk consistency of the SVM. We have implemented our algorithms in a publicly available R package magicsvm, which is much faster than the state-of-the-art SVM solvers. We demonstrate our methods on extensive simulations and benchmark examples.

Read More…
- Nov 23, 2018
- post
p-values vs Bayes factors: Is there a compromise?

David Wolfson · Nov 23, 2018
Date: 2018-11-23

Time: 15:30-16:30

Location: BURN 1104

Abstract:

This is not a research talk. Rather, the goal is to address the topic of the talk title through a 2017 multi-authored paper published in Nature Human Behaviour. The Nature article proposes that the standard cut-off significance level of .05 should be replaced by a cut-off level of .005 when new discoveries are being claimed. The authors attribute the high proportion of irreducible results in the literature that accompany claimed new discoveries, in part, to the low-bar cut-off of .05. Their fix is built around the Bayes factor. I will begin with a brief presentation of the difference between the frequentist and Bayesian approaches to statistical inference, and lead into p-values vs Bayes factors for hypothesis testing before discussing the Nature article itself. It is hoped that the talk will provoke thought about the way we do statistics.

Read More…
- Nov 16, 2018
- post
Estimation of the Median Residual Lifetime Function for Length-Biased Failure Time Data

James Hugh McVittie · Nov 16, 2018
Date: 2018-11-16

Time: 15:30-16:30

Location: BURN 1104

Abstract:

The median residual lifetime function is a statistical quantity which describes the future point in time at which the probability of current survival has dropped by 50%. In deriving an estimator for the median residual lifetime function for length-biased data, the added features of left-truncation and right-censoring must be taken into account.

In this talk, we give a brief description of length-biased failure time data and show that by using a particular non-parametric estimator for the survival function that it is possible to derive the asymptotically most-efficient non-parametric estimator for the median residual lifetime function. We give some details on the proof of the asymptotic results and examine the performance of the estimator using simulated data. We also apply the proposed estimator to the Canadian Study of Health and Aging data set to study the median residual lifetime function of patients with dementia.

Read More…

Date: 2019-03-15

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Date: 2019-03-01

Time: 15:30-16:30

Location: BURN 1104

Abstract:

Date: 2019-02-22

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Date: 2019-02-15

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Date: 2019-02-08

Time: 15:30-16:30

Location: BURN 1104

Abstract:

Date: 2019-01-25

Time: 16:00-17:00

Location: BURN 920

Abstract:

Date: 2019-01-18

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Date: 2019-01-11

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Date: 2018-11-23

Time: 15:30-16:30

Location: BURN 1104

Abstract:

Date: 2018-11-16

Time: 15:30-16:30

Location: BURN 1104

Abstract: