McGill Statistics Seminar - McGill Statistics Seminars

- Feb 26, 2016
- post
Aggregation methods for portfolios of dependent risks with Archimedean copulas

Etienne Marceau · Feb 26, 2016
Date: 2016-02-26

Time: 15:30-16:30

Location: BURN 1205

Abstract:

In this talk, we will consider a portfolio of dependent risks represented by a vector of dependent random variables whose joint cumulative distribution function (CDF) is defined with an Archimedean copula. Archimedean copulas are very popular and their extensions, nested Archimedean copulas, are well suited for vectors of random vectors in high dimension. I will describe a simple approach which makes it possible to compute the CDF of the sum or a variety of other functions of those random variables. In particular, I will derive the CDF and the TVaR of the sum of those risks using the Frank copula, the Shifted Negative Binomial copula, and the Ali-Mikhail-Haq (AMH) copula. The computation of the contribution of each risk under the TVaR-based allocation rule will also be illustrated. Finally, the links between the Clayton copula, the Shifted Negative Binomial copula, and the AMH copula will be discussed.

Read More…
- Feb 19, 2016
- post
An introduction to statistical lattice models and observables

James McVittie · Feb 19, 2016
Date: 2016-02-19

Time: 15:30-16:30

Location: BURN 1205

Abstract:

The study of convergence of random walks to well defined curves is founded in the fields of complex analysis, probability theory, physics and combinatorics. The foundations of this subject were motivated by physicists interested in the properties of one-dimensional models that represented some form of physical phenomenon. By taking physical models and generalizing them into abstract mathematical terms, macroscopic properties about the model could be determined from the microscopic level. By using model specific objects known as observables, the convergence of the random walks on particular lattice structures can be proven to converge to continuous curves such as Brownian Motion or Stochastic Loewner Evolution as the size of the lattice step approaches 0. This seminar will introduce the field of statistical lattice models, the types of observables that can be used to prove convergence as well as a proof for the q-state Potts model showing that local non-commutative matrix observables do not exist. No prior physics knowledge is required for this seminar.

Read More…
- Feb 5, 2016
- post
The Bayesian causal effect estimation algorithm

Denis Talbot · Feb 5, 2016
Date: 2016-02-05

Time: 15:30-16:30

Location: BURN 1214

Abstract:

Estimating causal exposure effects in observational studies ideally requires the analyst to have a vast knowledge of the domain of application. Investigators often bypass difficulties related to the identification and selection of confounders through the use of fully adjusted outcome regression models. However, since such models likely contain more covariates than required, the variance of the regression coefficient for exposure may be unnecessarily large. Instead of using a fully adjusted model, model selection can be attempted. Most classical statistical model selection approaches, such as Bayesian model averaging, do not readily address causal effect estimation. We present a new model averaged approach to causal inference, Bayesian causal effect estimation (BCEE), which is motivated by the graphical framework for causal inference. BCEE aims to unbiasedly estimate the causal effect of a continuous exposure on a continuous outcome while being more efficient than a fully adjusted approach.

Read More…
- Jan 29, 2016
- post
Estimating high-dimensional networks with hubs with an application to microbiome data

Annaliza McGillivray · Jan 29, 2016
Date: 2016-01-29

Time: 15:30-16:30

Location: BURN 1205

Abstract:

In this talk, we investigate the problem of estimating high-dimensional networks in which there are a few highly connected “hub" nodes. Methods based on L1-regularization have been widely used for performing sparse selection in the graphical modelling context. However, the L1 penalty penalizes each edge equally and independently of each other without taking into account any structural information. We introduce a new method for estimating undirected graphical models with hubs, called the hubs weighted graphical lasso (HWGL). This is a two-step procedure with a hub screening step, followed by network reconstruction in the second step using a weighted lasso approach that incorporates the inferred network topology. Empirically, we show that the HWGL outperforms competing methods and illustrate the methodology with an application to microbiome data.

Read More…
- Jan 22, 2016
- post
Robust estimation in the presence of influential units in surveys

David Haziza · Jan 22, 2016
Date: 2016-01-22

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Influential units are those which make classical estimators (e.g., the Horvitz-Thompson estimator or calibration estimators) very unstable. The problem of influential units is particularly important in business surveys, which collect economic variables, whose distribution are highly skewed (heavy right tail). In this talk, we will attempt to answer the following questions:

(1) What is an influential value in surveys? (2) How measure the influence of unit? (3) How reduce the impact of influential units at the estimation stage?

Read More…
- Nov 13, 2015
- post
Prevalent cohort studies: Length-biased sampling with right censoring

Masoud Asgharian · Nov 13, 2015
Date: 2015-11-13

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Logistic or other constraints often preclude the possibility of conducting incident cohort studies. A feasible alternative in such cases is to conduct a cross-sectional prevalent cohort study for which we recruit prevalent cases, i.e., subjects who have already experienced the initiating event, say the onset of a disease. When the interest lies in estimating the lifespan between the initiating event and a terminating event, say death for instance, such subjects may be followed prospectively until the terminating event or loss to follow-up, whichever happens first. It is well known that prevalent cases have, on average, longer lifespans. As such, they do not form a representative random sample from the target population; they comprise a biased sample. If the initiating events are generated from a stationary Poisson process, the so-called stationarity assumption, this bias is called length bias. I present the basics of nonparametric inference using length-biased right censored failure time data. I’ll then discuss some recent progress and current challenges. Our study is mainly motivated by challenges and questions raised in analyzing survival data collected on patients with dementia as part of a nationwide study in Canada, called the Canadian Study of Health and Aging (CSHA). I’ll use these data throughout the talk to discuss and motivate our methodology and its applications.

Read More…
- Nov 6, 2015
- post
Bayesian analysis of non-identifiable models, with an example from epidemiology and biostatistics

Lawrence McCandless · Nov 6, 2015
Date: 2015-11-06

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Most regression models in biostatistics assume identifiability, which means that each point in the parameter space corresponds to a unique likelihood function for the observable data. Recently there has been interest in Bayesian inference for non-identifiable models, which can better represent uncertainty in some contexts. One example is in the field of epidemiology, where the investigator is concerned with bias due to unmeasured confounders (omitted variables). In this talk, I will illustrate Bayesian analysis of a non-identifiable model from epidemiology using government administrative data from British Columbia. I will show how to use the software STAN, which is new software developed by Andrew Gelman and others in the USA. STAN allows the careful study of posterior distributions in a vast collection of Bayesian models, including non-identifiable models for bias in epidemiology, which are poorly suited to conventional Gibbs sampling.

Read More…
- Oct 23, 2015
- post
Robust mixture regression and outlier detection via penalized likelihood

Weixin Yao · Oct 23, 2015
Date: 2015-10-23

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Finite mixture regression models have been widely used for modeling mixed regression relationships arising from a clustered and thus heterogenous population. The classical normal mixture model, despite of its simplicity and wide applicability, may fail dramatically in the presence of severe outliers. We propose a robust mixture regression approach based on a sparse, case-specific, and scale-dependent mean-shift parameterization, for simultaneously conducting outlier detection and robust parameter estimation. A penalized likelihood approach is adopted to induce sparsity among the mean-shift parameters so that the outliers are distinguished from the good observations, and a thresholding-embedded Expectation-Maximization (EM) algorithm is developed to enable stable and efficient computation. The proposed penalized estimation approach is shown to have strong connections with other robust methods including the trimmed likelihood and the M-estimation methods. Comparing with several existing methods, the proposed methods show outstanding performance in numerical studies.

Read More…
- Oct 16, 2015
- post
Estimating high-dimensional multi-layered networks through penalized maximum likelihood

George Michailidis · Oct 16, 2015
Date: 2015-10-16

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Gaussian graphical models represent a good tool for capturing interactions between nodes represent the underlying random variables. However, in many applications in biology one is interested in modeling associations both between, as well as within molecular compartments (e.g., interactions between genes and proteins/metabolites). To this end, inferring multi-layered network structures from high-dimensional data provides insight into understanding the conditional relationships among nodes within layers, after adjusting for and quantifying the effects of nodes from other layers. We propose an integrated algorithmic approach for estimating multi-layered networks, that incorporates a screening step for significant variables, an optimization algorithm for estimating the key model parameters and a stability selection step for selecting the most stable effects. The proposed methodology offers an efficient way of estimating the edges within and across layers iteratively, by solving an optimization problem constructed based on penalized maximum likelihood (under a Gaussianity assumption). The optimization is solved on a reduced parameter space that is identified through screening, which remedies the instability in high-dimension. Theoretical properties are considered to ensure identifiability and consistent estimation of the parameters and convergence of the optimization algorithm, despite the lack of global convexity. The performance of the methodology is illustrated on synthetic data sets and on an application on gene and metabolic expression data for patients with renal disease.

Read More…
- Oct 9, 2015
- post
Parameter estimation of partial differential equations over irregular domains

Michelle Carey · Oct 9, 2015
Date: 2015-10-09

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Spatio-temporal data are abundant in many scientific fields; examples include daily satellite images of the earth, hourly temperature readings from multiple weather stations, and the spread of an infectious disease over a particular region. In many instances the spatio-temporal data are accompanied by mathematical models expressed in terms of partial differential equations (PDEs). These PDEs determine the theoretical aspects of the behavior of the physical, chemical or biological phenomena considered. Azzimonti (2013) showed that including the associated PDE as a regularization term as opposed to the conventional two-dimensional Laplacian provides a considerable improvement in the estimation accuracy. The PDEs parameters often have interesting interpretations. Although they are typically unknown and must be inferred from expert knowledge of the phenomena considered. In this talk I will discuss extending the profiling with a parameter cascading procedure outlined in Ramsay et al. (2007) to incorporate PDE parameter estimation. I will also show how, following Sangalli et al. (2013), the estimation procedure can be extended to include finite-element methods (FEMs). This allows the proposed method to account for attributes of the geometry of the physical problem such as irregular shaped domains, external and internal boundary features, as well as strong concavities. Thus this talk will introduce a methodology for data-driven estimates of the parameters of PDEs defined over irregular domains.

Read More…

Date: 2016-02-26

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Date: 2016-02-19

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Date: 2016-02-05

Time: 15:30-16:30

Location: BURN 1214

Abstract:

Date: 2016-01-29

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Date: 2016-01-22

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Date: 2015-11-13

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Date: 2015-11-06

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Date: 2015-10-23

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Date: 2015-10-16

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Date: 2015-10-09

Time: 15:30-16:30

Location: BURN 1205

Abstract: