Past Seminar Series - McGill Statistics Seminars

- Mar 9, 2012
- post
Using tests of homoscedasticity to test missing completely at random | Hugh Chipman: Sequential optimization of a computer model and other Active Learning problems

Hugh Chipman and Mori Jamshidian · Mar 9, 2012
Date: 2012-03-09

Time: 14:00-16:30

Location: UQAM, 201 ave. du Président-Kennedy, salle 5115

Abstract:

Li: The problem of selecting the most useful features from a great many (eg, thousands) of candidates arises in many areas of modern sciences. An interesting problem from genomic research is that, from thousands of genes that are active (expressed) in certain tissue cells, we want to ﬁnd the genes that can be used to separate tissues of diﬀerent classes (eg. cancer and normal). In this paper, we report a Bayesian logistic regression method based on heavytailed priors with moderately small degree freedom (such as 1) and small scale (such as 0.01), and using Gibbs sampling to do the computation. We show that it can distinctively separate a couple of useful features from a large number of useless ones, and discriminate many redundant correlated features. We also show that this method is very stable to the choice of scale. We apply our method to a microarray data set related to prostate cancer, and identify only 3 genes out of 6033 candidates that can separate cancer and normal tissues very well in leave-one-out cross-validation.

Read More…
- Mar 2, 2012
- post
Estimating a variance-covariance surface for functional and longitudinal data

James O. Ramsay · Mar 2, 2012
Date: 2012-03-02

Time: 15:30-16:30

Location: BURN 1205

Abstract:

In functional data analysis, as in its multivariate counterpart, estimates of the bivariate covariance kernel σ(s,t ) and its inverse are useful for many things, and we need the inverse of a covariance matrix or kernel especially often. However, the dimensionality of functional observations often exceeds the sample size available to estimate σ(s,t, and then the analogue S of the multivariate sample estimate is singular and non-invertible. Even when this is not the case, the high dimensionality S often implies unacceptable sample variability and loss of degrees of freedom for model fitting. The common practice of employing low-dimensional principal component approximations to σ(s,t) to achieve invertibility also raises serious issues.

Read More…
- Feb 17, 2012
- post
McGillivray: A penalized quasi-likelihood approach for estimating the number of states in a hidden Markov model | Best: Risk-set sampling and left truncation in survival analysis

Annaliza McGillivray and Ana Best · Feb 17, 2012
Date: 2012-02-17

Time: 15:30-16:30

Location: BURN 1205

Abstract:

McGillivray: In statistical applications of hidden Markov models (HMMs), one may have no knowledge of the number of hidden states (or order) of the model needed to be able to accurately represent the underlying process of the data. The problem of estimating the number of hidden states of the HMM is thus brought to the forefront. In this talk, we present a penalized quasi-likelihood approach for order estimation in HMMs which makes use of the fact that the marginal distribution of the observations from a HMM is a finite mixture model. The method starts with a HMM with a large number of states and obtains a model of lower order by clustering and combining similar states of the model through two penalty functions. We assess the performance of the new method via extensive simulation studies for Normal and Poisson HMMs.

Read More…
- Feb 10, 2012
- post
Stute: Principal component analysis of the Poisson Process | Blath: Longterm properties of the symbiotic branching model

Winfried Stute and Jochen Blath · Feb 10, 2012
Date: 2012-02-10

Time: 14:00-16:30

Location: Concordia

Abstract:

Stute: The Poisson Process constitutes a well-known model for describing random events over time. It has many applications in marketing research, insurance mathematics and finance. Though it has been studied for decades not much is known how to check (in a non-asymptotic way) the validity of the Poisson Process. In this talk we present the principal component decomposition of the Poisson Process which enables us to derive finite sample properties of associated goodness-of-fit tests. In the first step we show that the Fourier-transforms of the components contain Bessel and Struve functions. Inversion leads to densities which are modified arc sin distributions.

Read More…
- Feb 3, 2012
- post
Du: Simultaneous fixed and random effects selection in finite mixtures of linear mixed-effects models | Harel: Measuring fatigue in systemic sclerosis: a comparison of the SF-36 vitality subscale and FACIT fatigue scale using item response theory

Yeting Du and Daphna Harel · Feb 3, 2012
Date: 2012-02-03

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Du: Linear mixed-effects (LME) models are frequently used for modeling longitudinal data. One complicating factor in the analysis of such data is that samples are sometimes obtained from a population with significant underlying heterogeneity, which would be hard to capture by a single LME model. Such problems may be addressed by a finite mixture of linear mixed-effects (FMLME) models, which segments the population into subpopulations and models each subpopulation by a distinct LME model. Often in the initial stage of a study, a large number of predictors are introduced. However, their associations to the response variable vary from one component to another of the FMLME model. To enhance predictability and to obtain a parsimonious model, it is of great practical interest to identify the important effects, both fixed and random, in the model. Traditional variable selection techniques such as stepwise deletion and subset selection are computationally expensive as the number of covariates and components in the mixture model increases. In this talk, we introduce a penalized likelihood approach and propose a nested EM algorithm for efficient numerical computations. Our estimators are shown to possess desirable properties such as consistency, sparsity and asymptotic normality. We illustrate the performance of our method through simulations and a systemic sclerosis data example.

Read More…
- Jan 27, 2012
- post
Applying Kalman filtering to problems in causal inference

Sepideh Farsinezhad · Jan 27, 2012
Date: 2012-01-27

Time: 15:30-16:30

Location: BURN 1205

Abstract:

A common problem in observational studies is estimating the causal effect of time-varying treatment in the presence of a time varying confounder. When random assignment of subjects to comparison groups is not possible, time-varying confounders can cause bias in estimating causal effects even after standard regression adjustment if past treatment history is a predictor of future confounders. To eliminate the bias of standard methods for estimating the causal effect of time varying treatment, Robins developed a number of innovative methods for discrete treatment levels, including G-computation, G-estimation, and marginal structural models (MSMs). However, there does not currently exist straight-forward applications of G-Estimation and MSMs for continuous treatment. In this talk, I will introduce an alternative approach to previous methods which utilize the Kalman filter. The key advantage to the Kalman filter approach is that the model easily accommodates continuous levels of treatment.

Read More…
- Jan 20, 2012
- post
A concave regularization technique for sparse mixture models

Martin Larsson · Jan 20, 2012
Date: 2012-01-20

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Latent variable mixture models are a powerful tool for exploring the structure in large datasets. A common challenge for interpreting such models is a desire to impose sparsity, the natural assumption that each data point only contains few latent features. Since mixture distributions are constrained in their L1 norm, typical sparsity techniques based on L1 regularization become toothless, and concave regularization becomes necessary. Unfortunately concave regularization typically results in EM algorithms that must perform problematic non-convex M-step optimization. In this work, we introduce a technique for circumventing this difficulty, using the so-called Mountain Pass Theorem to provide easily verifiable conditions under which the M-step is well-behaved despite the lacking convexity. We also develop a correspondence between logarithmic regularization and what we term the pseudo-Dirichlet distribution, a generalization of the ordinary Dirichlet distribution well-suited for inducing sparsity. We demonstrate our approach on a text corpus, inferring a sparse topic mixture model for 2,406 weblogs.

Read More…
- Jan 13, 2012
- post
Bayesian approaches to evidence synthesis in clinical practice guideline development

Yulei He · Jan 13, 2012
Date: 2012-01-13

Time: 15:30-16:30

Location: Concordia, Library Building LB-921.04

Abstract:

The American College of Cardiology Foundation (ACCF) and the American Heart Association (AHA) have jointly engaged in the production of guideline in the area of cardiovascular disease since 1980. The developed guidelines are intended to assist health care providers in clinical decision making by describing a range of generally acceptable approaches for the diagnosis, management, or prevention of specific diseases or conditions. This talk describes some of our work under a contract with ACCF/AHA for applying Bayesian methods to guideline recommendation development. In a demonstration example, we use Bayesian meta-analysis strategies to summarize evidence on the comparative effectiveness between Percutaneous coronary intervention and Coronary artery bypass grafting for patients with unprotected left main coronary artery disease. We show the usefulness and flexibility of Bayesian methods in handling data arisen from studies with different designs (e.g. RCTs and observational studies), performing indirect comparison among treatments when studies with direct comparisons are unavailable, and accounting for historical data.

Read More…
- Dec 9, 2011
- post
Detecting evolution in experimental ecology: Diagnostics for missing state variables

Giles Hooker · Dec 9, 2011
Date: 2011-12-09

Time: 15:30-16:30

Location: UQAM Salle 5115

Abstract:

This talk considers goodness of fit diagnostics for time-series data from processes approximately modeled by systems of nonlinear ordinary differential equations. In particular, we seek to determine three nested causes of lack of fit: (i) unmodeled stochastic forcing, (ii) mis-specified functional forms and (iii) mis-specified state variables. Testing lack of fit in differential equations is challenging since the model is expressed in terms of rates of change of the measured variables. Here, lack of fit is represented on the model scale via time-varying parameters. We develop tests for each of the three cases above through bootstrap and permutation methods.

Read More…
- Dec 2, 2011
- post
Path-dependent estimation of a distribution under generalized censoring

Alberto Carabarin · Dec 2, 2011
Date: 2011-12-02

Time: 15:30-16:30

Location: BURN 1205

Abstract:

This talk focuses on the problem of the estimation of a distribution on an arbitrary complete separable metric space when the data points are subject to censoring by a general class of random sets. A path-dependent estimator for the distribution is proposed; among other properties, the estimator is sequential in the sense that it only uses data preceding any fixed point at which it is evaluated. If the censoring mechanism is totally ordered, the paths may be chosen in such a way that the estimate of the distribution defines a measure. In this case, we can prove a functional central limit theorem for the estimator when the underlying space is Euclidean. This is joint work with Gail Ivanoff (University of Ottawa)

Read More…

Date: 2012-03-09

Time: 14:00-16:30

Location: UQAM, 201 ave. du Président-Kennedy, salle 5115

Abstract:

Date: 2012-03-02

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Date: 2012-02-17

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Date: 2012-02-10

Time: 14:00-16:30

Location: Concordia

Abstract:

Date: 2012-02-03

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Date: 2012-01-27

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Date: 2012-01-20

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Date: 2012-01-13

Time: 15:30-16:30

Location: Concordia, Library Building LB-921.04

Abstract:

Date: 2011-12-09

Time: 15:30-16:30

Location: UQAM Salle 5115

Abstract:

Date: 2011-12-02

Time: 15:30-16:30

Location: BURN 1205

Abstract: