McGill Statistics Seminar - McGill Statistics Seminars

- Nov 15, 2013
- post
Submodel selection and post estimation: Making sense or folly

Syed Ejaz Ahmed · Nov 15, 2013
Date: 2013-11-15

Time: 15:30-16:30

Location: BURN 1205

Abstract:

In this talk, we consider estimation in generalized linear models when there are many potential predictors and some of them may not have influence on the response of interest. In the context of two competing models where one model includes all predictors and the other restricts variable coefficients to a candidate linear subspace based on subject matter or prior knowledge, we investigate the relative performances of Stein type shrinkage, pretest, and penalty estimators (L1GLM, adaptive L1GLM, and SCAD) with respect to the full model estimator. The asymptotic properties of the pretest and shrinkage estimators including the derivation of asymptotic distributional biases and risks are established. A Monte Carlo simulation study show that the mean squared error (MSE) of an adaptive shrinkage estimator is comparable to the MSE of the penalty estimators in many situations and in particular performs better than the penalty estimators when the model is sparse. A real data set analysis is also presented to compare the suggested methods.

Read More…
- Nov 8, 2013
- post
The inadequacy of the summed score (and how you can fix it!)

Daphna Harel · Nov 8, 2013
Date: 2013-11-08

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Health researchers often use patient and physician questionnaires to assess certain aspects of health status. Item Response Theory (IRT) provides a set of tools for examining the properties of the instrument and for estimation of the latent trait for each individual. In my research, I critically examine the usefulness of the summed score over items and an alternative weighted summed score (using weights computed from the IRT model) as an alternative to both the empirical Bayes estimator and maximum likelihood estimator for the Generalized Partial Credit Model. First, I will talk about two useful theoretical properties of the weighted summed score that I have proven as part of my work. Then I will relate the weighted summed score to other commonly used estimators of the latent trait. I will demonstrate the importance of these results in the context of both simulated and real data on the Center for Epidemiological Studies Depression Scale.

Read More…
- Nov 1, 2013
- post
Bayesian latent variable modelling of longitudinal family data for genetic pleiotropy studies

Radu Craiu · Nov 1, 2013
Date: 2013-11-01

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Motivated by genetic association studies of pleiotropy, we propose a Bayesian latent variable approach to jointly study multiple outcomes or phenotypes. The proposed method models both continuous and binary phenotypes, and it accounts for serial and familial correlations when longitudinal and pedigree data have been collected. We present a Bayesian estimation method for the model parameters and we discuss some of the model misspecification effects. Central to the analysis is a novel MCMC algorithm that builds upon hierarchical centering and parameter expansion techniques to efficiently sample the posterior distribution. We discuss phenotype and model selection, and we study the performance of two selection strategies based on Bayes factors and spike-and-slab priors.

Read More…
- Oct 18, 2013
- post
Whole genome 3D architecture of chromatin and regulation

Shili Lin · Oct 18, 2013
Date: 2013-10-18

Time: 15:30-16:30

Location: BURN 1205

Abstract:

The expression of a gene is usually controlled by the regulatory elements in its promoter region. However, it has long been hypothesized that, in complex genomes, such as the human genome, a gene may be controlled by distant enhancers and repressors. A recent molecular technique, 3C (chromosome conformation capture), that uses formaldehyde cross-linking and locus-specific PCR, was able to detect physical contacts between distant genomic loci. Such communication is achieved through spatial organization (looping) of chromosomes to bring genes and their regulatory elements into close proximity. Several adaptations of the 3C assay to study genomewide spatial interactions, including Hi-C and ChIA-PET, have been developed. The availability of such data makes it possible to reconstruct the underlying three-dimensional spatial chromatin structure. In this talk, I will first describe a Bayesian statistical model for building spatial estrogen receptor regulation focusing on reducing false positive interactions. A random effect model, PRAM, will then be presented to make inference on the locations of genomic loci in a 3D Euclidean space. Results from ChIA-PET and Hi-C data will be visualized to illustrate the regulation and spatial proximity of genomic loci that are far apart in their linear chromosomal locations.

Read More…
- Oct 4, 2013
- post
Some recent developments in likelihood-based small area estimation

Farhad Shokoohi · Oct 4, 2013
Date: 2013-10-04

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Mixed models are commonly used for the analysis data in small area estimation. In particular, small area estimation has been extensively studied under linear mixed models. However, in practice there are many situations that we have counts or proportions in small area estimation; for example a (monthly) dataset on the number of incidences in small areas. Recently, small area estimation under the linear mixed model with penalized spline model, for xed part of the model, was studied. In this talk, small area estimation under generalized linear mixed models by combining time series and cross-sectional data with the extension of these models to include penalized spline regression models are proposed. A likelihood-based approach is used to predict small area parameters and also to provide prediction intervals. The performance of the proposed models and approach is evaluated through simulation studies and also by real datasets.

Read More…
- Sep 20, 2013
- post
Tests of independence for sparse contingency tables and beyond

Orla A. Murphy · Sep 20, 2013
Date: 2013-09-20

Time: 15:30-16:30

Location: BURN 1205

Abstract:

In this talk, a new and consistent statistic is proposed to test whether two discrete random variables are independent. The test is based on a statistic of the Cramér–von Mises type constructed from the so-called empirical checkerboard copula. The test can be used even for sparse contingency tables or tables whose dimension changes with the sample size. Because the limiting distribution of the test statistic is not tractable, a valid bootstrap procedure for the computation of p-values will be discussed. The new statistic is compared by a power study to standard procedures for testing independence, such as the Pearson’s Chi-Squared, the Likelihood Ratio, and the Zelterman statistics. The new test turns out to be considerably more powerful than all its competitors in all scenarios considered.

Read More…
- Sep 13, 2013
- post
Bayesian nonparametric density estimation under length bias sampling

Theodoros Nicoleris · Sep 13, 2013
Date: 2013-09-13

Time: 15:30-16:30

Location: BURN 1205

Abstract:

A new density estimation method in a Bayesian nonparametric framework is presented when recorded data are not coming directly from the distribution of interest, but from a length biased version. From a Bayesian perspective, efforts to computationally evaluate posterior quantities conditionally on length biased data were hindered by the inability to circumvent the problem of a normalizing constant. In this talk a novel Bayesian nonparametric approach to the length bias sampling problem is presented which circumvents the issue of the normalizing constant. Numerical illustrations as well as a real data example are presented and the estimator is compared against its frequentist counterpart, the kernel density estimator for indirect data." This is joint work with: a) Spyridon J. Hatjispyros, University of the Aegean, Greece. b)Stephen G. Walker, University of Texas at Austin, U.S.A.

Read More…
- Apr 5, 2013
- post
Éric Marchand: On improved predictive density estimation with parametric constraints

Éric Marchand · Apr 5, 2013
Date: 2013-04-05

Time: 14:30-15:30

Location: BURN 1205

Abstract:

We consider the problem of predictive density estimation under Kullback-Leibler loss when the parameter space is restricted to a convex subset. The principal situation analyzed relates to the estimation of an unknown predictive p-variate normal density based on an observation generated by another p-variate normal density. The means of the densities are assumed to coincide, the covariance matrices are a known multiple of the identity matrix. We obtain sharp results concerning plug-in estimators, we show that the best unrestricted invariant predictive density estimator is dominated by the Bayes estimator associated with a uniform prior on the restricted parameter space, and we obtain minimax results for cases where the parameter space is (i) a cone, and (ii) a ball. A key feature, which we will describe, is a correspondence between the predictive density estimation problem with a collection of point estimation problems. Finally, if time permits, we describe recent work concerning : (i) non-normal models, and (ii) analysis relative to other loss functions such as reverse Kullback-Leibler and integrated L2.

Read More…
- Mar 15, 2013
- post
Jiahua Chen: Quantile and quantile function estimations under density ratio model

Jiahua Chen · Mar 15, 2013
Date: 2013-03-15

Time: 14:30-15:30

Location: BURN 1205

Abstract:

Join work with Yukun Liu (East China Normal University)

Population quantiles and their functions are important parameters in many applications. For example, the lower level quantiles often serve as crucial quality indices of forestry products and others. In the presence of several independent samples from populations satisfying density ratio model, we investigate the properties of the empirical likelihood (EL) based inferences of quantiles and their functions. In this paper, we first establish the consistency and asymptotic normality of the estimators of parameters and cumulative distributions. The induced EL quantile estimators are then shown to admit Bahadur representation. The results are used to construct asymptotically valid confidence intervals for functions of quantiles. In addition, we rigorously prove that the EL quantiles based on all samples are more efficient than the empirical quantiles which can only utilize information from individual samples. Simulation study shows that the EL quantiles and their functions have superior performances both when the density ratio model assumption is satisfied and mildly violated. An application example is used to demonstrate the new methods and potential cost savings.

Read More…
- Mar 1, 2013
- post
Natalia Stepanova: On asymptotic efficiency of some nonparametric tests for testing multivariate independence

Natalia Stepanova · Mar 1, 2013
Date: 2013-03-01

Time: 14:30-15:30

Location: BURN 1205

Abstract:

Some problems of statistics can be reduced to extremal problems of minimizing functionals of smooth functions defined on the cube $[0,1]^m$, $m\geq 2$. In this talk, we consider a class of extremal problems that is closely connected to the problem of testing multivariate independence. By solving the extremal problem, we provide a unified approach to establishing weak convergence for a wide class of empirical processes which emerge in connection with testing multivariate independence. The use of our result will be also illustrated by describing the domain of local asymptotic optimality of some nonparametric tests of independence.

Read More…

Date: 2013-11-15

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Date: 2013-11-08

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Date: 2013-11-01

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Date: 2013-10-18

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Date: 2013-10-04

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Date: 2013-09-20

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Date: 2013-09-13

Time: 15:30-16:30

Location: BURN 1205

Abstract:

Date: 2013-04-05

Time: 14:30-15:30

Location: BURN 1205

Abstract:

Date: 2013-03-15

Time: 14:30-15:30

Location: BURN 1205

Abstract:

Date: 2013-03-01

Time: 14:30-15:30

Location: BURN 1205

Abstract: