Date: 2014-10-17

Time: 15:30-16:30

Location: BURN 1205

Abstract:

In the recent past, electronic health records and distributed data networks emerged as a viable resource for medical and scientific research. As the use of confidential patient information from such sources become more common, maintaining privacy of patients is of utmost importance. For a binary disease outcome of interest, we show that the techniques of specimen pooling could be applied for analysis of large and/or distributed data while respecting patient privacy. I will review the pooled analysis for a binary outcome and then show how it can be used for distributed data. Aggregate level data are passed from the nodes of the network to the analysis center and can be used very easily with logistic regression for estimation of disease odds ratio associated with a set of categorical or continuous covariates. Pooling approach allows for consistent estimation of the parameters of logistic regression that can include confounders. Additionally, since the individual covariate values can be accessed within a network, effect modifiers can be accommodated and consistently estimated. Since pooling effectively reduces the size of the dataset by creating pools or sets of individual, the resulting dataset can be analyzed much more quickly as compared to an original dataset that is too big as compared to computing environment.

Speaker

Paramita S. Chaudhuri was recently hired as an Assistant Professor in the Department of Epidemiology, Biostatistics and Occupational Health at McGill.