/tags/2021-fall/index.xml 2021 fall - McGill Statistics Seminars
  • Adventures with Partial Identifications in Studies of Marked Individuals

    Date: 2021-11-26 Time: 15:30-16:30 (Montreal time) Zoom Link Meeting ID: 939 8331 3215 Passcode: 096952 Abstract: Monitoring marked individuals is a common strategy in studies of wild animals (referred to as mark-recapture or capture-recapture experiments) and hard to track human populations (referred to as multi-list methods or multiple-systems estimation). A standard assumption of these techniques is that individuals can be identified uniquely and without error, but this can be violated in many ways.
  • Prediction of Bundled Insurance Risks with Dependence-aware Prediction using Pair Copula Construction

    Date: 2021-11-19 Time: 15:30-16:30 (Montreal time) https://mcgill.zoom.us/j/83436686293?pwd=b0RmWmlXRXE3OWR6NlNIcWF5d0dJQT09 Meeting ID: 834 3668 6293 Passcode: 12345 Abstract: We propose a dependence-aware predictive modeling framework for multivariate risks stemmed from an insurance contract with bundling features – an important type of policy increasingly offered by major insurance companies. The bundling feature naturally leads to longitudinal measurements of multiple insurance risks. We build a novel predictive model that actively exploits the dependence among the evolution of multivariate repeated risk measurements.
  • Variational Bayes for high-dimensional linear regression with sparse priors

    Date: 2021-11-12 Time: 15:30-16:30 (Montreal time) https://mcgill.zoom.us/j/83436686293?pwd=b0RmWmlXRXE3OWR6NlNIcWF5d0dJQT09 Meeting ID: 834 3668 6293 Passcode: 12345 Abstract: A core problem in Bayesian statistics is approximating difficult to compute posterior distributions. In variational Bayes (VB), a method from machine learning, one approximates the posterior through optimization, which is typically faster than Markov chain Monte Carlo. We study a mean-field (i.e. factorizable) VB approximation to Bayesian model selection priors, including the popular spike-and-slab prior, in sparse high-dimensional linear regression.
  • Opinionated practices for teaching reproducibility: motivation, guided instruction and practice

    Date: 2021-10-29 Time: 15:30-16:30 (Montreal time) Zoom Link Meeting ID: 939 8331 3215 Passcode: 096952 Abstract: In the data science courses at the University of British Columbia, we define data science as the study, development and practice of reproducible and auditable processes to obtain insight from data. While reproducibility is core to our definition, most data science learners enter the field with other aspects of data science in mind, for example predictive modelling, which is often one of the most interesting topic to novices.
  • Model-assisted analyses of cluster-randomized experiments

    Date: 2021-10-22 Time: 15:30-16:30 (Montreal time) https://mcgill.zoom.us/j/83436686293?pwd=b0RmWmlXRXE3OWR6NlNIcWF5d0dJQT09 Meeting ID: 834 3668 6293 Passcode: 12345 Abstract: Cluster-randomized experiments are widely used due to their logistical convenience and policy relevance. To analyze them properly, we must address the fact that the treatment is assigned at the cluster level instead of the individual level. Standard analytic strategies are regressions based on individual data, cluster averages, and cluster totals, which differ when the cluster sizes vary.
  • Imbalanced learning using actuarial modified loss function in tree-based models

    Date: 2021-10-08 Time: 15:30-16:30 (Montreal time) https://mcgill.zoom.us/j/83436686293?pwd=b0RmWmlXRXE3OWR6NlNIcWF5d0dJQT09 Meeting ID: 834 3668 6293 Passcode: 12345 Abstract: Tree-based models have gained momentum in insurance claim loss modeling; however, the point mass at zero and the heavy tail of insurance loss distribution pose the challenge to apply conventional methods directly to claim loss modeling. With a simple illustrative dataset, we first demonstrate how the traditional tree-based algorithm’s splitting function fails to cope with a large proportion of data with zero responses.
  • The HulC: Hull based Confidence Regions

    Date: 2021-10-01 Time: 15:30-16:30 (Montreal time) https://mcgill.zoom.us/j/83436686293?pwd=b0RmWmlXRXE3OWR6NlNIcWF5d0dJQT09 Meeting ID: 834 3668 6293 Passcode: 12345 Abstract: We develop and analyze the HulC, an intuitive and general method for constructing confidence sets using the convex hull of estimates constructed from subsets of the data. Unlike classical methods which are based on estimating the (limiting) distribution of an estimator, the HulC is often simpler to use and effectively bypasses this step. In comparison to the bootstrap, the HulC requires fewer regularity conditions and succeeds in many examples where the bootstrap provably fails.
  • Deep down, everyone wants to be causal

    Date: 2021-09-24 Time: 15:00-16:00 (Montreal time) https://mcgill.zoom.us/j/9791073141 Meeting ID: 979 107 3141 Abstract: In the data science courses at the University of British Columbia, we define data science as the study, development and practice of reproducible and auditable processes to obtain insight from data. While reproducibility is core to our definition, most data science learners enter the field with other aspects of data science in mind, for example predictive modelling, which is often one of the most interesting topic to novices.
  • On the Minimal Error of Empirical Risk Minimization

    Date: 2021-09-17 Time: 15:30-16:30 (Montreal time) https://mcgill.zoom.us/j/83436686293?pwd=b0RmWmlXRXE3OWR6NlNIcWF5d0dJQT09 Meeting ID: 834 3668 6293 Passcode: 12345 Abstract: In recent years, highly expressive machine learning models, i.e. models that can express rich classes of functions, are becoming more and more commonly used due their success both in regression and classification tasks, such models are deep neural nets, kernel machines and more. From the classical theory statistics point of view (the minimax theory), rich models tend to have a higher minimax rate, i.
  • Weighted empirical processes

    Date: 2021-09-10 Time: 15:30-16:30 (Montreal time) https://mcgill.zoom.us/j/83436686293?pwd=b0RmWmlXRXE3OWR6NlNIcWF5d0dJQT09 Meeting ID: 834 3668 6293 Passcode: 12345 Abstract: Empirical processes concern the uniform behavior of averaged sums over a sample of observations where the sums are indexed by a class of functions. Classical empirical processes typically study the empirical distribution function over the real line, while more modern empirical processes study much more general indexing function classes (e.g., Vapnik-Chervonenkis class, smoothness class); typical results include moment bounds and deviation inequalities.