Convergence Guarantees for Adversarially Robust Classifiers
Rachel Morris · Oct 3, 2025
Date: 2025-10-03
Time: 15:30-16:30 (Montreal time)
Location: In person, Burnside 1104
https://mcgill.zoom.us/j/82469112499
Meeting ID: 824 6911 2499
Passcode: None
Abstract:
Neural networks can be trained to classify images and achieve high levels of accuracy. However, researchers have discovered that well-targeted perturbations of an image can completely fool a trained classifier, even in cases where the modified image is visually indistinguishable from the original. This has sparked many new approaches to classification which include an adversary in the training process: such an adversary can improve robustness and generalization properties at the cost of decreased accuracy and increased training time. In this presentation, I will explore the connection between a certain class of adversarial training problems and the Bayes classification problem for binary classification. In particular, robustness can be encouraged by adding a regularizing nonlocal perimeter term, providing a strong connection to classical studies of perimeter. Borrowing tools from geometric measure theory, I will show the Hausdorff convergence of adversarially robust classifiers to Bayes classifiers as the strength of adversary decreases to 0. In this way, the theoretical results discussed in the presentation provide a rigorous comparison with the standard Bayes classification problem.