Abstract:
This talk considers discrete-to-continuum limits of variational problems that arise from machine learning. Given a-priori labelled data points $\{x_i\}_{i=1}^N$ with labels $\{y_i\}_{i=1}^N$ we want to consistently extend labels to the unlabelled data points $\{x_i\}_{i=N+1}^n$. In this talk the geometry is represented by the random geometric graph model with connection radius r(n). The framework is to consider objective functions which reward the regularity of the estimator function and impose or reward the agreement with the training data, more specifically we will consider discrete p-Laplacian regularization.
The talk concerns the asymptotic behaviour in the limit where the number of unlabelled points increases while the number of training points remains fixed. The results are to uncover a delicate interplay between the regularizing nature of the functionals considered and the nonlocality inherent to the graph constructions. I will give almost optimal ranges on the scaling of r(n) for the asymptotic consistency to hold. For standard approaches used thus far there is a restrictive upper bound on how quickly r(n) must converge to zero as n goes to infinity. I will present a new model which overcomes this restriction. It is as simple as the standard models, but converges as soon as $r(n)\to 0$.
This is joint work with Dejan Slepcev (CMU). |