Distribution-free inference for regression
With Rina Foygel Barber, University of Chicago
Distribution-free inference for regression: discrete, continuous, and in between
In data analysis problems where we are not able to rely on distributional assumptions, what types of inference guarantees can still be obtained? Many popular methods, such as holdout methods, cross-validation methods, and conformal prediction, are able to provide distribution-free guarantees for predictive inference, but the problem of providing inference for the underlying regression function (for example, inference on the conditional meanE[Y|X]) is more challenging. If X takes only a small number of possible values, then inference on E[Y|X] is trivial to achieve. At the other extreme, if the features X are continuously distributed, we show that any confidence interval for E[Y|X] must have non-vanishing width, even as sample size tends to infinity – this is true regardless of smoothness properties or other desirable features of the underlying distribution. In between these two extremes, we find several distinct regimes – in particular, it is possible for distribution-free confidence intervals to have vanishing width if and only if the effective support size of the distribution ofXis smaller than the square of the sample size.
This work is joint with Yonghoon Lee.
- Speaker: Rina Foygel Barber, University of Chicago
- Friday 28 May 2021, 16:00–17:00
- Venue: https://maths-cam-ac-uk.zoom.us/j/95871364531?pwd=aFZaV0loSWt6QmRDbm5ONWNjTTBjZz09.
- Series: Statistics; organiser: Dr Sergio Bacallado.