publications
2024
- Posterior Uncertainty Quantification in Neural Networks using Data AugmentationLuhuan Wu, and Sinead A WilliamsonIn International Conference on Artificial Intelligence and Statistics 2024
In this paper, we approach the problem of uncertainty quantification in deep learning through a predictive framework, which captures uncertainty in model parameters by specifying our assumptions about the predictive distribution of unseen future data. Under this view, we show that deep ensembling (Lakshminarayanan et al., 2017) is a fundamentally mis-specified model class, since it assumes that future data are supported on existing observations only – a situation rarely encountered in practice. To address this limitation, we propose MixupMP, a method that constructs a more realistic predictive distribution using popular data augmentation techniques. MixupMP operates as a drop-in replacement for deep ensembles, where each ensemble member is trained on a random simulation from this predictive distribution. Grounded in the recently-proposed framework of Martingale posteriors (Fong et al., 2023), MixupMP returns samples from an implicitly defined Bayesian posterior. Our empirical analysis showcases that MixupMP achieves superior predictive performance and uncertainty quantification on various image classification datasets, when compared with existing Bayesian and non-Bayesian approaches.
2023
- Practical and Asymptotically Exact Conditional Sampling in Diffusion ModelsLuhuan Wu Wu, Brian Trippe, Christian Naesseth, David Blei, and John CunninghamIn Conference on Neural Information Processing Systems 2023
Diffusion models have been successful on a range of conditional generation tasks including molecular design and text-to-image generation. However, these achievements have primarily depended on task-specific conditional training or error-prone heuristic approximations. Ideally, a conditional generation method should provide exact samples for a broad range of conditional distributions without requiring task-specific training. To this end, we introduce the Twisted Diffusion Sampler, or TDS. TDS is a sequential Monte Carlo (SMC) algorithm that targets the conditional distributions of diffusion models. The main idea is to use twisting, an SMC technique that enjoys good computational efficiency, to incorporate heuristic approximations without compromising asymptotic exactness. We first find in simulation and on MNIST image inpainting and class-conditional generation tasks that TDS provides a computational statistical trade-off, yielding more accurate approximations with many particles but with empirical improvements over heuristics with as few as two particles. We then turn to motif-scaffolding, a core task in protein design, using a TDS extension to Riemannian diffusion models. On benchmark test cases, TDS allows flexible conditioning criteria and often outperforms the state of the art.
2022
- Variational Nearest Neighbor Gaussian ProcessLuhuan Wu, Geoff Pleiss, and John CunninghamIn International Conference on Machine Learning 2022
Variational approximations to Gaussian processes (GPs) typically use a small set of inducing points to form a low-rank approximation to the covariance matrix. In this work, we instead exploit a sparse approximation of the precision matrix. We propose variational nearest neighbor Gaussian process (VNNGP), which introduces a prior that only retains correlations within 𝐾 nearest-neighboring observations, thereby inducing sparse precision structure. Using the variational framework, VNNGP’s objective can be factorized over both observations and inducing points, enabling stochastic optimization with a time complexity of 𝑂(𝐾3). Hence, we can arbitrarily scale the inducing point size, even to the point of putting inducing points at every observed location. We compare VNNGP to other scalable GPs through various experiments, and demonstrate that VNNGP (1) can dramatically outperform low-rank methods, and (2) is less prone to overfitting than other nearest neighbor methods.
2021
- Bias-free Scalable Gaussian Processes via Randomized TruncationsAndres Potapczynski, Luhuan Wu, Dan Biderman, Geoff Pleiss, and John P CunninghamIn International Conference on Machine Learning 2021
Scalable Gaussian Process methods are computationally attractive, yet introduce modeling biases that require rigorous study. This paper analyzes two common techniques: early truncated conjugate gradients (CG) and random Fourier features (RFF). We find that both methods introduce a systematic bias on the learned hyperparameters: CG tends to underfit while RFF tends to overfit. We address these issues using randomized truncation estimators that eliminate bias in exchange for increased variance. In the case of RFF, we show that the bias-to-variance conversion is indeed a trade-off: the additional variance proves detrimental to optimization. However, in the case of CG, our unbiased learning procedure meaningfully outperforms its biased counterpart with minimal additional computation. Our code is available at https://github.com/ cunningham-lab/RTGPS.
- Hierarchical Inducing Point Gaussian Process for Inter-domian ObservationsLuhuan Wu, Andrew Miller, Lauren Anderson, Geoff Pleiss, David Blei, and John CunninghamIn International Conference on Artificial Intelligence and Statistics 2021
We examine the general problem of inter-domain Gaussian Processes (GPs): problems where the GP realization and the noisy observations of that realization lie on different domains. When the mapping between those domains is linear, such as integration or differentiation, inference is still closed form. However, many of the scaling and approximation techniques that our community has developed do not apply to this setting. In this work, we introduce the hierarchical inducing point GP (HIP-GP), a scalable inter-domain GP inference method that enables us to improve the approximation accuracy by increasing the number of inducing points to the millions. HIP-GP, which relies on inducing points with grid structure and a stationary kernel assumption, is suitable for low-dimensional problems. In developing HIP-GP, we introduce (1) a fast whitening strategy, and (2) a novel preconditioner for conjugate gradients which can be helpful in general GP settings.
2020
- Inverse Articulated-body Dynamics from Video via Variational Sequential Monte CarloDan Biderman, Christian A Naesseth, Luhuan Wu, Taiga Abe, Alice C Mosberger, Leslie J Sibener, Rui Costa, James Murray, and John P CunninghamIn NeurIPS Workshop on Differentiable Vision, Graphics, and Physics in Machine Learning 2020
Convolutional neural networks for pose estimation are continuously improving in identifying joints of moving agents from video. However, state-of-the-art algorithms offer no insight into the underlying mechanics of articulated limbs. "Seeing" the mechanics of movement is of major importance for fields like neuroscience, studying how the brain controls movement, and engineering, e.g., using vision to correct for errors in the action of a robotic manipulator. In the pipeline proposed here, we use a convolutional network to track joint positions, and embed these as the joints of a linked robotic manipulator. We develop a probabilistic physical model whose states specify second-order rigid-body dynamics and the torques applied to each actuator. Observations are generated by mapping the joint angles through the forward kinematics function to Cartesian coordinates. For nonlinear state estimation and parameter learning, we build on variational Sequential Monte Carlo (SMC), a differentiable variant of the classical SMC method leveraging variational inference. We extend with a distributed nested SMC algorithm, which, at inference time, wraps multiple independent SMC samplers within an outer-level importance sampler. We extract mechanical quantities from simulated data and newly acquired videos of mice and humans, offering a novel tool for studying e.g. biological motor control.
2019
- Smoothing Nonlinear Variational Objectives with Sequential Monte CarloAntonio Moretti, Zizhao Wang, Luhuan Wu, and Itsik Pe’erIn ICLR Workshop on Deep Generative Models for Highly Structured Data 2019
The task of recovering nonlinear dynamics and latent structure from a population recording is a challenging problem in statistical neuroscience motivating the development of novel techniques in time series analysis. Recent work has focused on connections between Variational Inference and Sequential Monte Carlo for performing inference and parameter estimation on sequential data. Inspired by this work, we present a framework to develop Smoothed Variational Objectives (SVOs) that condition proposal distributions on the full time-ordered sequence of observations. SVO maintains both expressiveness and tractability by sharing parameters of the transition function between the proposal and target. We apply the method to several dimensionality reduction/expansion tasks and examine the dynamics learned with a quantitative metric. SVO performs favorably against the state of the art.