Speaker
Description
Data scarcity is a fundamental challenge in astrophysics and astroparticle physics, especially in an era increasingly shaped by data-hungry deep learning models. Many cosmic phenomena, including fast transients, high-redshift galaxies, and multi-messenger events such as binary neutron star mergers and black hole–neutron star collisions, occur so rarely that available observational datasets are insufficient to train robust models or adequately capture astrophysical diversity. To address these limitations, data augmentation techniques have been developed, including image transformations, perturbations of spectral and time-series data, and deep generative models. These approaches expand the effective sample size while remaining constrained by physical assumptions. In this talk, I examine the implications of data scarcity and augmentation for the empirical basis of science, asking which forms of empiricism and scientific realism can provide an adequate framework for reasoning under conditions of limited empirical evidence.