In next-generation cosmology and exoplanet experiments, the data volumes are large but the signals of greatest importance are tiny. For examples, think of the baryon acoustic feature in the two-point correlation function of intergalactic absorption systems or the discovery of low-mass exoplanets in the astrometry of millions of parent stars. In this regime (large data but small signals) utmost care is required to transfer without loss the relevant information from the raw data into the parameters of interest. Information transfer is maximized when inference is performed by modeling the data, where the "model" generates the mean (expectation) and noise in the data accurately, but also includes well-informed prior probability distributions over nuisance parameters. (In general, large numbers of nuisance parameters are required to perform precise modeling, and there are technical challenges to marginalizing them out). I will present some examples of success in these areas---discoveries of high-redshift quasars, brown-dwarf stars, and exoplanets---and relate them to Gaia, LSST, and other large projects.