Suggested Searches

Exploiting the Intrinsic Correlations Between Galaxy Properties: Redshift and Beyond

PI: Newman, Jeffrey, University Of Pittsburgh
Wide-Field Science – Regular

Roman cosmology and galaxy evolution studies will rely on photometric information to estimate redshifts (photo-z’s) and other galaxy properties. Roman galaxy evolution studies will depend critically on measuring distributions and scaling relations of key properties such as luminosity, stellar mass, and specific star formation rate; the same quantities can also be used to enhance cosmology studies (for instance, by separating populations with different intrinsic alignments or clustering for weak lensing/LSS studies, characterizing host properties for SNe Ia, or mapping foreground shear and magnification for strong lens systems and SNe). However, training sets of objects with well-measured properties are very limited; we thus need new methods that can take maximum advantage of limited training sets.

In previous WFS-funded work, we have developed a new framework to overcome the limitations of magnitude/redshift/color biases, sparse sampling of color space, sample/cosmic variance, and incorrect redshift and parameter measurements. We construct optimal training sets by applying the UMAP algorithm to encode the optical-IR galaxy spectral energy distributions (SEDs) of observed galaxies into their intrinsic low-dimensional color space, and then robustly interpolating between objects in that space with high-quality measurements to predict redshifts for a complete sample of photometric objects. In contrast to the Self-Organizing Maps (SOMs) often used to map observed galaxy SEDs onto a 2-D rectangular and discrete grid for photo-z applications, UMAP provides a continuous, topologically flexible, and robust low-D representation of optical-IR color space, trained using photometric galaxy samples. We have found that galaxies intrinsically occupy a roughly 3-D manifold. We have applied robust regression methods which can interpolate between sparse samples and ignore outliers to map from location in UMAP space to redshift, with excellent results.

However, low-z and high-z galaxies are sometimes found near each other in an SED-based UMAP manifold. Our first proposed project is to address this problem by testing whether incorporating morphological information can break degeneracies and more cleanly separate populations, thereby improving interpolation results. This could utilize either basic catalog-level measurements or autoencoder representations of galaxy morphology provided by deep learning algorithms.

Our second proposed project is to implement and test frameworks for predicting galaxy properties beyond redshift. Position in the UMAP space also provides information on these properties; e.g., specific star formation rate (sSFR) varies in a direction roughly orthogonal to z in the map. The same frameworks we have used to interpolate redshift can be used to predict other galaxy properties. We will implement and test frameworks for predicting these quantities as well as z. The interpolation can be trained using limited samples with property measurements from spectroscopy and/or many-band data (e.g., JWST measurements), but can make predictions for arbitrary positions in a manifold defined by fewer bands, enabling us to take advantage of the wealth of information in Roman deep fields to produce better training sets even for wider-area surveys.

By leveraging our mapping of color space, we will produce new, re-balanced and augmented training sets for redshift and other parameters. The resulting training sets would be the ideal inputs for community development of machine-learning-based parameter measurement algorithms, increasing the power of Roman for studying both cosmology and galaxy evolution.