Extragalactic machine learning: in theory and in practice

Turner, S

Extragalactic machine learning: in theory and in practice

Export Citation

Turner, S (2021) Extragalactic machine learning: in theory and in practice. Doctoral thesis, Liverpool John Moores University.

Preview

Text
2020turnerphd.pdf - Published Version
Available under License Creative Commons Attribution Non-commercial.
Download (22MB) | Preview

Abstract

Galaxy evolution is complicated. Throughout their lifetimes, galaxies are subject to an amalgamation of astrophysical and cosmological processes that direct the growth of their stellar masses, the transformation of their morphologies, and the cessation of their star formation. The variable action of these processes begets a diverse population of galaxies, which exhibit a variety of brightnesses, colours, shapes, and sizes, among myriad other features. Many of these features are bimodally distributed, which has led to the general acceptance of a simple empirical paradigm of galaxy evolution. However, connecting this diversity among galaxies with the array of processes that are involved in their evolution, and constraining the relative influences of each of these processes, requires that several features are analysed simultaneously. This has been enabled by the recent advent of machine learning techniques, which are capable of extracting scientifically useful information from complicated, multi-dimensional datasets, to astronomy and astrophysics. Unsupervised machine learning techniques, free from the requirement for pre-labelled training data, are especially well suited to the exploration of the data structures of galaxy samples in multi-dimensional feature spaces. This thesis assesses the use of clustering, an unsupervised machine learning technique, for the research of galaxy evolution. Clustering is first tested on a well-characterised sample of galaxies from the GAMA survey. Galaxies are represented in five dimensions by a set of intrinsic astrophysical features. Use of a unique cluster evaluation framework enables the robust identification of reproducible and astrophysically meaningful clustering structures via the k-means method. Outcomes consisting of two, three, five, and six clusters are deemed stable, and form a hierarchical structure that agrees well with established notions of the galaxy bimodality. The two- and three-cluster outcomes are dominated in their structures by the stellar masses, colours, and star formation activity of galaxies, with Sérsic indices and half-light radii becoming important for the five- and six-cluster outcomes. Clusters also exhibit broad correspondence with detailed morphological classifications, and it is suggested that the inclusion of additional morphological features might improve this correspondence further. The five- and six-cluster outcomes indicate the differential role of environment in the evolution of galaxies with intermediate colours. This cluster evaluation framework is then applied for the validation of the cosmological, hydrodynamical EAGLE simulations against the GAMA survey. Outcomes consisting of seven and five clusters respectively, determined using the same five features for both samples, are selected for analysis. These outcomes produce an agreement score of Vₐ = 0.76, indicating broad, overall agreement, but differences in their substructures. These differences include discrepancies in the growth of the central bulges of galaxies along the star-forming main sequence, an over-abundance of low-mass, bulge-dominated, star-forming galaxies in the EAGLE sample, and a subpopulation of high-mass, disc-dominated, star-forming galaxies in the EAGLE sample that is not present in the GAMA sample. These differences are attributed to the resolution of EAGLE, and to an active galactic nucleus feedback prescription that is not sufficiently effective in EAGLE. Finally, clustering is used to compare samples of galaxies at low (z ~ 0.06; GSWLC-2) and intermediate (z ~ 0.67; VIPERS) redshifts, in order to examine the evolution of subpopulations of galaxies. Galaxies are clustered in a nine-dimensional feature space defined by a series of ultraviolet-through-near-infrared colours using the Subspace Expectation-Maximisation algorithm, which includes iterative dimensionality reduction. The algorithm models both samples using seven clusters: four containing mostly star-forming galaxies, and three containing mostly passive galaxies. Both sets of star-forming clusters form clear morphological sequences, capturing the gradual internally-driven growth of galaxy bulges at both epochs. At high stellar masses, this growth is linked with quenching. However, it is only at low redshifts that additional, environmental processes appear to be involved in the evolution of low-mass passive galaxies. The results of this thesis demonstrate the utility of clustering as a method with which to analyse the large galaxy samples that are anticipated from next-generation surveys, and with which to facilitate the multi-dimensional comparison of cosmological galaxy simulations with observations. Clustering is robustly able to identify astrophysically meaningful substructures in complex, multi-dimensional feature spaces, and these substructures may readily be interpreted with respect to the evolutionary contexts of the galaxies that they encompass.

Item Type:	Thesis (Doctoral)
Uncontrolled Keywords:	galaxy evolution; machine learning
Subjects:	Q Science > QB Astronomy Q Science > QC Physics
Divisions:	Astrophysics Research Institute
Date of first compliant Open Access:	29 January 2021
Date Deposited:	29 Jan 2021 09:32
Last Modified:	29 Nov 2022 16:10
DOI or ID number:	10.24377/LJMU.t.00014348
Supervisors:	Longmore, S, Baldry, I and Lisboa, P
URI:	https://researchonline.ljmu.ac.uk/id/eprint/14348

View Item

CORE (COnnecting REpositories)