Merger or Not: Accounting for Human Biases in Identifying Galactic Merger Signatures

Author/Creator ORCID

Date

2021-06-29

Department

Program

Citation of Original Publication

Lambrides, Erini et al.; Merger or Not: Accounting for Human Biases in Identifying Galactic Merger Signatures; Astrophysics of Galaxies, 29 June, 2021; https://arxiv.org/abs/2106.15618

Rights

This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
Attribution 4.0 International (CC BY 4.0)

Subjects

Abstract

Significant galaxy mergers throughout cosmic time play a fundamental role in theories of galaxy evolution. The widespread usage of human classifiers to visually assess whether galaxies are in merging systems remains a fundamental component of many morphology studies. Studies that employ human classifiers usually construct a control sample, and rely on the assumption that the bias introduced by using humans will be evenly applied to all samples. In this work, we test this assumption and develop methods to correct for it. Using the standard binomial statistical methods employed in many morphology studies, we find that the merger fraction, error, and the significance of the difference between two samples are dependent on the intrinsic merger fraction of any given sample. We propose a method of quantifying merger biases of individual human classifiers and incorporate these biases into a full probabilistic model to determine the merger fraction and the probability of an individual galaxy being in a merger. Using 14 simulated human responses and accuracies, we are able to correctly label a galaxy as ''merger'' or ''isolated'' to within 1\% of the truth. Using 14 real human responses on a set of realistic mock galaxy simulation snapshots our model is able to recover the pre-coalesced merger fraction to within 10\%. Our method can not only increase the accuracy of studies probing the merger state of galaxies at cosmic noon, but also can be used to construct more accurate training sets in machine learning studies that use human classified data-sets.