The bias in machine learning (and AI more generally) is the bias in data. The bias in data replicates the bias in ourselves, our distributed social and cultural information-processing systems, and in the natural orientation of complex information and energy-processing systems towards autonomous self-propagation through optimally-concise self-representational abstraction(s). Machine Learning applies mathematical heuristics to extract patterns in ways quite similar to the those in which social and cultural (or psychological) systems seek organisational mnemonics, notwithstanding that many of these are unjust, inequitable and profoundly unethical.
I suspect we need to investigate the underlying ontologies and axiomatic assumptions applied to complex system dynamics but, to be honest, there is a certain emotional (!) commitment to the ways in which things are done now and a shorter term utility tends to overrule the longer-term value in technical, technological and academic domains. We will not (like, ever) successfully disentangle the biases and ethical conundrums here if we do not assess the whole context in which all of this occurs and as such global assessments tend to fall quite dramatically outside the scope of language and demonstrable proof, we find ourselves in something of an intractable bind.
The bias inside data and ourselves represents the optimally-concise methods by and through which social, cultural and psychological information-processing systems obtain sustainable continuity. We can no more remove the broken symmetry of bias from our world than we can rewrite the fundamental forces of nature but we can, if we are clever, reshape and wilfully repurpose the inevitability of (some kind of emergent) bias in constructive ways.