
Context: What Is Synthetic Data?
…and yet, with this – the bootstrap into virtualised abstraction becomes all but complete. Fascinating and clearly intersects with some of Jean Baudrillard’s observations on simulation, semiotics and reality. Training machine intelligence on data generated in virtual worlds solves some tricky problems but at what cost; where does the “real-world” complexity go? If the issue of privacy and data bias is an implicit and effectively irreducible property of social, cultural and symbolic communications systems, does training machine intelligence in virtualised “utopias” then invoke an unbridgeable discontinuity that reemerges when those systems are applied to, in or alongside a real (and constitutively complex) non-virtual world?
The disconnect between what is considered significant as abbreviated representation for data and what is actually occurring in the world is an issue of complexity. My bet would be on a useful approach to bias being to simulate it, to determine what factors and influences shape the emergence of it in the first place. There are endemic properties of complex “real” information-processing systems that orient social, psychological and cultural systems to inequitable biases. We can’t pretend it isn’t there.