Technological Triumph of Linguistic Mediocrity

Context: We could run out of data to train AI language programs 

We should be less worried about running out of data to train language models and more concerned about finding ourselves without anything interesting to say.

The profusion of large language models does not occur in a vacuum. Regardless that our cognitive (as much as linguistic) reflex is to think of complex technological systems as relatively self-contained artefacts and entities, we do well to contemplate the larger form and flow of information, energy and communication of which they are only ever really functional microcosms.

The uptake of these large language models occurs synchronous to, or soon after, the emergence of a near-ubiquitous linguistic transmission medium of social media. This is not necessarily a causal relationship but I think we can view the former in terms of insights derived from the latter.

The arrival of a tidal wave of memetic tropes in language, idioms of political or commercial influence and abbreviated mnemonics indicates that those artefacts, entities and systems that tend to percolate to ascendance are precisely those that are most probable.

What is most probable in the generative inflection of language generation? Precisely those banalities and null statements that most easily combine with others.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.