Unveiling the simulacrum: Data’s role in revealing reality -

As data visualisation and augmented technology captivate the next generation, humans are likely to be passengers in a probabilistically engineered simulacrum defined by debatable data.

Harnessing a data mindset in a world increasingly built for speed and reasoning has become a critical competency. But blinded belief in data storytelling can hinder our proclivity to discern truth from fiction in a progressive statistical simulacrum.

As humanity progresses at astounding levels of intelligence and interconnectedness, access to conventional wisdom is becoming ever more expeditious. Data is now available all around us in highly curated forms — from medical prognoses and climate forecasting, to socioeconomic analyses and even personalised tracking apps.

Yet, with such appetite for pervasive knowledge and instantaneous insights, outsourcing inference comes at a cognitive cost: over reliance on curated data narratives with less focus on intrinsic inquiry. To fully grasp our surrounding worlds, we need a multifocal perspective with a critical lens for discerning dodgy data.

Data’s simulacrum of reality

With augmented intelligence changing the world in algorithmic leaps and bounds, data has become a sacred topic, that, in Kenneth Cukier’s words, leads us to ‘fetishise’ statistical figures — almost to the extent of blinded belief. In The Economist’s latest course on Data Storytelling and Visualisation, its Executive Editor postulates: ‘data is only a simulacrum of reality, not the real thing’.

With this in mind, data has both the functional and dysfunctional power to shape collective opinion. Used with knowledge and humility, it informs society to change and progress in transformational ways. However, applied with lack of care and wisdom, data can be dangerous.

For instance, dodgy data — whether for intentional or unintentional purposes — has had serious consequences, namely: The Volkswagen Emissions Scandal in 2015 involving the deliberate manipulation of data to mislead public opinion, and the Lancet MMR Vaccine Controversy in 1998 which led to vaccine hesitancy and outbreaks of preventable diseases.

Both of these cases signify the caution required in contemplating data representation. Even reputable publications such as The Economist have drawn erroneous visualisations from time to time, albeit unintentionally and with enlightening self-critique.

Degrees of Uncertainty by Andy Dollerson, Neil Halloran

Crucial to the topic of mindfulness, the inspection of any data perspective remains a vital intuition that requires a subconscious awareness of three elements: data representation, narrative and visualisation.

Data representation is circumstantial evidence depicted visually in the form of charts — simulating reality for objective reasoning.
Coherent data perspectives contain concise narratives that draw attention to specific information — distilling numerical complexity.
Visualisation techniques illustrate abstract concepts effectively — enhancing readability and comprehension.

Keeping in mind each of these aspects in any data story holds the key to unravelling the intended representation while objectively perceiving the purported findings. This heightened awareness can help us grasp the intent of seemingly complex visualisations, and approach data simulation with more confidence and critique.

Simulacrum — The ‘Warming Stripes’ graphic represents the change in average annual temperature over the past century. Ed Hawkins, UoR.

A rework of Radiohead’s ‘House of Cards’ song created using available LIDAR data. Brendan Dawes

Data representation & the distortion of reality

As deep learning and generative AI models propel the science of statistics and probability to new heights, data is prominently featuring at the heart of every neurological action we take. Your next online shopping session, a recent personality test, the fitness tracking app you monitor, or your doctor’s MRI scan of a perplexing ailment — each of these actions are influenced by representational data with varying degrees of accuracy, and it is this statistical representation that one must be cognisant of.

Absorbing the mind-boggling pace of innovation that is synonymous with big data and artificial intelligence can be entrancing, but it is crucial to keep in mind the representativeness of the data we are exposed to. Simply put, how accurately data portrays reality, and to what extent it exhibits limitations or manipulation.

While statistical methodologies (such as descriptive and inferential statistics) are used in data analysis to summarise or draw conclusions, any given data set cannot be deemed a genuine random sample. In other words, reality is too complex to holistically quantify.

Indeed, longitudinal studies provide the most resolute form of empirical accuracy, but extraneous and confounding variables (such as age, gender, mannerisms, or mixed effects) can influence behavioural modification during controlled testing.

Moreover, and in the realm of generative AI, vast probability computation models are only as intelligent or effective as the data they are provided or trained on. While AI models are becoming more robust in transforming huge amounts of raw data into actionable information, the proxies that they rely on (due to limited data) make the evidence circumstantial. It is this inferential hypothesising that distorts reality.

Of course, modern statistics is evolving with scientific rigour and has profound benefits for societal progress. But, as data visualisation and augmented technology captivate the next generation, humans are likely to be passengers in a probabilistically engineered simulacrum defined by debatable data.

Pausing to consider the contentious field of psychometry: standardised aptitude and personality frameworks such as MBTI, DiSC, The Big Five, and HEXACO have been widely used with dangerously low scientific validity. While such tests can be helpful as a preliminary screening tool to filter out ill-suited candidates in large groups, there is very little scientific consensus about the contextual predictability of psychometric practices.

The ‘Four Colour’ personality typology, for example, that intriguingly interplays with psychometric deduction, is notoriously controversial — to the alarming extent that the New York Times dubbed personality tests ‘the astrology of the office’, while the Swedish Skeptics Society named author Thomas Erikson ‘fraudster of the year‘ in 2018 for his best selling book, Surrounded by Idiots.

Yet, amid concerning critique about the reliability of such erratic inference, global recruiting and coaching practices are dogmatically applying this pseudoscientific methodology — evident in the personality testing industry’s staggering 300% growth rate from approximately USD 500 million (2019) to USD 2 billion (2023).

Where do we draw the line between circumstantial evidence and dodgy data?

Persona: The Dark Truth Behind Personality Tests

Uncovering America’s intriguing obsession with personality testing. Based on Merve Emre’s book, The Personality Brokers‘.

Astounding stories revealed by data simulation

Get inspired by these factual perspectives: told in an immersive way, using a variety of high quality visuals — from graphs and maps, to illustrations and audiovisual components.

Wealth Inequality in America

Smart or scary? Tell-tale signs of an emerging AI-nxiety crisis

AI: A Journey into the heart of Deep Learning