With so much data out there, why is there so little trust?
Published 2022-03-23, updated 2023-02-06
Summary - In a world awash in data, why is so much disinformation out there? In this blog, Allan shares his knowledge about what data is, and how to interpret and assess it.
I live in Ottawa, and I work downtown. My office is located in the area occupied for several weeks this winter by angry protestors and their trucks.
My work – and my company’s – is all about data. It involves the collection, display and interpretation of facts and figures to tell stories about just about anything: retail sales, employee retention, website visits, you name it. In short, we develop business intelligence software.
As I watched the trucker occupation unfold, its anger fueled by blatantly false or misleading information, what had been an abstract concept became very personal to me. As someone who is passionate about data, someone who has built a livelihood on evidence-based facts, I found myself frustrated and even angry that so many people could believe in and act on a stream of falsehoods.
Why, I wondered, in a world awash in data, is so much disinformation out there? Why are lies and half-truths spreading like wildfire? Why are people not fact-checking? Why are intelligent people prepared to accept falsehoods when science-based truths are staring them in the face?
I don’t have the answers to those questions. (I wish I did!)
But what I do have – and what I want to share – is knowledge about what data is, and how to interpret and assess it. Because only by being open about and critical of data can we understand its true value.
Why data literacy is important
I think too many people don’t know how to validate data, or even that it’s important that they do so. Yet in today’s data-driven world, the ability to critically assess information is as fundamental as knowing that red means stop and green means go.
Why do I say that?
Because there are no gatekeepers anymore.
The advent of social media has killed them off. We are now flooded with livestreams and videos and photos and tweets from all manner of people (or are they bots?) Algorithms feed us more of the same, without worrying about whether what we’re being sent is a report from a respected news organization or a clever but totally fabricated tidbit from a conspiracy theorist who’s pitched it to appeal to emotions.
People use data to form opinions about what’s going on in the world.
Problems arise when they can’t or won’t distinguish between good data (verifiable facts) and bad data (errors or outright fabrications). Bad data includes both ‘misinformation’ and ‘disinformation.’ Misinformation is something that is unintentionally wrong, or spread without the intention to deceive. Disinformation is something that is deliberately created to deceive.
Either can be put out by organizations, governments, corporations, or individuals, though disinformation is generally created by someone with an axe to grind.
And at the far end of the mistrust spectrum, some will argue that good, evidence-based data is actually fabricated and can’t be trusted because it’s part of a larger conspiracy.
Data by itself is neutral. It becomes believable once you have taken certain steps to verify it and to add context. Only with scrutiny does trust in the data actually emerge.
Here’s how to be data literate.
Validate the information
The first rule is: Validate data by seeing whether you can get the same information from a variety of other sources.
For example, if you see a TikTok video warning that a new species of mosquito is about to invade, see if you can find other sources that say the same thing. And not just other TikTok videos, but different sources – a city health department, a respected scientific journal, a mainstream media outlet, a university study. The more confirmations you can get from a variety of sources, the more likely the information is true.
It stands to reason that those sources providing validation should be disinterested parties. A company selling mosquito repellent is not a neutral source; it stands to profit if people worry about a mosquito invasion, so it’s less credible than a city health department.
By the way, we are very privileged in our society to have access to a variety of sources. In some places (Russia right now, for example), it’s difficult if not impossible to check and validate data. Is Ukraine being invaded, or is it being liberated from a Nazi regime? In some places, non-official information is hard to access.
When it’s created by large, powerful organizations (governments, corporations, etc.) fabricated data can be hard to verify, or can appear in a range of sources; that’s why it’s good to be skeptical at times.
Validate the source
Rudolph Giuliani is a former New York City mayor and who is closely associated with former U.S. president Donald Trump.
During the trucker occupation of downtown Ottawa this past winter, he put out the following tweet:
How many people believed Giuliani’s tweet? How many retweeted it without considering that Giuliani has an agenda as a Trump supporter? And that as such, he had an interest in discrediting the Canadian government?
Some material put out there is so ridiculous (at least to me) that I am shocked that people believe it. So check your sources before sharing.
Here, by the way, is the Ottawa Humane Society response to the Giuliani tweet. And if you aren’t inclined to believe the CTV news report of their response, go right to the source, a blog posting by Humane Society President and CEO Bruce Roney.
Validate the context
Data is useless without context.
For example, suppose you hear that a local company hired 50 people last year.
It sounds good, right? But without knowing the context, it’s impossible to assess that snippet of information and understand whether it’s positive or negative.
What if those 50 people were hired in January, but in February the company laid off half of them?
What if those were 50 part-time jobs?
What if all those 50 hires were remote workers in other countries and not at home?
When analyzing business data, checking context is crucial. It is typically done by looking at a range of metrics in order to provide the true picture. For example, profit alone does not tell the full story. A healthy profit one year could be the result of a big one-time opportunity, or layoffs that reduced expenses.
Only by understanding the context can you understand the data. To understand context you need to ask questions, be skeptical, and look for explanatory information.
Validate how and when the information was gathered
You’re hungry, you see a sign outside a restaurant you’ve never tried. “Voted best restaurant,” the sign says. It sounds like a good endorsement, right?
Not so fast!
Do you know who voted? Did the restaurant owner canvass only family and friends, or was there a city-wide vote?
When was the vote held? Was it last month, or was it five years ago, when the restaurant was owned by someone else?
Was there only one survey? What if this restaurant was voted best restaurant 10 years ago, but failed to make the cut in any subsequent annual survey?
Who conducted the vote? Was it the restaurant itself, using leading questions in an online survey, or was the vote conducted by an independent body gathering data from across the city?
How information is gathered is hugely important. Look for the date, the time range, the sample size or number of data points, and how it was collected (survey response, direct observation, machine). Question and understand.
Validate the way the data is presented
Say a survey of 100 people shows 99 of them were satisfied with a soap; one person, however, disliked the soap because it caused a strong allergic reaction.
In that case, both of these headlines are valid:
“Soap overwhelmingly endorsed by users.”
“Soap the source of strong allergic reaction.”
Now imagine each of those headlines on an Instagram photo; in the first case, the soap would be perceived as positive, in the second it would be perceived as negative. Same information, different angle.
When you’re looking at charts and graphs, be aware of cumulative data, understand the scale of the axis, and if any smoothing or trends have been applied. All of these will visually alter the interpretation of the results.
It’s important, when understanding data, to look beyond the way the data is presented. Data by itself is neutral; it’s neither here nor there. The way it’s presented can change how it’s perceived.
So you’ve validated the “Voted Best Restaurant” place and discovered the vote was a legitimate, city-wide survey by a disinterested party. Fine. But revisit the data every so often.
Maybe, six months later, someone discovers the original survey was a fraud.
Maybe the restaurant loses its chef and ends up getting fined for health board violations.
Or maybe one day it gets a Michelin star.
In other words, for any kind of data, new information can appear that changes things. Errors are uncovered. Mistakes are corrected. Awards are given. So check and recheck to make sure that data you are using still means what you thought it meant.
This, by the way, is part of the scientific process. Mistakes point research in new directions. Changing your mind on the basis of new evidence is a very good thing - though all too often presenting updated information is seen as incompetence, or as proof of a cover-up or conspiracy.
Understand your own biases
“Cars to be banned from major downtown street” can be both good news (if you’re a pedestrian) or bad news (if you drive your car down that street every day.)
When trying to understand data, we need to be aware of bias – our own, and that of the people who collected or presented the data.
Understanding one’s own confirmation biases is often very difficult. But to get the true value of data, we all have to be very aware of our ‘desired outcome’ and make sure it doesn’t interfere with our ability to read and understand the data.
Which brings me to sharing – one of the bases of social media. If you share data, it’s important to understand why you are doing so. Will it help, inform, harm? Sharing is not a neutral action, but a reflection of your own biases and desires.
Get out of your echo chamber
Click on apples and social media’s algorithms give you more apples. Click on oranges and you’ll get more oranges. It’s as simple as that. But after a while, the apple person sees only apples, and the orange person sees only oranges.
That’s not good.
To better understand data, you have to actively seek new perspectives.
That means looking at sources outside your comfort zone and assessing them openly and rationally. That is not always easy when they push your buttons!
Data and trust
I said at the beginning of this essay that I don’t have answers to the questions that anger and worry right now. I am still grappling with why a growing number of people feel unheard and unappreciated, and why they prefer to stew in their own anger rather than check facts.
But one thing I do know is that data alone is not what resonates with people. Stories, true or false, are what people relate to, remember, and retell. And data becomes powerful when it is used to craft compelling stories.
The trucker occupation that got me thinking now seems like ancient history. The world is now on edge, worried about a far larger problem: the war in Ukraine and its implications for the future in a world where COVID has not quite gone away. Trust in data is central to our ability to act on all these issues, and many others that we must face.
We have reached a point where our survival – as individuals and as a society – will depend on which stories we believe, and which we discard. And if we don’t agree on our stories, how can we ever trust each other enough to live harmoniously together?
Allan Wille is a Co-Founder and CEO of Klipfolio. He’s also a designer, a cyclist, a father and a resolute optimist.
Maximizing Business Insights: The Power of dbt's Semantic Layer with of Klipfolio's PowerMetrics
By Jeroen Visser — November 27th, 2023
Promoting data literacy with metrichq.org and the power of AI
By Allan Wille, Co-Founder — October 12th, 2023
Let’s fix analytics so we can stop asking you for dashboards
By Cathrin Schneider — September 11th, 2023
Why metrics are the key to confident decision making
By Graham Watts — July 31st, 2023