I recently saw the nearly two hour Netflix documentary, The Great Hack, which relates the story of ill-fated data analysis and marketing firm Cambridge Analytica (CA). The CA story involved the unethical use of personal data, notably from Facebook, and CA’s role in the US Presidential Elections of 2016 and the UK Brexit Vote. The story followed David Caroll a teacher who filed a lawsuit in the UK to obtain the data CA used to profile him as well as ex-CA staffers Chris Wylie and Brittany Kaiser, who were whistleblowers that triggered the CA scandal.
More on Netflix:
- Review: ‘The Great Hack’ is a docu of our times, but there’s a story it’s not telling us
- Review: ‘Stranger Things 3’ is sure-footed, streamlined, and still chock-full of 80s nostalgia
- Review: Unicorn Store is a sparkly tale on adulting, but a little edge could have really sold it
- Review: Netflix’s ‘Always Be My Maybe’ is hilarious and a joy to watch
The film raised ethical issues regarding the ability of digital platforms such as Google and Facebook to gather data about people at such a fine level and usually without notice, and using this information to influence people’s actions.
I think it’s important to understand that the fundamental concept behind Google, Facebook, and agencies like CA is “data mining” – or the ability to crunch and mine insights from granular data.
Let’s say I run an e-commerce portal which sells books (like Amazon), and for each purchase transaction I know two things: the age of the buyer, and the genre of the book. I could interpret this information two ways: a) get a particular age group like, say Millenials and find their most preferred genre like Science Fiction; or b) get a particular genre like Cookbooks and find that mostly Gen Xers buy them. I can then decide to send an email to my Millenial contacts and promote Isaac Asimov novels or offer a Cookbook discount to my Gen X clients who don’t have one yet. This is known as data-driven marketing and is quite commonplace today.
Two data points can run marketing campaigns but today our lives are composed of more data. Any one of us is at least 5 types of data to any company:
• Demographic – who you are
• Geographic – where you’ve been
• Psychographic – how you think, your personality and beliefs
• Behavioral – what you do, what you buy, how you act
• Biometric – biologically identifiable traces, blood, DNA, photos, fingerprints
Surprised your life is practically a collection of data? It’s also important to note where we usually generate and store this information:
• Government institutions – Registry of deeds would have most of your demographics and anything with a title such as land; ID-giving institutions: passports, drivers licenses, social security
• Employers – school grades, past employers, health records, stored in 201 files kept by HR departments
• Financial institutions – credit cards, deposits, investments; insurance companies also have health records
• Medical institutions – medical history, your pharmacist has your prescriptions
• Internet/Social Media – stuff you like, pictures of your lunch, relationships, beliefs, psychographics any time you play games and quizzes, e-books, videos watched, comments, purchases and preferences from online retailers, and more…
None of this should really bother us since few of these entities really are able to talk to each other and piece together all the information, if not for two common pieces of data that we always carry with us.
Emails and mobile numbers. Both are found on our smartphoneson which we do everything in our lives.
Most of us have had the same mobile number and email address for a longer period than we’ve had passport and driver’s license numbers, the equivalent of a universal national ID! These identifiers allow any entity to glue together all the disparate information about us into one database. Cambridge Analytica was said to gather as much as 5,000 data points on every individual, big data that reveals what people do, think, buy… and in the case of elections: who and what they would vote for.
Patterns revealed from large scale data mining can develop algorithms to predict behavior and influence people at an alarmingly accurate rate. In 2017, Stanford behavioral scientist Michal Kosinski developed an algorithm that could predict with a 91% accuracy a person’s sexual preference based on their photo. This outperformed humans who could only tell if someone was gay from a photo with a 65% accuracy. Today any modern media and marketing agency will use data to tailor-fit a message or image to appeal better to viewers (if you’re reading this article, it probably means similar stories like this appeals to you already).
Data-driven advertising messages contain calls-to-action to buy products, endorse candidates, or follow news. You will follow those messages because they were written with You in mind, and You will believe them because it’s Your personality to do so.
One of the ethical issues raised by The Great Hackis about free and honest elections — notably the Trump Election and Brexit. With people being influenced on a fine-level using data mining algorithms without their knowledge, has their basic right to self-determination been compromised? In the case of Kosinski’s gay detection algorithm, the question wasn’t on the algorithm itself but how long before it became a tool for sexual discrimination?
Data freedom, data discrimination, and data ethics pose questions which are hard to answer but I think they do need to be discussed now more than ever.
In the meantime, we need to take stock of and secure our personal data — and that means securing our weakest links: our mobile phones and our email addresses. One advice I learned from a reformed hacker is the need to “fragment” emails, just to make it harder for companies to use it to link your data. You need at least five email addresses: a personal email, one for the office, another for banking, one for social media, and one to “throw away” – for one-off transactions. This way in case any of your emails get compromised, you limit the damage and you know who could have done it.
Sounds like a hassle? The real insight after watching The Great Hackis discovering the identity of the great hacker of our lives: our own laziness.
Dominic Ligot is a data analyst, software developer, entrepreneur and technologist. He is a founding board member of the Analytics Association of the Philippines where he is an active advocate for data literacy and data ethics. He previously held executive roles in IT and banking that included roles in governance, risk management, fraud, surveillance, and cybersecurity.
References:
1. Netflix: The Great Hack https://www.netflix.com/
2. The Great Hack Review https://www.rogerebert.
3. How Demographic, Psychographic, and Behavioral Marketing Inform Your Intent Marketing Strategy https://www.yext.com/
4. How Data-Driven Marketers Are Using Psychographics https://
5. New AI can guess whether you're gay or straight from a photograph https://www.
Photographs from Netflix