I got pwned today - The PDL customer "enriched" data exposure

in #securitylast year (edited)

I got pwned today, no not by the crypto market but by an actual data breach. I received an email this morning informing me of my email being involved in the PDL customer data exposure. It looks like this.


By the way, if you want to be informed of future breaches related to your email account, be sure to subscribe to the notifications at "Have I been Pwned".

Over 620 million of unique email addresses

Being a cybersecurity professional, it upsets me to see my email being involved in data breaches. This data exposure, involves over 620 million of unique email addresses. Considering that there are about 7 billion people in this world, that is 1 out of every 10 people being affected. Given that not everyone is connected to the internet and have an email address, the chances of you being one of the victims is probably higher.

Troy Hunt, creator of "Have I been Pwned", graciously showed his set of information that was exposed as part of the incident. In his dataset, his name, phone number and work history were exposed.

Here is another article, from the researchers who found the exposed data, showing how the dataset looks like,


In a separate article, it was said that,

...the data doesn't include sensitive information like passwords, credit card numbers, or Social Security numbers. It does, though, contain profiles of hundreds of millions of people that include home and cell phone numbers, associated social media profiles like Facebook, Twitter, LinkedIn, and Github, work histories seemingly scraped from LinkedIn, almost 50 million unique phone numbers, and 622 million unique email addresses.

With 50 million unique phone numbers, I hope that my number is not exposed :)

What are these data used for?

Before this incident, I have never heard of the company called PDL. PDL is short for People Data Labs, the name sounded very much like they are in the business of infringing personal privacy and that is precisely what they do.

PDL engages in the business of data "enrichment". Basically, the company collects massive amount of data from various sources and sell them to customers who want to "enrich" their data. Here is an explanation that I found quite concise,

For a very low price, data enrichment companies allow you to take a single piece of information on a person (such as a name or email address), and expand (or enrich) that user profile to include hundreds of additional new data points of information. As seen with the Exactis data breach, collected information on a single person can include information such as household sizes, finances and income, political and religious preferences, and even a person’s preferred social activities.

Each time a company chooses to “enrich” a user profile, they are also agreeing to provide what they know about the person to the enriching organization (thereby increasing the validity of the organization’s future results). Despite efforts from social media organizations like Facebook, the resulting data continues to be compounded, creating a situation with no oversight that ultimately allows all of a person’s social and personal information to be easily downloaded.

Sound exactly like they are selling your data for money isn't it? Such is the internet now, whenever you share your details to a centralized platform, you can never be sure what they will do with your data. Data "enrichment" is probably one of the many usages for your personal info.

Anyway, the exposed data seemed to be from PDL, but it is not PDL which had exposed those data. Instead, it is likely to be a customer of PDL which had collected all these information and stored it in an insecure repository, leading to the breach.

Major problem with the internet and privacy

This data exposure once again highlighted the broken nature of the internet and our privacy. Once your data is out in the internet, you can never be sure which hands they will land on. Duplicating data is free. You are unable to control how many copies can be made and who to share with. It is a "double-spending" problem and also an innate flaw of centralization.

These data while not payment details and passwords, are still sensitive and useful to create targeted messages to phish or influence our thoughts. Much like how Cambridge Analytica did with the data they harvested from Facebook, these data can be used to pretty much do the same. This incident simply shows that nothing much has improved since Cambridge Analytica.



Decentralized blockchains may be able to help

When Satoshi Nakamoto released the Bitcoin whitepaper 11 years ago, the "double-spending" problem was solved by a public and decentralized ledger system. I think blockchain can be used to solve part of the data ownership issue that we are facing right now. Here is a nice write-up on how blockchain gives us a chance to truly own our own data.



A system which allows users to control who to share with, what info to share and how many times it can be shared is a much better system than the current one. Blockchain is just one small step into that future. However, we have to recognize it, visualize it and yearn of that future in order for it to become a possible reality.

I got this email today too 👀
I guess it is hard not to have your email on the list considering the scale of the exposure 😅

I have always thought "there are no free things" and in the case of the current internet, when an advertisement promises you something free it is because the payment is your contact information or information that can be capitalized or converted into some type of product marketable by the company that is after the "Gift" you are receiving.

Indeed. It may seem that you are getting to use a free platform, but it is your data that these platforms are after

Well, when the thing is free is because you pay with your data ... I think that old phrase about the "equivalent exchange" of the old texts of Hermes Trimegisto is still valid no matter what time we humans are. :)