A massive data breach dubbed db8151dd has exposed the records of 22M people – including addresses, phone numbers, and social media links. But the source of the data is a mystery …

I got an email alert this morning from the haveibeenpwned.com site telling me that my details were included. The exposed data appears extensive.

However, Troy Hunt, who runs the site, said that nobody has been able to identify where the information came from.

That ‘interesting’ data appears to come from customer relationship management (CRM) systems, including things like:

Back in Feb, Dehashed reached out to me with a massive trove of data that had been left exposed on a major cloud provider via a publicly accessible Elasticsearch instance. It contained 103,150,616 rows in total […]

The global unique identifier beginning with “db8151dd” features heavily on these first lines hence the name I’ve given the breach. I’ve had to give it this name because frankly, I’ve absolutely no idea where it came from, nor does anyone else I’ve worked on with this […]

It’s mostly scrapable data from public sources, albeit with some key differences. Firstly, my phone number is not usually exposed and that was in there in full. Yes, there are many places that (obviously) have it, but this isn’t a scrape from, say, a public LinkedIn page. Next, my record was immediately next to someone else I’ve interacted with in the past as though the data source understood the association. I found that highly unusual as it wasn’t someone I’d expect to see a strong association with and I couldn’t see any other similar folks. But it’s the next class of data in there which makes this particularly interesting.

Best guess is it’s some kind of aggregated data from a number of sources, but as neither Hunt nor other information security professionals have been able to identify any of them despite attempts lasting almost three months, it appears the details of the privacy breach may remain a mystery.