What is a “breach” and where has the data come from?
A “breach” is an incident where data is inadvertently exposed in a vulnerable system, usually due to insufficient access controls or security weaknesses in the software.
The website https://haveibeenpwned.com/ aggregates breaches and enables people to assess where their personal data has been exposed.
Are user passwords stored in this site?
When email addresses from a data breach are loaded into the site, no corresponding passwords are loaded with them. Separately to the pwned address search feature, the Pwned Passwords service allows you to check if an individual password has previously been seen in a data breach. No password is stored next to any personally identifiable data (such as an email address) and every password is SHA-1 hashed (read why SHA-1 was chosen in the Pwned Passwords launch blog post.)
Can I send users their exposed passwords?
No. Any ability to send passwords to people puts both them and myself at greater risk. This topic is discussed at length in the blog post on all the reasons I don’t make passwords available via this service.
Is a list of everyone’s email address or username available?
The public search facility cannot return anything other than the results for a single user-provided email address or username at a time. Multiple breached accounts can be retrieved by the domain search feature but only after successfully verifying that the person performing the search is authorised to access assets on the domain.
What about breaches where passwords aren’t leaked?
Occasionally, a breach will be added to the system which doesn’t include credentials for an online service. This may occur when data about individuals is leaked and it may not include a username and password. However this data still has a privacy impact; it is data that those impacted would not reasonably expect to be publicly released and as such they have a vested interest in having the ability to be notified of this.
How is a breach verified as legitimate?
There are often “breaches” announced by attackers which in turn are exposed as hoaxes. There is a balance between making data searchable early and performing sufficient due diligence to establish the legitimacy of the breach. The following activities are usually performed in order to validate breach legitimacy:
- Has the impacted service publicly acknowledged the breach?
- Does the data in the breach turn up in a Google search (i.e. it’s just copied from another source)?
- Is the structure of the data consistent with what you’d expect to see in a breach?
- Have the attackers provided sufficient evidence to demonstrate the attack vector?
- Do the attackers have a track record of either reliably releasing breaches or falsifying them?
What is a “paste” and why include it on this site?
A “paste” is information that has been “pasted” to a publicly facing website designed to share content such as Pastebin. These services are favoured by hackers due to the ease of anonymously sharing information and they’re frequently the first place a breach appears.
HIBP searches through pastes that are broadcast by the @dumpmon Twitter account and reported as having emails that are a potential indicator of a breach. Finding an email address in a paste does not immediately mean it has been disclosed as the result of a breach. Review the paste and determine if your account has been compromised then take appropriate action such as changing passwords.
My email was reported as appearing in a paste but the paste now can’t be found
Pastes are often transient; they appear briefly and are then removed. HIBP usually indexes a new paste within 40 seconds of it appearing and stores the email addresses that appeared in the paste along with some meta data such as the date, title and author (if they exist). The paste itself is not stored and cannot be displayed if it no longer exists at the source.
My email was not found — does that mean I haven’t been pwned?
Whilst HIBP is kept up to date with as much data as possible, it contains but a small subset of all the records that have been breached over the years. Many breaches never result in the public release of data and indeed many breaches even go entirely undetected. “Absence of evidence is not evidence of absence” or in other words, just because your email address wasn’t found here doesn’t mean that is hasn’t been compromised in another breach.
How does HIBP handle “plus aliasing” in email addresses?
Some people choose to create accounts using a pattern known as “plus aliasing” in their email addresses. This allows them to express their email address with an additional piece of data in the alias, usually reflecting the site they’ve signed up to such as test+netflix@example.com or test+amazon@example.com. There is presently a UserVoice suggestion requesting support of this pattern in HIBP. However, as explained in that suggestion, usage of plus aliasing is extremely rare, appearing in approximately only 0.03% of addresses loaded into HIBP. Vote for the suggestion and follow its progress if this feature is important to you.
How is the data stored?
The breached accounts sit in Windows Azure table storage which contains nothing more than the email address or username and a list of sites it appeared in breaches on. If you’re interested in the details, it’s all described in Working with 154 million records on Azure Table Storage – the story of Have I Been Pwned
Is anything logged when people search for an account?
Nothing is explicitly logged by the website. The only logging of any kind is via Google Analytics, Application Insights performance monitoring and any diagnostic data implicitly collected if an exception occurs in the system.
Why do I see my username as breached on a service I never signed up to?
When you search for a username that is not an email address, you may see that name appear against breaches of sites you never signed up to. Usually this is simply due to someone else electing to use the same username as you usually do. Even when your username appears very unique, the simple fact that there are several billion internet users worldwide means there’s a strong probability that most usernames have been used by other individuals at one time or another.
Why do I see my email address as breached on a service I never signed up to?
When you search for an email address, you may see that address appear against breaches of sites you don’t recall ever signing up to. There are many possible reasons for this including your data having been acquired by another service, the service rebranding itself as something else or someone else signing you up. For a more comprehensive overview, see Why am I in a data breach for a site I never signed up to?
Can I receive notifications for an email address I don’t have access to?
No. For privacy reasons, all notifications are sent to the address being monitored so you can’t monitor someone else’s address nor can you monitor an address you no longer have access to. You can always perform an on-demand search of an address, but sensitive breaches will not be returned.
Does the notification service store email addresses?
Yes, it has to in order to track who to contact should they be caught up in a subsequent data breach. Only the email address, the date they subscribed on and a random token for verification is stored.
What email address are notifications sent from?
All emails sent by HIBP come from noreply@haveibeenpwned.com. If you’re expecting an email (for example, the verification email sent when signing up for notifications) and it doesn’t arrive, try white-listing that address. 99.x% of the time email doesn’t arrive in someone’s inbox, it’s due to the destination mail server bouncing it.
How do I know the site isn’t just harvesting searched email addresses?
You don’t, but it’s not. The site is simply intended to be a free service for people to assess risk in relation to their account being caught up in a breach. As with any website, if you’re concerned about the intent or security, don’t use it.
Is it possible to “deep link” directly to the search for an account?
Sure, you can construct a link so that the search for a particular account happens automatically when it’s loaded, just pass the name after the “account” path. Here’s an example:
https://haveibeenpwned.com/account/test@example.com
How can I submit a data breach?
If you’ve come across a data breach which you’d like to submit, get in touch with me. Check out what’s currently loaded into HIBP on the pwned websites page first if you’re not sure whether the breach is already in the system.
What is a “sensitive breach”?
HIBP enables you to discover if your account was exposed in most of the data breaches by directly searching the system. However, certain breaches are particularly sensitive in that someone’s presence in the breach may adversely impact them if others are able to find that they were a member of the site. These breaches are classed as “sensitive” and may not be publicly searched.
A sensitive data breach can only be searched by the verified owner of the email address being searched for. This is done via the notification system which involves sending a verification email to the address with a unique link. When that link is followed, the owner of the address will see all data breaches and pastes they appear in, including the sensitive ones.
There are presently 25 sensitive breaches in the system including Adult Friend Finder, Adult-FanFiction.Org, Ashley Madison, Beautiful People, Bestialitysextaboo, Brazzers, CrimeAgency vBulletin Hacks, Fling, Florida Virtual School, Freedom Hosting II, Fridae, Fur Affinity, HongFire, HTH Studios, Mate1.com, Muslim Match, NapsGear, Naughty America, Non Nude Girls, Rosebutt Board and 5 more.
What is a “retired breach”?
After a security incident which results in the disclosure of account data, the breach may be loaded into HIBP where it then sends notifications to impacted subscribers and becomes searchable. In very rare circumstances, that breach may later be permanently remove from HIBP where it is then classed as a “retired breach”.
A retired breach is typically one where the data does not appear in other locations on the web, that is it’s not being traded or redistributed. Deleting it from HIBP provides those impacted with assurance that their data can no longer be found in any remaining locations. For more background, read Have I Been Pwned, opting out, VTech and general privacy things.
There is presently 1 retired breach in the system which is VTech.
What is an “unverified” breach?
Some breaches may be flagged as “unverified”. In these cases, whilst there is legitimate data within the alleged breach, it may not have been possible to establish legitimacy beyond reasonable doubt. Unverified breaches are still included in the system because regardless of their legitimacy, they still contain personal information about individuals who want to understand their exposure on the web. Further background on unverified breaches can be found in the blog post titled Introducing unverified breaches to Have I Been Pwned.
What is a “fabricated” breach?
Some breaches may be flagged as “fabricated”. In these cases, it is highly unlikely that the breach contains legitimate data sourced from the alleged site but it may still be sold or traded under the auspices of legitimacy. Often these incidents are comprised of data aggregated from other locations (or may be entirely fabricated), yet still contain actual email addresses unbeknownst to the account holder. Fabricated breaches are still included in the system because regardless of their legitimacy, they still contain personal information about individuals who want to understand their exposure on the web. Further background on unverified breaches can be found in the blog post titled Introducing “fabricated” breaches to Have I Been Pwned.
What is a “spam list”?
Occasionally, large volumes of personal data are found being utilised for the purposes of sending targeted spam. This often includes many of the same attributes frequently found in data breaches such as names, addresses, phones numbers and dates of birth. The lists are often aggregated from multiple sources, frequently by eliciting personal information from people with the promise of a monetary reward . Whilst the data may not have been sourced from a breached system, the personal nature of the information and the fact that it’s redistributed in this fashion unbeknownst to the owners warrants inclusion here. Read more about spam lists in HIBP .
What does it mean if my password is in Pwned Passwords?
If a password is found in the Pwned Passwords service, it means it has previously appeared in a data breach. HIBP does not store any information about who the password belonged to, only that it has previously been exposed publicly and how many times it has been seen. A Pwned Password should no longer be used as its exposure puts it at higher risk of being used to login to accounts using the now-exposed secret.
It’s a bit light on detail here, where can I get more info?
The design and build of this project has been extensively documented on troyhunt.com under the Have I Been Pwned tag. These blog posts explain much of the reasoning behind the various features and how they’ve been implemented on Microsoft’s Windows Azure cloud platform.