Building Trust on the Internet: How HTTPS Works
How often do you notice this friendly green lock in your web browser? Hopefully you are seeing it on just about every website you visit. In fact, there should be something similar in your URL bar right now.
This lock is very important. It indicates that your browser has used HTTPS to properly secure and authenticate your connection with a website. HTTPS has three main goals:
- Privacy: Encrypting data such that anything in-between your browser and the website cannot read your traffic.
- Integrity: Ensuring that the data received on either end has not been altered unknowingly along the way.
- Authentication: Proving that the website your browser is talking to is who they say they are.
That third goal, authentication, is surprisingly difficult to guarantee. In this post, we will be focusing on authentication. We will look at why authentication can be difficult to achieve, and describe how websites, web browsers, and certificate authorities work together to achieve it.
Why Is Authentication a Problem?
There are a myriad of ways that you, or your browser, can be tricked into thinking you are connected to a familiar website, when in reality you are connected to an attacker trying to steal your credentials. Let’s say, for example, you are visiting Facebook.
- If your DNS is manipulated by an attacker, then your browser will connect to that attacker’s IP address thinking that it is Facebook.
- If there is an attacker in control of a router on the path between you and Facebook’s servers, the attacker may intercept your traffic, and respond to it themselves pretending to be Facebook. Since a real reply from Facebook would likely come through their router anyway, it can be very difficult to know you’ve received a forgery.
- If you see a link to Facebook and click it without realizing that it actually says, for example, “faceboook.com” (with an extra “o”), you may end up connected to an attacker who registered that slight-typo domain on purpose for malicious activities.
In any of these cases, the attacker’s web server can send you the exact same web content that the real Facebook web server would have. Your browser will render it, and it will look exactly like the Facebook you are familiar with. It is nearly impossible for you to tell, at this point, that you are not actually connected to Facebook. But when you try to log in, you are inadvertently sending your password to that attacker.
These are all forms of a “phishing” attack, and they happen every day. Losing your Facebook account is one thing, but just imagine trying to recover from a stolen email or online banking account. When using such sensitive services online, we need to be certain the websites you connect to are who they say they are. (And this is another good reason not to reuse your passwords across multiple services!)
Before we dive into the mechanics of how HTTPS achieves authentication, we need to understand a couple topics from cryptography. If you are already familiar with the ideas of public and private keys or digital signatures, then feel free to skip this section.
Public and Private Keys
Public and private keys are widely used cryptographic tools. They are designed to help you keep data private, and secure communications on the Internet. It is possible to generate a “key pair” composed of a “private key” and a “public key.” There are several algorithms for doing so; RSA is a popular one. When you generate a key pair, it is absolutely critical that you keep your private key to yourself (hence the name). Your public key, on the other hand, is meant to be distributed publicly.
Someone with your public key can use it to encrypt some data they want to send to you. Once encrypted, the only way to decrypt that data is using your private key. This way, someone can send you data over the Internet without worrying if anyone else is listening to the communications, because only you (or rather, the owner of your private key) can decrypt it.
The mathematics that enable key pairs to work is complicated, and beyond the scope of this post. But if you are interested, this video has a very good layman’s introduction to the concepts.
Using the private key from your keypair, it is possible to digitally sign some data. This works much in the way that a real-world pen and paper signature would, but is much more robust. A digital signature cannot be forged in the way a pen and paper signature could be.
Once you sign some data with your private key, then anyone can verify the signature using your public key. If the recipient of your signed data successfully verifies the signature using your public key, they have two useful guarantees:
- Authenticity: The recipient is certain that the data came from the owner of the associated private key.
- Integrity: The recipient is certain that the data was not modified by a third party along the way.
For more information on key pairs and digital signatures, Docusign has a solid FAQ written about them.
Now let’s look at how these two concepts tie into HTTPS.
HTTPS, Certificates, and Certificate Authorities
Continuing to use Facebook as an example, in order to ensure that we’re connected to the real Facebook, we need a way for that website to verify its identity in a way that no other website possibly could. If we can do this, then we’ll know if we have been tricked into connecting to an attacker’s website instead of the real one.
Luckily, using the cryptographic tools we talked about in the previous section, this can be done! Facebook needs a public key and a private key. They share the public key with everyone, and keep the private key secured on their official servers. Now, if you use Facebook’s public key to encrypt your web traffic, then you know only Facebook’s private key can decrypt it. If the web server you connect to can properly decrypt and respond to your traffic, then you can be confident that they have Facebook’s private key.
As long as Facebook (and every other site) takes great care to ensure that no one else gains access to their private key, we have solved the authentication problem! This is pretty great. Despite the limitations imposed by the nature of the Internet, we have devised a way to verify the identity of a website. In fact, the process we’ve just described is similar to what HTTPS really does. When we talk about a server’s “HTTPS Certificate,” we are really just talking about a public key.
However, there is major problem with the basic plan above, which HTTPS must take care of. That is: how can you get a public key for every website you want to use? Sure, if you have Facebook’s public key you can communicate with them securely. But if you also want to communicate securely with Google, you need Google’s public key. To communicate securely with Twitter, you need Twitter’s public key.
There are billions of websites on the Internet, and the list is constantly changing. So it’s not possible to keep a list of public keys for every single website on your personal computer. Now, instead of keeping a copy of Facebook’s public key handy at all times, you could just ask them to send over their public key when you first connect. But then we’re back to square one of this problem: if you unknowingly connect to a malicious third party, they can just lie and give you their own public key.
At this point you may be frustrated. After all this business with the public keys and private keys, we’re actually back to exactly where we started: we need a way for Facebook to prove their identity before we can trust their public key, so what really was the point of the public key?
Certificates and Certificate Authorities
The problem is that every website on the Internet cannot verify the identity of their public key to every single visitor. That just doesn’t scale with the magnitude of today’s Internet. To solve this problem, we use Certificate Authorities, or CAs.
For a CA to work properly, we all have to trust them not to lie to us. This may seem like a lot to ask, but in practice, it works pretty well. CAs rely on trust, so the moment they do something questionable, browsers stop trusting them and they can go out of business. Even simple mistakes on the part of a CA are enough to permanently ruin their reputations.
Now, how does a CA solve our verification problem? A CA charges website owners for the service of verifying their identities. DigiCert is a well-known CA, and they are the CA for Facebook, so we will use them as an example. Facebook wants to prove the identity of their public key, so people can communicate securely with their site. Facebook pays DigiCert to verify that they are actually Facebook. Once done, DigiCert knows what public key actually belongs to Facebook. Now when you connect to Facebook, you don’t need Facebook to prove their identity to you. You need Facebook to prove that DigiCert already verified their identity. DigiCert does this for millions of websites. But you don’t need to figure out how to trust each of those websites, you just need to trust DigiCert. If you trust DigiCert, and DigiCert trusts Facebook, then in theory you can trust Facebook. We call this a chain of trust.
At a high level, that’s how it all works. But there are two final details about this process that we need to nail down specifically before we’ve got a complete description of how this works.
- How do you know you’re dealing with DigiCert, and not someone impersonating DigiCert?
Here, it seems like we’re back to square one of this entire problem again. If we have DigiCert’s public key, then we can verify their identity. But we need to make sure, somehow, that we have DigiCert’s public key and not someone else’s. The solution here? The operating system or browser you’re using shipped with DigiCert’s public key already stored on it. There are quite a lot of CAs that we trust today, but they are well-known enough that modern operating systems will have them all. They’re stored in your computer’s “Trusted Root Certificate Authority Store.” It may seem like a surprising number of keys to keep, but this is much easier than storing one public key for every website on the Internet.
- How does Facebook prove to you that DigiCert verified them?
Once DigiCert has verified that Facebook is Facebook, they take Facebook’s public key and digitally sign it. This interaction happens purely between Facebook and DigiCert, long before you connect to Facebook. Now when you do connect to Facebook, Facebook sends you a public key that has been signed by DigiCert’s private key. You can verify that Facebook’s public key was signed by DigiCert using the public key shipped with your operating system.
Bringing it All Together
Now we’ve talked about every step involved in HTTPS authenticating the websites you visit. Let’s put it all together, step by step, continuing to use Facebook as our example website and DigiCert as our example CA.
- Facebook requests a certificate from DigiCert
- DigiCert verifies that they are really talking to Facebook
- Facebook sends DigiCert their public key
- DigiCert uses their private key to digitally sign Facebook’s public key
- DigiCert gives Facebook the signed public key
- This is now Facebook’s SSL Certificate
- You connect to Facebook’s website
- Facebook sends you their SSL Certificate
- Using the DigiCert public key in your root certificate store, you verify DigiCert’s signature on Facebook’s SSL Certificate
- You generate a secret key, and use Facebook’s public key from their certificate to encrypt it.
- You send the encrypted secret key to Facebook
- Facebook decrypts it with their private key, and holds on to it.
- You and Facebook use the shared secret key to encrypt your web traffic.
And now you’re done! We have achieved robust authentication. So long as DigiCert remains trustworthy, you can be confident that sites with their certificates are who they say they are.
HTTPS and Strongarm
Have you ever seen a warning like this in your browser?
This one is specific to Chrome, but others will look similar. Chances are you’ve seen this warning in the past, but what exactly does it mean in the context of HTTPS? The answer is right there in the error code, NET::ERR_CERT_AUTHORITY_INVALID. Your browser is saying “hey, you connected to test.com and they sent us a certificate, but it wasn’t signed by any of the certificate authorities you trust.” Or in other words, none of the CA public keys in your computer’s root certificate authority store were able to verify the signature. Therefore, you can’t be certain that this site is who they say they are.
HTTPS Warnings with Strongarm
Here at Strongarm, we often get questions about HTTPS warnings. Let’s say, for example, that baddomain.test is a malicious/blocked domain. If you’re protected by Strongarm and you attempt to visit https://baddomain.test in your browser, your computer will communicate with Strongarm’s blackhole instead of actually communicating with the malicious/blocked domain. This is good, since it means Strongarm is protecting you. But when your computer connects to Strongarm and receives our HTTPS certificate, your browser will show you that warning. Strongarm doesn’t have a certificate for baddomain.test, let alone one that has been signed by a well-known certificate authority. We don’t own baddomain.test, so if we did have a proper certificate for it, then the CA who issued that certificate would have made a major mistake.
This is of course a bit of a usability concern. If your network is protected by Strongarm, your users may begin to see those SSL warnings when they attempt to visit a site you have blocked. If they choose to ignore the warning and proceed, they will see a page from Strongarm explaining that the site has been blocked. But typically you hope your users aren’t ignoring these types of warnings, and either way there is bound to be some confusion. We’ve decided that this is a reasonable trade-off: your network is protected, and these warnings are an important part of security on the Internet!
There are steps we could take to stop Strongarm customers from seeing SSL errors. Many competitors do this. But we believe that is a bad security practice with more risks than benefits. If you’re interested in learning more, we’ve actually written about this before.
HTTPS: A Key Aspect of Internet Safety
I hope you were able to learn something here about how HTTPS achieves authentication, and how important it is to security. It is an impressive solution that websites use to solve a very difficult problem. Your computer probably works through the HTTPS authentication process thousands of times each day, without you noticing or even thinking about it. Without it, securely using many of the online services we take for granted would not be possible. So it’s a great process to be aware of!
Want to take your security one step further?
Get protected in less than 10 minutes with Strongarm.