I don't know about you but, in my case, when I first tried to understand what JWT is, I felt like it was too complex, that I would never be able to use it. I started to read about it and, the more I read the more confused I got - JWT, JWS, JWE, JWKS... it was just too much to digest.
This post is going to be a very personal one. I'm going to walk you through the thought process I took to fully grasp the concept of JWT. Sometimes, when you've got an, apparently, complex concept, what I do is to break down the concept into its terms:
JWT stands for JSON Web Token, although to be precise we should take JSON Web Token Specification.
Let's see each term separately and we'll put them all together to make sense of it.
Let's start with the Token. What is a Token?
In general, a token is a string of data that represents something else. In the context of Security, a token can represent an identity. Or, in other words, a token can be used as proof of authentication or authorization.In fact, if you've played a little bit with APIs you've probably have seen that in some APIs we are required to include a (bearer) token, which is just a string of characters with apparently no meaning. There's no meaning within the characters themselves but it does have a meaning for the Authorization server that generated that token.
With that lack-of-meaning long string of characters we can access all the resources that are authorized and linked to that token. Just adding that token to an API header is enough to show that you're allowed to that API resource.
Web
The context in which we use these tokens is normally web applications and APIs
JSON
What if instead of using a string of characters with no meaning we used a structured piece of information that represented, not only proof of identity but also some attributes of that identity?
- A badge could be just a card, with a magnetic band or chip, that authorizes you to enter a building.
- A passport can also be considered a token. It's something that represents your identity and allows you, for example, to enter another country
- Authentication - In the case of the badge, the only way to verify who belongs that badge to, is by asking at the reception. The reception is who issued the badge. But whoever has that badge (token) is authorized to enter any room of the building.
- On the other hand, with the passport is easy to verify my identity. Only you, the passport owner, can use that passport. I cannot enter in another country with your passport
- Secondly - the meaning of the token. The badge is just an object that does not tell anything about the owner. However, the passport gives us details of the identity it represents - name, last name, nationality etc.
- There are more differences we could extract for the analogy with JWT but let's just stick with these two for now.
As we saw in this other post, JSON has become the de facto message exchange format for APIs. In the case of the tokens that's what we would do, we would use tokens to communicate systems and applications. That means that we need to choose a format for those tokens that is common to any system. That's JSON.
Depending on the security context, in some cases, we'll need simple web tokens and in other cases, JWT tokens are required
Specification
All right, we've got our three elements - JSON WEB TOKEN - We want to create tokens, that represent identities in a meaningul way and in JSON. Easy, why don't we try with something like this:
{
"name": "Gonzalo",
"last name": "Marcos",
"nationality": "Spanish"
}
- Authenticity: How do we prove that I'm the owner of that token. Who's the issuer of that token and how do we trust that issuer?
- Confidentiality: If just add this JSON in plain text to the header of our API requests, anyone could intercept the traffic and see the token
- Integrity: For the same reason, anyone intercepting the traffic could modify that JSON and modify my identity attributes.
The problem is how do we make sure that the sender and receiver of the tokens use the same security mechanisms to encrypt/decrypt, encode/decode, hash... the tokens.
That's what a specification solves. That's what the JWT Specification solves. The JWT specification defines how we can create a token, or better said, a container for a token that can be transported securely between interested parties in JSON.
The JWT will tell us what info will be required (and how) to form this container, so that this container includes the token and the necessary information to describe how this token has been secured. The JWT specification will define a JWT token as a container made up of three strings of characters, separated by a period (.).
Something like xxxxx.yyyyy.zzzzz. Each string of characters will represent:
- A header - that describes how the token has been encrypted, where to find the key of encryption. It describes if the token has been signed, and if so, how the signature has been calculated (what public key and hash would be required to verify the signature)
- A payload - how to represent the values or attributes of our token (claims)
- A signature - If required, the third element will be the digital signature of the previous two elements.
This is just a high-level overview of the JWT specification, we'll see the details in another post. The important thing to understand is that a JWT token must be built following a set of rules. Following those rules:
- Both, sender and receiver, make sure that the token has been secured following a valid security standard
- Both sender and receiver, make sure that they understand how to code and decode a JWT