Implementing Authorization on mobile can be tricky. Here are some recommendations to avoid common issues.
Originally published on the Just Eat Engineering Blog.
Overview
Modern mobile apps are more complicated than they used to be back in the early days and developers have to face a variety of interesting problems. While we've put in our two cents on some of them in previous articles, this one is about authorization and what we have learned by handling JWT on mobile at Just Eat.
When it comes to authorization, it's standard practice to rely on OAuth 2.0 and the companion JWT (JSON Web Token). We found this important topic was rarely discussed online while much attention was given to new proposed implementations of network stacks, maybe using recent language features or frameworks such as Combine.
We'll illustrate the problems we faced at Just Eat for JWT parsing, usage, and (most importantly) refreshing. You should be able to learn a few things on how to make your app more stable by reducing the chance of unauthorized requests allowing your users to virtually always stay logged in.
What is JWT
JWT stands for JSON Web Token and is an open industry standard used to represent claims transferred between two parties. A signed JWT is known as a JWS (JSON Web Signature). In fact, a JWT has either to be JWS or JWE (JSON Web Encryption). RFC 7515, RFC 7516, and RFC 7519 describe the various fields and claims in detail. What is relevant for mobile developers is the following:
- JWT is composed of 3 parts dot-separated: Header, Payload, Signature.
- The Payload is the only relevant part. The Header identifies which algorithm is used to generate the signature. There are reasons for not verifying the signature client-side making the Signature part irrelevant too.
- JWT has an expiration date. Expired tokens should be renewed/refreshed.
- JWT can contain any number of extra information specific to your service.
- It's common practice to store JWTs in the app keychain.
Here is a valid and very short token example, courtesy of jwt.io/ which we recommend using to easily decode tokens for debugging purposes. It shows 3 fragments (base64 encoded) concatenated with a dot.
eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyLCJleHAiOjE1Nzc3NTA0MDB9.7hgBhNK_ZpiteB3GtLh07KJ486Vfe3WAdS-XoDksJCQ
The only field relevant to this document is exp
(Expiration Time), part of Payload (the second fragment). This claim identifies the time after which the JWT must not be accepted. In order to accept a JWT, it's required that the current date/time must be before the expiration time listed in the exp
claim. It's accepted practice for implementers to consider for some small leeway, usually no more than a few minutes, to account for clock skew.
N.B. Some API calls might demand the user is logged in (user-authenticated calls), and others don't (non-user-authenticated calls). JWT can be used in both cases, marking a distinction between Client JWT and User JWT we will refer to later on.
The token refresh problem
By far the most significant problem we had in the past was the renewal of the token. This seems to be something taken for granted by the mobile community, but in reality, we found it to be quite a fragile part of the authentication flow. If not done right, it can easily cause your customers to end up being logged out, with the consequent frustration we all have experienced as app users.
The Just Eat app makes multiple API calls at startup: it fetches the order history to check for in-flight orders, fetches the most up-to-date consumer details, etc. If the token is expired when the user runs the app, a nasty race condition could cause the same refresh token to be used twice, causing the server to respond with a 401 and subsequently logging the user out on the app. This can also happen during normal execution when multiple API calls are performed very close to each other and the token expires prior to those.
It gets trickier if the client and the server clocks are sensibly off sync: while the client might believe to be in possession of a valid token, it has already expired.
The following diagram should clarify the scenario.
Common misbehavior
I couldn't find a company (regardless of size) or indie developer who had implemented a reasonable token refresh mechanism. The common approach seems to be: to refresh the token whenever an API call fails with 401 Unauthorized. This is not only causing an extra call that could be avoided by locally checking if the token has expired, but it also opens the door for the race condition illustrated above.
Avoid race conditions when refreshing the token 🚦
We'll explain the solution with some technical details and code snippets but what what's more important is that the reader understands the root problem we are solving and why it should be given the proper attention.
The more we thought about it, we more we convinced ourselves that the best way to shield ourselves from race conditions is by using threading primitives when scheduling async requests to fetch a valid token. This means that all the calls would be regulated via a filter that would hold off subsequent calls to fire until a valid token is retrieved, either from local storage or, if a refresh is needed, from the remote OAuth server.
We'll show examples for iOS, so we've chosen dispatch queues and semaphores (using GCD); fancier and more abstract ways of implementing the solution might exist - in particular by leveraging modern FRP techniques - but ultimately the same primitives are used.
For simplicity, let's assume that only user-authenticated API requests need to provide a JWT, commonly put in the Authorization
header:
Authorization: Bearer <jwt-token>
The code below implements the "Get valid JWT" box from the following flowchart. The logic within this section is the one that must be implemented in mutual exclusion, in our solution, by using the combination of a serial queue and a semaphore.
Here is just the minimum amount of code (Swift) needed to explain the solution.
typealias Token = String
typealias AuthorizationValue = String
struct UserAuthenticationInfo {
let bearerToken: Token // the JWT
let refreshToken: Token
let expiryDate: Date // computed on creation from 'exp' claim
var isValid: Bool {
return expiryDate.compare(Date()) == .orderedDescending
}
}
protocol TokenRefreshing {
func refreshAccessToken(_ refreshToken: Token, completion: @escaping (Result<UserAuthenticationInfo, Error>) -> Void)
}
protocol AuthenticationInfoStorage {
var userAuthenticationInfo: UserAuthenticationInfo?
func persistUserAuthenticationInfo(_ authenticationInfo: UserAuthenticationInfo?)
func wipeUserAuthenticationInfo()
}
class AuthorizationValueProvider {
private let authenticationInfoStore: AuthenticationInfoStorage
private let tokenRefreshAPI: TokenRefreshing
private let queue = DispatchQueue(label: <#label#>,
qos: .userInteractive)
private let semaphore = DispatchSemaphore(value: 1)
init(tokenRefreshAPI: TokenRefreshing,
authenticationInfoStore: AuthenticationInfoStorage) {
self.tokenRefreshAPI = tokenRefreshAPI
self.authenticationInfoStore = authenticationInfoStore
}
func getValidUserAuthorization(completion: @escaping (Result<AuthorizationValue, Error>) -> Void) {
queue.async {
self.getValidUserAuthorizationInMutualExclusion(completion: completion)
}
}
}
Before performing any user-authenticated request, the network client asks an AuthorizationValueProvider
instance to provide a valid user Authorization value (the JWT). It does so via the async method getValidUserAuthorization
which uses a serial queue to handle the requests. The chunky part is the getValidUserAuthorizationInMutualExclusion
.
private func getValidUserAuthorizationInMutualExclusion(completion: @escaping (Result<AuthorizationValue, Error>) -> Void) {
semaphore.wait()
guard let authenticationInfo = authenticationInfoStore.userAuthenticationInfo else {
semaphore.signal()
let error = // forge an error for 'missing authorization'
completion(.failure(error))
return
}
if authenticationInfo.isValid {
semaphore.signal()
completion(.success(authenticationInfo.bearerToken))
return
}
tokenRefreshAPI.refreshAccessToken(authenticationInfo.refreshToken) { result in
switch result {
case .success(let authenticationInfo):
self.authenticationInfoStore.persistUserAuthenticationInfo(authenticationInfo)
self.semaphore.signal()
completion(.success(authenticationInfo.bearerToken))
case .failure(let error) where error.isClientError:
self.authenticationInfoStore.wipeUserAuthenticationInfo()
self.semaphore.signal()
completion(.failure(error))
case .failure(let error):
self.semaphore.signal()
completion(.failure(error))
}
}
}
The method could fire off an async call to refresh the token, and this makes the usage of the semaphore crucial. Without it, the next request to AuthorizationValueProvider
would be popped from the queue and executed before the remote refresh completes.
The semaphore is initialised with a value of 1, meaning that only one thread can access the critical section at a given time. We make sure to call wait
at the beginning of the execution and to call signal
only when we have a result and therefore ready to leave the critical section.
If the token found in the local store is still valid, we simply return it, otherwise, it's time to request a new one. In the latter case, if all goes well, we persist the token locally and allow the next request to access the method, in the case of an error, we should be careful and wipe the token only if the error is a legit client error (2xx range). This includes also the usage of a refresh token that is not valid anymore, which could happen, for instance, if the user resets the password on another platform/device.
It's critical to not delete the token from the local store in the case of any other error, such as 5xx or the common Foundation's NSURLErrorNotConnectedToInternet
(-1009), or else the user would unexpectedly be logged out.
It's also important to note that the same AuthorizationValueProvider
instance must be used by all the calls: using different ones would mean using different queues making the entire solution ineffective.
It seemed clear that the network client we developed in-house had to embrace JWT refresh logic at its core so that all the API calls, even new ones that will be added in the future would make use of the same authentication flow.
General recommendations
Here are a couple more (minor) suggestions we thought are worth sharing since they might save you implementation time or influence the design of your solution.
Correctly parse the Payload
Another problem - even though quite trivial and that doesn't seem to be discussed much - is the parsing of the JWT, that can fail in some cases. In our case, this was related to the base64 encoding function and "adjusting" the base64 payload to be parsed correctly. In some implementations of base64, the padding character is not needed for decoding, since the number of missing bytes can be calculated but in Foundation's implementation it is mandatory. This caused us some head-scratching and this StackOverflow answer helped us.
The solution is - more officially - stated in RFC 7515 - Appendix C and here is the corresponding Swift code:
func base64String(_ input: String) -> String {
var base64 = input
.replacingOccurrences(of: "-", with: "+")
.replacingOccurrences(of: "_", with: "/")
switch base64.count % 4 {
case 2:
base64 = base64.appending("==")
case 3:
base64 = base64.appending("=")
default:
break
}
return base64
}
The majority of the developers rely on external libraries to ease the parsing of the token, but as we often do, we have implemented our solution from scratch, without relying on a third-party library. Nonetheless, we feel JSONWebToken by Kyle Fuller is a very good one and it seems to implement JWT faithfully to the RFC, clearly including the necessary base64 decode function.
Handle multiple JWT for multiple app states
As previously stated, when using JWT as an authentication method for non-user- authenticated calls, we need to cater for at least 3 states, shown in the following enum:
enum AuthenticationStatus {
case notAuthenticated
case clientAuthenticated
case userAuthenticated
}
On a fresh install, we can expect to be in the .notAuthenticated
state, but as soon as the first API call is ready to be performed, a valid Client JWT has to be fetched and stored locally (at this stage, other authentication mechanisms are used, most likely Basic Auth), moving to the .clientAuthenticated
state. Once the user completes the login or signup procedure, a User JWT is retrieved and stored locally (but separately to the Client JWT), entering the .userAuthenticated
, so that in the case of a logout we are left with a (hopefully still valid) Client JWT.
In this scenario, almost all transitions are possible:
A couple of recommendations here:
- if the user is logged in is important to use the User JWT also for the non-user-authenticated calls as the server may personalise the response (e.g. the list of restaurants in the Just Eat app)
- store both Client and User JWT, so that if the user logs out, the app is left with the Client JWT ready to be used to perform non-user-authenticated requests, saving an unnecessary call to fetch a new token
Conclusion
In this article, we've shared some learnings from handling JWT on mobile that are not commonly discussed within the community.
As a good practice, it's always best to hide complexity and implementation details. Baking the refresh logic described above within your API client is a great way to avoid developers having to deal with complex logic to provide authorization, and enables all the API calls to undergo the same authentication mechanism. Consumers of an API client, should not have the ability to gather the JWT as it’s not their concern to use it or to fiddle with it.
We hope this article helps to raise awareness on how to better handle the usage of JWT on mobile applications, in particular making sure we always do our best to avoid accidental logouts to provide a better user experience.