C#, .Net and Azure

Cloud native authentication with Azure AD using MSI

2021/01/16

Azure provides cloud native authentication & authorization mechanisms using Azure AD and MSI adds increased trust and automatic secret rotation making the authentication experience even more convenient.

In this post I specifically want to focus on the most common MSI scenarios I am implementing over and over:

First we need to distinguish the various authentication mechanisms and define some terminology that I have found is often not well understood:

Terminology

1. App registration

In short: App registrations define the surface of your application access (aka the “what”).

App registrations are used when you need any kind of identity management for applications. Examples would be frontend and backend services that can have different roles and scopes associated with them.

The app registration lets you define the valid redirect URLs, scopes and roles that are valid on the specific application (e.g. you could define read, write & admin roles). Later users can then be assigned to the roles using the underlying service principal.

Each app registration will automatically create a service principal in the same Azure AD. This service principal is often also refered to as an enterprise application.

2. Enterprise application/service principal

Here it already gets a bit confusing as Enterprise application sounds similar but is actually a different component.

In short: Enterprise applications/Service principals define the access to the application (“who” and “how”).

On each enterprise application you can define some global settings (can users see the application/is login generally allowed/..) as well as which users and groups get to have access to the application (alongside the roles they can have).

The reason for this split into two components becomes clear when you think about the Azure AD multi-tenancy feature:

With multi-tenancy you as the application author can create a single app registration that exists only in your tenant and is responsible for securing your product (let’s say it’s a SaaS service): You as the author get to define the app roles, scopes & redirect URIs on the app registration once.

Meanwhile each consumer of your SaaS service (= each company purchasing your SaaS service) gets an enterprise application added to their Azure AD tenant. This enterprise application is linked to your app registrations and allows your customers to select which users within their tenant should have access to the service (and which respective roles they should have).

You can continue to work on your product and update the app registration (add new roles as features are added, ..) whereas the customer is responsible for controlling which of his users get access to your application (and which roles they each have).

Obviously you still need to implement some kind of per user billing but the brunt work of letting a third party do user management is offloaded to the respective customer quite easily thanks to Azure AD providing the necessary features.

An alternative if you don’t want to force azure AD on your customers would be Azure B2C which allows connecting to all sorts of social and custom providers (Google, Facebook, Github, ..) however it comes at the cost of additional complexity and I don’t want to get into these details in this post.

3. Managed identity

Managed identities (short: MSI) are identities that are tied to individual service instances in Azure. They can be turned on on any kind of service that supports it (VMs, App Services, Function apps, etc.).

MSIs are powerful for multiple reasons:

With service principals anyone (user or service) who has both the client id (also known as application id) and the client secret can authenticate as the service principal and gain access to whichever resources the service principal has been granted access to.

With MSI only the instance in azure can authenticate as that MSI. This is achieved by having a token issuer that runs on the individual service at the 127.0.0.1 level and cannot be accessed remotely.

This means that you can connect many services without having to worry about secrets:

Managed identities

4. User-assigned identity

User-assigned identities work just as regular MSIs (automatic secret rotation) but are not tied to a particular service.

They allow you to reuse the identity across multiple services while taking advantage of the secret rotation.

A usecase could be having multiple build servers that all need to have access to certain resources during build/release (e.g. keyvault, web app).

Instead of assigning the managed identity of each individual build VM (and having to subsequently update all resources when a VM is added/removed) you can simply assign a user-assigned identity to all services and then use the user-assigned identity on each build server.

Scenarios

Now that the terminology is set here are the details regarding the most common scenarios:

Service-to-service communication using MSI

Let’s assume you have an webapp that needs to communicate with another backend (e.g. a notification system that you need to query for pending notifications).

The notification system is secured with an app registration and custom roles (e.g. notification.read, notification.create, ..).

In order to communicate with the notification service your backend needs to authenticate itself and needs to be in the notification.read role.

This approach is pretty typical (see also the MSDN documentation).

The neat trick with MSI is that you don’t need an additional app registration for your backend. Instead you can simply enable the MSI switch on your backend and then add the MSI to the notification.read role on the notification system service principal (the MSI name will be the web app name from azure).

Note: If you use slot based deployments with app services you also need to enable MSI on each slot and also assign the slot to the role as MSIs travel with the slots on swap operations (it has the same name as the root web app with suffix: /slots/<slot name>).

After you have added the MSIs to the respective roles you can use this small piece of code in your web app to fetch a token:

// using nuget package: Microsoft.Azure.Services.AppAuthentication
var tokenProvider = new AzureServiceTokenProvider();
var token = await tokenProvider.GetAccessTokenAsync("https://app-id-or-guid");

You can always use the application id however when using multiple applications you will quickly find that it can be quite complex to keep track of all the guids so another approach is to use the app id:

On each app registration one can simply set an Application ID URI (on the Expose an API tab). This URI can then be used to address the app registration in the same way as the guid.

Gotchas

  1. MSI tokens are cached within azure for up to 24h and there is no way to clear the cache.

If you requested a token before assigning a particular role to the MSI (role assingments may take up to 5 minutes to complete) your token will not contain the role and you will have to wait 24 hours for it to reset.

Workaround: The azure cache appears to not be case-sensitive. So you can change individual letters to upper/lowercase in your guid or app ID URI to circumvent the cache (obviously the new value is also cached 24 hours). It’s not ideal but when debugging/adding new roles it saves you from waiting 24 hours until the cache expires and there are enough letter/combinations of upper/lowercase in GUIDs to work around this issue. ;)

  1. Debugging the token is difficult (can’t see the actual token locally)

MSI tokens are only issued on the actual azure server so the behaviour might be slightly different locally (see also the section on local debugging) but there is a neat trick:

You can wrap the code I posted above into a small console application and simply pass in the parameter of GetAccessTokenAsync as an argument and print the token on the commandline.

Commandline example:

using Microsoft.Azure.Services.AppAuthentication;
using System;
using System.Threading.Tasks;

public static async Task Main(params string[] args)
{
  if (args.Length != 1)
  {
    Console.WriteLine("usage: getToken <guid or app ID URI>");
    return;
  }
  var tokenProvider = new AzureServiceTokenProvider();
  var token = await tokenProvider.GetAccessTokenAsync(args[0]);
  Console.WriteLine("Token: " + token);
}

On VMs you can just upload it and for app services there is Kudu. Simply upload the executable into any folder in kudu and then use the built-in commandline to execute it on the app service:

getToken.exe 22dbd01d-6365-4878-af44-f1ab82bcbd11
Token: ey...

Plug the token into jwt.ms and you can see the actual value as generated for the MSI. This makes it really easy to see if the appropriate roles are set in the MSI (and busting the azure cache is as easy as changing the casing on a single letter: “getToken.exe 22Dbd01d-6365-4878-af44-f1ab82bcbd11”)

Note that when running the same code locally you will get an error because locally the MSI provider detects that it is not running on an azure resource and falls back to user authentication (using Visual Studio and/or az cli) and since you yourself have not been added to the notification.read role the access is denied. (See the next section for more details on this exact scenario).

There isn’t much more to say about MSI authentication: It’s convenient, secrets are rotated automatically and it works out of the box for many azure services!

Connection to a service when debugging locally

The previous section already described service-to-service communication and infact most of the stuff also applies when trying to connect from your local debugging session to another azure resource:

AzureServiceTokenProvider is smart enough to provide a few fallbacks that can be used when no MSI token endpoint is detected (i.e. the code is not executing within an azure datacenter).

In this case the fallbacks for developers are:

Overall this is great as the fallbacks allow you to test the same code locally without needing to make any changes when deploying to azure.

You obviously need to make one additional change to the service principal of the target resource:

Add your own azure AD user account to the notification service notification.read role.

Because the token is issued in your name you also need to have the same permissions as the backend service requesting service-to-service communication would have (and there is no 24 hour cache for local tokens - AzureServiceTokenProvider only has a memory cache so restarting the app is enough. New roles should appear within a few minutes).

Using groups to delegate developer access makes it even easier as you can easily add new developers to the respective group and reuse the group for all relevant backend systems (or have multiple groups: read permission/admin permission/etc. to distinguish between various roles in your team).

For an example how managed identity can be used to access keyvaults check out this post and for a how-to on connecting to SQL server using MSI check out this post.

Gotchas

Even though local authentication should be seamless (and most often it is) there are situations where that’s not the case.

1. Authentication randomly stops working after a while

Usually your Visual Studio account requires reauthentication (but for some reason won’t tell you).

Simply navigate to account settings (by clicking on your profile in the top right in Visual Studio) and clicking the refresh credentials is enough (most of the time you don’t even have to actually enter the credentials).

If this doesn’t help there is another scenario that I encountered a few times:

Go to Tools -> Options -> Azure Service Authentication and make sure that the right account is selected (especially important when you are logged into multiple accounts).

Important: Even if it appears selected, reopen the dropdown and click on your account again (I could reproduce this bug many times where the account appeared selected but tokens couldn’t be issued; once the account was manually reselected AzureServiceTokenProvider started working again).

Alternatively signing out of the account and back in is a bit more crude but also does the trick.

2. Enterprise environments

For my private subscriptions the tricks above have always been enough, however I have also encountered enterprise environments where Azure AD configuration was locked down a lot.

In some cases Visual Studio/az cli authentication won’t work out of the box because AD admins have restricted the users ability to consent to applications. (User consent is necessary when using applications for the first time in Azure AD; by default an admin can consent on behalf of the application and users can consent for themselves. Many times I have encountered that the later was disabled by adminds to prevent users from consenting to various applications and potentially giving away access to their accounts).

The workaround is to go to the app registration for which you want the token issued (in the examples above: the notification system) and add these guids as authorized client applications in the Expose an API tab:

Note: You need to add at least one scope before you can add authorized clients (adding a dummy scope user_impersonation is enough; you don’t have to use the scope).

What this does is let the token issuer know that these applications are trusted by default and no user consent is necessary. (Normally you would see a consent screen when you request a token for the very first time for a particular user/app combination but if AD admins deny users consenting to applications then AzureServiceTokenProvider simply won’t work. By adding the app ids of Visual Studio/az cli you can circumvent this consent dialog by confirming them as trusted applications for all users).

Some of these gotchas and fixes can also be found in this (and similar) github issues.

3. Locking down the target resource

When creating new service principals the default settings are:

This means any user in your tenant can navigate to your application (or request a token via the token endpoint) and will be granted access.

With big enterprises that’s usually a lot of users and if you are not using roles (and also checking in all controllers that a user has at least one role) then all employees (and possible external contractors) have full access to the application (hence one of the reason why AD admins often disable the “user can consent” option - without consent they get to safeguard against this scenarion globally).

If you are planning on using roles and want users to at least select one role each you can simply flip this switch in the service principal properties section to true:

User assignment required

Users will then be prevented from signing in unless they have been added to at least one role on the application.

If you have built a separate permission system and don’t rely on Azure AD to block users then this obviously doesn’t apply to you but otherwise these settings should not be left as the default!

On-behalf-of authentication

The final mechanism I want to focus on is the on-behalf-of authentication.

It basically allows you to build an application that has no access permissions by itself and in turn relies on user permissions.

As an example one could build a custom version of the azure portal (with a reduced featureset) for a support team:

The support team would be able to perform simply tasks (look at logs, restart app services when issues are found) but generally aren’t allowed to create or delete azure resources.

Of course one could also train them to use the azure portal (although they’d have to ignore 99% of features) but for this scenario let’s assume it is a better trade-off to build a simple portal.

To build this portal you can make use of the on-behalf-of flow: The application backend doesn’t have access to any azure resources but instead relies on the users RBAC permissions as set in azure.

This has two advantages:

Relying on the integrated RBAC system is thus the better approach and the code involved is also straightforward:

public class OnBehalfOfAuthenticationProvider : AuthenticationProviderBase
{
    private readonly IConfidentialClientApplication _clientApplication;
    private readonly IHttpContextAccessor _httpContextAccessor;

    public OnBehalfOfAuthenticationProvider(
        IConfidentialClientApplication clientApplication,
        string[] scopes,
        IHttpContextAccessor httpContextAccessor)
        : base(scopes)
    {
        _clientApplication = clientApplication;
        _httpContextAccessor = httpContextAccessor;
    }

    protected override async Task<AuthenticationResult> GetTokenAsync()
    {
        var token = await _httpContextAccessor.HttpContext.GetTokenAsync("Bearer", "access_token");
        var userAssertion = new UserAssertion(token);

        var authenticationResult = await _clientApplication.AcquireTokenOnBehalfOf(Scopes, userAssertion).ExecuteAsync();
        return authenticationResult;
    }
}

The user needs to authenticate against your api anyway so you can simply take the token he sent and reuse it for the respective on-behalf-of request.

If the user has no permission his action is denied and the frontend needs to simply respond accordingly.


There are of course many more scenarios and features that I might describe in the future but these are the most common ones I encounter.