ABC - Always Be Coding

Sunday, January 4, 2026

PendingModelChangesWarning Exception When Applying EF Migrations that Involve ASP.NET Core Identity

I have a couple of personal projects that use ASP.NET Identity for user management, and both of them ran into the same problem when running Entity Framework migrations with a dedicated project as shown in the Aspire examples. That is, a Worker Service project that runs the migrations in a BackgroundService.

The error looked like this:

System.InvalidOperationException: 'An error was generated for warning 'Microsoft.EntityFrameworkCore.Migrations.PendingModelChangesWarning': The model for context 'MyContext' has pending changes. Add a new migration before updating the database. See https://aka.ms/efcore-docs-pending-changes. This exception can be suppressed or logged by passing event ID 'RelationalEventId.PendingModelChangesWarning' to the 'ConfigureWarnings' method in 'DbContext.OnConfiguring' or 'AddDbContext'.'

As suggested, I tried generating a new migration, but it was empty. After much AI-assisted debugging, it seemed to be an issue with the length of the columns on a couple of built in tables/classes:

IdentityUserToken

LoginProvider
Name

IdentityUserLogin

LoginProvider
ProviderKey

Some searching suggested that the default lengths of these have changed over the years and that they are adjustable for that reason. I have never specified the lengths in my code, so I don't know how they are being resolved to values different from what's in my DB. And this is only a problem when I run the migrations with the Worker Service project. When run from the command line tool, everything is fine.

As the exception mentions, there is a way to ignore this error, but then it would be ignored for all such future problems where you really forgot to add a migration.

The solution I found is to specify the lengths of the fields in OnModelCreating of my context:

builder.Entity<IdentityUserToken<string>>(entity =>
{
    entity.Property(m => m.LoginProvider).HasMaxLength(128);
    entity.Property(m => m.Name).HasMaxLength(128);
});
 
builder.Entity<IdentityUserLogin<string>>(entity =>
{
    entity.Property(m => m.LoginProvider).HasMaxLength(128);
    entity.Property(m => m.ProviderKey).HasMaxLength(128);
});

Thursday, January 1, 2026

Using Azure Functions with Managed Identity and SQL, Blob, Queue, and Event Grid Triggers

Like a good coding nerd, I spent (a surprising amount of) time figuring out how to Azure Functions with the following trigger types, all while using a managed identity:

SQL Table
Blob
Blob using Event Grid
Queue

Microsoft's documentation and logging in Azure is spread out. Twitter's Grok AI did a great job of helping pull it together when things weren't working.

To start, I created the 2 usual Aspire projects using Visual Studio 2022 and .NET 9, the ServiceDefaults and AppHost projects.

I then created my Functions project in Visual Studio 2022 using .NET 9 and the Functions project template, choosing the queue trigger template. This brings me to my first 2 complaints.

Complaint #1: The Functions template does not support the use of central package management. If your solution currently does you will immediately have a broken project until you adjust the project file and Directory.Packages.props. Knowing this, I used a project that does not use CPM.

Complaint #2: Azure Functions tooling is available 3 different ways on Windows, none of which I consider standard.

There's a command line tool "func" that you install with an MSI called Azure Functions Core Tools. I would have expected templates that you install with the dotnet CLI.
There's something in Visual Studio that you need to update from the Options menu. It doesn't update by itself as far as I know. It's not just an extension, you have to hunt for it in the menus.
Click that button and there's no progress meter. Just an ephemeral message in the bottom right corner of VS. I would have expected an extension or something that updates with VS every couple of weeks like VS does.
The Azure command line tool, azd. The docs say it requires the Azure Functions Core Tools, so maybe azd just calls func.

Which one is the latest? There's a github repo for func, so maybe that one. I don't know where the VS tooling lives on the internet.

Here is my AppHost.cs. I started with using named connections for the queue and storage connections, but when it came to configuring them in Azure it started getting very complex, so it's using the defaults. The SQL connection is for the SQL Table trigger. More details below.

var builder = DistributedApplication.CreateBuilder(args);
 
var db = builder.AddConnectionString("db");
 
var migrations = builder.AddProject<Projects.MigrationService>("migrationservice")
    .WithReference(db)
    .WaitFor(db);
 
var storage = builder.AddAzureStorage("storage")
    .RunAsEmulator(az => az.WithLifetime(ContainerLifetime.Persistent))
;
 
builder.AddAzureFunctionsProject<Projects.QueueTriggerFunction>("queuetriggerfunction")
    .WithHostStorage(storage)
    .WaitFor(storage)
    .WithReference(db)
    .WaitFor(db)
    .WaitForCompletion(migrations)
    ;
 
builder.Build().Run();

Over the years, a couple of Function defaults have driven me to distraction leading to my next 2 complaints.

Complaint #3: local.settings.json file is claimed to contain some secret values, so it is not included for commit by default in .gitignore. However, one critical value that everyone who works with the repo needs is "FUNCTIONS_WORKER_RUNTIME": "dotnet-isolated".

Complaint #4: something about the default logging setup makes sure only warning and above logging makes it to the Azure output. The Azure console relies SOLELY on this output to determine if your function ran, so logging something is critical. (Never mind that some other part of Azure knows it ran your function.)

I want local.settings.json committed, and I want to use the facility .NET Core already has for per-dev settings, secrets.json. Plus, using Aspire means the Function's project never needs settings in a config file anyway.

I also found some code that fixes the default exclusion of Information and below logging.

This is my Function project's Programs.cs. I added a simple config class and put a value for it in secrets.json to make sure the code uses it.


using Microsoft.Azure.Functions.Worker;
using Microsoft.Azure.Functions.Worker.Builder;
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Hosting;
using QueueTriggerFunction;
using System.Reflection;
using Microsoft.Extensions.Logging;
using Data;
using Microsoft.EntityFrameworkCore;
 
var builder = FunctionsApplication.CreateBuilder(args);
 
builder.AddServiceDefaults();
 
builder.ConfigureFunctionsWebApplication();
 
builder.Services
    .AddApplicationInsightsTelemetryWorkerService()
    .ConfigureFunctionsApplicationInsights();
 
builder.Logging.Services.Configure<LoggerFilterOptions>(options =>
{
    // The Application Insights SDK adds a default logging filter that instructs ILogger to capture only Warning and more severe logs. Application Insights requires an explicit override.
    // Log levels can also be configured using appsettings.json. For more information, see /azure-monitor/app/worker-service#ilogger-logs
    LoggerFilterRule? defaultRule = options.Rules.FirstOrDefault(rule => rule.ProviderName
        == "Microsoft.Extensions.Logging.ApplicationInsights.ApplicationInsightsLoggerProvider");
    if (defaultRule is not null)
    {
        options.Rules.Remove(defaultRule);
    }
 
    // Add a new rule to capture Information and above for AI
    options.AddFilter("Microsoft.Extensions.Logging.ApplicationInsights.ApplicationInsightsLoggerProvider",
        LogLevel.Information);
    options.MinLevel = LogLevel.Information;
});
 
builder.Services.AddOptions<MyConfigurationSecrets>()
    .Configure<IConfiguration>((settings, configuration) =>
    {
        configuration.GetSection("MyConfigurationSecrets").Bind(settings);
    });
 
builder.Configuration
       .SetBasePath(Environment.CurrentDirectory)
       .AddJsonFile("local.settings.json", optional: true)
       .AddUserSecrets(Assembly.GetExecutingAssembly(), optional: true)
       .AddEnvironmentVariables();
 
builder.Services.AddDbContext<DataContext>(optionsBuilder =>
{
    optionsBuilder.UseSqlServer(
        builder.Configuration.GetConnectionString("db"),
        b =>
        {
            b.MigrationsAssembly("Data");
            b.EnableRetryOnFailure();
        });
});
 
builder.Build().Run();

There's a Migrations WebJob project called Data like the ones seen in Aspire examples that creates the DB and a simple table. How the table gets created isn't important, so I'm not including that here. I will show the steps for enabling table change tracking later, though.

A word on NuGet versions as of December 2025:

Aspire 13 libraries
Latest Functions libraries
Latest .NET 9 libraries, except for Microsoft.Extensions.Configuration.UserSecrets, which is the latest 10.

I let VS publish the function to Azure in a .NET 9, Linux App Service. It created most of the environment variables that start with AzureWebJobsStorage, but I don't remember exactly which ones. I think it populated the storage account to use, but I don't remember how it knew. I also don't remember if I had already configured the managed identity to use, which helped it fill in the values. Anyway, here is the complete configuration:

The client ID value is the one with that name from the managed identity's properties:

The managed identity has the following role assignments for the storage account. They are probably more than is needed for just reading messages and blobs:

Storage Blob Data Owner
Storage Blob Data Contributor
Storage Queue Data Contributor
Storage Queue Data Message Processor
Storage Account Contributor

Something also assigned it the Monitoring Metrics Publisher role on the Application Insights resource that the function resource uses.

Some of these allow the managed user to list blobs/messages, and certain other ones allow it to read individual blobs/messages. It is not as clear as AWS's explicit list of actions. Again, Grok helped fill in my knowledge. At one point it showed me how to enable diagnostic logging on the storage account and look up the authorization failure in a Log Analytics Workspace.

Queue Trigger

The queue trigger code is almost the stock template code.


using Azure.Storage.Queues.Models;
using Microsoft.Azure.Functions.Worker;
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Options;
 
namespace QueueTriggerFunction;
 
public class QueueTriggerFunction
{
    private readonly MyConfigurationSecrets _myConfigurationSecrets;
    private readonly ILogger<QueueTriggerFunction> _logger;
 
    public QueueTriggerFunction(ILogger<QueueTriggerFunction> logger, IOptions<MyConfigurationSecrets> myConfigurationSecrets)
    {
        _logger = logger;
        _myConfigurationSecrets = myConfigurationSecrets.Value;
    }
 
    [Function(nameof(QueueTriggerFunction))]
    public void Run([QueueTrigger("myqueue-items")] QueueMessage message)
    {
        _logger.LogInformation("Using secret: {Secret}", _myConfigurationSecrets.Secret);
        _logger.LogInformation("C# Queue trigger function processed: {messageText}", message.MessageText);
    }
}

At this point, if you create a queue called myqueue-items and add a text message, the trigger should fire. I did this using the Azure portal. For some reason, even though I'm an Owner on the account, I had to add the following rights for myself to use the storage explorer in Azure:

Reader
Storage Blob Data Contributor
Storage Queue Data Contributor

File Trigger

The file trigger code is also almost as simple as the template:


using Microsoft.Azure.Functions.Worker;
using Microsoft.Extensions.Logging;
 
namespace QueueTriggerFunction;
 
public class FileTriggerFunction
{
    private readonly ILogger<FileTriggerFunction> _logger;
 
    public FileTriggerFunction(ILogger<FileTriggerFunction> logger)
    {
        _logger = logger;
    }
 
    [Function(nameof(FileTriggerFunction))]
    public async Task Run([BlobTrigger("uploads/{name}")] Stream stream, string name)
    {
        using var blobStreamReader = new StreamReader(stream);
        var content = await blobStreamReader.ReadToEndAsync();
        _logger.LogInformation("C# Blob trigger function Processed blob\n Name: {name} \n Data: {content}", name, content);
    }
}

At this point, everything is already set up to allow the file trigger to run when you upload a text file to a blob container called uploads.

SQL Trigger

The SQL trigger code:


using Data;
using Microsoft.Azure.Functions.Worker;
using Microsoft.Azure.Functions.Worker.Extensions.Sql;
using Microsoft.Extensions.Logging;
using Newtonsoft.Json;
 
namespace QueueTriggerFunction;
 
public class SqlTriggerFunction
{
    private readonly ILogger _logger;
 
    public SqlTriggerFunction(ILoggerFactory loggerFactory)
    {
        _logger = loggerFactory.CreateLogger<SqlTriggerFunction>();
    }
 
    // Visit https://aka.ms/sqltrigger to learn how to use this trigger binding
    [Function("SqlTriggerFunction")]
    public void Run(
        [SqlTrigger("[dbo].[TodoItems]", "db")] IReadOnlyList<SqlChange<ToDoItem>> changes,
            FunctionContext context)
    {
        _logger.LogInformation("SQL Changes: " + JsonConvert.SerializeObject(changes));
    }
}

As I mentioned, I created the DB and its single table with a WebJob that ran an EF migration. We need to do a few things to make the SQL trigger work:

Add the managed identity to the DB and give it the DB roles needed to read from the DB

CREATE USER [queue-trigger-user] FROM EXTERNAL PROVIDER;
ALTER ROLE db_datareader ADD MEMBER [queue-trigger-user];
ALTER ROLE db_datawriter ADD MEMBER [queue-trigger-user];
ALTER ROLE db_ddladmin ADD MEMBER [queue-trigger-user];

Enable change tracking on the DB

ALTER DATABASE [function-test]
SET CHANGE_TRACKING = ON
(CHANGE_RETENTION = 2 DAYS, AUTO_CLEANUP = ON);

Enable change tracking on the table

ALTER TABLE [dbo].[ToDoItems]
ENABLE CHANGE_TRACKING;

Grant the DB identity the ability to see change tracking events on the table

GRANT VIEW CHANGE TRACKING ON OBJECT::dbo.ToDoItems TO [queue-trigger-user];

We also need to configure the "db" connection string. None of the examples on the SQL DB connection string page in Azure are correct. The closest one is the "ADO.NET (Microsoft Entra integrated authentication)" one, except with the managed identity's Client ID in place of the User ID, and Authentication = "Active Directory Managed Identity".

If things are not working, one place I found errors is in the traces in the Live Metrics view of the Application Insights. Nothing showed in the Failures view when the SQL trigger wasn't showing.

Complaint #5: Logging in Azure for Functions, and Azure in general, is in so many places, unlike AWS and its CloudWatch system. There seems to be a lot more that needs to be enabled manually, like when diagnosing access problems to storage. Also, inconsistent time display. Some allow switching between local and UTC, some are only local, some are only UTC. Long delays in some cases e.g. 5 minutes between a function running and it showing in the console.

Complaint #6: The ways to allow resources in Azure to access other resources are too many. And once you've chosen a method, there can be many different ways to configure that one method. E.g. providing a config entry for each storage service (blob/table/queue) vs. a different one that points to the overall storage account. In the storage account case it tries one approach, then another. If you've accidentally provided a config that's partially one and partially the other, you'll have a hard time trying to understand why it's unhappy.

At this point the SQL trigger was working properly.

Event Grid-Based Blog Trigger

I read somewhere that at scale the direct blob trigger is not the best choice and that you should use an Event Grid-based blob trigger. The code looks like this:


using Microsoft.Azure.Functions.Worker;
using Microsoft.Extensions.Logging;
 
namespace QueueTriggerFunction;
 
public class FileTriggerEventGridFunction
{
    private readonly ILogger<FileTriggerEventGridFunction> _logger;
 
    public FileTriggerEventGridFunction(ILogger<FileTriggerEventGridFunction> logger)
    {
        _logger = logger;
    }
 
    [Function(nameof(FileTriggerEventGridFunction))]
    public async Task Run([BlobTrigger("uploads-eventgrid/{name}", Source = BlobTriggerSource.EventGrid)] Stream stream, string name)
    {
        using var blobStreamReader = new StreamReader(stream);
        var content = await blobStreamReader.ReadToEndAsync();
        _logger.LogInformation("C# Blob trigger function via event grid Processed blob\n Name: {name} \n Data: {content}", name, content);
    }
}

Almost the same as the blog trigger, but the Source is set to Event Grid. This is not the same as a pure Event Grid trigger, which would receive an object that describes the event instead of a ready-to-use Stream.

You wire the storage event to the function as a web hook, so you have to build a funky URL to call. This page details how to build it. Overall it's

https://<FUNCTION_APP_NAME>.azurewebsites.net/runtime/webhooks/blobs?functionName=Host.Functions.<FunctionName>&code=<BLOB_EXTENSION_KEY>

The BLOB_EXTENSION_KEY comes from the App keys section of your function resource

In my case I wanted only uploads in a specific container to fire events so I had to specify a filter on the event e.g.

/blobServices/default/containers/uploads-eventgrid/blobs/

After all that my Event Grid function started firing when I loaded to uploads-eventgrid.

Summary

I do not find Azure easy to work with. Granted, I've only ever deployed my own little personal projects, while I have built an actual production-quality project in AWS with a team. But the differences in the consistency of experience across services in Azure vs. AWS is glaring. There are fewer ways to do things in AWS, and that's probably due to the history and organization of each company. Azure is starting to introduce AI helpers directly in Azure, but I haven't found them nearly as helpful as pasting errors into Grok. As long as you can find those errors.

Friday, July 5, 2024

Invalid Signature on AWS SNS Message

Amazon Web Services has a service called Simple Notification Service, which does what it says on the tin: it sends notification messages. In my case, I was sending SNS messages to an Azure Function via HTTP calls when an email is undeliverable.

These messages have a cryptographic signature and a link to the certificate that AWS used to sign the message. You can use these to ensure the message came from AWS unchanged. Amazon has libraries to do this in various languages. I was using the .NET one.

I captured the message by having the Node-based Azure function log it to the console e.g.

console.log(request.body);

I copied the output from the Invocation list in Azure Functions and put it in a .json file.

I wrote some C# code to load the file from disk, then loaded it into an AWS SNS Message object with its ParseMessage method. Finally, I called IsMessageSignatureValid on the message. It returned false. Huh?

I suspected that something about logging it to the Azure UI or saving it to disk had somehow mangled it. Maybe something got double escaped/encoded.

To double check what AWS sending I subscribed an AWS SQS queue to it so that I could see what that captured. I knew you could look at the messages in the AWS UI.

Looks the same doesn't it? Other than the cut off UnsubscribeURL, which isn't where the problem is.

I saved it to disk and looked through it character by character. I finally noticed the one in the AWS UI had 2 spaces instead of 1 between these 2 bits:

multipart/alternative; boundary

The AWS UI puts the contents in a <textarea> tag, which preserves all whitespace. The Azure UI uses a <span>, which does the usual HTML thing of condensing all subsequent spaces to 1, unless they are   characters.

I put an extra space in the file I'd created from Azure's UI and IsMessageSignatureValid finally returned true.

Sunday, July 9, 2017

Dependency Inversion and Microsoft Web Technologies

When I started learning C# and ASP.NET about 10 or 15 years ago (!!), we rendered HTML with Web Forms and published services with ASMX Web Services. A little while later we moved on to WCF services.

One of the limitations of those technologies was that the framework would only call the default, parameterless constructor of the page and service classes. To allow unit testing, I used "poor man's" dependency injection (DI), which I learned from Jean-Paul Boodhoo's videos on DnRTV. This is the approach where you have 2 constructors: a default one that calls the other one, which takes instances of all needed dependencies. The default constructor created instances of concrete classes that implemented interfaces. E.g.

I don't remember using, or even looking to mock frameworks. I think we created custom mocks by creating test classes that implement the interfaces and allowed customizing responses. I'm not sure which inversion of control (IoC) containers existed for C# then, or mock frameworks. If they existed, they were probably open source, which was frowned on at the bank at the time, even if just for testing. If they were allowed it involved some paperwork, anyway. (Banks like to have big companies they can call when they need support and who will be around a while.) Our pages and unit tests looked something like this:

This is fine if all you want to do is unit test. But what if you want to do integration testing, where you mock out the layer where you cross into other people's code e.g. the database or an external service? You'd have to do something like this:

This wasn't something we tried to do at the time. We were a pretty inexperienced bunch. (Programming at a bank vs. programming at a software company is like practicing law at a bank vs. at a law firm.)

If I'd have known more about the dependency inversion principle then, I would have been very skeptical about putting up with this limitation. I would have immediately gone searching for ways to insert something into the web request pipeline to control page and service creation. If I need to replace an implementation at the bottom of a dependency graph, it should be as easy as replacing one interface registration in an IoC container.

Searching the web now it looks like it was possible, but not in an a way that made you feel good. They look like hacks, or the domain of .NET experts. If you wanted to stick to the SOLID principles, though, it's what you should have done.

I do find it surprising that Microsoft gave us frameworks based primarily on object oriented languages (C#, VB.NET) that didn't let you observe object-oriented (OO) practices. Every example from that time involved creating instances of concrete classes in the page or service classes themselves. I even remember one example that told developers to drag and drop a new SqlConnection object onto every page. Not very maintainable. Perhaps Microsoft didn't think the existing MS web developers of the day could embrace these concepts, so many being ASP/VB6 devs.

Whatever the reason, these limitations lead to some particularly gnarly code, completely lacking in abstractions and injection points. For example, the WCF service I'm currently tearing apart to allow mocking out the bottom-most layer. It goes something like this:

The WCF class calls static methods. The first time one of them is called, it initializes a static instance, that is, a singleton, of an object that has instances of database and service classes. If you replace one of those instances during a test, you must set it back to null for reinitialization when you're done, or have every test set it before it runs. This is what you get when you ignore the SOLID principles.

It wasn't until ASP.NET MVC 3 that Microsoft built IoC into ASP.NET MVC from the start, allowing controllers to take dependencies in constructors. Until then, people used IoC containers that implemented the complex looking code that allowed using DI in ASP.NET. I'm still surprised that it took so long for them to bake this in.

The next time I look at a technology that is based primarily on an OO language, I'll be looking for the injection points, no matter how complex they are to use. If they don't exist, the technology will need a very compelling reason for me to use it.

Saturday, February 18, 2017

WebAuthenticationBroker and OAuth2 UserCancel Error

Windows Phone 8.1 programming can be a little... opaque. Yes, I was working on a Phone 8.1 project. My phone can't be upgraded to 10. I'm in the less than 1% of phone users.

I wanted to build an app that can do OAuth2 authentication, so I started with an example from IdentityServer3.Samples on GitHub. It uses WebAuthenticationBroker to present the OAuth server's UI to the user. A client with an ID of "implicitclient" was missing from Clients.cs, but I just copied one of the JavaScript implicit flow clients. The sample worked - I could get both an ID token and an access token at the same time.

I created a similar Client.cs in my existing project and pretty much copied the sample WinPhone example. I could get an ID token from my server and I could get an access token. But not both at the same time.

The WebAuthenticationResult.ResponseStatus value was WebAuthenticationStatus.UserCancel, a very generic error that has many sources. The ResponseErrorDetail property had a more specific error number, 2148270093. I couldn't find many references to this number on the web, but in its hex form, 0x800C000D, I found results. It's a "URL moniker code" produced by IE meaning INET_E_UNKNOWN_PROTOCOL. The description is "The protocol is not known and no pluggable protocols have been entered that match." Still not much to go on.

The callback URL for Phone 8.1 apps starts with ms-app://, which I thought maybe wasn't being recognized. I pointed my Phone app at the sample server and I could get both tokens at the same time.

I debugged the Phone app and grabbed the URL from both my auth server and the sample one. The sample server's callback URL was quite a bit shorter. It started to dawn on me that Phone 8.1 uses IE11 and that it might have a fairly conservative URL size limit. It turns out it's 2083, which is not long enough to hold both of my server's tokens. My signing certificate's key is twice the size of the sample servers, making the token signature twice as long.

So, how to shorten the URL?

I was needlessly including some claims, so I cut them out. I read that elliptical curve keys are shorter, which makes for shorter signatures. IdentityServer3 doesn't have support for EC certificates out of the box, so that would have been some work.

Then I finally stumbled across the idea of reference tokens. It turns out that they are the typical way to shorten an OAuth2 callback URL. Instead of the entire access token the URL contains a short identifier. Clients send the identifier to the server and the server looks it up from its database.

After 3 or 4 days of beating my head against the wall, problem solved. Now I can login.

Wednesday, December 14, 2016

Don't be Afraid of SingleOrDefault - Much Worse, Performance Problem Edition

I found another example at work of someone not taking advantage of SingleOrDefault when making Linq-to-Sql calls, but in a much worse way. I previously mentioned the use of Any() and First() in https://jamesmclachlan.blogspot.ca/2014/05/dont-be-afraid-of-singleordefault.html.

The new example constructs a query and calls Any() to test for the presence of at least 1 result. Any() is very efficient on its own because it uses SQL's EXISTS to just check for at least 1 record without reading anything about the record. This isn't a problem on its own.

Unfortunately, this was followed by a call to ToList() and then [0] to get the first item, instead of First(). The effect of ToList()[0] is to run the query and pull every one of the matching records into memory and then take the first item. First() at least tells SQL to only return the TOP 1 item.

Even worse is that because of faulty logic the code fails to add any query parameters, loading ALL of the records in the system of a particular type. Production has many tens of thousands of such records. Luckily, it only does this in a very specific case. If anyone has ever seen a problem they haven't reported it.

So it's not enough to hunt down uses of First() and FirstOrDefault(). We also need to look for ToList()[0], or perhaps all uses of All().

Friday, October 14, 2016

Handling User Data Securely - Really, Really Securely

Imagine you're building a web app that requires a password from a user for encryption. This password is not the same as their login one, it's used only for encrypting and decrypting their data in the browser. The user enters the password, the app uses it to generate an encryption key that the browser can use and then encrypts and uploads data. Later, the user types the password in again so that they can work with the encrypted data.

One of the things your app promises is that it never stores the password on the server in any form that another party could intercept or crack. And if someone compromises the user's computer or it gets stolen or seized the app promises that there is no trace of the password left behind.

This means never including the password or a hash of it on any posts to the server, neither in form submissions nor AJAX calls nor in something like ASP.NET Webforms view state. Similarly, the app must not save the password in any form to a persistent store like the browser's local storage or one of the database systems that some browsers support.

What options do you have for building such a system with a typical web system that involves form posts like ASP.NET MVC or WebForms?

There are 3 ways a web app typically persists data that the user enters from one page to another:

cookies written to disk by the browser and sent back to the server with the page request
a view state system that includes bits of data in hidden form fields
session storage in the browser to leave data behind for the resulting page to read

The first 2 obviously send data to the server, so they are not solutions. The third one does't send data to the server, and it might seem as good as being only in memory. However different browsers treat it differently. They don't all clear it when you close the browser, and if the browser dies or the power goes out it may be left on disk.

You might be tempted to encrypt the value before writing it to view state or session storage, but who controls that encryption key? If you do, then you have to manage it and may have to reveal it under court order. If you generate a key in the user's browser, then you have the same problem of storing something between page views.

You conclude that the you can only hold the user's password in memory, meaning a JavaScript variable. The problem is that they disappear every time you visit another page. The easiest solution to program is for the app to ask the user to enter their password on every page that needs it. Not exactly a delightful experience.

Obviously you want to minimize the number of times the user enters their password. One approach would be for the app to have a field that collects the password, perhaps at the top. The rest of the page performs AJAX-y form posts that submit encrypted data, receives HTML from the server and updates the DOM with the response. Something like WebForms' UpdatePanel or Ajax.BeginForm in MVC.

If the user navigates to another part of the app that doesn't need the password, though, they have to reenter it when they return. This might be an advantage because the app only asks for the password in applicable contexts. Users would learn to do everything they need to before navigating, or they'd use 2 browser windows or tabs.

You could get the number of times the user needs to enter the password down to once per use session by using a single page app (SPA). If the app is all JavaScript and AJAX calls for data then it can keep the variable holding the password alive for the entire duration of use. If the user closes the window the password evaporates. (Well, it might still be sitting in RAM or on disk in a virtual memory file, but this seems beyond our control in a browser.)

SPAs have an entirely different development process that you may have to learn, especially if you are used to server side web development (ask me how I know :). Putting your time in to learn will be worth it, though, if you want to give the user the best possible experience of entering a secure value exactly once.