Focused Apps Need Focused Databases for .NET
Microsoft has put a lot of effort into making .NET a fully cross-platform framework appropriate for projects of any size. .NET Core, .NET 5, and the upcoming .NET 6 have all allowed for not only more sophisticated applications but ones that can run on our phones, in our web browsers, along with a greater variety of desktop environments.
The additional platforms and the new applications possible with .NET Core has made us re-examine our own software, VistaDB. We needed to make VistaDB a better database for more environments, more applications, and we found the best way to do that was to write a group of new modes for our database engine: the single process modes. Overall, these modes improve the query performance of our database performance substantially (up to 20% over the multi process mode) and expand the use cases for VistaDB.
Here, we’ll discuss the broader .NET trends that led to us adding the single process mode and some specifics on how it works.
Doing Less to… Do More?
Helping .NET run in limited environments has been a considerable focus of Microsoft’s .NET development. It’s why many of the biggest features have been mobile optimizations and stripping away what it takes to run a .NET application. Just look at the Themes of .NET website’s listed goals:
- “.NET apps work well in constrained environments” like mobile phones, tablets, containers, and more, which don’t have the resources of traditional windows desktops.
- “Xamarin developers can upgrade to and use the latest .NET SDKs in their existing applications” so Xamerin users (mobile app devs) can leverage the features and performance improvements in .NET 5 and 6.
- “Developers can safely trim ASP.NET apps to reduce their deployment payloads” meaning remove assemblies from .NET in self-contained apps. A measure taken to reduce the resources needed to run a .NET application.
I could reference more examples from that list alone, but the overwhelming trend is Microsoft wants .NET capable of running on any platform. In some cases, even if it means developers can pick .NET apart to do specific things better, like with trimming. Trimming allows developers to deploy with only specific components, making the application easier to fit into constrained environments.
Reducing the footprint, and in some scenarios the function, of .NET is a fascinating aspect of the development of .NET Core. Microsoft is making .NET a better framework by embracing minimalism!
It’s counterintuitive that a minimal .NET can allow for more usage, but it’s a principle that works across multiple industries. For example, let’s discuss the room pictured above.
It’s pretty simple, and that’s one of the appealing features. If I tried to make the room too “feature-packed,” it would probably make it worse. It doesn’t need to match the feature set of an arcade. It only needs to be a nice place to sit, and adding a skeeball table would make it worse for that goal.
The same broad principles apply to many .NET mobile and web applications. They don’t do the most advanced computing possible, or complete every function a company needs. These applications only need to provide an excellent experience for a few activities. Carrying the extra weight of unneeded frameworks and resources does not help anyone, so Microsoft has taken steps to make sure .NET can scale down appropriately.
If you want an example of how .NET scales down to make focused applications, just look at the applications featured in the Xamarin customer showcase. The applications described on the page complete relatively simple tasks (ordering food, tracking packages, reading the news, etc.). But they do so across a wide variety of platforms and devices. Removing unnecessary assemblies and features lets applications do the few necessary tasks better.
So, does any of this matter for the database?
It does, as the database can be a heavy performance problem for mobile and web applications. To make VistaDB the best embedded database to pair with .NET, we need to consider the smaller-scale, focused applications. .NET is now an excellent platform for making those applications, and we want to be as well.
The goal was always to bring a complete SQL Server experience to an embedded database that’s easier to deploy and fits on more devices. This way, more developers can get the full-fat SQL Server functionality they need for whatever application they want to build. But when an app needs to focus on one task, we want VistaDB to be an appropriate tool for the job.
So, let’s take a look at a feature we added in VistaDB 6 and continue to optimize, which helps it work for more focused applications across various mobile and web platforms: the single process modes.
Optimizing for Smaller, Focused Apps: Single Process Database Usage
The development of .NET Core and .NET 5 has changed how we viewed our .NET product, VistaDB. It’s not that we didn’t consider mobile and web applications before. We had Xamarin support, and knew our database could work for mobile app development. But the larger shift in focus in .NET showed us it was time to see if we could do more.
Previously, the primary use case was for multi-process desktop applications (aka, multiple processes could access the database file simultaneously), and we optimized our engine for those scenarios. But mobile and web applications often only use a single process to access the database at any given time. The goals for these applications are stripped down compared to desktop counterparts, and only need one process to run. In these use cases, many of our optimizations targeted for multi-process applications were not impactful.
So, starting with VistaDB 6 and VistaDB 6.1 (and continuing with VistaDB 6.2), we developed and optimized the single process modes in our engine. In short, the single process modes reduce the query time by approximately 10% - 20% compared to the multi process mode. There is enough of a difference that it matters for mobile and web applications, which need every edge they can get. Let’s go into why the single process modes provide a performance improvement.
Single Process Locks In Memory
If you are unfamiliar with why file locking is essential for embedded databases in the first place, I wrote about it here. But in short, locking is necessary because it prevents:
-
Lost Data: if multiple processes can access the file at the same time, one may overwrite the data written by the other. That means you will only have the changes made by one process stored on the database, which means any unique data from the other process is gone.
-
Incorrect Data: If two processes access the database simultaneously, they have no chance to reference changes made by each other. This may result in inaccurate data if what’s written by one process would impact the other.
-
File Corruption: If two processes make significant meta-data changes that overwrite each other, the database has a high chance of corruption, meaning it will require considerable effort to recover data.
We rely on file-system locks for multi-process applications to negotiate access to the database file, and it’s effective. Multi-process mode runs very safely thanks to the file-system locks. But, this locking method means we are restrained to the speed of the file system itself, which can get pretty slow, especially on lower spec devices. Many applications only use a single process though, including many mobile and web applications. For these users, it was worth designing a locking method that could perform better on more hardware.
So we did. We added two single-process modes in VistaDB 6: Single Process Read-Only and Single Process Read-Write. There’s an obvious downside with both of these modes: you can only use one process. But adding the single process modes let us change the locking method entirely.
These modes complete all the file locking in memory instead of using the file system itself. A single process only needs to negotiate for its own connections, so a file-system lock isn’t necessary. This allows for controlling locks within memory instead of on disk, which gives a nice performance boost to these applications.
But while it helped, we didn’t see as much performance gain as we expected. The code we were sharing with the other locking modes was masking most of the performance gains we had hoped to get. Ultimately, we had to make further changes to the locking implementation to help the single process modes perform optimally, which we’re delivering in VistaDB 6.2. Let’s take a look at some results from our performance tests:
These logs show a select query on a table with 297,305 rows. It’s a much larger table than average, but some applications will have a table or two of similar size, and we want to test for those scenarios. With VistaDB 6.1, this query takes a little over eight seconds. In version 6.2, the query takes just a little over four seconds, a 49.44% time difference between the two:
The baseline runs are on VistaDB 6.1, while the comparison is with VistaDB 6.2, with the query from the log data being the first result in the above screenshot. Overall, through all of our tests (85 queries per test run), we see a 25% -50% performance improvement per query! How we got these results involved two improvements:
- We have sped up multiple concurrent access
- We have reduced lock contention time between multiple processes
Specifically, lock contention has been a challenge since VistaDB 6.0 and has required some iterations along the way.
Negotiating Multiple Connections for a Single Process
Even if only one process is running, most applications will have multiple connections within that process. This is why a single process still needs to implement a locking method. The same problems caused by multiple processes accessing a database at the same time can happen with multiple connections from a single process.
Working with customers, we saw that even when locks were acquired quickly, we were still getting lock timeouts.
Digging into the problem, we discovered that the way we were selecting the next connection to get a lock when it was released
was designed for speed (great!) but didn’t guarantee every connection would get a chance at the lock.
Under high contention, some connections would get passed over repeatedly, causing them to eventually time out.
So we added a fairly simple solution: a fair queue.
- If there are one or two queries, we optimize for throughput.
- If there are many queries, we optimize for fairness. First come, first serve.
The queue adds a small amount of time to the negotiation process, but significantly improves application reliability. To make the single-process modes as good as our multi-process modes, we need them to be just as sturdy. While we try to avoid performance tradeoffs whenever possible, this was a worthwhile one to make. The performance hit is relatively small and worth it for the improved consistency. Thanks to changes made in VistaDB 6.2, we have reduced the performance impact of the queue as well, and wrote tests to keep tabs on this issue.
Focus More on Your App, We’ll Focus on the Database
The work we have done creating and optimizing the single process modes for the VistaDB engine allows our users to focus on making their applications great. And it’s work we are continuing. Performance optimization is one of the most important aspects of any database engine, especially when resources are limited. So, we’ll continue to sweat shaving seconds off transaction times, ensuring a sturdy database, and let our users focus on making the best .NET applications possible. No matter where they run.
Interested in try VistaDB for your application? It’s easy to get started with our 30-day free trial. You can learn more about VistaDB here, or join the trial now with the link below.