is one of the most popular ORM frameworks in the .NET world, and for good reason. It’s generally quick and easy to get started doing basic CRUD operations, performance is acceptable for most use cases, and at version 6.3 for the full .NET Framework, it is very mature and stable. However, for more complex queries, it is common to run into some performance issues.
In this blog post, I will detail the steps I took to reduce the time to complete a query using Entity Framework, and then ultimately how I used another .NET ORM framework called Dapper to further increase query performance.
Recently I worked on a project for users to track hundreds of data points about their clients, and then create complex reports using that data. Clients would have demographic information such as age, gender, household type, and others spread across normalized database tables. They would also have multiple “assessment” collections tied to them, where users would input data about their clients’ health, income, and other factors.
The model would look something like this:
…only with many more assessment types (each of which contained their own nested object properties).
Since most of the reports the users needed to generate would reference all the various assessments and other nested objects within the Client, I decided the best way to handle this would be to query the database once and populate all these nested properties, then pass this collection to the report service where the calculations are performed.
This looked something like this:
…only with many more . Include() statements added.
When Entity Framework builds this query, it will use SQL “JOIN”s to get the one-to-one mappings such as “HouseholdType”. However, when it gets the collections, it will perform a “UNION ALL” across all the assessments and their various properties. Depending on how many collections you need to include, and how many properties/nested objects they contain, this can create one huge ugly query that is not very performant. With hundreds of clients, containing dozens of assessments, this query would take a few minutes to complete in our testing.
Finding the Solution
The first thing I tried to remedy this issue was to drop the .Include() statements for the nested collections, and instead loop through the clients and manually call “Load” on them.
That looked like this:
This method will make a call to the database for each collection. In our testing, it did shave about 30 seconds off the total query time, but that wasn’t quite enough for my liking. I had heard of the performance benefits of Dapper, so at this point, I had decided to give it a try.
Dapper to the Rescue!
With Dapper, you write the SQL query yourself and write code to tell it how to map the results of your query to your models. To do this with collections, I used the QueryMultiple extension, as shown below:
To my surprise, loading my nested objects this way reduced our total query time from 2-3 minutes down to 6 seconds!
As you can see, it is easy to include Dapper alongside an existing Entity Framework project. All I had to do was install the Dapper NuGet package and then add “using Dapper;” to my namespace imports at the beginning of my code.