I recently worked through a situation where a bulk upsert (update if a matching record already exists; otherwise insert a new record) to SQL Server implemented using Entity Framework 6 was producing the correct results, but running very slowly.
I was able to improve the performance of the bulk upsert by about 600x (!!!) by replacing the Entity Framework implementation with a an new approach based on a stored procedure taking a table-valued parameter.
Entity Framework implementation – very slow
The original code, using Entity Framework 6 (altered to use a table / object type of “Employee” instead of the actual type I was working with):
DateTime now = DateTime.Now;
using (MyCustomContext context = new MyCustomContext())
{
foreach (Employee employee in employeeData)
{
// Get the matching Employee record from the database, if there is one.
Employee employeeRecord = context.Employees
.Where(e => e.EmployeeID == employee.EmployeeID)
.SingleOrDefault();
bool isNewRecord = false;
if (employeeRecord == null)
{
// We don't have a record in the database for this employee yet,
// so we'll add a new one.
isNewRecord = true;
employeeRecord = new Employee()
{
EmployeeID = employee.EmployeeID,
CreationDate = now
};
}
employeeRecord.ModificationDate = now;
employeeRecord.FirstName = employee.FirstName;
// (Set the remaining attributes...)
if (isNewRecord)
{
context.Employees.Add(employeeRecord);
}
context.SaveChanges();
}
}
That approach was taking around 300 seconds per 1000 records inserted. It was bogging down my application in situations where I needed to insert/update tens of thousands of records at once.
Stored Procedure + Table-Valued Parameter implementation – fast!
In my testing, the revised approach below, replacing the Entity Framework implementation with a stored procedure taking a table-valued parameter, was able to upsert the same 1000 records in 0.5 second – a huge improvement!
I needed to add a custom table type with columns matching the Employees table:
CREATE TYPE EmployeesTableType AS TABLE
(
[EmployeeID] [int] NOT NULL,
[CreationDate] [datetime] NOT NULL,
[FirstName] [nvarchar(100)] NOT NULL,
/* More fields here... */
);
Next, the stored procedure to perform the upsert, taking input of an instance of the custom EmployeesTableType, and using the SQL Server MERGE command:
CREATE PROCEDURE dbo.UpsertEmployees
@UpdateRecords dbo.EmployeesTableType READONLY
AS
BEGIN
MERGE INTO Employees AS Target
USING @UpdateRecords AS Source
ON Target.EmployeeID = Source.EmployeeID
WHEN MATCHED THEN
UPDATE SET Target.EmployeeID = Source.EmployeeID,
Target.FirstName = Source.FirstName,
/* More fields here... */
WHEN NOT MATCHED THEN
INSERT (EmployeeID,
FirstName,
/* More field names here... */
)
VALUES (Source.EmployeeID,
Source.FirstName,
/* More field values here... */
);
END
Finally, here’s the C# code I used to execute the stored procedure:
public static void UpsertEmployeeData(List<Employee> employees)
{
string connectionString = MyCustomMethodToGetConnectionString();
using(SqlConnection connection = new SqlConnection(connectionString))
{
connection.Open();
using (SqlCommand command = connection.CreateCommand())
{
command.CommandText = "dbo.UpsertEmployees";
command.CommandType = CommandType.StoredProcedure;
SqlParameter parameter = command.Parameters.AddWithValue("@UpdateRecords",
CreateUpdateRecordsSqlParam(employees)); // See implementation below
parameter.SqlDbType = SqlDbType.Structured;
parameter.TypeName = "dbo.EmployeesTableType";
command.ExecuteNonQuery();
}
}
}
private static DataTable CreateUpdateRecordsSqlParam(IEnumerable employees)
{
DataTable table = new DataTable();
table.Columns.Add("EmployeeID", typeof(Int32));
table.Columns.Add("FirstName", typeof(string));
// More fields here...
foreach (Employee employee in employees)
{
table.Rows.Add(employee.EmployeeID,
employee.FirstName,
// More fields here...
}
return table;
}
Ideally, this post will serve as a useful reference the next time I, or you, need to code up a bulk upsert into SQL Server!
Credit to this StackOverflow answer by Ryan Prechel which was the primary inspiration for this approach.
Thanks Jon. This article has been very helpful.
ReplyDelete