Tuesday, August 30, 2016

Fix: SSL Handshaking Error in Windows Server 2008 R2 / 2012 R2

I recently did troubleshooting for, and managed to successfully fix, an issue where HTTPS connections to a specific remote server were failing to be made successfully.  The client computers affected by the issue were a pair of servers, running Windows 2012 R2 and Windows 2008 R2, respectively.

For the purposes of this post, I’ll use https://tls.example.com as the URL of the remote server.

The Problem

Symptom 1: In a C# program, an attempt to establish an HTTPS (SSL / TLS) connection to https://tls.example.com failed.  Error message: “The request was aborted: Could not create SSL/TLS secure channel.”

  • The program did work fine to make connections to all other HTTPS URLs that we had tried.
  • The exact same C# program worked fine when I ran it from my local workstation as the client PC (connecting to the same https://tls.example.com remote server).

Symptom 2: In Internet Explorer 11, attempting to connect to https://tls.example.com failed.  Error message: “Turn on TLS 1.0, TLS 1.1, and TLS 1.2 in Advanced settings and try connecting to again. If this error persists, contact your site administrator.”

IE11_TLS_Error

  • However, connecting to https://tls.example.com using the Chrome browser from that same client PC worked fine.
  • Connecting to https://tls.example.com from my local workstation using Internet Explorer 11 also worked fine.

The Solution

Note: This solution will only help if the remote server is configured with an SSL key that has an ECDSA (not RSA) signature, but all of the the cipher_suites that the client PC is configured to support are RSA (not ECDSA).

Note 2: If you’re reading this post after August 2016, check and make sure the new cipher_suites value that you add is one that’s still cryptographically valid. These things tend to change over time!

Note 3: Don’t use Registry Editor (as suggested here) unless you know what you’re doing. It can permanently damage your PC.

In my case, the problem was caused by there being no match between the set of cipher_suites supported by the client, and the set of values that the server was able to accept.  Specifically, in my case, the server had an SSL key signed with ECDSA (not RSA), and my problematic client PCs were configured to use only ECDSA (not RSA) cipher_suites.  This caused SSL handshaking to fail after the initial “Client Hello” step.

I was able to fix this by adding a ECDSA value to my client PC’s set of cipher_suites:

On the client PC:

  1. Open the Registry Editor.
  2. Navigate to HKLM/SOFTWARE/Policies/Microsoft/Cryptography/Configuration/SSL/0010002
  3. Edit the existing comma-separated value, and add a new value to the end that’s supported by the client OS, is cryptographically secure, and works with a key with an ECDSA signature. The value I used: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256_P256
  4. Reboot.

Investigation Details

The remainder of this post details the investigation that led me to the above solution.

SSL / TLS protocol mismatch?

I’ve run into SSL handshaking problems before caused by a protocol mismatch. For example, the client specified that it would only connect using SSL 3.0 or TLS 1.0, but the server would only accept TLS 1.2.  However, that did not seem to be the cause of the issue here (despite the Internet Explorer error message):

  • In my C# program, I was specifying that the client accept any of TLS 1.2 | TLS 1.1 | TLS 1.0. 
  • In Internet Explorer’s Advanced Options dialog, I confirmed that the checkboxes for TLS 1.2, TLS 1.1, and TLS 1.0 were all already checked (again, despite the error message).
  • In Firefox, by clicking on the green lock icon in the address bar after successfully connecting to the remote website, I confirmed that the connection was secured using TLS 1.2.

Firefox_Security_Info

As far as I could tell, both the client and server should be agreeing on the use of TLS 1.2.  Thus, probably not a protocol mismatch issue.

SSL certificate trust chain issue?

When I asked myself the question “So what’s different between my local PC (where things work fine) and my server PCs (not working)?”, the first answer I came up with was, maybe the installed trusted SSL root certificates?

However, that theory turned out to be a dead end in this case.  When I used the “Manage server certificates” / “certlm” tool to look at the installed certificates on my PCs at Certificates > Trusted Root Certification Authorities, although there were some differences between the root certs on my local Windows 10 PC versus the root certs installed on the Windows Server 2012 R2 PC, that didn’t turn out to be the cause of the problem.

Additional symptom: System event log error

My first clue to the actual problem was a Windows System event log error that I noticed would be logged whenever I reproduced the HTTPS connection failure in Internet Explorer or my custom C# program:

“A fatal alert was received from the remote endpoint. The TLS protocol defined fatal alert code is 40.”

A helpful MDSN blog post defined that error code of 40 as “handshake_failure”.

Network traffic sniffing using Microsoft Message Analyzer

As suggested by another very helpful Microsoft blog post, I installed Microsoft Message Analyzer.  (It turns out that I needed to install the 64-bit version of Analyzer to match my OS, even though as far as I know, browsers typically run as 32-bit processes.)

Using Message Analyzer turned out to be easy.  I just did the following:

    1. In Analyzer, hit the “New Session” button;
    2. Selected “Local Network Interfaces”;
    3. Hit Start;
    4. Switched windows to my C# program, and reproduced the issue;
    5. Switched back to Analyzer, and hit the Stop button.

Analyzer_NewSession

I filtered out all irrelevant events captured while my session was running by applying this filter:

(*Source == "www.example.com" or *Destination == "www.example.com") and *Summary contains "Handshake"

(Where both instances of “www.example.com” were replaced with the actual host to which I was connecting.)

On my local PC where the HTTPS connection was working, the Message Analyzer results included a “Handshake: [Client Hello]” message originating from my local PC, followed by a “Handshake: [Server Hello]” originating from the server.

However, on the Windows Server 2012 R2 machine where the the connection was failing, I could see the “Handshake: [Client Hello]” from the local machine was followed by an “Alert” reply from the server!

Doing a right-click | Show Details on the Alert reply, I could see that it contained a body message of “Level 2, Description 40”.  This reply must have been what the System Event Log was picking up to generate that message that I’d noticed earlier.

Comparing the successful and unsuccessful Client Hello messages

At this point, I’d narrowed down the difference between the succeeding and failing environments to the differing server replies to the initial “Client Hello” step of SSL handshake. 

Still in Message Analyzer, I did another Show Details to compare the contents of the “Client Hello” on my Windows 10 PC (working) and my Windows Server 2012 R2 machine (not working). 

The significant difference turned out to be the cipher_suites parameter in the body of each PC’s “Client Hello” message. 

Analyzer

As I learned, the cipher_suites parameter contains the list of encryption settings which the PC sending the message is able to handle. The idea is that the server picks the one from that list that it prefers, sends a “Server Hello” reply that includes the selected cipher suite, and the two sides use that to securely communicate.

It turns out that while my Windows 10 PC (working) was sending a selection of 33 cipher_suites values that it was able to support, the Server 2012 R2 PC (not working) was sending only 11 cipher_suites values!

Each cipher_suites value, while it appears in the raw message body as an integer, “translates” to a descriptive string value like: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA256. (Message Analyzer helpfully performs this translation when displaying the values in the cipher_suites value under the “body” value, as is mostly visible in the screenshot above.)

The Microsoft article Cipher Suites in TLS/SSL provides a very helpful picture of what the parts of those cipher_suites values mean, which I’ll borrow and display here:

IC858360

Taking a closer look, the 33 cipher_suites values from the Client Hello message Windows 10 PC (working) included a mix of cipher_suites values contained a mix of RSA, DHE, and ECDSA as the Signature value.  The 11 values from the Server 2012 R2 PC (not working) all had RSA as the Signature value!

A Certificate Signing Algorithm Mismatch?

Discovering that the not-working Server 2012 R2 PC was effectively saying that it would only support RSA as the cert signing method immediately suggested a new likely theory: If the server cert was signed with something other than RSA, the SSL handshaking would fail.

Sure enough, drilling further down into the cert details in Firefox showed that the cert was signed with not RSA, but ECDSA:

Cert_Details

In essence, the failing SSL handshaking conversation was going like this:

  • Client [Client Hello]: Hey, let’s talk securely, using any of these methods (…), as long as you’ve got an RSA-signed cert!
  • Server [Alert]: Sorry, nope, I can’t do business along those parameters. Bye!

Getting the Server 2012 PC to accept an ECDSA certificate

A great blog post by Nartac Software on how their IIS Crypto tool works pointed me to the solution. A Windows registry key mentioned in that article contained the same set of cipher_suites values that I was seeing in the problem PC’s Client Hello SSL handshake message:

HKLM\SOFTWARE\Policies\Microsoft\Cryptography\Configuration\SSL\00010002

In the Server Hello SSL handshake message on my working Windows 10 PC, I could see that the cipher_suites value that the server had selected to successfully communicate with was:

TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256

From that same article, another registry location has the list of supported cipher suites on the server:

HKLM\SYSTEM\CurrentControlSet\Control\Cryptography\Configuration\Local\SSL\00010002

Looking in that registry location on the Server 2012 R2 PC, I saw that one of the supported values was

TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256_P256

With the cipher suite portion of that key being a match for the accepted value that had been accepted by the server in the SSL handshake from my Windows 10 PC, I edited the comma-separated list of cipher suite values from the first 00010002 registry key above to include this additional cipher key value. Finally, I rebooted the Server 2012 R2 PC (since a reboot is required to make the change take effect).

After the reboot, the problems were solved! Internet Explorer was successfully able to connect to the target website, and my C# app was also able to successfully establish an HTTPS connection.

So how had this happened?

I posed the question to the failing client PCs’ hosting provider: Are Windows Server 2008 R2 and Windows Server 2012 R2 machines configured by default to only accept RSA SSL certs, or is this something that the hosting provider configures in their “default” images?

The answer, it turned out, was neither of the above. Instead, the missing non-RSA cipher suite values had been intentionally removed in an “server hardening” task performed some time in the past.  This probably made sense originally, under an assumption that these servers would never themselves be acting as the client side of an HTTPS connection, and therefore for the sake of reducing attack surface, could have cipher suites with signature types other than the servers’ own cert signatures disabled.

Wednesday, August 10, 2016

Bulk upsert into SQL Server

I recently worked through a situation where a bulk upsert (update if a matching record already exists; otherwise insert a new record) to SQL Server implemented using Entity Framework 6 was producing the correct results, but running very slowly. 

I was able to improve the performance of the bulk upsert by about 600x (!!!) by replacing the Entity Framework implementation with a an new approach based on a stored procedure taking a table-valued parameter.

Entity Framework implementation – very slow

The original code, using Entity Framework 6 (altered to use a table / object type of “Employee” instead of the actual type I was working with):

DateTime now = DateTime.Now;
using (MyCustomContext context = new MyCustomContext())
{
    foreach (Employee employee in employeeData)
    {
        // Get the matching Employee record from the database, if there is one.
        Employee employeeRecord = context.Employees
            .Where(e => e.EmployeeID == employee.EmployeeID)
            .SingleOrDefault();

        bool isNewRecord = false;
        if (employeeRecord == null)
        {
            // We don't have a record in the database for this employee yet, 
            // so we'll add a new one.
            isNewRecord = true;
            employeeRecord = new Employee()
            {
                EmployeeID = employee.EmployeeID,
                CreationDate = now
            };
        }

        employeeRecord.ModificationDate = now;
        employeeRecord.FirstName = employee.FirstName;
        // (Set the remaining attributes...)

        if (isNewRecord)
        {
            context.Employees.Add(employeeRecord);
        }

        context.SaveChanges();
    }
}

That approach was taking around 300 seconds per 1000 records inserted.  It was bogging down my application in situations where I needed to insert/update tens of thousands of records at once.

Stored Procedure + Table-Valued Parameter implementation – fast!

In my testing, the revised approach below, replacing the Entity Framework implementation with a stored procedure taking a table-valued parameter, was able to upsert the same 1000 records in 0.5 second – a huge improvement!

I needed to add a custom table type with columns matching the Employees table:

CREATE TYPE EmployeesTableType AS TABLE
(
    [EmployeeID] [int] NOT NULL,
    [CreationDate] [datetime] NOT NULL,
    [FirstName] [nvarchar(100)] NOT NULL,
    /* More fields here... */
);

Next, the stored procedure to perform the upsert, taking input of an instance of the custom EmployeesTableType, and using the SQL Server MERGE command:

CREATE PROCEDURE dbo.UpsertEmployees
    @UpdateRecords dbo.EmployeesTableType READONLY
AS
BEGIN
    MERGE INTO Employees AS Target
    USING @UpdateRecords AS Source
    ON Target.EmployeeID = Source.EmployeeID
    WHEN MATCHED THEN
        UPDATE SET Target.EmployeeID = Source.EmployeeID,
            Target.FirstName = Source.FirstName,
            /* More fields here... */
    WHEN NOT MATCHED THEN           
        INSERT (EmployeeID,
            FirstName,
            /* More field names here... */
            )
        VALUES (Source.EmployeeID,
            Source.FirstName,
            /* More field values here... */
            );
END

Finally, here’s the C# code I used to execute the stored procedure:

public static void UpsertEmployeeData(List<Employee> employees)
{
    string connectionString = MyCustomMethodToGetConnectionString();
    using(SqlConnection connection = new SqlConnection(connectionString))
    {
        connection.Open();

        using (SqlCommand command = connection.CreateCommand())
        {
            command.CommandText = "dbo.UpsertEmployees";
            command.CommandType = CommandType.StoredProcedure;

            SqlParameter parameter = command.Parameters.AddWithValue("@UpdateRecords",
                CreateUpdateRecordsSqlParam(employees));  // See implementation below
            parameter.SqlDbType = SqlDbType.Structured;
            parameter.TypeName = "dbo.EmployeesTableType";

            command.ExecuteNonQuery();
        }
    }
}

private static DataTable CreateUpdateRecordsSqlParam(IEnumerable employees)
{
    DataTable table = new DataTable();
    table.Columns.Add("EmployeeID", typeof(Int32));
    table.Columns.Add("FirstName", typeof(string));
    // More fields here...

    foreach (Employee employee in employees)
    {
        table.Rows.Add(employee.EmployeeID,
            employee.FirstName,
            // More fields here...
    }

    return table;
}

Ideally, this post will serve as a useful reference the next time I, or you, need to code up a bulk upsert into SQL Server!

Credit to this StackOverflow answer by Ryan Prechel which was the primary inspiration for this approach.