I recently did troubleshooting for, and managed to successfully fix, an issue where HTTPS connections to a specific remote server were failing to be made successfully. The client computers affected by the issue were a pair of servers, running Windows 2012 R2 and Windows 2008 R2, respectively.
For the purposes of this post, I’ll use https://tls.example.com as the URL of the remote server.
The Problem
Symptom 1: In a C# program, an attempt to establish an HTTPS (SSL / TLS) connection to https://tls.example.com failed. Error message: “The request was aborted: Could not create SSL/TLS secure channel.”
- The program did work fine to make connections to all other HTTPS URLs that we had tried.
- The exact same C# program worked fine when I ran it from my local workstation as the client PC (connecting to the same https://tls.example.com remote server).
Symptom 2: In Internet Explorer 11, attempting to connect to https://tls.example.com failed. Error message: “Turn on TLS 1.0, TLS 1.1, and TLS 1.2 in Advanced settings and try connecting to again. If this error persists, contact your site administrator.”
- However, connecting to https://tls.example.com using the Chrome browser from that same client PC worked fine.
- Connecting to https://tls.example.com from my local workstation using Internet Explorer 11 also worked fine.
The Solution
Note: This solution will only help if the remote server is configured with an SSL key that has an ECDSA (not RSA) signature, but all of the the cipher_suites that the client PC is configured to support are RSA (not ECDSA).
Note 2: If you’re reading this post after August 2016, check and make sure the new cipher_suites value that you add is one that’s still cryptographically valid. These things tend to change over time!
Note 3: Don’t use Registry Editor (as suggested here) unless you know what you’re doing. It can permanently damage your PC.
In my case, the problem was caused by there being no match between the set of cipher_suites supported by the client, and the set of values that the server was able to accept. Specifically, in my case, the server had an SSL key signed with ECDSA (not RSA), and my problematic client PCs were configured to use only ECDSA (not RSA) cipher_suites. This caused SSL handshaking to fail after the initial “Client Hello” step.
I was able to fix this by adding a ECDSA value to my client PC’s set of cipher_suites:
On the client PC:
- Open the Registry Editor.
- Navigate to HKLM/SOFTWARE/Policies/Microsoft/Cryptography/Configuration/SSL/0010002
- Edit the existing comma-separated value, and add a new value to the end that’s supported by the client OS, is cryptographically secure, and works with a key with an ECDSA signature. The value I used: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256_P256
- Reboot.
Investigation Details
The remainder of this post details the investigation that led me to the above solution.
SSL / TLS protocol mismatch?
I’ve run into SSL handshaking problems before caused by a protocol mismatch. For example, the client specified that it would only connect using SSL 3.0 or TLS 1.0, but the server would only accept TLS 1.2. However, that did not seem to be the cause of the issue here (despite the Internet Explorer error message):
- In my C# program, I was specifying that the client accept any of TLS 1.2 | TLS 1.1 | TLS 1.0.
- In Internet Explorer’s Advanced Options dialog, I confirmed that the checkboxes for TLS 1.2, TLS 1.1, and TLS 1.0 were all already checked (again, despite the error message).
- In Firefox, by clicking on the green lock icon in the address bar after successfully connecting to the remote website, I confirmed that the connection was secured using TLS 1.2.
As far as I could tell, both the client and server should be agreeing on the use of TLS 1.2. Thus, probably not a protocol mismatch issue.
SSL certificate trust chain issue?
When I asked myself the question “So what’s different between my local PC (where things work fine) and my server PCs (not working)?”, the first answer I came up with was, maybe the installed trusted SSL root certificates?
However, that theory turned out to be a dead end in this case. When I used the “Manage server certificates” / “certlm” tool to look at the installed certificates on my PCs at Certificates > Trusted Root Certification Authorities, although there were some differences between the root certs on my local Windows 10 PC versus the root certs installed on the Windows Server 2012 R2 PC, that didn’t turn out to be the cause of the problem.
Additional symptom: System event log error
My first clue to the actual problem was a Windows System event log error that I noticed would be logged whenever I reproduced the HTTPS connection failure in Internet Explorer or my custom C# program:
“A fatal alert was received from the remote endpoint. The TLS protocol defined fatal alert code is 40.”
A helpful MDSN blog post defined that error code of 40 as “handshake_failure”.
Network traffic sniffing using Microsoft Message Analyzer
As suggested by another very helpful Microsoft blog post, I installed Microsoft Message Analyzer. (It turns out that I needed to install the 64-bit version of Analyzer to match my OS, even though as far as I know, browsers typically run as 32-bit processes.)
Using Message Analyzer turned out to be easy. I just did the following:
- In Analyzer, hit the “New Session” button;
- Selected “Local Network Interfaces”;
- Hit Start;
- Switched windows to my C# program, and reproduced the issue;
- Switched back to Analyzer, and hit the Stop button.
I filtered out all irrelevant events captured while my session was running by applying this filter:
(*Source == "www.example.com" or *Destination == "www.example.com") and *Summary contains "Handshake"
(Where both instances of “www.example.com” were replaced with the actual host to which I was connecting.)
On my local PC where the HTTPS connection was working, the Message Analyzer results included a “Handshake: [Client Hello]” message originating from my local PC, followed by a “Handshake: [Server Hello]” originating from the server.
However, on the Windows Server 2012 R2 machine where the the connection was failing, I could see the “Handshake: [Client Hello]” from the local machine was followed by an “Alert” reply from the server!
Doing a right-click | Show Details on the Alert reply, I could see that it contained a body message of “Level 2, Description 40”. This reply must have been what the System Event Log was picking up to generate that message that I’d noticed earlier.
Comparing the successful and unsuccessful Client Hello messages
At this point, I’d narrowed down the difference between the succeeding and failing environments to the differing server replies to the initial “Client Hello” step of SSL handshake.
Still in Message Analyzer, I did another Show Details to compare the contents of the “Client Hello” on my Windows 10 PC (working) and my Windows Server 2012 R2 machine (not working).
The significant difference turned out to be the cipher_suites parameter in the body of each PC’s “Client Hello” message.
As I learned, the cipher_suites parameter contains the list of encryption settings which the PC sending the message is able to handle. The idea is that the server picks the one from that list that it prefers, sends a “Server Hello” reply that includes the selected cipher suite, and the two sides use that to securely communicate.
It turns out that while my Windows 10 PC (working) was sending a selection of 33 cipher_suites values that it was able to support, the Server 2012 R2 PC (not working) was sending only 11 cipher_suites values!
Each cipher_suites value, while it appears in the raw message body as an integer, “translates” to a descriptive string value like: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA256. (Message Analyzer helpfully performs this translation when displaying the values in the cipher_suites value under the “body” value, as is mostly visible in the screenshot above.)
The Microsoft article Cipher Suites in TLS/SSL provides a very helpful picture of what the parts of those cipher_suites values mean, which I’ll borrow and display here:
Taking a closer look, the 33 cipher_suites values from the Client Hello message Windows 10 PC (working) included a mix of cipher_suites values contained a mix of RSA, DHE, and ECDSA as the Signature value. The 11 values from the Server 2012 R2 PC (not working) all had RSA as the Signature value!
A Certificate Signing Algorithm Mismatch?
Discovering that the not-working Server 2012 R2 PC was effectively saying that it would only support RSA as the cert signing method immediately suggested a new likely theory: If the server cert was signed with something other than RSA, the SSL handshaking would fail.
Sure enough, drilling further down into the cert details in Firefox showed that the cert was signed with not RSA, but ECDSA:
In essence, the failing SSL handshaking conversation was going like this:
- Client [Client Hello]: Hey, let’s talk securely, using any of these methods (…), as long as you’ve got an RSA-signed cert!
- Server [Alert]: Sorry, nope, I can’t do business along those parameters. Bye!
Getting the Server 2012 PC to accept an ECDSA certificate
A great blog post by Nartac Software on how their IIS Crypto tool works pointed me to the solution. A Windows registry key mentioned in that article contained the same set of cipher_suites values that I was seeing in the problem PC’s Client Hello SSL handshake message:
HKLM\SOFTWARE\Policies\Microsoft\Cryptography\Configuration\SSL\00010002
In the Server Hello SSL handshake message on my working Windows 10 PC, I could see that the cipher_suites value that the server had selected to successfully communicate with was:
TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256
From that same article, another registry location has the list of supported cipher suites on the server:
HKLM\SYSTEM\CurrentControlSet\Control\Cryptography\Configuration\Local\SSL\00010002
Looking in that registry location on the Server 2012 R2 PC, I saw that one of the supported values was
TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256_P256
With the cipher suite portion of that key being a match for the accepted value that had been accepted by the server in the SSL handshake from my Windows 10 PC, I edited the comma-separated list of cipher suite values from the first 00010002 registry key above to include this additional cipher key value. Finally, I rebooted the Server 2012 R2 PC (since a reboot is required to make the change take effect).
After the reboot, the problems were solved! Internet Explorer was successfully able to connect to the target website, and my C# app was also able to successfully establish an HTTPS connection.
So how had this happened?
I posed the question to the failing client PCs’ hosting provider: Are Windows Server 2008 R2 and Windows Server 2012 R2 machines configured by default to only accept RSA SSL certs, or is this something that the hosting provider configures in their “default” images?
The answer, it turned out, was neither of the above. Instead, the missing non-RSA cipher suite values had been intentionally removed in an “server hardening” task performed some time in the past. This probably made sense originally, under an assumption that these servers would never themselves be acting as the client side of an HTTPS connection, and therefore for the sake of reducing attack surface, could have cipher suites with signature types other than the servers’ own cert signatures disabled.