Introduction to Digital Certificates
Digital certificates are the cornerstone of the modern e-commerce and secure internet communications in general. A digital certificate is a means to establish trust, one way or mutual, between two parties before a transaction can take place between the two. The contents and integrity of the transaction are protected by the digital certificate from evesdroppers, who may desire to use it for self-gains or tamper with it, without such protection. A digital certificate proves the ownership of a “Cryptographic Key”. Let’s look at this in a bit more detail.
Need for Security in Communications
The need to secure communications between two people, say, Alice and Bob, separated by an untrusted communication medium is perhaps as old as the history of human civilization. Why communicate? Because that is what we do. Maybe Alice wants Bob to bring eggs home on the way back from work, for tomorrow’s breakfast.
What is the communication medium in between? A horse rider, a note carrying pigeon, a smoke signal, postal mail or in the modern era, a phone call, text message or email.
Why is the communication medium untrusted? Because Alice and Bob’s arch nemesis Eve, can’t see them happy and wants to tamper with the message to have Bob bring home milk instead, which they got plenty of, and will make Alice very unhappy if he brings it again.
The need for security in communication is evident in not just relationships as above, but also in business dealings, financial or non-financial transactions or simply for privacy. In the era of computers, there are a few ways in which to secure digital communications. The security technique needs to protect the message contents, that it has not been read by the middleman, called Encryption. It needs to protect the message health, that it has not tampered with to make it useless, called Integrity. Finally, the message needs to be decodable by the receiver to retrieve the actual contents, called Decryption. The multitude of techniques used to secure digital communication constitutes the science of Cryptography. Broadly, these techniques could be divided into two categories:
Symmetric-key Cryptography: These are cryptographic algorithms that use the same cryptographic key for both encryption and decryption of the message. The problem with this method is that the Key needs to be a shared secret between the parties before communicating over the untrusted medium, which could be a challenge. Examples of symmetric-key cryptography include DES, Blowfish, AES.
Asymmetric-key Cryptography: These are cryptographic algorithms that use different cryptographic keys for encryption and decryption of the message. The advantage of this method is that the sender and receiver have different keys for the communication and no shared-secret mechanism is needed prior to communication. This kind of cryptography is also known as “Public-key Cryptography”. Examples of asymmetric-key cryptography include RSA, DSA.
In public-key cryptography, the sender and receiver have different keys for the communication. Together the two are known as a cryptographic key pair. A message is encrypted with public-key before transmission. The encrypted message is jibberish to the middleman and extremely difficult to decode. The only way to decrypt the message is with the associated private-key in the cryptographic pair.
In a physical world analogy, it would be almost magic if you could lock a lock with one physical key but to unlock it, have to use a second key, that has no resemblance to the first key. What asymmetric-key cryptography does through public-key / private-key pair is not less than magic. It is a simple enough mathematical process. How it does it is transparent enough to an observer. But how to undo what it does is a diabolical task for the middleman without the knowledge of the private-key. It can easily take over a lifetime to brute force a key of decent size.
A key is just a very long string of mathematical numbers. One key is aptly called the Public-Key, while the other is known as the Private-key. Together, the public-key and private-key are used to secure communications and belongs to a single owner, a sender or a receiver. The owner is allowed to freely distribute the public-key while keeping the private-key very private. For communication in reverse direction, from original receiver to sender, another key-pair is generated by the original receiver. Again, only the public-key is made public. For bi-directional communications, there are two public-keys known to the world and two private-keys known only to their respective owners. The figure below shows one-directional communications – from Alice to Bob.
Now an interesting question. How are others (Alice) going to make sure the public-key belong to the actual owner they intend to communicate with (Bob) and not generated by a malicious adversary (Eve) in the middle impersonating as the actual owner? Well, that is what the “Certificate” bit does in “Public-key Certificate”. A public-key certificate is also known as a “Digital Certificate”.
Certificate Authority – CA
A digital certificate and public-key certificate are synonyms. A digital certificate contains the owner’s Public-key and identity information packaged together as a single unit. It is called a certificate because it proves the ownership of the so-called Public-key contained in it. But who provides this proof? It turns out you got to trust someone as a starting point. There got to be a trust point from which others can derive their trust. The starting point is called Certification Authority or CA for short. The job of CA is to certify that a public-key belong to a rightful owner – through a process called signing the certificate. The signed certificate contains the public-key and owner’s identity information. It has an issue date and an expiry date, outside which it is deemed unsuitable for communication. In return for its service, the certificate authority charges a fee each time a signing request is received. Popular certificate authority services include DigiCert, Comodo, GoDaddy etc. There is a popular free alternative Let’s Encrypt that offer free, automated and open CA service.
The CA’s own certificate, public-key / private key pair, isn’t any different to the owner’s pair in structure. Likewise, the CA’s public-key is made public and private-key is protected through an extream physical and digital security measures. The worth of 1’s and 2’s in the CA’s private key is literally worth more than gold for every digit in it. The CA is the root of the trust hierarchy. If it’s private-key is compromised, anything it trusted, thousands or millions of signed certificates, could also be compromised.
What does the owner own that the certificate authority certifies? It can be a domain-name like “teranetics.net”. The certificate will validate that the owner of the public-key / private-key pair really owns the domain “teranetics.net”. After this trust is established, when others see the digital certificate for this website, they know they are talking to a legitimate communication request from “teranetics.net” and not from some middleman.
How does the signing process work? It starts with the owner generating a public-key / private-key pair and a Certificate Signing Request (CSR) through a software application i.e. OpenSSL etc. The CSR contains the public-key and owner’s relevant information like website domain name “teranetics.net”, organization, city, country etc. The CSR is sent to the Certificate Authority for signing. The CA has a cryptographic public-key / private-key pair of its own. All the certificate authority does is signing the owner’s CSR with its private key. The resulting output is a Digital Certificate for the owner.
In the worldwide internet, when others want to validate the owners’ digital certificate, that it really is who it claims it is, they simply use the certificate authority’s publically known public-key to validate the owner’s digital certificate. The certificate authority originally signed the owner’s public-key plus identity info using its private-key. So this time the certificate authority’s public-key is used to reverse the process.
Public CA and Private CA
The certificate authority (CA) doesn’t need to be globally known or service global clients. If it is, it is known as Public CA’s. If it is not, it is known as Private CA. Examples include DigiCert, Comodo, GoDaddy, Let’s Encrypt.
A private certificate authority is maintained privately by an individual, or more appropriately by an organization. It serves the same purpose as a public CA, issues digital certificates but to internal requesting entities only. This serves to establish trust among internal entities communicating with each other, within the organizational boundaries. They can be free to create. Using OpenSSL, you can create as many private CA as you like. A popular option for an enterprise is to deploy a Private CA service on a Microsoft Windows Server platform.
Root CA and Subordinate CA
Root CA is the point of origin for trust relationships. It trusts no external entity, more precisely it trusts only its self, but everyone is supposed to trust it. A Subordinate CA, on the other hand, trusts a Root CA or another Subordinate CA. This constitutes a hierarchy of CA’s in the form of an inverted tree where Root CA is at the top, and each downstream branch trusts its upstream parent. The last subordinate CA that actually issues a digital certificate to requesting entities is also known as Issuing CA. Root CA’s and Subordinate CA’s can be of type private CA or public CA.
The hierarchical CA design is for security purposes. The root CA, being the single most critical service, is abstracted away from requesting clients on multiple levels. A common CA design practice is a two-tier CA design with one root CA, which is mostly kept off the network and/or in downstate, and two subordinate CA’s as shown in the figure above. The subordinate CA’s in a two-tier design is also the Issuing CA’s as they process the incoming signing requests. In a three-tier CA design, the middle tier will purely be a subordinate CA tier while the third tier will be the issuing CA tier.
A client-server communication is simply two parties talking to each other, where one party requests something from the other and the other fulfills the request. Examples include a patient going to a doctor, a customer going to a consultant or a user (using a web browser) visiting a website (hosting on a web server). A typical secure client-server communication has the following properties:
Identity – The verification of the identity of communicating parties to each other i.e. client to the server and server to the client. This ensures we are talking to a legitimate peer and not some middleman impersonating as the peer. In practice, this is mostly done through asymmetric-key (public-key) cryptography. In practice, in a typical client-server communication, only the server authenticates to the client. That is in a typical consumer-producer relationship, the consumer (client) generally has the right to know the producer (server) before requesting service – medical treatment, consultancy or webpage. However, the producer doesn’t need to know all about its consumers.
Confidentiality – The encryption/decryption algorithm used to secure communication. Both symmetric-key and asymmetric key cryptography can be used. In practice, this is mostly done through symmetric-key cryptography. What about our shared-secret exposed to middleman problem? Well, the shared-secret exchange happens inside the “Identity” verification step above so it is protected against eavesdropping by the same public-key cryptography mechanism that ensures the identity of the peers in the first place. Asymmetric-key encryption on every message exchanged would be highly processor intensive. On the contrary, symmetric-key (shared-secret) encryption is not that taxing on the CPU and can occur on a per message basis. During the course of a long communication session, the symmetric-key can be refreshed after a certain time or data threshold.
Integrity – The integrity of the messages exchanged. Although the messages are encrypted, they can be altered to make them unusable or replayed several times by the middleman. A replay attack is particularly interesting as the middleman (Eve) doesn’t need to know what the message (“J1bb3r1$h”) means but without integrity checks if it is replayed again tomorrow it will still be received and decrypted by the receiver (“Bob”) as a valid message (“Remember Eggs”). Bob will again bring home eggs, two days in a row, although this time Alice may have asked for bread instead. In practice, the integrity of the message is ensured by adding a timestamp to the message and calculating a checksum (a one-way cryptographic hash function, MD5, SHA1, SHA2) over the final message contents. The resulting checksum + final message is encrypted as usual. If the encrypted message is replayed (by “Eve”), it will have an old timestamp and discarded by the receiver (“Bob”). If the same message is sent again by the sender (“Alice” needs more eggs), it will have a different timestamp and hence a different checksum in the final message. The encrypted message will look totally different to the previous message to an eavesdropper (“Eve”), although the real message contents are exactly the same (“Remember Eggs”).
A digital certificate can come in various encoding formats. There are tools (OpenSSL) that can convert a certificate from one format to another.
PEM – It is an RFC standard format that stores the certificate in ASCII text. It can store public-key, private-key or root-certificates. If you can open a certificate in Notepad and see letters and numbers between two lines containing “BEGIN CERTIFICATE…” and “END CERTIFICATE”, it is in PEM format.
DER – The certificate is stored in binary format. Opening in Notepad would display all sorts of displayable / non-displayable characters. It can store public-key or private-key.
PKCS7 or P7B – An RFC standard popular on Windows and Java, where the encoded certificate is stored in ASCII text in a single container. The contents are placed inside lines containing “BEGIN PKCS…”, “END PKCS…”. This file format can contain public-key and certificate chains but not private-key.
PKCS12 or PFX or P12 – Another RFC standard that acts as a passworded container for both public-key and private-key pair and certificate chains. It is quite popular on Windows. The contents are encrypted using a password and stored in binary format. This format can be used for the transfer of a private-key, certificate and certificate chain, as a single and encrypted package.
Hands-on Certificates using OpenSSL
Now that we have a basic understanding of digital certificates, in the next post Digital Certificates using OpenSSL, we are going to perform some basic certificate operations using OpenSSL.