Disaster recovery

Under the covers

One of the important innovations in the Mesh is that disaster recovery is neither an afterthought or a 'premium feature'. Providing mechanisms to recover from almost any type of disaster is built into its design.

Support for recovery from multiple types of disaster is the main reason that a personal Mesh profile contains significantly more keys than any existing PKI in widespread use. It is also the main reason for developing the Mesh as a completely new PKI rather than attempting to build on PKIX, OpenPGP or SAML.

Just as every problem in computer science can be solved by adding another layer of indirection, almost every disaster recovery scenario can be addressed by introducing an additional layer of keys.

Such an approach would have been (rightly) rejected as wildly impractical in 1995 when PGP and PKIX were being developed. The machines of the day were slow and could barely perform one public key operation without introducing unacceptable delays. Modern computers using modern algorithms can perform hundreds or even thousands of public key operations in the same time.

The chief risk in introducing additional layers of keys is that it introduces additional complexity. To avoid this problem, the Mesh offers as few options as possible. There are certainly situations where escrow of encryption keys is undesirable and so use of key escrow is an option for Mesh enabled applications. But every Mesh personal master profile is required to contain a master escrow key so that support for escrow is always available for every user.

Information Risks

Like any other security control, the purpose of disaster recovery is to mitigate risk. The standard information security concerns are considered:

  1. Integrity / Impersonation
  2. Availability / Data loss

While the CIA mnemonic is easy to remember, the most important risk for most users by far is the risk of losing their data. Most users care much more about the risk that they might lose their pictures of their children at 5 years old that can never be replaced than the risk that someone might reveal their bank balance.

Even in the military, integrity is a much higher concern than confidentiality in most cases. Disclosing the secret plans for the attack is bad. But having the enemy impersonate a commander and order a bombing run on their own side is worse.

Preparing for recovery

Every disaster recovery plan requires some infrastructure whose survival is assured with near certainty. We call this the 'Survival Core'

In an enterprise environment this survival core is typically one or more disaster recovery sites that are separate from the main production site both logically and physically.

The survival core of the Mesh consists of two parts:

  1. An offline escrow record stored in the CryptoMesh that contains the user's master signature and escrow keys encrypted under the master recovery key.

Computer hardware is liable to be lost, stolen or break. For this reason, the recovery shares should ideally be written down onto paper and stored in a secure location.

Once a set of master key recovery shares has been created, the user is strongly advised to delete the master private keys from their machines and rely on the encrypted backup and offline recovery shares. This minimizes the risk of compromise of the master private keys.

Recovering encrypted data

The most important information risk introduced through the use of strong encryption is the risk of data loss resulting from the loss of an encryption key.

The Mesh addresses the need to recover encrypted stored data without compromise to other security functions as follows:

  • All Mesh applications separate keys used for encrypting stored data from keys used for all other purposes. This ensures that the ability to recover the encryption keys does not compromise authentication, non-repudiation or transport encryption controls.
  • Unless the user explicitly requests not to, all keys used to encrypt stored data in an application are escrowed under an application escrow key which is in turn escrowed under the Master Escrow key of the user's profile.

These precautions ensure that stored data encrypted under an application key can be recovered by recourse to the application escrow key or in extremis, the master escrow key through the recovery shares.

Disclosure of administration key

Since the administration keys must be used each time a device is added to or removed from the users profile, these keys are at risk of disclosure. Ideally, this risk should be minimized through the use of trustworthy hardware that resists extraction of the private key.

Should an administration key be compromised, a user can regain control of their personal security environment by creating a new personal profile from the valid portions of the old one and signing it with their master signature key.

Personal catastrophe

In 2016, the UNHCR estimated that over 21.3 million people were refugees. In previous crises, very few of these refugees would have used computers, email or social media. In the modern world the majority of refugees used the Internet before becoming a refugee and their need for communications only increases after being displaced.

The Mesh disaster recovery capabilities allow a user to recover all the credentials controlling access to their personal digital assets provided that they have a sufficient number of recovery shares.

Confidentiality breach

Once confidential information has leaked, there is little that can be done to prevent it being released or used to damage the user it belongs to. Thus the options for disaster recovery are limited to preventing further disclosures.

The best tool for mitigating the extent of disclosure is to limit the scope of use of encryption keys by time, by application domain, or both. The Mesh key management tool automatically generates separate encryption keys for each application and cryptographic function and provides the ability to rotate encryption keys automatically. It is thus practical to limit use of a key to a month or less.

Although the Mesh attempts top mitigate consequences of a confidentiality breach to the maximum extent possible, the ability to do so is limited using legacy encryption applications such as S/MIME and OpenPGP. The use of Proxy Re-Encryption provides a powerful technical capability for addressing these concerns more effectively. Further consideration of these concerns is thus left to development of Mesh/Recrypt.

Death or Incapacitation

It is perhaps surprising that almost no Internet security applications consider what should happen in the case of the user's death. This is after all the one disaster every user is certain to suffer at some point.

After death, the user is gone but their information assets survive. Most users will have some information they want to be available to their successors and some information they definitely want to keep secret. As one person put it, 'I want my family to know where I buried Aunt Agatha's jewels but not where I buried Aunt Agatha'. There are thus three cases of interest:

Information to be released on death or incapacitation
e.g. The subjects financial and medical records.
Information to be released on death
e.g. contact details for the subject's other wives, mistresses and illegitimate children.
Information that must never be released
e.g. client confidential material.

The chief technical challenge in meeting these requirements is providing users with an acceptable mechanism for specifying into which category a particular piece of information falls.

These concerns form part of the requirements that motivate development of Mesh/Recrypt.