יום שישי, 27 בדצמבר 2013

Information splitting in Cloud Storage Services

Introduction

The use of cloud computing services is expanding rapidly in recent years as it enables scalability, quick adaptation to dynamic changes in business requirements and total cost of ownership reduction. However, these services create challenges regarding information confidentiality and availability, where the cloud service provider is solely responsible for managing the computing infrastructure and information security.

There are many ways to maintain information confidentiality, by using Steganography (messages hiding), privileges mechanisms or sophisticated encryption techniques. Nevertheless, in all those solutions the information is out there, stored in a cloud service provider repository, in one form or another, available for a determined attacker with the appropriate resources to find and expose it.

Beyond confidentiality, a major concern for organizations that stores their data in cloud services is degraded information availability due to hardware, software or communication failures. Although cloud services use multiple geographically dispersed data centers for enhanced service reliability, all those data centers belong to the same administrative domain and are susceptible to the same risks. One can of course encrypt and duplicate the information by using different cloud storage services (a typical data recovery procedure), but that method will increase the total cost of ownership, the number of attack vectors and also the overall information risk.

Encryption of information requires protection of the encryption keys and here is a dilemma: If these keys are stored in cloud storage service unauthorized access to the data is enabled by personnel within the service provider, which eliminates the need for encryption in the first place. If encryption keys remain in the user desktops, information availability will be lost in case of a local failure.

Information splitting - Secret Sharing

So where to store the encryption key or the data itself? One magical solution to these challenges is splitting and distributing the encrypted data to multiple private or public cloud storage services. This solution can reduce the dependency on the availability or reliability properties of a single cloud storage service or on applied data security controls.

One can randomly split the secret into several shares and store each share in a different cloud storage service, so that no single cloud service can recover the secret without the assistance of others. In order to recover the secret, all or a part of the different cloud storage services must collaborate and integrate their various shares into a restore function. That is, even if an attacker could get some random information stored in one cloud storage, he could not disclose the secret.

One simple and scalable method to implement data splitting is to perform a XOR operation on the secret number which you want to keep and on a random number, store the result in one place and the random number in other place (and do not forget to delete the original secret). To restore the number, a XOR operation must be done on the two stored numbers.

In the previous example if one share is lost or cannot be exposed (as in the case of a degraded availability of one cloud service) the information cannot be recovered at all. A more sophisticated method, which enhances data availability, enables recovery of data using combination of only a small group of distributed shares. For example, one splits the data into four shares, where only two of them should be integrated to restore the secret.

Professor Adi Shamir (one of the inventors of the RSA algorithm) proposed in 1979 a simple method to split secrets (secret sharing), which basically was based on graphs. Two points are sufficient to draw a straight line, three points are sufficient to draw a parabola, and so on. One can draw a line in any order (a polynomial) using random characteristics (except the secret one desires to protect), and distribute points (x and y values) on the line to every participant that one desire to share the secret with. Because each participant receives values of only one point, it has no knowledge of the secret itself. The polynomial order defines the number of shares that one will be required to combine together in order to recover the original secret. For example, one can draw a straight line with a slope defined randomly where the secret is the encounter of the line with the y-axis, then distribute four different point values on the line to four different participants, but only two of them, each possible pair of the four, will be required to combine their point values to recover the secret.

In general, this algorithm is called “threshold scheme (k, n)”. The secret is splited into n random shares with the size of the secret itself. Combination of k shares (a number less than n) or higher can allow the recovery of the secret. Combination of less than k shares does not allow disclosure of any information about the secret. Thus, the loss of any share does not degrade the secret availability.
                                             Figure 1 - secret splitting into shares using a parabola

This method is defined “unconditionally secure” because it is secured even if the attacker uses unbounded resources after it exposes one share. Disclosure of the sensitive information will require hacking into each one of the cloud storages in the group which keep the random secret shares.

Moreover, even if one cloud storage service is compromised and one secret share is disclosed, the method allows the creation of a new series of random secret shares which is distributed to the cloud storages that were not compromised, whiteout changing the secret itself, in a way that obviates the secret share that was exposed, which will be useless to the attacker.

Efficient information splitting

The secret sharing improves the confidentiality and availability of the secret data. The only disadvantage of using secret sharing algorithm is its low efficiency. Transferring and storing secret shares requires bandwidth and storage space which are equal to the product of the secret size and the number of shares. If one stores a large file, for example, 1GB, using 5 shares, than a storage size of 5GB will be required. Therefore, this method, which enhances confidentiality, is mostly suitable for small secrets (eg encryption keys), and less suitable for large secrets (such as files or databases).

There is also a method called "secret sharing made ​​short" (SSMS) which uses a three phases process: encryption of information, use of information dispersal algorithm (IDA - developed by Professor Michael Oser Rabin in 1989) which is designed to split the data using erasure coding in a very efficient manner and splitting also the encryption key itself using secret sharing algorithm. In this solution each cloud storage service locally stores a share of the encrypted data and a share of the encryption key. In SSMS the information encryption and secret sharing algorithm methods enable information confidentiality and the IDA improves both information availability and confidentiality in a very efficient way. The downside of the SSMS is that it is not immune against unbounded resource in the hands of an attacker, such as the secret sharing method. Nevertheless, in order to recover the secret one would have to penetrate a number of cloud storage services and recover both the encrypted information and the encryption key which is also splitted.

Another method that enables sharing of a secret in an efficient way while preserving confidentiality is the All-or-Nothing-Transform with Reed-Solomon (AONT-RS), which integrates the AONT that was proposed by Professor Ronald Rivest (another inventor of the RSA algorithm) and erasure coding. This method first encrypts and transforms the information and the encryption key into blocks in a way that the information cannot be recovered without using all the blocks, and second it uses the IDA to split the blocks into shares to be stored in multiple cloud storage services.
Figure 2 - information dispersal to multiple cloud storage services

Summary

Secret splitting mechanisms (split key, split knowledge) were mainly incorporated to protect sensitive information within organizations or securing the access to encryption keys. An example of this would be the use of dual control mechanisms or performing sensitive operations within a HSM. In recent years a number of registered patents and technological solutions use SSMS or ANOT-RS algorithms, which are based on secret splitting concepts, as a basis for efficient Information Assurance in cloud storage services. Those solutions are incorporated in desktops software or in cloud storage gateways and enable secure storage of data in multiple private and public cloud storage services. Secured information dispersal algorithms can effectively and efficiently improve the overall security of many services, such as big data storage, cloud data centers, data archiving, data backup and file synchronization. 

This arlticle was also published here.