Redshift four-tier key-based encryption architecture explained simply
The simplest explanation of the Redshift four-tier key-based encryption architecture?
Redshift uses a four-tier key-based architecture for encryption, Why 4??? Is this really enough? If four tiers is good surely five is better :-? It's tricky to find other references to four-tier encryption. Is this only an AWS thing? There is a paper on a four-tier architecture.
Oh, and can't believe that one of the certification exam questions is how many tiers does the Redshift encryption architecture use (1, 2, 3, 4)! Talk about pub trivia quiz night. Maybe they should ask: Why does it matter how many tiers Redshift uses for encryption?
I think this is just an example of multiple encryption, may be better to explain it in this context.
And another question. If a four-tier key-based architecture for encryption is good enough for Redshift why isn't it used for other Amazon databases? Maybe the answer is that this is just the logical encryption architecture required for the Redshift architecture - i.e. it just follows from the Redshift cluster MPP (Massively Parallel Processing) architecture (see above picture from the docs)?! YES. This is the correct answer from the following paper.
This is a good paper motivating the Redshift architecture but doesn't mention the four-tier security aspect explicitly, but does explain it well.
Anurag Gupta, Deepak Agarwal, Derek Tan, Jakub Kulesza, Rahul Pathak, Stefano Stefani, and Vidhya Srinivasan. 2015. Amazon Redshift and the Case for Simpler Data Warehouses. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data (SIGMOD '15). ACM, New York, NY, USA, 1917-1923. DOI: http://dx.doi.org/10.1145/2723372.2742795
Their explanation (my tier numbers):
Encryption is similarly straightforward. Enabling encryption requires setting a checkbox in our console and, optionally, specifying a key provider such as a hardware security module (HSM).
(Tier-4) Under the covers, we generate block-specific encryption keys (to avoid injection attacks from one block to another),
(Tier-3) wrap these with cluster-specific keys (to avoid injection attacks from one cluster to another),
(Tier-2) and further wrap these with a master key (TODO Why? There's only one master so it's not to prevent master to master attacks, or maybe it is to prevent attacks on other masters?),
(Tier-1) stored by us off-network or via the customer-specified HSM.
All user data, including backups, is encrypted. Key rotation is straightforward as it only involves re-encrypting block keys or cluster keys, not the entire database. Repudiation is equally straightforward, as it only involves losing access to the customer’s key or re-encrypting all remaining valid cluster keys with a new master. We also benefit from security features in the core AWS platform. For example, we use Amazon VPC to provide network isolation of the compute nodes providing cluster storage, isolating them from general-purpose access from the leader node, which is accessible from the customer’s VPC.
So, the 4-tier architecture is really just: (Tier-1) master key store off-network, (Tier-2) master key wraps cluster keys (Tier-3) cluster keys wrap block key, (Tier-4) block keys wrap data. The design is to prevent "attacks" within (e.g. block to block, cluster to cluster) tiers.
How about between (e.g. master to cluster, cluster to blocks) tiers? Is this why the master key is used perhaps?
Now in theory other AWS databases should have n-tier encryption to, based on their architectures?
I also wish that Amazon had a list of the published papers in academic conferences by the AWS staff somewhere, as most of the time the original explanations/motivations make more sense that the current documentation
.
I get the above explanation better the current documents which explain it as follows.
This is a good paper motivating the Redshift architecture but doesn't mention the four-tier security aspect explicitly, but does explain it well.
Anurag Gupta, Deepak Agarwal, Derek Tan, Jakub Kulesza, Rahul Pathak, Stefano Stefani, and Vidhya Srinivasan. 2015. Amazon Redshift and the Case for Simpler Data Warehouses. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data (SIGMOD '15). ACM, New York, NY, USA, 1917-1923. DOI: http://dx.doi.org/10.1145/2723372.2742795
Their explanation (my tier numbers):
Encryption is similarly straightforward. Enabling encryption requires setting a checkbox in our console and, optionally, specifying a key provider such as a hardware security module (HSM).
(Tier-4) Under the covers, we generate block-specific encryption keys (to avoid injection attacks from one block to another),
(Tier-3) wrap these with cluster-specific keys (to avoid injection attacks from one cluster to another),
(Tier-2) and further wrap these with a master key (TODO Why? There's only one master so it's not to prevent master to master attacks, or maybe it is to prevent attacks on other masters?),
(Tier-1) stored by us off-network or via the customer-specified HSM.
All user data, including backups, is encrypted. Key rotation is straightforward as it only involves re-encrypting block keys or cluster keys, not the entire database. Repudiation is equally straightforward, as it only involves losing access to the customer’s key or re-encrypting all remaining valid cluster keys with a new master. We also benefit from security features in the core AWS platform. For example, we use Amazon VPC to provide network isolation of the compute nodes providing cluster storage, isolating them from general-purpose access from the leader node, which is accessible from the customer’s VPC.
So, the 4-tier architecture is really just: (Tier-1) master key store off-network, (Tier-2) master key wraps cluster keys (Tier-3) cluster keys wrap block key, (Tier-4) block keys wrap data. The design is to prevent "attacks" within (e.g. block to block, cluster to cluster) tiers.
How about between (e.g. master to cluster, cluster to blocks) tiers? Is this why the master key is used perhaps?
Now in theory other AWS databases should have n-tier encryption to, based on their architectures?
I also wish that Amazon had a list of the published papers in academic conferences by the AWS staff somewhere, as most of the time the original explanations/motivations make more sense that the current documentation
.
I get the above explanation better the current documents which explain it as follows.
Finally the four-tier encryption details:
About Database Encryption for Amazon Redshift Using AWS KMS
About Database Encryption for Amazon Redshift Using AWS KMS
When you choose AWS KMS for key management with Amazon Redshift, there is a four-tier hierarchy of encryption keys.
These keys, in hierarchical order, are the master key, a cluster encryption key (CEK), a database encryption key (DEK), and data encryption keys.
When you launch your cluster, Amazon Redshift returns a list of the customer master keys (CMKs) that your AWS account has created or has permission to use in AWS KMS. You select a CMK to use as your master key in the encryption hierarchy.
By default, Amazon Redshift selects your default key as the master key. Your default key is an AWS-managed key that is created for your AWS account to use in Amazon Redshift. AWS KMS creates this key the first time you launch an encrypted cluster in a region and choose the default key.
If you don’t want to use the default key, you must have (or create) a customer-managed CMK separately in AWS KMS before you launch your cluster in Amazon Redshift. Customer-managed CMKs give you more flexibility, including the ability to create, rotate, disable, define access control for, and audit the encryption keys used to help protect your data. For more information about creating CMKs, go to Creating Keys in the AWS Key Management Service Developer Guide.
If you want to use a AWS KMS key from another AWS account, you must have permission to use the key and specify its ARN in Amazon Redshift. For more information about access to keys in AWS KMS, go to Controlling Access to Your Keys in the AWS Key Management Service Developer Guide.
After you choose a master key, Amazon Redshift requests that AWS KMS generate a data key and encrypt it using the selected master key. This data key is used as the CEK in Amazon Redshift. AWS KMS exports the encrypted CEK to Amazon Redshift, where it is stored internally on disk in a separate network from the cluster along with the grant to the CMK and the encryption context for the CEK. Only the encrypted CEK is exported to Amazon Redshift; the CMK remains in AWS KMS. Amazon Redshift also passes the encrypted CEK over a secure channel to the cluster and loads it into memory.
Then, Amazon Redshift calls AWS KMS to decrypt the CEK and loads the decrypted CEK into memory. For more information about grants, encryption context, and other AWS KMS-related concepts, go to Concepts in the AWS Key Management Service Developer Guide.
Next, Amazon Redshift randomly generates a key to use as the DEK and loads it into memory in the cluster.
The decrypted CEK is used to encrypt the DEK, which is then passed over a secure channel from the cluster to be stored internally by Amazon Redshift on disk in a separate network from the cluster. Like the CEK, both the encrypted and decrypted versions of the DEK are loaded into memory in the cluster. The decrypted version of the DEK is then used to encrypt the individual encryption keys that are randomly generated for each data block in the database. (the data encryption keys?)
When the cluster reboots, Amazon Redshift starts with the internally stored, encrypted versions of the CEK and DEK, reloads them into memory, and then calls AWS KMS to decrypt the CEK with the CMK again so it can be loaded into memory. The decrypted CEK is then used to decrypt the DEK again, and the decrypted DEK is loaded into memory and used to encrypt and decrypt the data block keys as needed.
Simple isn't it!
Which has the only diagram of the Redshift 4-tier key architecture than I can find anywhere
PS Not directly related...
The last piece of the blog is the one that I have read to answer my question: How does RedShift encrypts data. Thank you.
ReplyDeleteAmazing facts you have discussed in your article. thank you and update more informations
ReplyDeletePython Training in Chennai
Python Training in T.Nagar
JAVA Training in Chennai
Big data training in chennai
Selenium Training in Chennai
Python Training in Chennai
Python Training in Tambaram
Redshift Four-Tier Key-Based Encryption Architecture Explained Simply >>>>> Download Now
ReplyDelete>>>>> Download Full
Redshift Four-Tier Key-Based Encryption Architecture Explained Simply >>>>> Download LINK
>>>>> Download Now
Redshift Four-Tier Key-Based Encryption Architecture Explained Simply >>>>> Download Full
>>>>> Download LINK w2
Redshift Four-Tier Key-Based Encryption Architecture Explained Simply >>>>> Download Now
ReplyDelete>>>>> Download Full
Redshift Four-Tier Key-Based Encryption Architecture Explained Simply >>>>> Download LINK
>>>>> Download Now
Redshift Four-Tier Key-Based Encryption Architecture Explained Simply >>>>> Download Full
>>>>> Download LINK 5x