5.22.1. Encrypt (High-Speed clip0090 action)

<< Click to Display Table of Contents >>

Navigation:  5. Detailed description of the Actions > 5.22. Other Transformations >

5.22.1. Encrypt (High-Speed clip0090 action)

 

Icon: ANATEL~4_img393

 

Property window:
 

ANATEL~4_img394 ANATEL~4_img395

 

Short description:

Encrypt/Decrypt some fields.

 

Long Description:

Encryption and Decryption are based on a symmetric key (i.e. there is one unique key that allow encryption and decryption). The key is save inside a “Key File”. You need a “Key File” to encrypt or decrypt your data.

 

To select a “Key File” (using the “Browse” button) or to create a new “Key File” (using the “Create New Key File” button), you need to switch to “Expert-user-mode”. To switch to expert-user-mode: Click the ANATEL~4_img396 button in the main toolbar of the application! Once you are in “expert-user-mode”, the “Browse” button and the “Create New Key File” button are enabled.

 

When you click the “Create New Key File” button, Anatella opens this Window:
 

ANATEL~4_img397

 

 

By moving randomly your mouse inside this window, you generate random numbers that are used to create a 100% random key. This encryption key is saved inside the “Key File”.

 
 

ANATEL~4_img129

NOTE:

Never lose your “Key File”. If you lose your key file, you’ll never be able to decrypt your data later.

 
 
ANATEL~4_img129

NOTE:

Never send your “Key File” to third parties.

 

 

ANATEL~4_img129

NOTE:

The encryption algorithm that is used is DES (for the short keys) and 3DES (for the long keys). It’s a well-studied encryption algorithm that does not seem to have any weakness.

 

 
The encryption algorithm used inside Anatella is symmetric. This guarantees that there will never be any “collisions”. For example: Let’s assume that you are encrypting many MSISDN (i.e. many phone numbers): because there are no collisions, the number of distinct MSISDN before and after encryption is the same. There will never be 2 different un-encrypted MSISDN that are “mapped” to the same encrypted MSISDN (i.e. there are no collisions, never).

 

Since there are no collisions, you can safely use the ANATEL~4_img393 encrypt Action to anonymize your datasets. In particular, when anonymizing datasets containing MSISDN numbers, you’ll lose, after encryption, some precious information about the MSISDN. The lost information is:
 

Is it a “short” phone number? (e.g. like the voice-mail number)

Is it an international call?

 
These pieces of information are *very* important when analyzing communication-graphs using SNA (Social Network Analysis) algorithms. You can use:
 

The “Extract Original Prefixes” option to keep un-encrypted the first few digits of the MSISDN (This allows to detect international calls).

The “Extract Original Lengths” option to save the length of the un-encrypted MSISDN (This allows to detect “short” phone number like the voice-mail number).

 

 

ANATEL~4_img129

Anonymizing a dataset using a non-symetric encoding (such as MD5) can lead to some “collisions”. Non-symetric encodings (such as MD5) are thus bad and dangerous alternatives when anonymizing datasets.

Let’s take an example. Let’s assume that you are anonymizing 2 million different MSISDN using a 5-characters-MD5-code. A 5-characters-MD5-code can only have, at maximum, 1 million different values (=165). This means that you will have a *catastrophic* number of collisions that will make your anonymized dataset completely useless (Actually, even if you use, on the same population, a 6-character-MD5-code, there are 99% chance that you’ll also have so many collisions that your anonymized dataset is also useless).