5.1.2. Data Types

<< Click to Display Table of Contents >>

Navigation:  5. Detailed description of the Actions > 5.1. General Parameters used in many actions >

5.1.2. Data Types

 

There are 3 basic data-types available inside Anatella:
 

1.Unknown type (or String type)
 

2.Float Type (also named “double”)
 

3.Key type

 

 
The dates (or the times) are a special case: They can be represented using either the Unknown/String type or the Key type.

 
Use the ChangeDataType ANATEL~2_img9  Action to convert from one data-type to another.

 

 
Each different data-type has specific properties:

 

Data-Type

Unknow type

(or String type)

Float Type

(double)

Key type

 

Memory consumption for the storage of one cell/value in RAM.

String Size = s

bytes

8 bytes

4 bytes

NULL value

1

s < 128 chars

1 + 2 s

s < 16384 chars

2 + 2 s

s < 2097152 chars

3 + 2 s

otherwise

4 + 2 s

Space consumption for the storage on the hard drive.

File format:

Compression

File format:

Compression

File format:

Compression

.gel_anatella

★★★

.gel_anatella

★★

.gel_anatella

ANATEL~2_img10

.cgel_anatella

★★★ANATEL~2_img10

.cgel_anatella

★★ANATEL~2_img10

.cgel_anatella

★★★★★★★

Admissible values.

NULL value, empty string, any string

NULL Value, NaN (Not A Number), +Inf, -Inf, any number

NULL Value, Any positive integer number less than 4,294,967,294

Computation Speed

(Efficiency in manipulating variable of this type).

Moderate

High

Very high

 

 
In general, the most efficient data-type for storage is the “Key” Type because:
 

1.One “Key” value consumes only 4 bytes of RAM memory for storage.
 

2.One “Key” value uses very little space on the hard drive (especially when using the “.cgel_anatella” file format: The compression is very efficient in this case)

 

 
A one character-String consumes only 3 bytes and is thus a little bit more efficient, in terms of RAM memory consumption, than the “Key type” (that uses 4 bytes). Please don’t forget that, if you intend to make many number computations based on a 1 character-String, the transformation from String to float can be CPU intensive (especially for large tables). Furthermore, the “Key” type compresses *a lot* better than the “Unknown/String” type, so the consumed hard-drive space is very much lower when using the “Key” type. To summarize: Even for numbers with only one digit, it’s nearly always better to use the “Key” type.

 

 
Using a more efficient Data-type will allow you to:
 

1.Consume less disk-space for your “.gel_Anatella” or “.cgel_anatella” files.
 

2.Increase processing speed in many actions: e.g. join operations (with the “MultiJoin” Action), Variable creation (with the “Calculator” Action), Filtering (with the “FilterRows” Action).
 

3.Handle larger Slave Tables in the “MultiJoin” Action. Indeed the “MultiJoin” Action starts by loading into the central RAM memory all the slave tables. If these tables consume less memory space (because of a more efficient data-type), you’ll be able to compute MultiJoins on slave tables containing a larger number of rows.

 

 
You can easily see the data-type of each column in the Data-preview window:

 

ANATEL~2_img13

 

 

 

ANATEL~2_img8

When a cell is empty, it can contain either an empty string or a NULL value. To know the exact content of the cell, look at the background’s color: If the background’s color is red, then the cell contains the NULL value (otherwise the cell contains the empty string “”).