5.5.2.2. Faster Alternatives to the “simple” Sort

<< Click to Display Table of Contents >>

Navigation:  5. Detailed description of the Actions > 5.5. Standard > 5.5.2. Sort (High-Speed action) >

5.5.2.2. Faster Alternatives to the “simple” Sort

 

Sorting is one of the slowest operation that you can perform inside any ETL because it involves writing all the data on the hard drive (in tape files) and, just after, reading all the same data again from the hard drive (during the “Merge Sort”). All these I/O operations (writing and reading from the hard drive) take a considerable amount of time (especially for large input tables). Thus, when optimizing your Anatella graph for speed, you should avoid to use any clip0131 Sort Action. There exists many different ways to obtain a sorted table without using a plain clip0131 Sort Action: I strongly suggest you to use instead (if possible):

1.a clip0134 MergeSort Action.

2.a ANATEL~3_img327 MergeSortInput Action.

3.a clip0135 partitionedSort Action.

These 3 Actions are incomparably faster than a plain clip0131 Sort Action.

 

Sorting is required for the simple clip0136 Join Action to work properly. If possible, replace the simple clip0136 Join Action with a clip0137 MultiJoin Action, to avoid sorting all the data.

 

Sorting is also required for the “out-of-memory” mode of the clip0138  Aggregate Action to work properly. If possible, replace the “out-of-memory” mode with the “in-memory” mode, to avoid sorting all the data.