<< Click to Display Table of Contents >> Navigation: 5. Detailed description of the Actions > 5.4. Join Tables > 5.4.2. Join (High-Speed action) |
Icon:
Function: Join
Property window:
Short description:
Join several tables.
All the joins computed by the Join Action are using the same unique KEY column inside the MASTER table (Use the MultiJoin action if you have many different keys inside the MASTER table).
Long Description:
Pre-requisite
All the input tables must be sorted on the "Key" columns using the SAME sorting algorithm (all "numeric sort" or all "alpha-numeric sort").
This Action joins several tables on a key. Some definition:
•The first input pin contains the MASTER table.
•The other input pins (second, third, etc.) contain the SLAVE tables.
During the join, the SLAVE tables are added to (i.e. “joined with”) the MASTER table.
Please refer to the section “5.1.7. Composite Primary Keys”, to know how to handle composite primary keys.
The Join Action checks that all the input tables are sorted in the same way (i.e. using the same sorting algorithm). Sorting all the input table is required for the Join Action to work (in opposition to the MultiJoin Action that does not require any sort at all).
Because the input tables of the Join Action MUST be sorted for the join to work, this produces an error:
..but this will work (provided that the two Sort Actions are properly parameterized):
When you use the "sort action", it modifies the meta-data of the table, so that it now includes the information "this table is now sorted in a specific way". Using this meta-data, the "Join operator" is able to check if the input tables are correctly sorted.
This will work if the two .gel_anatella files are properly sorted:
This means that, if you intend to do several join with a specific table, it’s better to SORT and thereafter save the table. In this way, you can easily compute many joins because the table is already sorted in the .gel_anatella file (and, thus, you don’t need to sort it, again and again, anymore).
All the input tables used inside a Join Action are read simultaneously (in opposition to the MultiJoin Action that, first, reads completely all the SLAVE tables and then, once it’s finished, starts reading the MASTER table).
The RAM Memory consumption of the Join Action is negligible (i.e. very small) (in opposition to the MultiJoin Action that, stores completely all the SLAVE tables in RAM).
NOTE:
You can only define one key inside the MASTER TABLE. If you want to define several different keys inside the MASTER TABLE, use the MultiJoin Action.
Let’s give some examples: Here are the input tables:
+------------------+ +-------------------+ +-------------------+
| MASTER TABLE(A) | | SLAVE TABLE 1(B1) | | SLAVE TABLE 2(B2) |
+----+------+------+ +-----+------+------+ +-----+------+------+
|KeyA|Field1|field2| |KeyB1|Field1|field2| |KeyB2|Field1|field2|
+----+------+------+ +-----+------+------+ +-----+------+------+
| 1| A| B| | 1| C| D| | 1| E| F|
| 2| AA| BB| | 3| CC| DD| | 2| EE| FF|
| 3| AAA| BBB| +-----+------+------+ | 4| EEE| FFF|
| 3| AAAA| BBBB| +-----+------+------+
| 5| AAAAA| BBBBB|
+----+------+------+
For the “Left Outer Join” (represented by the or icon), we obtain:
+--------------------------------------------------------------+
| OUTPUT TABLE |
+----+------+------+----------+----------+----------+----------+
|KeyM|Field1|field2|T01.Field1|T01.field2|T02.Field1|T02.field2|
+----+------+------+----------+----------+----------+----------+
| 1| A| B| C| D| E| F|
| 2| AA| BB| | | EE| FF|
| 3| AAA| BBB| CC| DD| | |
| 3| AAAA| BBBB| CC| DD| | |
| 5| AAAAA| BBBBB| | | | |
+----+------+------+----------+----------+----------+----------+
For the “Full Outer Join” (represented by the icon) between Table A and Table B2, we obtain:
+----------------------------------------+
| OUTPUT TABLE |
+----+------+------+----------+----------+
|KeyM|Field1|field2|T02.Field1|T02.field2|
+----+------+------+----------+----------+
| 1| A| B| E| F|
| 2| AA| BB| EE| FF|
| 3| AAA| BBB| | |
| 3| AAAA| BBBB| | |
| 4| | | EEE| FFF|
| 5| AAAAA| BBBBB| | |
+----+------+------+----------+----------+
For the “Inner Join” (represented by the icon) between Table A and Table B1, we obtain:
+----------------------------------------+
| OUTPUT TABLE |
+----+------+------+----------+----------+
|KeyM|Field1|field2|T01.Field1|T01.field2|
+----+------+------+----------+----------+
| 1| A| B| C| D|
| 3| AAA| BBB| CC| DD|
| 3| AAAA| BBBB| CC| DD|
+----+------+------+----------+----------+
For the “A minus B Join” (represented by the icon) between Table A and Table B2, we obtain:
+------------------+
| OUTPUT TABLE |
+----+------+------+
|KeyM|Field1|field2|
+----+------+------+
| 3| AAA| BBB|
| 3| AAAA| BBBB|
| 5| AAAAA| BBBBB|
+----+------+------+
For the “A xor B Join” (represented by the icon) between Table A and Table B2, we obtain:
+----------------------------------------+
| OUTPUT TABLE |
+----+------+------+----------+----------+
|KeyM|Field1|field2|T02.Field1|T02.field2|
+----+------+------+----------+----------+
| 3| AAA| BBB| | |
| 3| AAAA| BBBB| | |
| 4| | | EEE| FFF|
| 5| AAAAA| BBBBB| | |
+----+------+------+----------+----------+
NOTE:
For the join, no row of the MASTER TABLE will ever be duplicated.
NOTE:
This Action is actually a good example of meta-data management: the "Join Action" looks at the "meta-data" of the input tables and refuses to join the input tables if the meta-data of the input tables says that "these table are not sorted properly".
This operator is an example of clever meta-data management. Most "Anatella-Action" are able to work without using any meta-data at all about the columns of the different tables (i.e. usually the actions don't need to know, for example, if the columns contain numbers or characters). ...but this does NOT mean that Anatella is not able to manipulate and manage meta-data information. On the contrary! For example: when the "Join operator" wants to test the meta-data of the input tables (to see if these input tables are properly sorted), the Anatella framework is providing ALL the required functionalities to do so. Most of the time, with Anatella, you don't need to do any meta-data management but when you really need to do it, it's really easy and powerful.