What’s the main RAM memory consumption of the above graph?
Each Sort Action consumes 5 GB RAM: Since these two Sort Actions are running in parallel, the total RAM memory consumption of this graph is 5GB+5GB=10GB. The same graph executed on 1 CPU (i.e. without the Multithread Actions) consumes only 5GB RAM.
When you are executing in parallel some Anatella graph, the total RAM memory consumption usually increases substantially (compared to the “1-CPU-execution”). In the above example, the memory consumption for the “parallel” execution rises to 10 GB, compared to only 5GB for the simple, “sequential” execution.
If the total RAM memory consumption exceeds the amount of physical RAM inside your computer, then you are in trouble because MSWindows will be forced to “swap” (also named “paging”). When MSWindows “swaps” all processing speed is divided by 100 or more: The computation becomes so slow that it’s better to stop all computation and re-design your data-transformation-graph to use less RAM.
You can check the total physical RAM memory available to you and the memory consumption of your Anatella graphs, by looking at the MSWindows “Task Manager”.
To run the MSWindows “Task Manager”, right-click the MSWindows Taskbar and select “Start Task Manager”:
The amount of total physical RAM memory available to run your graphs is visible here:
In the above example, I still have 7572 MB ≈ 7.5 GB of Available Physical RAM memory to run my Anatella graphs. When you start your data-transformation-graph, Anatella start consumming some RAM memory and the amount of Available Physical RAM memory decreases. When this amount drops to zero, MSWindows will start “swapping” and all computations (from the whole computer) nearly stop. You should avoid that!
The Actions that consume a great quantity of RAM memory are:
•The MultiJoin Action: This Action starts by loading into RAM memory all the Slaves tables (on pin 1 and above). If there are a great quantity of large slave tables, the MultiJoin Action will consume a great quantity of RAM.
•The FilterOnKey Action: This Action works in the same way as the MultiJoin Action: It start by leading into RAM memory the table on the second pin. If this table is large, then it will consume a large quantity of RAM.
•The Sort Action: To sort large table, you need a large “tape size”. It means creating a large buffer into RAM memory (that contains one tape data).
•The Aggregate Action (when the option “Use in-RAM algorithm for small output tables” is checked): All the output tables (i.e. all the “aggregates”) must fit into RAM memory.
•Most of the Actions for “Graph Mining” are loading the whole graph to analyze into RAM memory. If this graph is large, then it will consume a large quantity of RAM. These Actions include the CommunityDetection Action, the NodeAnalysis Action, the SignificanceLevel Action, the Leadership Action.
•Most of the Actions for “Operational Research” are loading the whole table to analyze into RAM memory. If this table is large, then it will consume a large quantity of RAM. These Actions include the KNN Action and the DatasetReduction Action and the AssignmentSolver Action.
You should pay extra attention to RAM memory consumption when the above Actions are running in parallel. If you exceed the total amount of Available Physical RAM memory of your computer, all computations will slow down radically.