About Spark and the Cloud
You’ll find here two youtube videos that explains:
* the Amdhal’s Law and the “incompressible time” of distributed computation engines.
* why you shouldn’t use Spark for ETL processes.
* why it’s better to avoid using “cloud solutions” (Amazon, Azure) for “data science” projects.
(subtitles in English and French are available).
The presentation used in the two videos:
A quick one-page executive summary about the two videos:
A white paper that summarizes the findings explained in the two videos:
To see the video from Mister Frédéric Pierucci:
The Github repository with the Anatella graphs and the scala codes used in the video: