About Spark and the Cloud

About Spark and the Cloud

About Spark and the Cloud You’ll find here two youtube videos that explains: * the Amdhal’s Law and the “incompressible time” of distributed computation engines. * why you shouldn’t use Spark for ETL processes. * why you shouldn’t use any “cloud solutions” (Amazon, Azure) for “data science” projects. (subtitles in English and French are available).
Data Vaulting

Data vaulting: from a bad idea to inefficient implementations

Data vaulting: from a bad idea to inefficient implementations An efficient data management mechanism should have two main characteristics: operational efficiency (it must run faster and with less resources than those it aims to replace) and structural clarity (it must be straightforward to access, understand, and query). As IT data manager, you know you sometimes