cloud

About Spark and the Cloud

About Spark and the Cloud You’ll find here two youtube videos that explains: * the Amdhal’s Law and the “incompressible time” of distributed computation engines. * why you shouldn’t use Spark for ETL processes. * why it’s better to avoid using “cloud solutions” (Amazon, Azure) for “data science” projects. (subtitles in English and French are
Data Vaulting

Data vaulting: from a bad idea to inefficient implementations

Data vaulting: from a bad idea to inefficient implementations An efficient data management mechanism should have two main characteristics: operational efficiency (it must run faster and with less resources than those it aims to replace) and structural clarity (it must be straightforward to access, understand, and query). As IT data manager, you know you sometimes