Software Engineering Daily

Apache Arrow with Uwe Korn

Software Engineering Daily

In a typical data analytics system, there are a variety of technologies interacting. HDFS for storing files, Spark for distributed machine learning, pandas for data analysis in Pythonโ€“each of these different technologies has a different format for how data is represented. ย  Serialization and deserialization between these different formats causes significant latency across the overall system. Apache Arrow is a tool for improving performance of in-memory analytics systems, and todayโ€™s

Continue reading...

Next Episodes

Software Engineering Daily

Economics of Software with Russ Roberts @ Software Engineering Daily

๐Ÿ“† 2016-07-15 06:14 / โŒ› 01:03:52


Software Engineering Daily

IoT Analytics with Jean-Christophe Cimono @ Software Engineering Daily

๐Ÿ“† 2016-07-14 01:08 / โŒ› 00:48:58


Software Engineering Daily

Cassandra Data Modeling with Jon Haddad @ Software Engineering Daily

๐Ÿ“† 2016-07-13 07:16 / โŒ› 00:56:11


Software Engineering Daily

Salary Negotiation with Haseeb Qureshi @ Software Engineering Daily

๐Ÿ“† 2016-07-12 05:00 / โŒ› 01:34:26


Software Engineering Daily

Platforms with Bridget Kromhout @ Software Engineering Daily

๐Ÿ“† 2016-07-11 07:16 / โŒ› 00:57:23