Tag: dataset

Introduction to Dataset API
By: Date: March 12, 2018 Categories: Apache Spark Tags: , , , , ,

Apache Spark introduced Dataset API that unified the programming experience, improving upon the performance/experience and reducing the learning curve for spark developers. This is a great link to get familiar with Dataset. If the link doesn’t work at when you are reading this post, google is your friend. I want to save time and get…

Read More →
Migrating to Spark 2.0
By: Date: October 4, 2017 Categories: Apache Spark Tags: , , , , , , , ,

Spark 2.0 provides a more matured eco-system, a unified data abstraction API and setting some new benchmarks in performance boosts with some non-backward compatible changes. Here, we try to see some important things to learn/remember before we migrate our existing spark projects to spark 2.0. Following is not a complete list of points but presents…

Read More →