Tag: sqlcontext

Migrating to Spark 2.0
By: Date: October 4, 2017 Categories: Apache Spark Tags: , , , , , , , ,

Spark 2.0 provides a more matured eco-system, a unified data abstraction API and setting some new benchmarks in performance boosts with some non-backward compatible changes. Here, we try to see some important things to learn/remember before we migrate our existing spark projects to spark 2.0. Following is not a complete list of points but presents…

Read More →
Apache Spark Transformation – DataFrame
By: Date: August 7, 2017 Categories: Apache Spark Tags: , , , , , , , ,

Apache Spark Transformation – DataFrame DataFrame can be create from any structured dataset like JSON, relational table, parquet or an existing RDD with defined schema. Following program creates a DataFrame and queries using sql. Here is the json we will use to play with, copy these following lines into a file and save it inĀ <SPARK_HOME>/bin…

Read More →