Tag: mappartitions

Processing multiline JSON file – Apache Spark
By: Date: August 31, 2017 Categories: Apache Spark Tags: , , , , , , , ,

Apache Spark is great for processing JSON files, you can right away create DataFrames and start issuing SQL queries agains them by registering them as temporary tables. This works very good when the JSON strings are each in line, where typically each line represented a JSON object. In such a happy path JSON can be…

Read More →