Download and configure on Mac/Windows Download the version of Spark that you want to work on from here Copy the downloaded tgz file to a folder where you want it to reside. Either double click the package or run tar -xvzf /path/to/yourfile.tgz command which will extract the spark package. Navigate to bin folder and start ./spark-shell…
Read More →If you haven’t read the previous article about MapReduce, I’d highly recommend reading it because that will set a good foundation to appreciate Sparks existence. Apache Spark – Introduction I want to get to the practical exercises quickly and I think there are enough resources on the internet to explain theoretical view of the framework….
Read More →MapReduce – Quick Intro If you are reading this page, then I assume you have heard about MapReduce. Let us understand MR framework quickly, as understanding of this is much needed for someone to appreciate Apache Spark. MapReduce is the core de facto data processing framework of Apache Hadoop. The beauty of this framework was…
Read More →