2020-07-09 13:26:50 +00:00
2020-07-09 13:26:50 +00:00
2020-07-09 13:26:50 +00:00
2020-07-09 13:26:50 +00:00
2020-07-09 13:26:50 +00:00
2020-07-09 13:26:50 +00:00
2020-07-09 13:26:50 +00:00
2020-07-09 13:26:50 +00:00
2020-07-09 13:26:50 +00:00
2020-07-09 13:26:50 +00:00

Spark Boilerplate

This is a boilerplate project for Apache Spark. The related blog post can be found at https://www.barrelsofdata.com/spark-boilerplate-using-scala

Build instructions

From the root of the project execute the below commands

  • To clear all compiled classes, build and log directories
./gradlew clean
  • To run tests
./gradlew test
  • To build jar
./gradlew shadowJar
  • All combined
./gradlew clean test shadowJar

Run

spark-submit --master yarn --deploy-mode cluster build/libs/spark-boilerplate-1.0.jar
Description
Project detailing how to write a word count program using spark structured streaming along with unit test cases
Readme 162 KiB
Languages
Scala 90.1%
Shell 9.9%