A boilerplate template for apache spark projects
Go to file Use this template
karthik 201d80b49d
Some checks failed
run-tests
Add tests action
2023-03-22 19:04:04 +01:00
.gitea/workflows Add tests action 2023-03-22 19:04:04 +01:00
gradle/wrapper Update spark and gradle versions, fix deprecated features 2023-03-16 12:32:45 +01:00
src Update spark and gradle versions, fix deprecated features 2023-03-16 12:32:45 +01:00
.gitignore Initial commit 2020-06-21 09:55:22 -04:00
build.gradle Fix typography 2023-03-16 16:04:15 +01:00
gradle.properties Fix typography 2023-03-16 16:04:15 +01:00
gradlew Update spark and gradle versions, fix deprecated features 2023-03-16 12:32:45 +01:00
gradlew.bat Update spark and gradle versions, fix deprecated features 2023-03-16 12:32:45 +01:00
README.md - Use native gradle build to generate jar with dependencies, removed shadowJar plugin 2020-11-25 02:35:56 -05:00
settings.gradle Initial commit 2020-06-21 09:55:22 -04:00

Spark Boilerplate

This is a boilerplate project for Apache Spark. The related blog post can be found at https://www.barrelsofdata.com/spark-boilerplate-using-scala

Build instructions

From the root of the project execute the below commands

  • To clear all compiled classes, build and log directories
./gradlew clean
  • To run tests
./gradlew test
  • To build jar
./gradlew build
  • All combined
./gradlew clean test build

Run

spark-submit --master yarn --deploy-mode cluster build/libs/spark-boilerplate-1.0.jar