generated from barrelsofdata/spark-boilerplate
karthik
409ade0326
Tests / reset-status (push) Successful in 3s
Details
Tests / tests (push) Successful in 4m35s
Details
Tests / build (push) Successful in 3m43s
Details
|
||
---|---|---|
.gitea/workflows | ||
gradle | ||
src | ||
.gitignore | ||
README.md | ||
build.gradle.kts | ||
gradle.properties | ||
gradlew | ||
gradlew.bat | ||
settings.gradle.kts |
README.md
Spark Word Count with Unit Tests
This is a project detailing how to write word count program in Apache Spark along with unit test cases. The related blog post can be found at https://barrelsofdata.com/spark-word-count-with-unit-tests
Build instructions
From the root of the project execute the below commands
- To clear all compiled classes, build and log directories
./gradlew clean
- To run tests
./gradlew test
- To build jar
./gradlew build
Run
spark-submit --master yarn --deploy-mode cluster build/libs/spark-wordcount-1.0.0.jar hdfs://path/to/input/file.txt hdfs://path/to/output/directory