Project detailing how to write a word count program in spark along with unit test cases
Go to file
karthik 409ade0326
All checks were successful
Tests / reset-status (push) Successful in 3s
Tests / tests (push) Successful in 4m35s
Tests / build (push) Successful in 3m43s
Typo fix
2023-10-07 13:15:53 +02:00
.gitea/workflows Use gradle version catalog 2023-10-07 13:10:15 +02:00
gradle Use gradle version catalog 2023-10-07 13:10:15 +02:00
src Typo fix 2023-10-07 13:15:53 +02:00
.gitignore Use gradle version catalog 2023-10-07 13:10:15 +02:00
build.gradle.kts Provide jvm args for test 2023-10-07 13:10:15 +02:00
gradle.properties Use gradle version catalog 2023-10-07 13:10:15 +02:00
gradlew Use gradle version catalog 2023-10-07 13:10:15 +02:00
gradlew.bat Use gradle version catalog 2023-10-07 13:10:15 +02:00
README.md Use gradle version catalog 2023-10-07 13:10:15 +02:00
settings.gradle.kts Use gradle version catalog 2023-10-07 13:10:15 +02:00

Tests Build

Spark Word Count with Unit Tests

This is a project detailing how to write word count program in Apache Spark along with unit test cases. The related blog post can be found at https://barrelsofdata.com/spark-word-count-with-unit-tests

Build instructions

From the root of the project execute the below commands

  • To clear all compiled classes, build and log directories
./gradlew clean
  • To run tests
./gradlew test
  • To build jar
./gradlew build

Run

spark-submit --master yarn --deploy-mode cluster build/libs/spark-wordcount-1.0.0.jar hdfs://path/to/input/file.txt hdfs://path/to/output/directory