Apache Spark proposal
Project
We will grant permissions to submit the proposal
Name: Apache Spark
Desired Initial Maturity Level (Sandbox, Incubating, Graduated): Sandbox
Problem Statement (i.e. problem you want to solve): There are no packages available (that I'm aware of) in Big Bang that support large-scale, distributed data processing. These workloads could come in the form of data pipelines, machine-learning models, etc. Apache Spark is a great, open-source tool for these tasks.
Description: Apache Spark is a multi-language engine for executing data engineering, data science, and machine learning on single-node machines or clusters.
Initial Members:
- Lucas Rodriguez @lucas.rodriguez
Edited by Lucas Rodriguez