Apache Wayang is a system designed to fully support cross-platform data processing: It enables users to run data analytics over multiple data processing platforms. For this, it provides an abstraction on top of existing platforms in order to run data analytic tasks on top of any set of platforms. As a result, users can focus on the logics of their applications rather on the intricacies of the underlying platforms.

Turning shadows into a show

Read more on how Apache Wayang converts the light and shadows of data processing platforms to amazing theatre for you.


How we move the string for you


Run a single data analytic task on top of any set of data processing platforms.


It selects the best available data processing platform for any incoming query.


User defined functions (UDFs) as first-class citizens, enabling extensibility and adaptability.


A simple interface that allows developers to focus only on the logics of their application.

Cost Saving

Fast development of data analytic applications.

Open Source

All code is on GitHub under Apache License.

Why is Apache Wayang faster as other, modern frameworks?

Apache Wayang uses internal optimization patterns to detect the best possible combination of computation and nodes. We know, just adding more nodes into a cluster doesn't mean more speed; each additional node has several tradeoffs, be it shuffle or communication bottlenecks. Apache Wayang understands the UDF's and optimizes the function for the underlying processing platform. It also uses small JVM instances to reduce the operational overhead when processing a reduced number of data points.