Bayu Dwiyan Satria's Projects
Community health files
Apache Hadoop Environment and Configuration Files For Single (Stand Alone) Or Clusters
Apache Installation And Configuration
Apache Spark Environment and Configuration Files For Single (Stand Alone) Or Clusters
Java lib parent for dependencies and plugins management
Big Data Environment
BusyBox combines common UNIX utilities into a single containerization
Helm Charts
Curated applications for Kubernetes
Common Security Library
Platform and Environment System
Full Project Dependencies of Master Degree Library
Apache Hadoop
IBM Platform LSF Environment For Computing
Templates for Setup Infrastructure
Java common libs
Official Java client library for kubernetes
Java Libraries Catalogue
Apache Hadoop. Apache Hadoop is a collection of open-source software utilities that facilitate using a network of many computers to solve problems involving massive amounts of data and computation. It provides a software framework for distributed storage and processing of big data using the MapReduce programming model. Originally designed for computer clusters built from commodity hardware—still the common use—it has also found use on clusters of higher-end hardware. All the modules in Hadoop are designed with a fundamental assumption that hardware failures are common occurrences and should be automatically handled by the framework.
Apache Spark Libraries. Apache Spark has as its architectural foundation the resilient distributed dataset (RDD), a read-only multiset of data items distributed over a cluster of machines, that is maintained in a fault-tolerant way. The Dataframe API was released as an abstraction on top of the RDD, followed by the Dataset API. In Spark 1.x, the RDD was the primary application programming interface (API), but as of Spark 2.x use of the Dataset API is encouraged even though the RDD API is not deprecated. The RDD technology still underlies the Dataset API.