An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: hadoop-docker

big-data-europe/docker-hadoop

Apache Hadoop docker image

Language: Shell - Size: 109 KB - Last synced at: 8 days ago - Pushed at: about 1 year ago - Stars: 2,252 - Forks: 1,355

gathecageorge/hadoop

Contains docker files to build a hadoop container image

Language: Shell - Size: 51.8 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

Mgosi/Big-Data-Analysis-using-MapReduce-in-Hadoop

We explore data by using Big Data Analysis and Visualization skills. To obtain this, we perform 3 main operations. i.e. i)Data Aggregation through different sources. ii) Big Data Analysis using MapReduce and iii) Visualization through Tableau. Data Analysis is very critical in understanding the data, and what we can do with the data. For small datasets it is easier to process and obtain the results. But as for big companies, it becomes crucial for them to obtain the trends of the company for any changes need to be made. Hence we introduce Big Data Analysis to solve this problem. In this lab, we collect close to 20000 tweets, 500 articles on New York Times and 500 articles on Common Crawl Data about Entertainment, which is our main topic of discussion. Using this data, we perform preprocessing and feed it to a MapReduce to find the Word Count and Word Co-Occurrence. Using this, we find the trend of the data collected in this topic. We have used Python to perform Data Analysis.Data Analysis is very critical in understanding the data, and what we can do with the data. For small datasets it is easier to process and obtain the results. But as for big companies, it becomes crucial for them to obtain the trends of the company for any changes need to be made. Hence we introduce Big Data Analysis to solve this problem. In this lab, we collect close to 20000 tweets, 500 articles on New York Times and 500 articles on Common Crawl Data about Entertainment, which is our main topic of discussion. Using this data, we perform preprocessing and feed it to a MapReduce to find the Word Count and Word Co-Occurrence. Using this, we find the trend of the data collected in this topic. We have used Python to perform Data Analysis.

Language: Jupyter Notebook - Size: 16.8 MB - Last synced at: 5 months ago - Pushed at: over 5 years ago - Stars: 6 - Forks: 3

Wittline/apache-spark-docker

Dockerizing an Apache Spark Standalone Cluster

Language: VBA - Size: 63.7 MB - Last synced at: 9 days ago - Pushed at: almost 3 years ago - Stars: 43 - Forks: 27

jbw/hadoop-docker-cluster

Hadoop cluster on Docker (single host)

Language: Shell - Size: 159 KB - Last synced at: 11 months ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

jbw/build-hadoop

Build Hadoop with Docker for Ubuntu. See releases for different architectures such as armv7l

Language: Dockerfile - Size: 5.86 KB - Last synced at: 11 months ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

vietanh85/hadoop-docker

Apache Hadoop Cluster Docker images

Language: Shell - Size: 103 KB - Last synced at: about 1 year ago - Pushed at: about 7 years ago - Stars: 3 - Forks: 0

adisve/hadoop-spark-cluster

A Spark/Hadoop-Docker Cluster template for working with Big Data

Language: Python - Size: 12.5 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 4 - Forks: 0

Rohit9314/my-hadoop

Setup hadoop cluster manually and automatically

Language: Python - Size: 23.4 KB - Last synced at: about 1 year ago - Pushed at: almost 8 years ago - Stars: 2 - Forks: 0

hyeonsangjeon/dataplatform

Hadoop3.2 single/cluster mode with web terminal gotty, spark, jupyter pyspark, hive, eco etc.

Language: Shell - Size: 549 KB - Last synced at: 23 days ago - Pushed at: over 5 years ago - Stars: 11 - Forks: 1

mjaglan/docker-hadoop-distributed-mode

Run Apache Hadoop 2.7 inside docker container in Multi-Node Cluster mode

Language: Shell - Size: 15.6 KB - Last synced at: over 1 year ago - Pushed at: over 7 years ago - Stars: 2 - Forks: 2

mjaglan/docker-hadoop-pseudo-distributed-mode

Run Apache Hadoop 2.7 inside docker container in pseudo-distributed mode

Language: Shell - Size: 16.6 KB - Last synced at: over 1 year ago - Pushed at: over 7 years ago - Stars: 3 - Forks: 1

ruoyu-chen/hadoop-docker

基于Docker构建的Hadoop开发测试环境,包含Hadoop,Hive,HBase,Spark

Language: Shell - Size: 3.72 MB - Last synced at: over 1 year ago - Pushed at: almost 6 years ago - Stars: 280 - Forks: 161

HoangNV2001/Docker-Hadoop-Hive-Spark-Zeppelin-Hue-Superset

Bigdata stack with Hadoop + Hive +Spark + Zeppelin + Hue + Superset

Language: Python - Size: 6.52 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

JuanCasado/Hadoop-Docker

Hadoop deployment on docker and Docker Swarm

Language: TSQL - Size: 88.8 MB - Last synced at: 11 months ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 1

lyingbo/hadoop-cluster-docker

Run Hadoop Cluster within Docker Containers

Language: Shell - Size: 32.2 KB - Last synced at: almost 2 years ago - Pushed at: over 5 years ago - Stars: 8 - Forks: 3

mr-ravin/Smart-Hadoop-Cluster-SMHACL 📦

This is an automated hadoop cluster building tool,which implements distributed computing for creating the cluster over the network. This is implemented in python 2.7

Language: Python - Size: 1.64 MB - Last synced at: almost 2 years ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 1

waltherg/distributable_docker_sql_on_hadoop

Toy Hadoop cluster combining various SQL-on-Hadoop variants

Language: Shell - Size: 88.9 KB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 11 - Forks: 4

SharpData/docker-hive3

Hive 3 In Docker container

Language: Shell - Size: 2.3 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

MengmSun/hadoop-in-docker

Hadoop in docker cluster, created by docker-compose. Create Hadoop cluster in less than 5mins.

Language: Shell - Size: 6.13 MB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 4 - Forks: 0

Sidl419/hadoop_streaming

Построение рекомендательной системы на основе алгоритма коллаборативной фильтрации и технологии Hadoop Streaming

Language: Python - Size: 1.94 MB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

alex-ber/docker-hive Fork of ops-guru/docker-hive

EMR 5.25.0 cluster single node Hadoop docker image. With Amazon Linux, Hadoop 2.8.5 and Hive 2.3.5

Language: Shell - Size: 45.9 KB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 2 - Forks: 1

fredrikhgrelland/docker-hadoop

Language: Dockerfile - Size: 18.6 KB - Last synced at: 8 days ago - Pushed at: almost 5 years ago - Stars: 1 - Forks: 1

simonprewo/vagrantbox-hadoop-containerexecutor

Size: 10.7 KB - Last synced at: almost 2 years ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 0

younthu/docker-hadoop Fork of big-data-europe/docker-hadoop

Apache Hadoop docker image cluster

Language: Shell - Size: 759 KB - Last synced at: about 1 year ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 0

kevin85421/docker-compile-hadoop

Compile hadoop in docker container

Language: Dockerfile - Size: 6.84 KB - Last synced at: 19 days ago - Pushed at: about 6 years ago - Stars: 0 - Forks: 0

tertiarycourses/ApacheHadoop

Exercise files for Apache Hadoop Big Data Training

Size: 63.5 KB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 1 - Forks: 0

codito/hadoop-expt

Experiments with Hadoop cluster setups in Docker

Size: 1.95 KB - Last synced at: 19 days ago - Pushed at: over 7 years ago - Stars: 2 - Forks: 3

vladimir-kazan/hadoop

Developer's environment for Hadoop

Language: Dockerfile - Size: 6.84 KB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

marycboardman/Assessment-Attempts

Data processing using docker containers, kafka, spark, and hadoop

Size: 6.84 KB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 1

imdeepanshugpt/Hadoop

Hadoop-Cluster

Language: Python - Size: 887 KB - Last synced at: about 1 month ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

maiktheknife/hadoop-docker Fork of SingularitiesCR/hadoop-docker

Apache Hadoop Docker Image

Language: Shell - Size: 133 KB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0