GitHub topics: hadoop-docker
big-data-europe/docker-hadoop
Apache Hadoop docker image
Language: Shell - Size: 109 KB - Last synced at: 8 days ago - Pushed at: about 1 year ago - Stars: 2,252 - Forks: 1,355

gathecageorge/hadoop
Contains docker files to build a hadoop container image
Language: Shell - Size: 51.8 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

Mgosi/Big-Data-Analysis-using-MapReduce-in-Hadoop
We explore data by using Big Data Analysis and Visualization skills. To obtain this, we perform 3 main operations. i.e. i)Data Aggregation through different sources. ii) Big Data Analysis using MapReduce and iii) Visualization through Tableau. Data Analysis is very critical in understanding the data, and what we can do with the data. For small datasets it is easier to process and obtain the results. But as for big companies, it becomes crucial for them to obtain the trends of the company for any changes need to be made. Hence we introduce Big Data Analysis to solve this problem. In this lab, we collect close to 20000 tweets, 500 articles on New York Times and 500 articles on Common Crawl Data about Entertainment, which is our main topic of discussion. Using this data, we perform preprocessing and feed it to a MapReduce to find the Word Count and Word Co-Occurrence. Using this, we find the trend of the data collected in this topic. We have used Python to perform Data Analysis.Data Analysis is very critical in understanding the data, and what we can do with the data. For small datasets it is easier to process and obtain the results. But as for big companies, it becomes crucial for them to obtain the trends of the company for any changes need to be made. Hence we introduce Big Data Analysis to solve this problem. In this lab, we collect close to 20000 tweets, 500 articles on New York Times and 500 articles on Common Crawl Data about Entertainment, which is our main topic of discussion. Using this data, we perform preprocessing and feed it to a MapReduce to find the Word Count and Word Co-Occurrence. Using this, we find the trend of the data collected in this topic. We have used Python to perform Data Analysis.
Language: Jupyter Notebook - Size: 16.8 MB - Last synced at: 5 months ago - Pushed at: over 5 years ago - Stars: 6 - Forks: 3

Wittline/apache-spark-docker
Dockerizing an Apache Spark Standalone Cluster
Language: VBA - Size: 63.7 MB - Last synced at: 9 days ago - Pushed at: almost 3 years ago - Stars: 43 - Forks: 27

jbw/hadoop-docker-cluster
Hadoop cluster on Docker (single host)
Language: Shell - Size: 159 KB - Last synced at: 11 months ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

jbw/build-hadoop
Build Hadoop with Docker for Ubuntu. See releases for different architectures such as armv7l
Language: Dockerfile - Size: 5.86 KB - Last synced at: 11 months ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

vietanh85/hadoop-docker
Apache Hadoop Cluster Docker images
Language: Shell - Size: 103 KB - Last synced at: about 1 year ago - Pushed at: about 7 years ago - Stars: 3 - Forks: 0

adisve/hadoop-spark-cluster
A Spark/Hadoop-Docker Cluster template for working with Big Data
Language: Python - Size: 12.5 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 4 - Forks: 0

Rohit9314/my-hadoop
Setup hadoop cluster manually and automatically
Language: Python - Size: 23.4 KB - Last synced at: about 1 year ago - Pushed at: almost 8 years ago - Stars: 2 - Forks: 0

hyeonsangjeon/dataplatform
Hadoop3.2 single/cluster mode with web terminal gotty, spark, jupyter pyspark, hive, eco etc.
Language: Shell - Size: 549 KB - Last synced at: 23 days ago - Pushed at: over 5 years ago - Stars: 11 - Forks: 1

mjaglan/docker-hadoop-distributed-mode
Run Apache Hadoop 2.7 inside docker container in Multi-Node Cluster mode
Language: Shell - Size: 15.6 KB - Last synced at: over 1 year ago - Pushed at: over 7 years ago - Stars: 2 - Forks: 2

mjaglan/docker-hadoop-pseudo-distributed-mode
Run Apache Hadoop 2.7 inside docker container in pseudo-distributed mode
Language: Shell - Size: 16.6 KB - Last synced at: over 1 year ago - Pushed at: over 7 years ago - Stars: 3 - Forks: 1

ruoyu-chen/hadoop-docker
基于Docker构建的Hadoop开发测试环境,包含Hadoop,Hive,HBase,Spark
Language: Shell - Size: 3.72 MB - Last synced at: over 1 year ago - Pushed at: almost 6 years ago - Stars: 280 - Forks: 161

HoangNV2001/Docker-Hadoop-Hive-Spark-Zeppelin-Hue-Superset
Bigdata stack with Hadoop + Hive +Spark + Zeppelin + Hue + Superset
Language: Python - Size: 6.52 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

JuanCasado/Hadoop-Docker
Hadoop deployment on docker and Docker Swarm
Language: TSQL - Size: 88.8 MB - Last synced at: 11 months ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 1

lyingbo/hadoop-cluster-docker
Run Hadoop Cluster within Docker Containers
Language: Shell - Size: 32.2 KB - Last synced at: almost 2 years ago - Pushed at: over 5 years ago - Stars: 8 - Forks: 3

mr-ravin/Smart-Hadoop-Cluster-SMHACL 📦
This is an automated hadoop cluster building tool,which implements distributed computing for creating the cluster over the network. This is implemented in python 2.7
Language: Python - Size: 1.64 MB - Last synced at: almost 2 years ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 1

waltherg/distributable_docker_sql_on_hadoop
Toy Hadoop cluster combining various SQL-on-Hadoop variants
Language: Shell - Size: 88.9 KB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 11 - Forks: 4

SharpData/docker-hive3
Hive 3 In Docker container
Language: Shell - Size: 2.3 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

MengmSun/hadoop-in-docker
Hadoop in docker cluster, created by docker-compose. Create Hadoop cluster in less than 5mins.
Language: Shell - Size: 6.13 MB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 4 - Forks: 0

Sidl419/hadoop_streaming
Построение рекомендательной системы на основе алгоритма коллаборативной фильтрации и технологии Hadoop Streaming
Language: Python - Size: 1.94 MB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

alex-ber/docker-hive Fork of ops-guru/docker-hive
EMR 5.25.0 cluster single node Hadoop docker image. With Amazon Linux, Hadoop 2.8.5 and Hive 2.3.5
Language: Shell - Size: 45.9 KB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 2 - Forks: 1

fredrikhgrelland/docker-hadoop
Language: Dockerfile - Size: 18.6 KB - Last synced at: 8 days ago - Pushed at: almost 5 years ago - Stars: 1 - Forks: 1

simonprewo/vagrantbox-hadoop-containerexecutor
Size: 10.7 KB - Last synced at: almost 2 years ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 0

younthu/docker-hadoop Fork of big-data-europe/docker-hadoop
Apache Hadoop docker image cluster
Language: Shell - Size: 759 KB - Last synced at: about 1 year ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 0

kevin85421/docker-compile-hadoop
Compile hadoop in docker container
Language: Dockerfile - Size: 6.84 KB - Last synced at: 19 days ago - Pushed at: about 6 years ago - Stars: 0 - Forks: 0

tertiarycourses/ApacheHadoop
Exercise files for Apache Hadoop Big Data Training
Size: 63.5 KB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 1 - Forks: 0

codito/hadoop-expt
Experiments with Hadoop cluster setups in Docker
Size: 1.95 KB - Last synced at: 19 days ago - Pushed at: over 7 years ago - Stars: 2 - Forks: 3

vladimir-kazan/hadoop
Developer's environment for Hadoop
Language: Dockerfile - Size: 6.84 KB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

marycboardman/Assessment-Attempts
Data processing using docker containers, kafka, spark, and hadoop
Size: 6.84 KB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 1

imdeepanshugpt/Hadoop
Hadoop-Cluster
Language: Python - Size: 887 KB - Last synced at: about 1 month ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

maiktheknife/hadoop-docker Fork of SingularitiesCR/hadoop-docker
Apache Hadoop Docker Image
Language: Shell - Size: 133 KB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0
