GitHub topics: hadoop-mapreduce
jathavaan/bds-seoul-hadoop
langage: Python - taille: 81,1 ko - dernière synchronisation: il y a 3 jours - enregistré: il y a 3 jours - étoiles: 0 - forks: 0

NitchayaninT/EGCI466_BigData
For big data processing course. Lecture includes the use of hadoop, mongoDB, etc
langage: Jupyter Notebook - taille: 2,61 Mo - dernière synchronisation: il y a 7 jours - enregistré: il y a 7 jours - étoiles: 0 - forks: 0

benedekh/bigdata-projects
Student projects in Big Data field.
langage: Java - taille: 198 ko - dernière synchronisation: il y a 10 jours - enregistré: il y a 10 jours - étoiles: 19 - forks: 12

janheinrichmerker/hadoop-ktx
💾 Kotlin Extensions for Apache Hadoop (MapReduce).
langage: Kotlin - taille: 178 ko - dernière synchronisation: il y a 7 jours - enregistré: il y a 13 jours - étoiles: 1 - forks: 0

SaltFishGC/SteamGameDataAnalysis
大数据课设,steam游戏数据分析,结合hadoop+hive+sqoop+mysql+springboot+echarts展示结果。
langage: JavaScript - taille: 9,3 Mo - dernière synchronisation: il y a 14 jours - enregistré: il y a 15 jours - étoiles: 0 - forks: 0

Lokeshkanna7/An-End-to-End-Big-Data-Pipeline-for-Amazon-Book-Reviews-using-Hadoop-and-Spark
A scalable big data pipeline built with Hadoop and Spark to analyze Amazon book reviews. This project performs sentiment analysis, rating prediction, and fake review detection using PySpark, demonstrating real-world applications of distributed systems and machine learning.
langage: Jupyter Notebook - taille: 459 ko - dernière synchronisation: il y a 18 jours - enregistré: il y a 18 jours - étoiles: 0 - forks: 0

mahmoudparsian/data-algorithms-book
MapReduce, Spark, Java, and Scala for Data Algorithms Book
langage: Java - taille: 397 Mo - dernière synchronisation: il y a 18 jours - enregistré: il y a 8 mois - étoiles: 1 075 - forks: 661

JKA098/Pokemon-Feistiness-Apache-Spark-Job
The following readme file, assume that before running the Spark analytic job, you have already installed the correct versions of **Java**, **Hadoop**, **Spark** and that you are inside **Ubuntu**.
langage: Python - taille: 184 ko - dernière synchronisation: il y a environ un mois - enregistré: il y a environ un mois - étoiles: 0 - forks: 0

JKA098/Pokemon-Feistiness-MapReduce-Job
This Project aims to implement a **Hadoop MapReduce job in Pseudo-Distributed Mode** to determine the **feistiest Pokémon** based on their **type**. The job processes the Pokémon dataset (`pokemon.csv`) and outputs a CSV file containing Pokémon **type1, type2, name, and feistiness score**.
langage: Python - taille: 220 ko - dernière synchronisation: il y a environ un mois - enregistré: il y a environ un mois - étoiles: 0 - forks: 0

taabishhh/LLM_Preprocessing
This project implements a Byte Pair Encoding (BPE) tokenization approach along with a Word2Vec model to generate word embeddings from a text corpus. The implementation leverages Apache Hadoop for distributed processing and includes evaluation metrics for optimal dimensionality of embeddings.
langage: Scala - taille: 7,37 Mo - dernière synchronisation: il y a 6 jours - enregistré: il y a 7 mois - étoiles: 1 - forks: 0

Yousuf1733/Titanic-Dataset-Analysis
Exploratory data analysis of the Titanic dataset, uncovering insights on passenger survival rates based on gender, age, and class. Includes data cleaning, visualization, and findings.
langage: Jupyter Notebook - taille: 71,3 ko - dernière synchronisation: il y a environ 2 mois - enregistré: il y a environ 2 mois - étoiles: 0 - forks: 0

groda/big_data
Tutorials on Big Data essentials: Hadoop, MapReduce, Spark. Explore a variety of tutorials and demonstrations on Big Data technologies, primarily in the form of Jupyter notebooks. Most notebooks are self-contained and live—ready to run with a click.
langage: Jupyter Notebook - taille: 51,9 Mo - dernière synchronisation: il y a 13 jours - enregistré: il y a environ 2 mois - étoiles: 75 - forks: 26

ArianaPerez-24/Hadoop-MapReduce-de-WordCount
Ejercicios para contar palabras, ordenar numeros (de menor a mayor) y resolver sudoku.
langage: Shell - taille: 837 ko - dernière synchronisation: il y a environ 2 mois - enregistré: il y a environ 2 mois - étoiles: 0 - forks: 0

KeerthanaJ-rec/210701118-CS19P16-DA-Lab
Data Analytics Laboratory
langage: R - taille: 23,1 Mo - dernière synchronisation: il y a 2 mois - enregistré: il y a 2 mois - étoiles: 1 - forks: 0

senthuran16/word-count-streaming-python-hadoop-mapreduce
A word count streaming MapReduce implementation with Python
langage: Python - taille: 586 ko - dernière synchronisation: il y a 2 mois - enregistré: il y a plus de 3 ans - étoiles: 0 - forks: 0

sueszli/sparkly-svm
distributed training of a SVM with sparkML
langage: Jupyter Notebook - taille: 21,6 Mo - dernière synchronisation: il y a 23 jours - enregistré: il y a 3 mois - étoiles: 0 - forks: 0

lokk798/BigData-Quiz-Bank
A comprehensive collection of multiple-choice questions (MCQs) and assessments covering Hadoop, MapReduce, and the broader Big Data ecosystem.
taille: 5,86 ko - dernière synchronisation: il y a 3 mois - enregistré: il y a 3 mois - étoiles: 0 - forks: 0

touero/ctenopharyngodon-idella
Use the MapReduce's Java interface to distributed crawle the data of Chinese universities and learn basic knowledge of hdfs.
langage: Java - taille: 3,75 Mo - dernière synchronisation: il y a 21 jours - enregistré: il y a 8 mois - étoiles: 140 - forks: 0

imsanjoykb/PySpark-Bootcamp
My Practice and project on PySpark
langage: Jupyter Notebook - taille: 4,52 Mo - dernière synchronisation: il y a 2 mois - enregistré: il y a plus de 3 ans - étoiles: 8 - forks: 3

josericodata/josericodata
Adding a cool README file
taille: 87,9 ko - dernière synchronisation: il y a 3 mois - enregistré: il y a 3 mois - étoiles: 0 - forks: 0

elaaatif/JPEG-and-JPEG2000-compression-on-Multi-node-cluster-using-hadoop-and-spark
Big Data technologies can be leveraged for efficient, distributed image compression using JPEG2000 (Spark) and JPEG (MapReduce).
taille: 14,3 Mo - dernière synchronisation: il y a 2 mois - enregistré: il y a 4 mois - étoiles: 2 - forks: 0

groda/hats
Hadoop Ansible Test Suite
langage: Shell - taille: 33,2 ko - dernière synchronisation: il y a 8 jours - enregistré: il y a 4 mois - étoiles: 0 - forks: 0

nikisetti01/Hadoop-MapReduce-LetterFrequency-Analysis
Simple example of Hadoop Application count letter, with an intersting Romance Language Analysis
langage: Jupyter Notebook - taille: 2,71 Mo - dernière synchronisation: il y a 3 mois - enregistré: il y a 10 mois - étoiles: 2 - forks: 2

berksudan/Analysis-on-Big-Data-with-Hadoop
Implementation of Statistical Methods via Hadoop Map-Reduce Library.
langage: Java - taille: 75,2 Mo - dernière synchronisation: il y a 4 mois - enregistré: il y a 4 mois - étoiles: 0 - forks: 0

pngo1997/Big-Data-Mining-Project-PageRank-Hadoop-Streaming
Explores Big Data Processing using Hadoop & MapReduce.
langage: Jupyter Notebook - taille: 2,54 Mo - dernière synchronisation: il y a 3 mois - enregistré: il y a 4 mois - étoiles: 0 - forks: 0

madhurimarawat/Big-Data-Analytics
This repository demonstrates big data processing, visualization, and machine learning using tools such as Hadoop, Spark, Kafka, and Python.
langage: Jupyter Notebook - taille: 10,7 Mo - dernière synchronisation: il y a 3 mois - enregistré: il y a 4 mois - étoiles: 0 - forks: 1

chouaib-629/CustomerSegmentation
Hadoop-based Customer Segmentation project using the Online Retail Dataset. Implements MapReduce for processing and Python for preprocessing to uncover customer purchasing patterns for targeted marketing.
langage: Jupyter Notebook - taille: 260 ko - dernière synchronisation: il y a 4 mois - enregistré: il y a 4 mois - étoiles: 1 - forks: 0

bytedance/CloudShuffleService
Cloud Shuffle Service(CSS) is a general purpose remote shuffle solution for compute engines, including Spark/Flink/MapReduce.
langage: Java - taille: 1,23 Mo - dernière synchronisation: il y a 20 jours - enregistré: il y a environ un an - étoiles: 255 - forks: 58

arkady-emelyanov/hadoop-playground 📦
🐘Yet another Hadoop playground
langage: Shell - taille: 49,8 ko - dernière synchronisation: il y a 3 jours - enregistré: il y a environ 7 ans - étoiles: 2 - forks: 1

developer-sdk/beginner-bigdata-example
Hadoop, Hive, Spark 작업의 예제들
langage: Java - taille: 3,34 Mo - dernière synchronisation: il y a 2 jours - enregistré: il y a 5 mois - étoiles: 1 - forks: 1

Hadeel-Abdeljalil/-Advanced-Topics-in-Database-DBMS--University-Assignment
langage: Java - taille: 411 ko - dernière synchronisation: il y a 3 mois - enregistré: il y a 9 mois - étoiles: 1 - forks: 0

chouaib-629/MovieRecommendation
A Hadoop-based Movie Recommendation System using the MovieLens dataset, demonstrating MapReduce for sorting and processing movie ratings.
langage: Java - taille: 320 ko - dernière synchronisation: il y a 4 mois - enregistré: il y a 5 mois - étoiles: 1 - forks: 0

QiushiSun/Distributed-Computing-Systems
2021 Spring (Distributed Computing Systems) 分布式系统与编程
langage: Java - taille: 101 Mo - dernière synchronisation: il y a 2 mois - enregistré: il y a presque 4 ans - étoiles: 15 - forks: 1

krishnadey30/Intro-to-Hadoop-and-MapReduce
langage: Python - taille: 6,54 Mo - dernière synchronisation: il y a 3 mois - enregistré: il y a presque 7 ans - étoiles: 2 - forks: 0

krishnadey30/NewsHeadlines
This repository have codes that extracts meaningful information from News headline data-set.
langage: Python - taille: 85,9 ko - dernière synchronisation: il y a 2 mois - enregistré: il y a environ 6 ans - étoiles: 3 - forks: 2

benjdiasaad/MapReduce_WordCount
Création d'un programme Hadoop Java : compteur d’occurrence de mots. Si vous souhaitez compiler manuellement le code sur la machine virtuelle Hadoop, vous devrez y copier ce code dans la VM
langage: Java - taille: 11,7 ko - dernière synchronisation: il y a 3 mois - enregistré: il y a plus de 4 ans - étoiles: 2 - forks: 0

singhdivyank/MongoHadoop
A MongoDB and Hadoop cheat sheet with some commands and a few questions
langage: Python - taille: 499 ko - dernière synchronisation: il y a 3 mois - enregistré: il y a 5 mois - étoiles: 0 - forks: 0

Mariam-iftikhar/BigDataProjects
The repository showcases a series of exercises and projects focused on big data processing using Hadoop, HBase, Hive, and Spark with Python. Hosted on AWS EMR, these projects demonstrate efficient data handling and processing techniques, leveraging the power of cloud computing to tackle complex data challenges.
taille: 10,4 Mo - dernière synchronisation: il y a 30 jours - enregistré: il y a environ un an - étoiles: 0 - forks: 0

sephiroth7712/K-Nearest-Neigbours
Implementation of K-Nearest Neighbors algorithm using multiple parallel computing approaches: CUDA (GPU), Hadoop, Spark, MPI, OpenMP, and PThreads. Demonstrates scalable machine learning across different parallel computing paradigms from GPU to distributed frameworks.
langage: C++ - taille: 19,5 ko - dernière synchronisation: il y a 2 mois - enregistré: il y a 6 mois - étoiles: 0 - forks: 0

mehwishferoz/BDA-project
A Hadoop MapReduce project analyzing the Consumer Complaints dataset with five queries to extract insights like complaints by product, state, company, tags, and timely responses.
langage: Java - taille: 7,42 Mo - dernière synchronisation: il y a 3 mois - enregistré: il y a 6 mois - étoiles: 0 - forks: 0

HabibAroua/Newspaper-analysis
langage: Java - taille: 12,5 Mo - dernière synchronisation: il y a 7 mois - enregistré: il y a 7 mois - étoiles: 1 - forks: 1

SAKET-SK/Semester6-SPPU-Data-Analysis-Lab
I installed Hadoop on Virtual Machine and all Assignments are performed on Ubuntu OS. Refer to this repo for completion of the Hadoop Assignments. It is recommended that you have a stable internet connection while doing these things.
langage: Rebol - taille: 3,24 Mo - dernière synchronisation: il y a 9 jours - enregistré: il y a environ 2 ans - étoiles: 13 - forks: 6

m-anshu/big-data-coursework
Big Data coursework material
langage: Shell - taille: 3,27 Mo - dernière synchronisation: il y a 1 jour - enregistré: il y a 8 mois - étoiles: 0 - forks: 0

RiccardoRevalor/MapReduce
Collection of exercises regarding Hadoop and MapReduce approach
langage: Java - taille: 71,3 ko - dernière synchronisation: il y a 8 mois - enregistré: il y a 8 mois - étoiles: 0 - forks: 0

chriniko13/apache-hadoop-word-count-example
langage: Java - taille: 5,86 ko - dernière synchronisation: il y a 3 mois - enregistré: il y a environ 7 ans - étoiles: 1 - forks: 0

amaankhan02/maplejuice
A parallel distributed batch processing framework similar to Hadoop MapReduce with a SQL Engine and a distributed file system
langage: Go - taille: 864 ko - dernière synchronisation: il y a 9 mois - enregistré: il y a 9 mois - étoiles: 0 - forks: 0

MariaDukmak/Hadopy
Easy parallel map-reduce command line tool
langage: Python - taille: 28,3 ko - dernière synchronisation: il y a 3 jours - enregistré: il y a environ 4 ans - étoiles: 7 - forks: 0

Rifat392000/BigDataAnalytics
langage: Jupyter Notebook - taille: 18,4 Mo - dernière synchronisation: il y a 30 jours - enregistré: il y a 9 mois - étoiles: 0 - forks: 0

burhanahmed1/Big-Data-Analytics
Practice tasks in Python programming language using Hadoop, MRJob, PySpark for Big Data Analytics.
langage: Jupyter Notebook - taille: 40 ko - dernière synchronisation: il y a 4 mois - enregistré: il y a 12 mois - étoiles: 2 - forks: 0

drexly/movie140reviewcorpus
네이버 영화 164397건 중 140자 평이 있는 영화별 평점 raw data for spark
taille: 336 Mo - dernière synchronisation: il y a 9 mois - enregistré: il y a plus de 7 ans - étoiles: 7 - forks: 5

prateekkr1/Project-Work
This repository contains some of my personal projects.
langage: Jupyter Notebook - taille: 3,79 Mo - dernière synchronisation: il y a 10 mois - enregistré: il y a environ 5 ans - étoiles: 0 - forks: 0

WilliamCallao/HadoopNewsTrends
News trend analysis using Hadoop in a virtualized CentOS environment
langage: Python - taille: 17,4 Mo - dernière synchronisation: il y a 29 jours - enregistré: il y a 12 mois - étoiles: 1 - forks: 0

ZiadSalah2003/BigData-Project
"BigData-Project", is a comprehensive Big Data solution that involves various operations such as web crawling, PageRank algorithm, TF-IDF calculations, and inverted index creation using Hadoop.
taille: 83 ko - dernière synchronisation: il y a 25 jours - enregistré: il y a environ un an - étoiles: 0 - forks: 0

viseshrp/PageRank-MapReduce-Implementation
The MapReduce-Hadoop implementation of Google's PageRank algorithm
langage: Java - taille: 206 ko - dernière synchronisation: il y a 6 jours - enregistré: il y a environ 8 ans - étoiles: 2 - forks: 0

highoncarbs/hadoopwithpy
:elephant: :heavy_plus_sign: :snake: Learning Hadoop with Python
langage: Python - taille: 86,6 Mo - dernière synchronisation: il y a 3 mois - enregistré: il y a plus de 7 ans - étoiles: 3 - forks: 0

sharma-n/global_event_analytics
Big data analytics using Hadoop on GDELT global news dataset.
langage: Java - taille: 2,66 Mo - dernière synchronisation: il y a 11 mois - enregistré: il y a plus de 5 ans - étoiles: 4 - forks: 1

KingJin-web/Hadoop
hadoop-hdfs 以及 mapreduce 学习
langage: Java - taille: 7,56 Mo - dernière synchronisation: il y a 12 mois - enregistré: il y a presque 4 ans - étoiles: 1 - forks: 1

29DCH/Hadoop-HDFS-MapReduce-Examples
Java API操作HDFS文件、基于MapReduce的词频统计程序及其重构、MapReduce编程之Combiner、Partitioner组件应用
langage: Java - taille: 35,2 ko - dernière synchronisation: il y a 3 mois - enregistré: il y a presque 3 ans - étoiles: 2 - forks: 1

DecioXXIV/BD-StockAnalysis
Repository per il Secondo Progetto del Corso di "Big Data" (2023/24)
langage: Python - taille: 36,1 ko - dernière synchronisation: il y a 12 mois - enregistré: il y a environ un an - étoiles: 0 - forks: 0

jbw/hadoop-docker-cluster
Hadoop cluster on Docker (single host)
langage: Shell - taille: 159 ko - dernière synchronisation: il y a environ un an - enregistré: il y a presque 7 ans - étoiles: 0 - forks: 0

Coursal/Text-Sentiment-Analysis-In-Hadoop-And-Spark
The source code developed and used for the purposes of my thesis with the same title under the guidance of my supervisor professor Vasilis Mamalis for the Department of Informatics and Computer Engineering of the University of West Attica.
langage: Java - taille: 66,5 Mo - dernière synchronisation: il y a environ un an - enregistré: il y a environ 4 ans - étoiles: 1 - forks: 1

Coursal/Hadoop-Letter-File-Index-Counter
A Hadoop-based Java project that counts the max number of word occurences for each letter in a textfile of a folder.
langage: Java - taille: 213 ko - dernière synchronisation: il y a environ un an - enregistré: il y a plus de 4 ans - étoiles: 0 - forks: 0

Coursal/Hadoop-Examples
Some simple, kinda introductory projects based on Apache Hadoop to be used as guides in order to make the MapReduce model look less weird or boring.
langage: Java - taille: 340 ko - dernière synchronisation: il y a environ un an - enregistré: il y a environ un an - étoiles: 5 - forks: 2

StevenMonty/MapReduceSearchEngine
A containerized search engine GUI that communicates with a Hadoop cluster running MapReduce on GCP to create Inverted Indices for search engine queries.
langage: Java - taille: 15,8 Mo - dernière synchronisation: il y a environ un an - enregistré: il y a plus de 4 ans - étoiles: 0 - forks: 0

chicuongdev2002/BigData_Hadoop_MapReduce
Use Scrapy Hadoop PigLatin
langage: Python - taille: 6,99 Mo - dernière synchronisation: il y a environ un an - enregistré: il y a environ un an - étoiles: 0 - forks: 0

subhash26jan96/cluster
This repository has a hadoop cluster code that are automated, ondemand, manual using by python, linux, html etc.
langage: Python - taille: 16,6 ko - dernière synchronisation: il y a environ un an - enregistré: il y a presque 7 ans - étoiles: 0 - forks: 1

Jacob12138xieyuan/hadoop-mapreduce-with-python
hadoop mapreduce algorithm with hadoop streaming (Python)
langage: Jupyter Notebook - taille: 16,6 ko - dernière synchronisation: il y a environ un an - enregistré: il y a environ un an - étoiles: 1 - forks: 0

VaishnavJois/CLOUDERA
Cloudera commands used for Big Data Analytics
taille: 13,7 ko - dernière synchronisation: il y a environ un an - enregistré: il y a plus de 5 ans - étoiles: 0 - forks: 0

raineydavid/big-data-processing
Big Data Processing Notes from Masters in Big Data Science
taille: 13,7 Mo - dernière synchronisation: il y a environ un an - enregistré: il y a environ 6 ans - étoiles: 0 - forks: 0

gmarciani/mapreduce-app
Scaffolding for Map/Reduce applications, leveraging Apache Hadoop.
langage: Shell - taille: 1000 octets - dernière synchronisation: il y a environ un an - enregistré: il y a presque 8 ans - étoiles: 1 - forks: 0

Walrussin/MapReduce-Examples
Analyzing air quality index of eight states
langage: Java - taille: 35,6 Mo - dernière synchronisation: il y a environ un an - enregistré: il y a environ 3 ans - étoiles: 0 - forks: 0

manoharpalanisamy/Advanced-Map-Reduce
Running Map reduce jobs on Hadoop Cluster with customized parameter
langage: Java - taille: 21,5 ko - dernière synchronisation: il y a environ un an - enregistré: il y a presque 7 ans - étoiles: 0 - forks: 0

andrejanesic/Hadoop-Beginner-Exercise-Football-Data
Hadoop beginner exercise in analyzing European football teams' statistics over the last 20 years. The goal is to determine which team had the highest win percentage-rate.
langage: Makefile - taille: 453 ko - dernière synchronisation: il y a environ un an - enregistré: il y a plus de 2 ans - étoiles: 0 - forks: 0

zhermin/topkcommonwords
Extracts the Top K Common Words between 2 Text Files using Hadoop's MapReduce
langage: Java - taille: 84 ko - dernière synchronisation: il y a environ un an - enregistré: il y a plus de 2 ans - étoiles: 0 - forks: 0

MoustafaAMahmoud/BigDataInDepth
Data Engineering Course
langage: TeX - taille: 78,9 Mo - dernière synchronisation: il y a environ un an - enregistré: il y a environ un an - étoiles: 15 - forks: 9

YMaher99/Parallelizing-the-Feedforward-Operation-of-Neural-Networks-in-Hadoop-MapReduce
Leveraging the mapreduce paradigm we propose a solution to parallelize the feedforward operation of neural networks in order to speed it up for sufficiently large NN architectures and for sufficiently large datasets. Tested Using the MNIST dataset results can be found in the results.html and results.ipynb files.
langage: HTML - taille: 2,1 Mo - dernière synchronisation: il y a environ un an - enregistré: il y a plus de 2 ans - étoiles: 0 - forks: 0

rodrigoorf/HadoopStudies
Repo with a few Hadoop exercises
langage: Java - taille: 72,3 ko - dernière synchronisation: il y a environ un an - enregistré: il y a plus de 5 ans - étoiles: 0 - forks: 0

emrectn/HadoopTutorial
hadoop
langage: Java - taille: 15,6 ko - dernière synchronisation: il y a environ un an - enregistré: il y a environ 6 ans - étoiles: 0 - forks: 0

prabhuvashwin/PageRank-Algorithm-Implementation 📦
Implementation of Google's PageRank algorithm using Java, Hadoop, and MapReduce
langage: Java - taille: 10,7 ko - dernière synchronisation: il y a environ un an - enregistré: il y a plus de 7 ans - étoiles: 3 - forks: 1

prabhuvashwin/TFIDF-SearchQuery 📦
Implementation for TFIDF and Searching of queries using keywords, using Java and Apache Hadoop
langage: HTML - taille: 887 ko - dernière synchronisation: il y a environ un an - enregistré: il y a plus de 7 ans - étoiles: 1 - forks: 1

prabhuvashwin/Credit-Card-Fraud-Detection 📦
Naive Bayes classifier and Logistic Regression classifier to predict whether a transaction is fraudulent or not
langage: Java - taille: 42,3 Mo - dernière synchronisation: il y a environ un an - enregistré: il y a plus de 7 ans - étoiles: 1 - forks: 2

tableMinPark/trendflow
❗ 트랜드 분석 플랫폼 - SSAFY 8기 특화 프로젝트
langage: Java - taille: 55,6 Mo - dernière synchronisation: il y a environ un an - enregistré: il y a environ 2 ans - étoiles: 0 - forks: 0

Dave-Vedant/BigDataTech
This Repository contains the small projects related to Hive, Hadoop, and Spark. Its my contribution of learning new technology and provide my concise knowledge on big data different infrastructures.
langage: Scala - taille: 533 ko - dernière synchronisation: il y a environ un an - enregistré: il y a presque 5 ans - étoiles: 1 - forks: 1

yangboz/mipr Fork de sozykin/mipr
MapReduce Image Processing framework for Hadoop
langage: Java - taille: 734 ko - dernière synchronisation: il y a environ un an - enregistré: il y a plus de 8 ans - étoiles: 0 - forks: 0

yennanliu/spark_emr_dev
Collection of code for submitting Spark/Hadoop/Hive/Pig tasks to EMR (AWS Elastic MapReduce) | #DE
langage: Scala - taille: 3,72 Mo - dernière synchronisation: il y a environ un an - enregistré: il y a plus de 5 ans - étoiles: 3 - forks: 1

koitoer/hadoop
Hadoop and Big Data Training Resources
langage: Java - taille: 12,2 Mo - dernière synchronisation: il y a environ un an - enregistré: il y a environ 8 ans - étoiles: 1 - forks: 1

juliengan/TreesAnalysis
Trees Analysis
langage: Java - taille: 52,3 Mo - dernière synchronisation: il y a environ un an - enregistré: il y a environ un an - étoiles: 1 - forks: 0

vim89/datapipelines-essentials-python
Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
langage: Python - taille: 1,76 Mo - dernière synchronisation: il y a environ un an - enregistré: il y a environ 2 ans - étoiles: 53 - forks: 34

aadishgoel/Hadoop-Codes
Neat and Handy Place for all Hadoop codes
langage: Java - taille: 25,4 ko - dernière synchronisation: il y a environ un an - enregistré: il y a plus de 7 ans - étoiles: 6 - forks: 3

simondelarue/Hadoop_Map_Reduce_from_scratch
Hadoop MapReduce implementation from scratch using Python | Distributed computing | Multi-processing
langage: Python - taille: 5,39 Mo - dernière synchronisation: il y a 11 mois - enregistré: il y a presque 4 ans - étoiles: 2 - forks: 0

ahtezaz123/hadoop-mapreduce-on-wikipedia-articles-
Big Data Analytics Assignment on Hadoop MapReduce
langage: Jupyter Notebook - taille: 5,54 Mo - dernière synchronisation: il y a environ un an - enregistré: il y a environ un an - étoiles: 0 - forks: 0

juagarmar/Cov-Cor-matrix-via-Rhadoop
Covariance and correlation matrix via Rhadoop (rmr2 and HDFS)
langage: R - taille: 2,93 ko - dernière synchronisation: il y a environ un an - enregistré: il y a environ 8 ans - étoiles: 4 - forks: 0

juagarmar/linear-regression-via-Rhadoop
Linear regression via Rhadoop (rmr2 and RHDFS)
langage: R - taille: 5,86 ko - dernière synchronisation: il y a environ un an - enregistré: il y a environ 8 ans - étoiles: 3 - forks: 0

nzsaurabh/hadoop_training
Exercises on MapReduce, Pig, Spark, Relational and Non Relational data stores in Hadoop
taille: 939 ko - dernière synchronisation: il y a environ un an - enregistré: il y a plus de 6 ans - étoiles: 0 - forks: 0

syscrest/oozie-graphite
Monitor your oozie server and your oozie bundles with graphite
langage: Java - taille: 621 ko - dernière synchronisation: il y a environ un an - enregistré: il y a environ 10 ans - étoiles: 7 - forks: 3

Driramohamedfarouk/bigdata-stock-market-pipeline
langage: Scala - taille: 310 ko - dernière synchronisation: il y a environ un an - enregistré: il y a environ 2 ans - étoiles: 0 - forks: 1

Rohit9314/my-hadoop
Setup hadoop cluster manually and automatically
langage: Python - taille: 23,4 ko - dernière synchronisation: il y a environ un an - enregistré: il y a presque 8 ans - étoiles: 2 - forks: 0

snehpahilwani/WordCount-hadoop
Word Count code written for Hadoop platform (Java Implementation)
langage: Java - taille: 1,74 Mo - dernière synchronisation: il y a environ un an - enregistré: il y a plus de 7 ans - étoiles: 0 - forks: 0

fzehracetin/big-data-project
Big Data Processing and Analytics course term project.
langage: JavaScript - taille: 8,77 Mo - dernière synchronisation: il y a environ un an - enregistré: il y a presque 4 ans - étoiles: 0 - forks: 0

jayanthanantharapu/WordCounter
Hadoop job to count words
langage: Java - taille: 3,91 ko - dernière synchronisation: il y a environ un an - enregistré: il y a presque 8 ans - étoiles: 0 - forks: 0
