GitHub topics: mapreduce-java

Repositories

nielsbasjes/splittablegzip

Splittable Gzip codec for Hadoop

Language: Java - Size: 1.38 MB - Last synced at: 11 days ago - Pushed at: 22 days ago - Stars: 73 - Forks: 9

SPARK is a extensible energy analytics platform designed for processing large-scale renewable energy datasets using the Hadoop MapReduce framework. It offers data analytics, machine learning-based forecasting, and energy trend insights in a fully modular setup utilizing Hadoop, MapReduce, Apache Spark complementing the Streamlit UI.

Language: Jupyter Notebook - Size: 21.6 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

16kushaal/SPARK-Sustainable-Power-Analytics-and-Renewable-Kinetics

SPARK is a extensible energy analytics platform designed for processing large-scale renewable energy datasets using the Hadoop MapReduce framework. It offers data analytics, machine learning-based forecasting, and energy trend insights in a fully modular setup utilising Hadoop, MapReduce, Apache Spark complementing the Streamlit UI.

Language: Jupyter Notebook - Size: 13.4 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

SimoBkr/MapReduceJAVA

JAVA SWING APPLICATION MAPREDUCE

Language: Java - Size: 40 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

PasanAbeysekara/Taxi-Pickup-Hotspot-Analysis-using-Hadoop-MapReduce

This project analyzes one month of NYC Yellow Taxi trip data (January 2016) to identify the busiest taxi pickup locations. It utilizes the Hadoop MapReduce framework to process the data and a lookup table to map location IDs to human-readable zone names.

Language: Java - Size: 5.65 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

gilberto-009199/bigdata

Workspaces de BigData:

Language: Java - Size: 60.4 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

TechAlhan826/Hadoop-Tasks

Hadoop MapReduce Tasks Java - Big Data Project 🚀

Language: Java - Size: 275 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

KhalilKrugerOS/PaymentMethodCounter

INSAT exercice solution where we count how many transactions use Mastercard using MapReduce Frameword on hadoop

Language: Java - Size: 0 Bytes - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

Dare-marvel/Big-Data-Analytics--BDA--

💾 Welcome to the Big Data Analytics Repository! 📚✨ Immerse yourself in a carefully curated reservoir of knowledge on Big Data Analytics. 🌐💡 Explore the intricacies of deriving insights from vast datasets and navigating powerful analytics tools. 🚀🔍

Language: Java - Size: 174 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 2

berksudan/Analysis-on-Big-Data-with-Hadoop

Implementation of Statistical Methods via Hadoop Map-Reduce Library.

Language: Java - Size: 75.2 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

ashishgopalhattimare/Parallel-Concurrent-and-Distributed-Programming-in-Java

Parallel, Concurrent, and Distributed Programming in Java | Coursera

Language: Java - Size: 34.5 MB - Last synced at: 4 months ago - Pushed at: almost 5 years ago - Stars: 2 - Forks: 2

agustin-recoba/mosquitos-hpc

Proyecto Hadoop MapReduce de algoritmos de detección de tendencias sobre series temporales, aplicados a datos de ventas de productos relacionados con control de plagas (repelentes e insecticidas).

Language: Java - Size: 1.03 MB - Last synced at: 5 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

raza8899/Recommend_Friends_through_MapReduce

Its a Map Reduce Program which tells you about People you may know on the basis of mutual friends

Language: Java - Size: 2.66 MB - Last synced at: about 1 year ago - Pushed at: over 7 years ago - Stars: 1 - Forks: 0

anpu9/MIT6.824-MapReduce

MapReduce Implementation - Distributed System

Language: Go - Size: 21.1 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Coursal/Hadoop-Examples

Some simple, kinda introductory projects based on Apache Hadoop to be used as guides in order to make the MapReduce model look less weird or boring.

Language: Java - Size: 340 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 5 - Forks: 2

amarkum/crunch-demo

crunch demo project

Language: Java - Size: 7.81 KB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 0

debajyotiguha11/BigDataAssignment_WordCount

Class assignment to understand the MapReduce Programming model in Hadoop.

Language: Java - Size: 5.9 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

emrectn/HadoopTutorial

hadoop

Language: Java - Size: 15.6 KB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

changfubai/hadoop-wordcount

Language: Java - Size: 3.64 MB - Last synced at: over 1 year ago - Pushed at: almost 8 years ago - Stars: 0 - Forks: 0

aadishgoel/Hadoop-Codes

Neat and Handy Place for all Hadoop codes

Language: Java - Size: 25.4 KB - Last synced at: over 1 year ago - Pushed at: almost 8 years ago - Stars: 6 - Forks: 3

fzehracetin/big-data-project

Big Data Processing and Analytics course term project.

Language: JavaScript - Size: 8.77 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

fbaldi6/PageRank-Hadoop Fork of edofazza/PageRank-Hadoop

Implementation of the MapReduce PageRank algorithm using the Hadoop framework in Java (developed for Cloud Computing course)

Size: 5.35 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

fbaldi6/PageRank-Spark Fork of edofazza/PageRank-Spark

Implementation of the MapReduce PageRank algorithm using the Spark framework both in Python and in Java (developed for Cloud Computing course)

Size: 4.99 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

10lloydj/NLP-RDF-Inverted-Index

This Map Reduce program should read in a set of RDF/XML documents and output the data in the form: {object}, [(predicate1, position, subject1)...]

Language: Java - Size: 11.7 KB - Last synced at: over 1 year ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0

SomeshChevella/Apache-Hadoop-Map-Reduce--Basic-Sentiment-Analysis-on-Yelp-Dataset

In this project we will use Hadoop MapReduce to implement a very basic “Sentiment Analysis” using the review text in the Yelp Academic Dataset as training data.

Language: Java - Size: 7.39 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

HarshitDawar55/MapReduce

Programs for MapReduce written in java with least complexity!

Language: Java - Size: 76.2 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

iulianoroberto/MapReduceBasicApplications

Basic MapReduce applications in Java.

Language: Java - Size: 16.6 KB - Last synced at: 7 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

Sumonta056/Hadoop-Clustering-Docker-Guide

Hadoop-Clustering-Docker-Guide : A Complete Documentation to setting up Hadoop and try clustering.

Language: Java - Size: 51.5 MB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

HxnDev/Hadoop-MapReduce-to-Analyze-Sentiment-of-Keyword

In this task, we had to write a MapReduce program to analyze the sentiment of a keyword from a list of comments. This was done using Hadoop HDFS.

Language: Java - Size: 1000 KB - Last synced at: 3 days ago - Pushed at: about 4 years ago - Stars: 6 - Forks: 0

leightonllc/FTEC4005 📦

FTEC4005 - Financial Informatics/ FTEC4003 - Data Mining for FinTech -- This repository contains codes for the bonus task, as well as the group project.

Language: Java - Size: 161 KB - Last synced at: over 1 year ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 0

arberkuci/shared-memory-map-reduce

A shared-memory implementation of MapReduce.

Language: Java - Size: 12.7 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

anshul1004/MutualFriends

Implementation of Hadoop and Spark

Language: Java - Size: 23 MB - Last synced at: almost 2 years ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 0

benjdiasaad/MapReduce_K-means

Implémentation de l'algorithme de clustering k-means en utilisant le framework Hadoop version 3.1.3 (MapReduce).

Language: Java - Size: 32.2 KB - Last synced at: 5 months ago - Pushed at: over 4 years ago - Stars: 3 - Forks: 2

NikolaAndro/Pagerank_Hadoop_MapReduce

Language: Java - Size: 6.02 MB - Last synced at: almost 2 years ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0

kushalhebbar/Big-data-project

Optimizing the storage capability of HDFS and HBase through data size factor with integrated security feature

Size: 48.9 MB - Last synced at: almost 2 years ago - Pushed at: almost 8 years ago - Stars: 0 - Forks: 0

ucapdak/Olympic-Tweets

Assignment for Big Data Processing: A collection of programs for analysing tweets related to the 2012 Olympics.

Language: Java - Size: 223 KB - Last synced at: about 2 years ago - Pushed at: about 8 years ago - Stars: 1 - Forks: 0

ucapdak/MapReduce-Spark-Comparison

Assignment for Big Data Processing: Comparison between Spark and MapReduce programs for analysing large data sets.

Language: Java - Size: 651 KB - Last synced at: about 2 years ago - Pushed at: about 8 years ago - Stars: 0 - Forks: 0

DanMolenhouse/Distributed-Systems-Project5-Hadoop-and-Spark

In this project, we used both Hadoop / MapReduce and Spark to do distributed computing. The first task was to perform a series of operations using a Mapper and Reduce java file that was implemented on a Hadoop server. The second task was to perform similar operations, but on Spark instead.

Language: Java - Size: 70.3 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

DA1OOO/Big-Data-Systems-and-Information-Processing

基于Hadoop集群的各类大数据存储、处理。

Language: Java - Size: 107 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

backslash112/crystal-ball-hadoop

A crystal ball to predict events that may happen once a certain event happened with MapReduce.

Language: Java - Size: 18.6 KB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

RonnJacob/PageRank-MapReduce-Spark

Implemented the PageRank algorithm in Hadoop MapReduce framework and Spark.

Language: Java - Size: 442 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 1

ManasaPola/Distributed-Parallel_DB

Distributed and Parallel Database Tasks

Language: Python - Size: 1.46 MB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

pancr9/Cloud-Computing

The repository consists of Cloud Computing for Data Analysis project and assignments.

Language: Java - Size: 2.91 MB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

harsh306/Hadoop_Task 📦

Language: Java - Size: 92.8 KB - Last synced at: about 2 years ago - Pushed at: about 8 years ago - Stars: 1 - Forks: 1

a-poliakov/distributed_computing

Language: Java - Size: 15.2 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

Elzawawy/hadoop-word-count

A simple MapReduce and Hadoop application to count words in a document ,implemented in Java to get a flavor for how they work.

Language: Java - Size: 22.5 KB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 2 - Forks: 2

SarahAyaz/YouTube_Data_Analysis

Analysis of YouTube Data using Hadoop Mapreduce framework in Java.

Language: Java - Size: 24.5 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 3 - Forks: 2

jieren123/Bigdata_Project_Recommender_System

Recommender system based on Item Collaborative Filtering and MapReduce

Language: Java - Size: 389 KB - Last synced at: over 2 years ago - Pushed at: about 8 years ago - Stars: 17 - Forks: 3

shashankg32/big_data_lab_nmit_6th_sem

big data lab nmit 6th sem

Language: Java - Size: 11.7 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

Raveesh1505/BigData-Training

Big data training material

Language: Python - Size: 45.9 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

hiifong/MapReduce-multi-table-merge

MapReduce multi-table merge MapReduce多表合并

Language: Java - Size: 6.84 KB - Last synced at: 11 days ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

warrenlyr/K-Nearest-Neighbors-Implementation-in-Parallel-Programming

K-Nearest Neighbors implementation in parallel programming and cloud computing with MPI, MapReduce, Spark, and MASS.

Language: Java - Size: 33.5 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

charliecai00/Tree-Versus-Income

Examining the Relationship Between Tree Quality and Socioeconomic Status in New York City

Language: Java - Size: 32.5 MB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

concealedtea/cardinaalit

cardinality Counter for large .data files

Language: Java - Size: 21.5 KB - Last synced at: over 2 years ago - Pushed at: about 8 years ago - Stars: 0 - Forks: 0

dddddkio/Data-analysis-of-Sogou-query-log

使用hadoop mapreduce对搜狗2008年查询日志进行数据分析

Language: Java - Size: 120 MB - Last synced at: over 2 years ago - Pushed at: almost 4 years ago - Stars: 2 - Forks: 1

markomih/kmeans_mapreduce

K-means MapReduce implementation

Language: Java - Size: 51 MB - Last synced at: over 2 years ago - Pushed at: about 3 years ago - Stars: 39 - Forks: 17

huangyueranbbc/RecommendByItemcf

Hadoop mapreduce. 基于ItemCF的协同过滤物品推荐系统 Collaborative filtering goods recommendation system based on ItemCF

Language: Java - Size: 498 KB - Last synced at: over 2 years ago - Pushed at: almost 7 years ago - Stars: 20 - Forks: 13

razo7/Nap

Nap: Network-Aware Data Partitions for Efficient Distributed Processing

Language: Mathematica - Size: 186 MB - Last synced at: 4 months ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

Michu-dev/big-data-first-project

First academic big data project to implement analysis using MapReduce and Hive platform

Language: Java - Size: 109 KB - Last synced at: over 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

RiccardoSagramoni/map-reduce-bloom-filter 📦

University Project for "Cloud Computing" course (MSc Computer Engineering @ University of Pisa). MapReduce applications implemented in Hadoop and Spark.

Language: Java - Size: 8.86 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

GovardhanR26/webserver-log-analysis

Language: Jupyter Notebook - Size: 1.83 MB - Last synced at: over 2 years ago - Pushed at: almost 3 years ago - Stars: 2 - Forks: 0

jhlfrfufyfn/hadoop-web-robot

Web robot made with Hadoop MapReduce and Java

Language: Java - Size: 3.7 MB - Last synced at: over 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

careycwang/CS5425-MapReduce-Common-Words

CS5425 Assignment 1: Top K Common Words

Language: Java - Size: 60.5 KB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

ling67/Cloud-Computing

Cloud Computing Learning and Project 👩‍🎓‍🤦‍♀️🤷‍♀️

Language: HTML - Size: 933 KB - Last synced at: over 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

HxnDev/Hadoop-MapReduce-to-Find-Average-Length-of-Comments

In this task, we had to find the average length of comments given in the dataset. It was done using Hadoop MapReduce and Hadoop HDFS.

Language: Java - Size: 675 KB - Last synced at: 7 months ago - Pushed at: about 4 years ago - Stars: 4 - Forks: 1

Ishuan/Information-Retrieval

Information retrieval (IR) is concerned with finding material (e.g., documents) of an unstructured nature (usually text) in response to an information need (e.g., a query) from large collections. One approach to identify relevant documents is to compute scores based on the matches between terms in the query and terms in the documents. For example, a document with words such as ball, team, score, championship is likely to be about sports. It is helpful to define a weight for each term in a document that can be meaningful for computing such a score. We describe below popular information retrieval metrics such as term frequency, inverse document frequency, and their product, term frequency-inverse document frequency (TF-IDF), that are used to define weights for terms. Term Frequency: Term frequency is the number of times a particular word t occurs in a document d. TF(t, d) = No. of times t appears in document d Since the importance of a word in a document does not necessarily scale linearly with the frequency of its appearance, a common modification is to instead use the logarithm of the raw term frequency. WF(t,d) = 1 + log10 (TF(t,d)) if TF(t,d) > 0, and 0 otherwise We will use this logarithmically scaled term frequency in what follows. Inverse Document Frequency: The inverse document frequency (IDF) is a measure of how common or rare a term is across all documents in the collection. It is the logarithmically scaled fraction of the documents that contain the word, and is obtained by taking the logarithm of the ratio of the total number of documents to the number of documents containing the term. IDF(t) = log10 (Total # of documents / # of documents containing term t) Under this IDF formula, terms appearing in all documents are assumed to be stopwords and subsequently assigned IDF=0. We will use the smoothed version of this formula as follows: IDF(t) = log10 (1 + Total # of documents / # of documents containing term t) Practically, smoothed IDF helps alleviating the out of vocabulary problem (OOV), where it is better to return to the user results rather than nothing even if his query matches every single document in the collection. TF-IDF: Term frequency–inverse document frequency (TF-IDF) is a numerical statistic that is intended to reflect how important a word is to a document in a collection or corpus of documents. It is often used as a weighting factor in information retrieval and text mining. TF-IDF(t, d) = WF(t,d) * IDF(t)

Language: Java - Size: 378 KB - Last synced at: over 2 years ago - Pushed at: over 7 years ago - Stars: 1 - Forks: 0