GitHub topics: hadoop-mapreduce
41xu/Hadoop-ClassNotes
Some code during learning Hadoop.
Language: Java - Size: 6.1 MB - Last synced at: about 1 year ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

biagiocornacchia/bloom-filters-in-mapreduce
Implementation of the MapReduce Bloom filter construction algorithm using the Hadoop and Spark framework.
Language: Java - Size: 717 KB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

JohandeGraaf/PageRank
PageRank algorithm implemented in Hadoop MapReduce.
Language: Java - Size: 2 MB - Last synced at: about 1 year ago - Pushed at: over 7 years ago - Stars: 1 - Forks: 0

csd-auth-ftw/hadoop-http-logs
A hadoop application that searches for errors in Apache logs
Language: Java - Size: 3.91 KB - Last synced at: about 1 year ago - Pushed at: about 8 years ago - Stars: 1 - Forks: 0

JiangtaoXu93/Routing-Project
Giving historical airplane on time performance data, offer suggestions for two-hop flights that minimize the chance of missing a connection.
Language: Java - Size: 42.5 MB - Last synced at: about 1 year ago - Pushed at: over 6 years ago - Stars: 1 - Forks: 0

verma-rahul/MapReduceProjects
Language: Java - Size: 1.29 MB - Last synced at: about 1 year ago - Pushed at: over 7 years ago - Stars: 1 - Forks: 0

smaddikonda/Hadoop-MapReduce
Parallel Data Processing using Hadoop MapReduce
Language: Makefile - Size: 35.3 MB - Last synced at: about 1 year ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

salonishah11/MapReduce
Contains PageRank algorithm implemented in MapReduce and Spark. Programs for Combiner, NoCombiner and InMapperCombiner patterns along with Secondary Sort algorithm executed on temperature data.
Language: Java - Size: 1.2 MB - Last synced at: about 1 year ago - Pushed at: over 7 years ago - Stars: 1 - Forks: 0

marcelxyz/twitter-analysis-hadoop
Hadoop implementation of tweet length and frequent hashtag analysis
Language: Java - Size: 5.86 KB - Last synced at: about 1 year ago - Pushed at: almost 7 years ago - Stars: 0 - Forks: 0

aravind2060/HadoopAndHiveforLargeScaleDataAnalysis
Language: Java - Size: 186 KB - Last synced at: 2 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

nabai-max/mapreduce-wordcount
This project leverages Java and Hadoop MapReduce to analyze text and flight data, focusing on a classic Word Count problem and detailed flight data analysis.
Language: Java - Size: 703 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

nabai-max/Hadoop-MapReduce
Changed readme. This is a Java project that is used for a simple MapReduce Word Count problem.
Language: Java - Size: 711 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

darule0/yarndiff
A rudimentary command line utility for contrasting Apache Yarn container logs.
Language: Shell - Size: 59.6 KB - Last synced at: 3 days ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

shask9/Matrix-Multiplication-Hadoop
Hadoop MapReduce program to compute multiplication of two sparse matrices
Language: Java - Size: 96.7 KB - Last synced at: 7 months ago - Pushed at: about 7 years ago - Stars: 8 - Forks: 5

lalkakonus/ir-hw4
Mail TehnoSphere Information Retrieval HW4
Language: Java - Size: 1.12 MB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 0

RevanthPosina/YouTube_Analysis_with_Java
YouTube Analysis to find out the top 5 categories with maximum number of videos uploaded and the top 10 rated videos on YouTube
Language: Java - Size: 537 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 0

WindowsXp-Beta/BookHub
An online bookstore integrated with many fancy technologies.
Language: JavaScript - Size: 6.07 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

10lloydj/NLP-RDF-Inverted-Index
This Map Reduce program should read in a set of RDF/XML documents and output the data in the form: {object}, [(predicate1, position, subject1)...]
Language: Java - Size: 11.7 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

AkshayJaitly/CS643-AKSHAY-JAITLY
HADOOP WORDCOUNT ON AWS EC2 INSTANCE
Language: Java - Size: 9.46 MB - Last synced at: over 1 year ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

Ruggero1912/mapreduce-bloom-filters
This project investigates how to build Bloom Filters using the MapReduce approach in Hadoop and Spark. Different implementations and further anlysis on performances are reported
Language: Jupyter Notebook - Size: 1.74 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 0

seyfal/MapReduceGraphComparison
Distributed computational problem-solving project, which aims to perform large-scale graph matching using cloud computing technologies. The project allows users to import two directed graphs and analyze the differences between them.
Language: Scala - Size: 1.76 MB - Last synced at: 1 day ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

j-buitrago/Distributed-processing-AWS
Amazon Web Services to process big data using a Hadoop cluster
Language: Python - Size: 18.6 KB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

hyeonsangjeon/dataplatform
Hadoop3.2 single/cluster mode with web terminal gotty, spark, jupyter pyspark, hive, eco etc.
Language: Shell - Size: 549 KB - Last synced at: about 2 months ago - Pushed at: over 5 years ago - Stars: 11 - Forks: 1

mhamadelitawi/Handoop
Hadoop Map-Reduce implementations of many scientific computations
Language: Java - Size: 2.46 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 1

HarshitDawar55/MapReduce
Programs for MapReduce written in java with least complexity!
Language: Java - Size: 76.2 KB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

yuliya-akchurina/Big-Data-Programming
Big Data Programming Projects
Language: Python - Size: 57.5 MB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

DavideBruni/ParallelK-Means
Implementation of Parallel k-means using MapReduce in Hadoop
Language: Jupyter Notebook - Size: 485 KB - Last synced at: 6 months ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

drewm8080/big_data_management
Contains all homework from the course Foundations of Database Management at USC
Language: Python - Size: 4.75 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

hamzahamidi/map-reduce-sample
MapReduce exercices sample
Language: Java - Size: 24.7 MB - Last synced at: 3 months ago - Pushed at: over 7 years ago - Stars: 2 - Forks: 1

HxnDev/Hadoop-MapReduce-to-Analyze-Sentiment-of-Keyword
In this task, we had to write a MapReduce program to analyze the sentiment of a keyword from a list of comments. This was done using Hadoop HDFS.
Language: Java - Size: 1000 KB - Last synced at: about 1 month ago - Pushed at: almost 4 years ago - Stars: 6 - Forks: 0

mikeroyal/Apache-Hadoop-Guide
Apache Hadoop Guide
Size: 141 KB - Last synced at: 2 months ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 2

marcocolangelo/Big-Data-processing-and-Analytics
The current repository contains all the code developed during the Big Data processing and Analytics laboratories. Data are processed and analyzed using Hadoop and Spark
Language: Java - Size: 6.1 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

JPThakur361/Sample-Projects
Map Reduce implements various mathematical algorithms to divide a task into small parts and assign them to multiple systems. In technical terms, Map Reduce algorithm helps in sending the Map & Reduce tasks to appropriate servers in a cluster. Like:> Sorting ,Searching ,Indexing ,TF-IDF . where we implemented few small things in indexing algorithm .
Language: Java - Size: 1.68 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

Tarunpreetsingh16/hadoop
Data analysis using hadoop.
Size: 63.7 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

PranavPKS/big-data-small-projects
Learning basic concepts of standard big-data technologies
Language: Java - Size: 18.5 MB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 1 - Forks: 0

anshul1004/MutualFriends
Implementation of Hadoop and Spark
Language: Java - Size: 23 MB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 1 - Forks: 0

katipogluMustafa/BigData
Map Reduce
Language: Java - Size: 245 KB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 1 - Forks: 0

gowribhat/sms-corpus-keyword-analysis
Language: Jupyter Notebook - Size: 144 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

yp3722/Distributed-Log-Processing
A distributed system built with Hadoop File System that employs map-reduce approach to analyze large volumes of data to extract insights
Language: Scala - Size: 194 KB - Last synced at: 1 day ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

evaesqmor/WordCountMapReduce
Big Data: Map Reduce Example
Language: Java - Size: 4.88 KB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 0

thenielfarias/MapReduce-application-with-Hadoop-and-Java-for-WordCount
MapReduce application with Hadoop and Java for WordCount that counts the number of occurrences of each word in the books of the input set.
Language: Java - Size: 784 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

benjdiasaad/MapReduce_K-means
Implémentation de l'algorithme de clustering k-means en utilisant le framework Hadoop version 3.1.3 (MapReduce).
Language: Java - Size: 32.2 KB - Last synced at: 26 days ago - Pushed at: over 4 years ago - Stars: 3 - Forks: 2

lauravoicu/Coursera-Hadoop-Platform-Application
Language: Python - Size: 6.84 KB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

MatteoM95/Big-data-processing-and-analytics
Exercises on Spark and Hadoop - Done in Distributed architectures for big data processing and analytics course at Politecnico di Torino
Language: Java - Size: 4.94 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 2

HaneefAhamed/Hadoop_Map_Reduce
Hadoop setup and Getting Started with developing Hadoop programs
Size: 11.7 KB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

ayush-usf/stack-overflow-logs-hadoop-analysis
Ask Ubuntu Logs analysis with Hadoop, MapReduce 2(Yarn)
Language: Java - Size: 108 KB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

SubalakshmiShanthosi/PCP1211DALab
Language: TeX - Size: 34.4 MB - Last synced at: over 1 year ago - Pushed at: about 6 years ago - Stars: 0 - Forks: 0

NikhilURao/H1B_VisaProject
This repository contains the H1B_Visa Applicants Data Analysis project/case study using Hadoop undertaken during the training at NIIT. MapReduce,Hive,Pig,Scoop and Shell-scripting are the technologies used.
Language: Shell - Size: 729 KB - Last synced at: over 1 year ago - Pushed at: almost 6 years ago - Stars: 2 - Forks: 5

ronellsalunke/Titanic-BigData
Java Hadoop MapReduce code for my Big Data Analytics Project using the Titanic dataset
Language: Java - Size: 41 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

rfhussain/Running-a-Spark-Job-on-AWS-Cluster
When dealing with huge datasets, it is quite impossible that the code successfully executes on your personal desktop. You either need a locally installed clustered environment i.e. Hadoop Map-Reduce or a Cloud such as AWS. Here's an example of running such Job on AWS cloud.
Language: Python - Size: 804 KB - Last synced at: over 1 year ago - Pushed at: almost 6 years ago - Stars: 0 - Forks: 0

toukirnaim08/Python-Hadoop-MapReduce
Python Hadoop/MapReduce Program
Language: Python - Size: 5.52 MB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 0

toukirnaim08/HiveQL-Hadoop-MapReduce
A HiveQL script with Hadoop/MapReduce Program to find out the most popular movies for different age groups.
Language: HiveQL - Size: 5.52 MB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 0

toukirnaim08/PigScript-Hadoop-MapReduce
PigLatin script and Hadoop/MapReduce Program
Language: PigLatin - Size: 5.52 MB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 0

JhonWilderParionaVilca/MapReduce
Ejemplo de uso de Map Reduce con hadoop y jupiter
Language: Jupyter Notebook - Size: 34.2 KB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

kartik894/hadoop-sched
Hadoop is an open-source implementation of MapReduce enjoying wide adoption and is often used for short jobs where low response time is critical. Hadoop’s performance is closely tied to its task scheduler, which implicitly assumes that cluster nodes are homogeneous and tasks make progress linearly, and uses these assumptions to decide when to speculatively re-execute tasks that appear to be stragglers. In practice, the homogeneity assumptions do not always hold. MapReduce uses speculative execution to improve fault tolerance. Current Hadoop implementation decides whether to run speculative tasks based on the progress rates of running tasks, which does not take into consideration the absolute progress of each task. The modified Hadoop framework was deployed in 6 t2.medium EC2 instances in a master-slave configuration.
Language: Java - Size: 1.57 MB - Last synced at: over 1 year ago - Pushed at: over 7 years ago - Stars: 1 - Forks: 2

shubhamwaghe/Scalable-Data-Mining
Scalable Data Mining - Assignment submissions
Language: Python - Size: 3.38 MB - Last synced at: over 1 year ago - Pushed at: over 7 years ago - Stars: 2 - Forks: 0

Goutham88/Parallel-and-Weighted-Itemset-Mining-by-means-of-MapReduce-FrameWork
Mines heavy Weighted Item-sets(Rating,Reviews)
Language: Java - Size: 75.2 KB - Last synced at: over 1 year ago - Pushed at: almost 8 years ago - Stars: 0 - Forks: 0

tirthmehta/Big-Data-Analysis-with-Apache-Hadoop-Pig-Latin
Big Data Analysis of datasets for taking into account the character occurrences.
Language: PigLatin - Size: 1000 Bytes - Last synced at: over 1 year ago - Pushed at: almost 8 years ago - Stars: 0 - Forks: 0

asaldelkhosh-learning/hadoop
Learning Hadoop and Map-Reduce!
Size: 33.2 KB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

maphdev/M2_Big_Data_Project
Create a world map with zooms using NASA's geographical data. Use of Hadoop, Spark and HBase.
Language: Java - Size: 48.4 MB - Last synced at: over 1 year ago - Pushed at: about 6 years ago - Stars: 0 - Forks: 0

xichie/Hadoop
Language: Java - Size: 46.5 MB - Last synced at: over 1 year ago - Pushed at: about 6 years ago - Stars: 1 - Forks: 0

dboston1/Reddit-Sentiment-Analysis
Program that performs textual analysis of Reddit data (approx. 300 GB) preprocessed by another team member. Uses Hadoop's Mapreduce to classify comments as either positive or negative based on certain keywords, negation, etc.
Language: Java - Size: 2.34 MB - Last synced at: over 1 year ago - Pushed at: about 7 years ago - Stars: 5 - Forks: 0

mohammadsadra/iust-cc-401
This repo contains all supplementary items for Cloud Computing course taught in IUST at Fall 2022.
Size: 340 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

Keerthivasan13/CSCI572-Information_Retrieval_And_Web_Search_Engines
Search Engine projects
Language: Java - Size: 34.5 MB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 11 - Forks: 17

amitkedia007/Analysis-of-AirBnB-data-Hadoop-Mapreduce
This repo explains the implementation of Map-Reduce Algorithm on the AirBnb data to understand the consumer satisfaction region and country wise. This is the effective use of parallel distributed computing to resolve the big data problems
Language: Java - Size: 1.8 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

joshi-aditya/Amazon-Reviews-Dataset-Analysis-MapReduce
Amazon Customer Reviews Dataset Analysis using Hadoop MapReduce, Pig. Semester end project for INFO7250 Engineering of Big Data Systems course.
Language: Java - Size: 1.66 MB - Last synced at: over 1 year ago - Pushed at: about 6 years ago - Stars: 6 - Forks: 2

RahulReddy-Arva/Search-Engine-BAsed-on-TFIDF
Developed a Basic Search Engine which ranks the documents in the decreasing order of their TF - IDF values based on the Search Query provided by the User and retrieves the top 100 documents according to the Search request. Term Frequency - Inverse Document Frequency is used for Information Retrieval. This is implemented in distributed computing environment using Apache HADOOP.
Language: HTML - Size: 870 KB - Last synced at: over 1 year ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

Yeema/WordCount
using Hadoop to rank vocabulary by Aa-Zz
Language: Java - Size: 2.01 MB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

Yeema/Average_Sort
calculate the average of occurrences and sort them by multiple reducers
Language: Java - Size: 2.64 MB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 1 - Forks: 0

Yeema/PageRank
Language: Jupyter Notebook - Size: 779 KB - Last synced at: over 1 year ago - Pushed at: almost 6 years ago - Stars: 0 - Forks: 0

Yeema/LSH
find similar articles
Language: Java - Size: 623 KB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

Arushi2002/Yet_Another_Map_Reduce
Implemented the core concepts of Hadoop's Map Reduce Framework.
Language: Python - Size: 20.5 KB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

Shubham-vish/hadoop-B-Tree
Language: Java - Size: 42 KB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 3 - Forks: 1

ucapdak/Olympic-Tweets
Assignment for Big Data Processing: A collection of programs for analysing tweets related to the 2012 Olympics.
Language: Java - Size: 223 KB - Last synced at: over 1 year ago - Pushed at: almost 8 years ago - Stars: 1 - Forks: 0

tugrulhkarabulut/hadoop-movie-rating-prediction
Movie rating prediction application
Language: CSS - Size: 3.46 MB - Last synced at: almost 2 years ago - Pushed at: almost 4 years ago - Stars: 4 - Forks: 0

AH-Yussef/Health-Monitor-Big-Data-System
A Health Monitor to simulate receiving and processing large amounts of health metrics from many clients with the goal of efficiently finding aggregate statistics
Language: Java - Size: 319 KB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

DanMolenhouse/Distributed-Systems-Project5-Hadoop-and-Spark
In this project, we used both Hadoop / MapReduce and Spark to do distributed computing. The first task was to perform a series of operations using a Mapper and Reduce java file that was implemented on a Hadoop server. The second task was to perform similar operations, but on Spark instead.
Language: Java - Size: 70.3 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

suncle1993/hadoop-mapreduce-demo
Hadoop3.1 MapReduce Demo -- Python
Language: Python - Size: 781 KB - Last synced at: almost 2 years ago - Pushed at: over 6 years ago - Stars: 1 - Forks: 2

PrateekKumar1709/Ngram-Language-Model-Hadoop-MapReduce
A project to implement a language Models (Ngrams) with Hadoop MapReduce
Language: Python - Size: 4.88 KB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

avulaankith/Matrix-Multiplication-Hadoop
This is code for matrix multiplication using hadoop framework in java and spark framework in scala
Language: HTML - Size: 257 KB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

e-petrachi/AmazonFoodAnalytic
Un progetto di confronto tra HADOOP, SPARK e HIVE su query simili per analisi distribuite su un dataset in formato CSV relativo a recensioni di prodotti gastronomici Amazon
Language: Java - Size: 535 KB - Last synced at: almost 2 years ago - Pushed at: almost 7 years ago - Stars: 1 - Forks: 0

mihir09/Burnol
A search engine that allows users to search for multi words query and displays top 10 Wikipedia pages matched with query. Scrapped Wikipedia using Beautiful Soup. Index the data using Hadoop Map Reduce.
Language: Java - Size: 138 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

akarsh3007/HadoopMapRExamples
Language: Java - Size: 35.9 MB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

Grg0rry/MapReduce-Recommendation-System
A recommendation system built on top of Hadoop Distributed File System and MapReduce
Language: Java - Size: 204 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

ruxuebu/Java-based-Movie-Recommender
A Movie Recommendation System implemented in Java base on Item-Item collaborative filtering algorithms
Language: Java - Size: 8.79 KB - Last synced at: almost 2 years ago - Pushed at: about 8 years ago - Stars: 4 - Forks: 2

iRahulP/COMP6231
All assignments implementation as part of COMP6231(Distributed System Design) course at Concordia University for Winter21.
Language: Java - Size: 8.87 MB - Last synced at: almost 2 years ago - Pushed at: about 4 years ago - Stars: 2 - Forks: 0

nish-d/MapReduce_on_Cancer_Database
Map Reduce Queries on United States Cancer Statistics Data. Database can be found at mentioned link
Language: Java - Size: 14.3 MB - Last synced at: almost 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

ash-0521/Ensuring-Smiles-using-Spark-ML
The primary objective of this study is to explore the feasibility of using machine learning algorithms to classify health insurance plans based on their coverage for routine dental services. To achieve this, I used six different classification algorithms: LR, DT, RF, GBT, SVM, FM(Tech: PySpark, SQL, Databricks, Zeppelin books, Hadoop, Spark-Submit)
Language: Python - Size: 15.4 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

Vzzarr/BigData---FineFoodReviews
Language: JavaScript - Size: 2.57 MB - Last synced at: almost 2 years ago - Pushed at: about 8 years ago - Stars: 0 - Forks: 0

PeterSchuld/UCSanDiego_MicroMasters_DataScience-BigDataAnalyticsUsingSpark
The University of California, San Diego, course DSE230x "Big Data Analytics Using Spark" (Summer 2019): Learn how to analyze large datasets using Jupyter notebooks, MapReduce and Spark as a platform. Part 4 of the »Data Science« MicroMasters® Program on edX. Instructor: Yoav Freund, Professor of CS and Engineering, University of California San Diego.
Size: 6.12 MB - Last synced at: almost 2 years ago - Pushed at: about 5 years ago - Stars: 1 - Forks: 1

shreyas15/Ranked-File-Search
Information retrieval (IR) is concerned with finding material (e.g., documents) of an unstructured nature (usually text) in response to an information need (e.g., a query) from large collections. One approach to identify relevant documents is to compute scores based on the matches between terms in the query and terms in the documents. For example, a document with words such as ball , team , score , championship is likely to be about sports. It is helpful to define a weight for each term in a document that can be meaningful for computing such a score. I use popular information retrieval metrics such as term frequency, inverse document frequency, and their product, term frequency-inverse document frequency (TF-IDF), that are used to define weights for terms.
Language: Java - Size: 974 KB - Last synced at: almost 2 years ago - Pushed at: over 8 years ago - Stars: 1 - Forks: 0

backslash112/crystal-ball-hadoop
A crystal ball to predict events that may happen once a certain event happened with MapReduce.
Language: Java - Size: 18.6 KB - Last synced at: almost 2 years ago - Pushed at: about 7 years ago - Stars: 0 - Forks: 0

suyashdamle/ScalableDataMining
Assignments from the course of Scalable Data Mining
Language: Jupyter Notebook - Size: 494 KB - Last synced at: almost 2 years ago - Pushed at: over 6 years ago - Stars: 2 - Forks: 1

Shineuptillast/hive_practice
Hive Practice Material
Size: 1.78 MB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

HaElMe/BigDataTrainings
Big Data for All Levels
Size: 103 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

somisettyv/HadoopWordCount
Hadoop MapReduce Word Count
Language: Java - Size: 5.86 KB - Last synced at: almost 2 years ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

RonnJacob/PageRank-MapReduce-Spark
Implemented the PageRank algorithm in Hadoop MapReduce framework and Spark.
Language: Java - Size: 442 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 1

pancr9/Cloud-Computing
The repository consists of Cloud Computing for Data Analysis project and assignments.
Language: Java - Size: 2.91 MB - Last synced at: almost 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

AlbertHunduza/Hadoop-MapReduce-Sentiment-Analysis
Calculating Average Sentiment of Words/Tokens
Language: Python - Size: 120 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

HarigovindV10/NYC-Subway-Data-Analysis
An analysis of NYC Subway Data using Hadoop Map Reduce
Language: Jupyter Notebook - Size: 529 KB - Last synced at: almost 2 years ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 1
