An open API service providing repository metadata for many open source software ecosystems.

Topic: "cloudera-hadoop"

sergevs/ansible-cloudera-hadoop

ansible playbook to deploy cloudera hadoop components to the cluster

Language: Shell - Size: 6.3 MB - Last synced at: over 1 year ago - Pushed at: almost 7 years ago - Stars: 52 - Forks: 41

tilakpatidar/cdh5

Docker image for Cloudera Hadoop components (CDH5)

Language: Shell - Size: 51.8 KB - Last synced at: over 2 years ago - Pushed at: over 7 years ago - Stars: 9 - Forks: 5

smartlin5228/CCA175

Language: Java - Size: 107 KB - Last synced at: almost 2 years ago - Pushed at: over 7 years ago - Stars: 7 - Forks: 10

Ranjandas/Dirty-CDH-Docker

A quick and dirty CDH cluster skeleton using Docker for Testing

Language: Shell - Size: 7.81 KB - Last synced at: over 1 year ago - Pushed at: almost 9 years ago - Stars: 6 - Forks: 2

dengshaochun/cdh-tools

cloudera hadoop auto install

Language: Shell - Size: 923 KB - Last synced at: over 2 years ago - Pushed at: about 7 years ago - Stars: 4 - Forks: 1

haspdecrypted/OS-for-Big-Data-and-Hadoop

Getting Started with Hadoop and Big Data

Size: 23.4 KB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 2 - Forks: 1

achintya-kumar/BD2017

Otto-von-Guericke Universität Magdeburg - Big Data SoSe 2017

Language: Java - Size: 28.1 MB - Last synced at: about 1 year ago - Pushed at: about 7 years ago - Stars: 2 - Forks: 0

kwartile/spark-benchmark

Spark Benchmark suite to evaluate cluster configuration and compare the performance with other big data frameworks.

Language: Scala - Size: 28.3 KB - Last synced at: 3 months ago - Pushed at: about 8 years ago - Stars: 2 - Forks: 0

JohnnyFoulds/local-hadoop

This project creates a small local Hadoop cluster using Cloudera CDH and CentOS.

Language: Python - Size: 216 MB - Last synced at: over 2 years ago - Pushed at: almost 5 years ago - Stars: 1 - Forks: 1

vodkolav/DataEngineerProject

This is my final project for Data Engineer Expert course at Naya College.

Language: Jupyter Notebook - Size: 930 KB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 0

SakhriHoussem/Apache-Hive-Tutorial

Learn How Hive Work in Simple Example

Size: 4.88 KB - Last synced at: about 2 years ago - Pushed at: almost 7 years ago - Stars: 1 - Forks: 1

Ishuan/Page-Rank-Implementation

The goal of this programming assignment is to compute the PageRanks of an input set of hyperlinked Wikipedia documents using Hadoop MapReduce. The PageRank score of a web page serves as an indicator of the importance of the page. Many web search engines (e.g., Google) use PageRank scores in some form to rank user-submitted queries. The goals of this assignment are to: 1. Understand the PageRank algorithm and how it works in MapReduce. 2. Implement PageRank and execute it on a large corpus of data. 3. Examine the output from running PageRank on Simple English Wikipedia to measure the relative importance of pages in the corpus. To run your program on the full Simple English Wikipedia archive, you will need to run it on the dsba-hadoop cluster to which you have access.

Language: Java - Size: 36.1 KB - Last synced at: over 2 years ago - Pushed at: about 7 years ago - Stars: 1 - Forks: 0

dorianbg/cloudera-quickstart-installation-guide

How to install Cloudera quickstart

Size: 909 KB - Last synced at: over 2 years ago - Pushed at: almost 8 years ago - Stars: 1 - Forks: 3

syscrest/cloudera-manager-hipchat-chatbot

chatbot for hipchat (cloud or onpremise) that enables you to talk to your cloudera manager

Language: Java - Size: 77.1 KB - Last synced at: about 1 year ago - Pushed at: about 8 years ago - Stars: 1 - Forks: 1

Rifat392000/BigDataAnalytics

Language: Jupyter Notebook - Size: 18.4 MB - Last synced at: about 1 month ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

meetgajjarx07/Baseball-analysis-BigData

This project utilizes the Cloudera platform and PIG queries to analyze and retrieve information on specific baseball performance and statistics problems. By employing big data methods, the analysis offers valuable insights into player performance, game trends, and strategic patterns.

Size: 3 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

AdrianYuu/qualification-big-data-processing

A qualification project for teaching as an assistant at SLC in the COMP6579001 Big Data Processing course.

Language: Jupyter Notebook - Size: 2.11 MB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

shubnimkar/Hadoop

This repository includes two versions of hadoop management tools

Size: 320 MB - Last synced at: 9 days ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

Johnny1110/Hadoop_Note

學習 Hadoop 筆記

Language: Shell - Size: 8.41 MB - Last synced at: over 2 years ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

Rishi500067313/Twitter-data-stream-into-MySQL-table-using-NiFI

Size: 1.51 MB - Last synced at: almost 2 years ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0

VaishnavJois/CLOUDERA

Cloudera commands used for Big Data Analytics

Size: 13.7 KB - Last synced at: about 1 year ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

marycboardman/Assessment-Attempts

Data processing using docker containers, kafka, spark, and hadoop

Size: 6.84 KB - Last synced at: over 2 years ago - Pushed at: almost 7 years ago - Stars: 0 - Forks: 1

SakhriHoussem/HBase-Tutorial

a Simple HBase Tutorial

Size: 16.6 KB - Last synced at: about 2 years ago - Pushed at: almost 7 years ago - Stars: 0 - Forks: 1

SakhriHoussem/SparkSQL-Tutorial

a Simple SparkSQL Tutorial

Size: 6.84 KB - Last synced at: about 2 years ago - Pushed at: almost 7 years ago - Stars: 0 - Forks: 0

SakhriHoussem/Apache-Spark-Tutorial

a Simple Apache Spark Tutorial

Size: 5.57 MB - Last synced at: about 2 years ago - Pushed at: almost 7 years ago - Stars: 0 - Forks: 0

SakhriHoussem/HBase-With-Hive

Learn How Hive Work With HBase in Simple Example

Size: 6.84 KB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

nikitaeverywhere/hadoop-network-of-keywords

Keywords network builder based on TF-IDF with the use of Hadoop platform

Language: Python - Size: 86.9 KB - Last synced at: 3 months ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

Mantej-Singh/Apache-Spark-Under-the-hood--WordCount

Running my first pyspark app in CDH5

Language: Jupyter Notebook - Size: 165 KB - Last synced at: 9 days ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 1

akshaydake123/Sentiment-Analysis-on-Twitter-Data

This contains how to perform Sentiment Analysis on the tweets from Twitter using Hive.Collect the tweets from Twitter using Flume, As the tweets coming in from twitter are in Json format, we need to load the tweets into Hive using json input format. Use Cloudera Hive json serde for this purpose.

Size: 575 KB - Last synced at: 10 months ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

bishalpaudel/HadoopProductPurchaseProbability

Anticipatory customer order prediction after purchasal of item(s).

Language: Java - Size: 10.7 KB - Last synced at: 4 months ago - Pushed at: about 8 years ago - Stars: 0 - Forks: 1