An open API service providing repository metadata for many open source software ecosystems.

Topic: "mapreduce"

donnemartin/data-science-ipython-notebooks

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

Language: Python - Size: 46.8 MB - Last synced at: 19 days ago - Pushed at: about 1 year ago - Stars: 28,123 - Forks: 7,967

heibaiying/BigData-Notes

大数据入门指南 :star:

Language: Java - Size: 22.9 MB - Last synced at: 16 days ago - Pushed at: over 1 year ago - Stars: 16,368 - Forks: 4,277

PowerJob/PowerJob

Enterprise job scheduling middleware with distributed computing ability.

Language: Java - Size: 18.6 MB - Last synced at: 5 days ago - Pushed at: 4 months ago - Stars: 7,449 - Forks: 1,311

douban/dpark 📦

Python clone of Spark, a MapReduce alike framework in Python

Language: Python - Size: 2.65 MB - Last synced at: 22 days ago - Pushed at: over 4 years ago - Stars: 2,682 - Forks: 530

water8394/BigData-Interview

:dart: :star2:[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结

Size: 6.59 MB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 1,610 - Forks: 446

collabH/bigdata-growth

大数据知识仓库涉及到数据仓库建模、实时计算、大数据、数据中台、系统设计、Java、算法等。

Language: Shell - Size: 221 MB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 1,579 - Forks: 377

mahmoudparsian/data-algorithms-book

MapReduce, Spark, Java, and Scala for Data Algorithms Book

Language: Java - Size: 397 MB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 1,073 - Forks: 663

microsoft/Mobius

C# and F# language binding and extensions to Apache Spark

Language: C# - Size: 6.44 MB - Last synced at: 4 days ago - Pushed at: over 1 year ago - Stars: 940 - Forks: 211

happyer/distributed-computing

distributed_computing include mapreduce kvstore etc.

Language: Go - Size: 16.8 MB - Last synced at: 11 months ago - Pushed at: almost 5 years ago - Stars: 796 - Forks: 213

cdapio/cdap

An open source framework for building data analytic applications.

Language: Java - Size: 612 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 773 - Forks: 347

bcongdon/corral

🐎 A serverless MapReduce framework written for AWS Lambda

Language: Go - Size: 1.43 MB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 693 - Forks: 40

sunnyandgood/BigData

💎🔥大数据学习笔记

Language: Java - Size: 316 MB - Last synced at: over 1 year ago - Pushed at: almost 6 years ago - Stars: 647 - Forks: 222

grailbio/bigslice

A serverless cluster computing system for the Go programming language

Language: Go - Size: 2.66 MB - Last synced at: 20 days ago - Pushed at: almost 2 years ago - Stars: 553 - Forks: 34

apache/uniffle

Uniffle is a high performance, general purpose Remote Shuffle Service.

Language: Java - Size: 12.9 MB - Last synced at: 1 day ago - Pushed at: 3 days ago - Stars: 415 - Forks: 155

CamDavidsonPilon/tdigest

t-Digest data structure in Python. Useful for percentiles and quantiles, including distributed enviroments like PySpark

Language: Python - Size: 91.8 KB - Last synced at: about 1 month ago - Pushed at: about 2 years ago - Stars: 392 - Forks: 54

cubefs/compass

Compass is a task diagnosis platform for bigdata

Language: Java - Size: 5.92 MB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 380 - Forks: 137

RedisGears/RedisGears

Dynamic execution framework for your Redis data

Language: Rust - Size: 4.76 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 373 - Forks: 66

cwensel/cascading

Cascading is a feature rich API for defining and executing complex and fault tolerant data processing flows locally or on a cluster.

Language: Java - Size: 32.1 MB - Last synced at: 28 days ago - Pushed at: about 1 month ago - Stars: 349 - Forks: 221

datawhalechina/juicy-bigdata

🎉🎉🐳 Datawhale大数据处理导论教程 | 大数据技术方向的开篇课程🎉🎉

Language: Python - Size: 27.4 MB - Last synced at: about 1 month ago - Pushed at: about 2 years ago - Stars: 312 - Forks: 43

DigitalPebble/behemoth 📦

Behemoth is an open source platform for large scale document analysis based on Apache Hadoop.

Language: Java - Size: 7.45 MB - Last synced at: 6 months ago - Pushed at: about 7 years ago - Stars: 281 - Forks: 60

Tencent/Firestorm

Firestorm is a Remote Shuffle Service, and provides the capability for Apache Spark and Apache Hadoop MapReduce applications to store shuffle data on remote servers

Language: Java - Size: 1.63 MB - Last synced at: about 1 year ago - Pushed at: about 2 years ago - Stars: 248 - Forks: 75

xingdl2007/6.824-2017

:zap: 6.824: Distributed Systems (Spring 2017). A course which present abstractions and implementation techniques for engineering distributed systems.

Size: 0 Bytes - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 215 - Forks: 78

BWbwchen/MapReduce

An easy-to-use Map Reduce Go parallel-computing framework inspired by 2021 6.824 lab1. It supports multiple workers threads on a single machine and multiple processes on a single machine right now.

Language: Go - Size: 2.6 MB - Last synced at: 11 months ago - Pushed at: over 1 year ago - Stars: 214 - Forks: 13

mahmoudparsian/data-algorithms-with-spark

O'Reilly Book: [Data Algorithms with Spark] by Mahmoud Parsian

Language: Python - Size: 44.9 MB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 213 - Forks: 93

lynnlangit/learning-hadoop-and-spark

Companion to Learning Hadoop and Learning Spark courses on Linked In Learning

Language: HTML - Size: 13.6 MB - Last synced at: 7 days ago - Pushed at: 5 months ago - Stars: 194 - Forks: 167

kevwan/mapreduce

A in-process MapReduce library to help you optimizing service response time or concurrent task processing.

Language: Go - Size: 44.9 KB - Last synced at: 7 days ago - Pushed at: 10 months ago - Stars: 173 - Forks: 24

mahmoudparsian/big-data-mapreduce-course

Big Data Modeling, MapReduce, Spark, PySpark @ Santa Clara University

Language: HTML - Size: 601 MB - Last synced at: 29 days ago - Pushed at: 5 months ago - Stars: 155 - Forks: 142

touero/ctenopharyngodon-idella

Use the MapReduce's Java interface to distributed crawle the data of Chinese universities and learn basic knowledge of hdfs.

Language: Java - Size: 3.75 MB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 140 - Forks: 0

mimecast/dtail

DTail is a distributed DevOps tool for tailing, grepping, catting logs and other text files on many remote machines at once.

Language: Go - Size: 12.3 MB - Last synced at: 8 days ago - Pushed at: 9 months ago - Stars: 128 - Forks: 10

CocaineCong/tangseng

Tangseng search engine including full text search and vector search base on golang. 基于go语言的搜索引擎,信息检索系统

Language: Go - Size: 6.12 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 125 - Forks: 36

asakusafw/asakusafw

Asakusa Framework

Language: Java - Size: 34.8 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 117 - Forks: 13

miguno/avro-hadoop-starter 📦

Example MapReduce jobs in Java, Hive, Pig, and Hadoop Streaming that work on Avro data.

Language: Java - Size: 650 KB - Last synced at: 6 days ago - Pushed at: over 9 years ago - Stars: 114 - Forks: 83

feng-li/Distributed-Statistical-Computing

Teaching Materials for Distributed Statistical Computing (大数据分布式计算教学材料)

Language: HTML - Size: 49.1 MB - Last synced at: about 2 months ago - Pushed at: 11 months ago - Stars: 106 - Forks: 66

Refefer/Dampr

Python Data Processing library

Language: Python - Size: 178 KB - Last synced at: 17 days ago - Pushed at: over 1 year ago - Stars: 102 - Forks: 6

andreaiacono/MapReduce

MapReduce by examples

Language: Java - Size: 33.5 MB - Last synced at: about 2 years ago - Pushed at: about 6 years ago - Stars: 94 - Forks: 76

maxis42/Big-Data-Engineering-Coursera-Yandex

Big Data for Data Engineers Coursera Specialization from Yandex

Language: Jupyter Notebook - Size: 66.2 MB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 92 - Forks: 74

iflytek/Guitar

A Simple and Efficient Distributed Multidimensional BI Analysis Engine.

Language: Java - Size: 1.5 MB - Last synced at: 6 months ago - Pushed at: over 3 years ago - Stars: 86 - Forks: 22

GoCollaborate/src

A light-weight distributed stream computing framework for Golang

Language: Go - Size: 9.33 MB - Last synced at: 11 months ago - Pushed at: almost 7 years ago - Stars: 86 - Forks: 24

kwartile/connected-component

Map Reduce Implementation of Connected Component on Apache Spark

Language: Scala - Size: 26.4 KB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 84 - Forks: 18

mahmoudparsian/pyspark-algorithms

PySpark Algorithms Book: https://www.amazon.com/dp/B07X4B2218/ref=sr_1_2

Language: Python - Size: 40.5 MB - Last synced at: about 1 month ago - Pushed at: over 5 years ago - Stars: 84 - Forks: 44

flipkart-incubator/hbase-orm

A production-grade HBase ORM library that makes accessing HBase clean, fast and fun (Can also be used as Bigtable ORM)

Language: Java - Size: 363 KB - Last synced at: 26 days ago - Pushed at: almost 2 years ago - Stars: 81 - Forks: 41

tracy-talent/curriculum

a repository for my curriculum project

Language: Python - Size: 135 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 81 - Forks: 66

groda/big_data

Tutorials on Big Data essentials: Hadoop, MapReduce, Spark. Explore a variety of tutorials and demonstrations on Big Data technologies, primarily in the form of Jupyter notebooks. Most notebooks are self-contained and live—ready to run with a click.

Language: Jupyter Notebook - Size: 51.9 MB - Last synced at: 19 days ago - Pushed at: 20 days ago - Stars: 75 - Forks: 26

nellore/rail

Scalable RNA-seq analysis

Language: Python - Size: 249 MB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 73 - Forks: 11

am-kantox/elixir-iteraptor

Handy enumerable operations implementation.

Language: Elixir - Size: 206 KB - Last synced at: about 15 hours ago - Pushed at: 2 months ago - Stars: 72 - Forks: 9

razertory/MIT6.824-Java

Java 实现的分布式系统课程(MIT6.824)

Language: Java - Size: 1.43 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 65 - Forks: 19

TurboWay/pybigdata

使用 python 操作大数据的各种组件

Language: Python - Size: 85 KB - Last synced at: 29 days ago - Pushed at: about 2 years ago - Stars: 63 - Forks: 18

arindas/mit-6.824-distributed-systems

Template repository to work on the labs from MIT 6.824 Distributed Systems course.

Language: Go - Size: 1.42 MB - Last synced at: about 1 month ago - Pushed at: almost 3 years ago - Stars: 59 - Forks: 7

liumingmusic/HadoopLearning

全套大数据基础学习教程,包含最基础的centos、maven。大数据主要包含hdfs、mr、yarn、hbase、kafka、scala、sparkcore、sparkstreaming、sparksql。教程包含所有的源代码演示以及在线文档说明。

Language: Scala - Size: 5.95 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 52 - Forks: 24

niqdev/devops

DevOps

Language: Shell - Size: 9.18 MB - Last synced at: about 1 month ago - Pushed at: about 2 years ago - Stars: 48 - Forks: 19

yenkuanlee/IPDC

IPDC(InterPlanetary Distributed Computing) is the Distributed Computation service, A peer-to-peer hypermedia protocol to make the computation faster, open, and more scalable.

Language: Python - Size: 20.9 MB - Last synced at: over 1 year ago - Pushed at: about 7 years ago - Stars: 48 - Forks: 9

aikuyun/bigdata-doc

大数据学习笔记,学习路线,技术案例整理。

Language: Shell - Size: 2.38 MB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 47 - Forks: 19

vivek2319/Learn-Hadoop-and-Spark

This repository focuses on gathering and making a curated list resources to learn Hadoop for FREE.

Language: Python - Size: 211 MB - Last synced at: over 1 year ago - Pushed at: almost 7 years ago - Stars: 46 - Forks: 39

asuiu/pyxtension

Pure Python extensions library that includes Scala-like streams, Json with attribute access syntax, and other common use stuff

Language: Python - Size: 334 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 45 - Forks: 1

jehiah/gomrjob

gomrjob - a Go Framework for Hadoop Map Reduce Jobs

Language: Go - Size: 583 KB - Last synced at: about 1 month ago - Pushed at: 9 months ago - Stars: 42 - Forks: 4

whitfin/efflux

Easy Hadoop Streaming and MapReduce interfaces in Rust

Language: Rust - Size: 51.8 KB - Last synced at: 25 days ago - Pushed at: about 1 year ago - Stars: 39 - Forks: 7

azavea/terraform-aws-emr-cluster 📦

A Terraform module to create an Amazon Web Services (AWS) Elastic MapReduce (EMR) cluster.

Language: HCL - Size: 16.6 KB - Last synced at: about 1 year ago - Pushed at: over 5 years ago - Stars: 39 - Forks: 52

commoncrawl/cc-warc-examples Fork of Smerity/cc-warc-examples

CommonCrawl WARC/WET/WAT examples and processing code for Java + Hadoop

Language: Java - Size: 30.3 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 38 - Forks: 19

maniram-yadav/Big_DataHadoop_Projects

Big data projects implemented by Maniram yadav

Language: PigLatin - Size: 2.79 MB - Last synced at: about 2 years ago - Pushed at: about 7 years ago - Stars: 33 - Forks: 33

hiejulia/Data-pipeline-project

Data pipeline project

Language: Jupyter Notebook - Size: 55.1 MB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 32 - Forks: 22

jishnub/ParallelUtilities.jl

Fast and easy parallel mapreduce on HPC clusters

Language: Julia - Size: 992 KB - Last synced at: 13 days ago - Pushed at: over 3 years ago - Stars: 31 - Forks: 0

amarkum/interview-refresher-java-bigdata

a one-stop repo to lookup for code snippets of core java concepts, sql, data structures as well as big data. It also consists of interview questions asked in real-life.

Language: Java - Size: 2.05 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 30 - Forks: 9

tiemma/sonic-distribute

Accelerate your distributed processes with this MapReduce framework. Focus on your logic and deploy tasks to workers seamelssly.

Language: TypeScript - Size: 912 KB - Last synced at: 21 days ago - Pushed at: about 2 years ago - Stars: 30 - Forks: 2

OrangeDrk/JavaNotes

Java后端学习笔记。包括Linux、maven、git、互联网架构、大数据体系等

Size: 149 MB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 29 - Forks: 9

saleyn/etran

Erlang Parse Transforms Including Fold (MapReduce) comprehension, Elixir-like Pipeline, and default function arguments

Language: Erlang - Size: 130 KB - Last synced at: 23 days ago - Pushed at: over 1 year ago - Stars: 29 - Forks: 2

chucheng92/HadoopDedup

:watermelon:基于Hadoop和HBase的大规模海量数据去重

Language: Java - Size: 12 MB - Last synced at: 27 days ago - Pushed at: about 7 years ago - Stars: 29 - Forks: 16

SSQ/Coursera-UW-Machine-Learning-Clustering-Retrieval

Language: Python - Size: 81.9 MB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 28 - Forks: 28

CLDXiang/Mining-Frequent-Pattern-from-Search-History

《大数据挖掘技术》@复旦 课程项目,试图从搜狗实验室用户查询日志数据(2008)中找出搜索记录中有较高支持度关键词的频繁二项集。在实现层面上,我搭建了一个由五台服务器组成的微型 Hadoop 集群,并且用 Python 实现了 Parallel FP-Growth 算法中的三个 MapReduce 过程。

Language: Python - Size: 1.52 MB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 26 - Forks: 2

hobbyquaker/mqttDB

JSON Store with MQTT Interface :books::open_file_folder::satellite:

Language: JavaScript - Size: 99.6 KB - Last synced at: 5 days ago - Pushed at: almost 7 years ago - Stars: 26 - Forks: 0

onanypoint/yandex-big-data-engineering 📦

Language: Jupyter Notebook - Size: 458 KB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 25 - Forks: 39

longshilin/Hadoop-MapReduce

基于MapReduce的应用案例 :ear_of_rice:

Language: Java - Size: 30.3 KB - Last synced at: 18 days ago - Pushed at: over 7 years ago - Stars: 25 - Forks: 7

caizkun/mapreduce-examples

A collection of mapreduce problems and solutions

Language: Java - Size: 91.8 KB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 24 - Forks: 10

d2si-oss/ooso

Java library for running Serverless MapReduce jobs

Language: Java - Size: 82.9 MB - Last synced at: 23 days ago - Pushed at: almost 8 years ago - Stars: 24 - Forks: 2

srafay/Hadoop-hands-on

Learning how to tame the Big Data with Hadoop and related technologies

Language: PigLatin - Size: 96.7 KB - Last synced at: 6 days ago - Pushed at: about 5 years ago - Stars: 23 - Forks: 21

Deeptiman/offchaindata

Hyperledger Fabric OffChain Storage

Language: Go - Size: 28.3 KB - Last synced at: 19 days ago - Pushed at: almost 4 years ago - Stars: 22 - Forks: 15

gbieul/spyrk-cluster

Spyrk-cluster is a data mini-lab, considering the main technologies used these days. It's useful to either understand how to configure a cluster, or just to take it for granted to use for testing with submit or interactive jobs.

Language: Dockerfile - Size: 572 KB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 22 - Forks: 11

ishugaepov/MLBD

Materials for "Machine Learning on Big Data" course

Language: Jupyter Notebook - Size: 88.6 MB - Last synced at: 5 months ago - Pushed at: almost 2 years ago - Stars: 21 - Forks: 60

Azure-Samples/durablefunctions-mapreduce-dotnet 📦

An implementation of MapReduce on top of C# Durable Functions over the NYC 2017 Taxi dataset to compute average ride time per-day

Language: C# - Size: 620 KB - Last synced at: about 1 year ago - Pushed at: almost 4 years ago - Stars: 21 - Forks: 9

in4it/gomap

Run your MapReduce workloads as a single binary on a single machine with multiple CPUs and high memory. Pricing of a lot of small machines vs heavy machines is the same on most cloud providers.

Language: Go - Size: 79.1 KB - Last synced at: 11 months ago - Pushed at: almost 5 years ago - Stars: 21 - Forks: 8

lmarabi/st-hadoop

ST-Hadoop is an open-source MapReduce extension of Hadoop designed specially to analyze your spatio-temporal data efficiently

Language: Java - Size: 125 MB - Last synced at: over 1 year ago - Pushed at: about 6 years ago - Stars: 20 - Forks: 6

huangyueranbbc/RecommendByItemcf

Hadoop mapreduce. 基于ItemCF的协同过滤 物品推荐系统 Collaborative filtering goods recommendation system based on ItemCF

Language: Java - Size: 498 KB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 20 - Forks: 13

TauferLab/Mimir

Mimir is a new implementation of MapReduce over MPI. Mimir inherits the core principles of existing MapReduce frameworks, such as MR-MPI, while redesigning the execution model to incorporate a number of sophisticated optimization techniques that achieve similar or better performance with significant reduction in the amount of memory used.

Language: C++ - Size: 51.9 MB - Last synced at: about 1 year ago - Pushed at: over 6 years ago - Stars: 20 - Forks: 7

bilal-elchami/dijkstra-hadoop-spark

Dijkstra Algorithm - Python Hadoop Streaming and Pyspark

Language: TeX - Size: 1.96 MB - Last synced at: almost 2 years ago - Pushed at: almost 7 years ago - Stars: 20 - Forks: 2

taha7ussein007/Coursera_Bigdata_UCSD

UCSD Big Data Specialization General Materials and my Capstone Project.

Language: Jupyter Notebook - Size: 11.2 MB - Last synced at: about 2 years ago - Pushed at: about 7 years ago - Stars: 20 - Forks: 16

a4tunado/lectures-hse-spark

Масштабируемое машинное обучение и анализ больших данных с Apache Spark

Language: Jupyter Notebook - Size: 52.3 MB - Last synced at: about 2 years ago - Pushed at: about 7 years ago - Stars: 20 - Forks: 10

InnoFang/subgraph-isomorphism

❄Implement the common subgraph isomorphism algorithms (i.e. Ullmann, VF2) based on MapReduce on Hadoop

Language: Java - Size: 19.6 MB - Last synced at: 13 days ago - Pushed at: almost 3 years ago - Stars: 19 - Forks: 0

zenoyang/web-click-flow

网站点击流离线日志分析

Language: Java - Size: 2.98 MB - Last synced at: 22 days ago - Pushed at: over 6 years ago - Stars: 19 - Forks: 11

benedekh/bigdata-projects

Student projects in Big Data field.

Language: Java - Size: 190 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 18 - Forks: 12

zunzhuowei/qs-hadoop

大数据生态圈学习

Language: Java - Size: 5.49 MB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 18 - Forks: 7

pranitbose/market-basket-analysis

Hadoop MapReduce implementation of Market Basket Analysis for Frequent Item-set and Association Rule mining using Apriori algorithm.

Language: Java - Size: 92.8 KB - Last synced at: about 2 years ago - Pushed at: almost 8 years ago - Stars: 18 - Forks: 11

ly16/GooglePlay-Web-Crawler

Mapreduce project by Hadoop, Nutch, AWS EMR, Pig, Tez, Hive

Language: Java - Size: 1.62 MB - Last synced at: 10 months ago - Pushed at: about 8 years ago - Stars: 18 - Forks: 6

vsmolyakov/pyspark

spark (scala and python)

Language: Python - Size: 2.4 MB - Last synced at: about 1 year ago - Pushed at: over 5 years ago - Stars: 17 - Forks: 6

yaa110/goterator

Lazy iterator implementation for Golang

Language: Go - Size: 19.5 KB - Last synced at: 10 days ago - Pushed at: 10 months ago - Stars: 16 - Forks: 4

jarlor/TravelWebsite_BigDataAnalysis

旅游网站(携程网部分数据)大数据分析-hadoop课程设计(本科课设级别)

Language: Java - Size: 639 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 16 - Forks: 1

goldmansachs/MRWord2Vec

A MapReduce / Hadoop implementation of Word2Vec

Language: Java - Size: 52.7 KB - Last synced at: about 1 month ago - Pushed at: about 3 years ago - Stars: 16 - Forks: 11

singgel/BigData-skillTree

【易车】- Spark、flink、HBase、Hive、flume集成了一些Hadoop的原生api的一些demo(如HDFS、MapReduce:目前就这两个);同时测试一些异常功能

Language: Java - Size: 107 KB - Last synced at: 29 days ago - Pushed at: about 6 years ago - Stars: 16 - Forks: 11

alash3al/aggrex

a crazy API gateway aggregation using javascript as a language and go as a runtime

Language: Go - Size: 24.4 KB - Last synced at: 13 days ago - Pushed at: almost 7 years ago - Stars: 16 - Forks: 0

marcelmittelstaedt/BigData

Lecture: Big Data

Language: HTML - Size: 588 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 15 - Forks: 12

wxw-matt/docker-hadoop Fork of big-data-europe/docker-hadoop

Run Hadoop on Arm64 (Apple M1) and Intel CPUs Natively Using the Universal Docker Images.

Language: Shell - Size: 114 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 15 - Forks: 8

zhou-yuhan/MIT-6.824-Distributed-Systems

Materials for MIT 6.824: Distributed Systems 2020

Language: Go - Size: 14.5 MB - Last synced at: 11 months ago - Pushed at: over 3 years ago - Stars: 15 - Forks: 4