Mining-Frequent-Pattern-from-Search-History

《大数据挖掘技术》@复旦课程项目，试图从搜狗实验室用户查询日志数据（2008）中找出搜索记录中有较高支持度关键词的频繁二项集。在实现层面上，我搭建了一个由五台服务器组成的微型 Hadoop 集群，并且用 Python 实现了 Parallel FP-Growth 算法中的三个 MapReduce 过程。

JSON API: http://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CLDXiang%2FMining-Frequent-Pattern-from-Search-History

Stars: 26
Forks: 2
Open issues: 0

License: None
Language: Python
Size: 1.52 MB
Dependencies parsed at: Pending

Created at: over 5 years ago
Updated at: over 2 years ago
Pushed at: about 4 years ago
Last synced at: about 2 years ago

Topics: fp-growth, hadoop, mapreduce, mapreduce-python

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Repos

GitHub / CLDXiang / Mining-Frequent-Pattern-from-Search-History