gitlab.com topics: Apache Spark

Repositories

arise-biodiversity/biocloud/docker-compose-biocloud-dta

Store and find Arise data: Delta Lake and PostgreSQL

Last synced at: about 2 years ago - Stars: 0 - Forks: 0

edu-career/edu-apachespark

Educational repo for Apache Spark

Last synced at: over 2 years ago - Stars: 0 - Forks: 0

dars1608/geographically-weighted-regression-in-apache-spark

Implementation of Geographically Weighted Regression (GWR) using Apache Spark, Spark ML and Apache Sedona.

Last synced at: 11 months ago - Stars: 0 - Forks: 0

cecilegltslmcs/electricity_consumption_production

This aim of this project is to collect informations related to energy consumption and production in France. This collection is realized by using Apache Kafka, the data are processed by Apache Spark and they are storaged in a NoSQL Database : MongoDB.

Last synced at: over 2 years ago - Stars: 0 - Forks: 0

cecilegltslmcs/twitter_sentimentanalysis

The aim of this project is to collect tweets by using the Twitter API and display the results of a sentiment analysis on a dashboard.

Last synced at: over 2 years ago - Stars: 0 - Forks: 0

progxaker/sparkplugin

The "Stage Metrics" plugin for Apache Spark to creating metrics by stage status

Last synced at: almost 3 years ago - Stars: 0 - Forks: 0

leo-plese/artificial-intelligence-machine-learning-deep-learning/machine-learning/apache-spark-python-framework-machine-learning-data-pipeline

Last synced at: almost 3 years ago - Stars: 0 - Forks: 0

saeideh_ab/spark-test

sentiment analysis using spark ml library. implemented classic ml models: SVM, Logistic Regression, Naive Bayes and Random Forest. implemented embedding: Word2Vec and TF-IDF. also ensemble and hybrid (ml and lexicon based) methods were implemented

Last synced at: over 2 years ago - Stars: 0 - Forks: 1

zero323/pyspark-asyncactions

Mirror of https://github.com/zero323/pyspark-asyncactions

Last synced at: almost 3 years ago - Stars: 0 - Forks: 0

leliac/ganymede

Execute Hadoop and Spark applications on the BigData@Polito cluster with a single command

Last synced at: over 2 years ago - Stars: 0 - Forks: 0

siddie/stackexchange-dump-spark-research-tools

Stack Exchange releases "data dumps" of all its publicly available content roughly every three months via archive.org. This project is an example and a framework for building ETL for this data with Apache Spark and Java.

Last synced at: over 2 years ago - Stars: 0 - Forks: 0

Related Keywords

Apache Spark 11 python 3 big data 3 pyspark 3 spark 2 mongodb 2 Apache Kafka 2 Apache Hadoop 1 sentiment analysis 1 Persian reviews 1 MapReduce 1 Digikala comments 1 sentiment classification 1 ETL 1 scalability 1 big data ETL 1 stack exchange 1 machine learning 1 jupyter notebook 1 NLP(Natural Language Process) 1 Large Movie Review Dataset 1 stackexchange 1 plugins 1 plugin 1 streamlit 1 plotly 1 spark-streaming 1 dashboard 1 gis 1 geospatial 1 ML 1 GWR 1 postgresql 1 parquet 1 iRODS 1 Delta Lake 1 ClickHouse 1 Citus 1

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Repos

gitlab.com topics: Apache Spark

arise-biodiversity/biocloud/docker-compose-biocloud-dta

edu-career/edu-apachespark

dars1608/geographically-weighted-regression-in-apache-spark

cecilegltslmcs/electricity_consumption_production

cecilegltslmcs/twitter_sentimentanalysis

progxaker/sparkplugin

leo-plese/artificial-intelligence-machine-learning-deep-learning/machine-learning/apache-spark-python-framework-machine-learning-data-pipeline

saeideh_ab/spark-test

zero323/pyspark-asyncactions

leliac/ganymede

siddie/stackexchange-dump-spark-research-tools