An open API service providing repository metadata for many open source software ecosystems.

Topic: "big data"

jtholmi/wit_io

WITio: A MATLAB data evaluation toolbox to script broader insights into big data from WITec microscopes

Last synced at: over 2 years ago - Stars: 2 - Forks: 0

nescience/machine_learning

New machine learning algorithms based on the minimum nescience principle

Last synced at: over 2 years ago - Stars: 2 - Forks: 0

appdev87/staff-scheduler-nextjs

Last synced at: over 2 years ago - Stars: 1 - Forks: 0

datahackformation/community/workshops/workshop_1_druid_gcpcertification

Workshop dictado por Jesús Méndez (https://pe.linkedin.com/in/jmendezgal) y Antonio Cachuán (https://linkedin.com/in/antoniocachuan/) los temas de Apache Druid, Certificarte en GCP y nuestro Data Engineering Program

Last synced at: over 2 years ago - Stars: 1 - Forks: 0

datahackformation/community/workshops/workshop_2_bigdata_hadoop

Workshop de Big Data a cargo de Jimmy Farfán docente del curso online "Desarrollo de Aplicaciones de Big Data en Hadoop". Si requieren más información o cualquier duda pueden ubicarnos en facebook como Data Hack Formation.

Last synced at: over 2 years ago - Stars: 1 - Forks: 0

gmarciani/flink-app

Scaffolding for data stream processing applications, leveraging Apache Flink.

Last synced at: over 2 years ago - Stars: 1 - Forks: 0

gmarciani/mapreduce-app

Scaffolding for Map/Reduce applications, leveraging Apache Hadoop.

Last synced at: over 2 years ago - Stars: 1 - Forks: 0

neuroscience-lab/bndf

Structured Big data framework based on Apache Spark for storing and manipulating large scale multi channel neurophysiological recording data

Last synced at: over 2 years ago - Stars: 1 - Forks: 0

neuroscience-lab/bndfcluster

BNDF Private cluster

Last synced at: over 2 years ago - Stars: 1 - Forks: 2

v-bootcamp-bd-ml/big-data-processing.spark-y-scala.practice

Práctica del módulo Big Data Processing (Spark y Scala) del V Bootcamp BD & ML de Keepcoding

Last synced at: over 2 years ago - Stars: 1 - Forks: 0

amit-kamat/Map-Reduce-Ukraine

This project aggregates trending data from Ukraine based Twitter accounts. The raw aggregated data is cleansed before analysis using some Big-data methods. The purpose of this project is to familiarize myself with the workings of Hadoop for HDFS and Map-Reduce infrastructure.

Last synced at: over 2 years ago - Stars: 0 - Forks: 0

arnimjenett/fsdb

file system based database for the management of big image data.

Last synced at: over 2 years ago - Stars: 0 - Forks: 0

contactprincebansal/pyspark-azure-hdinsight-sample

Deploying PySpark Jobs on Azure HDInsight Spark Cluster (CI/CD)

Last synced at: over 2 years ago - Stars: 0 - Forks: 0

dars1608/geographically-weighted-regression-in-apache-spark

Implementation of Geographically Weighted Regression (GWR) using Apache Spark, Spark ML and Apache Sedona.

Last synced at: 9 months ago - Stars: 0 - Forks: 0

dars1608/geospatial-index-distributed

Spatial join of geospatial data from Kafka streams using Apache Spark (Spark Streaming).

Last synced at: 9 months ago - Stars: 0 - Forks: 0

dephekt/crawler

A Python app for scanning large data sets of URLs for a given signature and storing the results to an ElasticSearch index. Useful applications for CERTs and security researchers, maybe others.

Last synced at: about 2 years ago - Stars: 0 - Forks: 0

erichgatejen/autohit-2003

XML based testing platform.

Last synced at: about 2 years ago - Stars: 0 - Forks: 0

erichgatejen/dadadJ

dadadJ data operating environement

Last synced at: about 2 years ago - Stars: 0 - Forks: 0

franckf/enwiki-20210620-abstract.xml

Last synced at: over 2 years ago - Stars: 0 - Forks: 0

jal7/DataScience

A Learning Path for Data Science professional development

Last synced at: 10 months ago - Stars: 0 - Forks: 1

jbferet/bigRaster

The package bigRaster allows handling large rasters when they can be processed by chunk. This includes computing spectral indices, applying regression models, stacking individual rasters into larger rasters...

Last synced at: almost 2 years ago - Stars: 0 - Forks: 0

kaelta/kwn

Studying the effects of music in the growth of plants through an IoT automated farming solution.

Last synced at: over 1 year ago - Stars: 0 - Forks: 0

knowledge-bases/data-science

https://data.rtfm.page

Last synced at: over 2 years ago - Stars: 0 - Forks: 0

leliac/ganymede

Execute Hadoop and Spark applications on the BigData@Polito cluster with a single command

Last synced at: over 2 years ago - Stars: 0 - Forks: 0

maspadaru/taskmaster

Taskmaster is a light-weight open-source software framework that aims to simplify distribution of big data processing and analysis tasks over multiple worker nodes.

Last synced at: over 2 years ago - Stars: 0 - Forks: 0

migandr/hadoop-premier-league

This project was an exercise for the Master in Big Data Engineering and Data Science at "Universidad Autónoma de Madrid". See the readme.md for more information.

Last synced at: over 2 years ago - Stars: 0 - Forks: 0

NCouli/BigDataManagement

Last synced at: about 2 years ago - Stars: 0 - Forks: 0

rvalfo/shluker

NLTK for sentiment analysis given a Twitter streaming for a word. Configuration scripts for MongoDB and twitter streaming.

Last synced at: over 2 years ago - Stars: 0 - Forks: 1

rychly-edu/theses/dist-forensic-digital-data-repo

Distributed storage for digital forensic data with data/metadata repository, API for queries and incoming/outgoing data, indexing, plug-in system for yet unsupported data-types, etc.

Last synced at: over 2 years ago - Stars: 0 - Forks: 0

samy_benslimane/nf26-project

From Data ASOS (https://mesonet.agron.iastate.edu/request/download.phtml), Analysis of aviation data to underline some patterns

Last synced at: over 2 years ago - Stars: 0 - Forks: 0

siddie/stackexchange-dump-spark-research-tools

Stack Exchange releases "data dumps" of all its publicly available content roughly every three months via archive.org. This project is an example and a framework for building ETL for this data with Apache Spark and Java.

Last synced at: over 2 years ago - Stars: 0 - Forks: 0

stefano.slobodiuk/open-data-for-bike-marstefo

The unofficial Bike Sharing analytics service for Udine (aka Bike MarStefo) makes the free download of the dataset available to everyone.

Last synced at: 10 months ago - Stars: 0 - Forks: 0

therackio/big-data/binaries/apache-hadoop-bin-arm64

Last synced at: over 2 years ago - Stars: 0 - Forks: 0

therackio/big-data/binaries/apache-hive-bin-arm64

Last synced at: over 2 years ago - Stars: 0 - Forks: 0

therackio/big-data/binaries/apache-hudi-bin-arm64

Apache Hudi (https://hudi.apache.org/), compiled on ARM64.

Last synced at: over 2 years ago - Stars: 0 - Forks: 0

therackio/big-data/binaries/apache-spark-bin-arm64

Last synced at: over 2 years ago - Stars: 0 - Forks: 0

tymyrddin/seedlings

Deep learning (CNN) deployment pipeline

Last synced at: over 2 years ago - Stars: 0 - Forks: 0

varunkp420/titanic

Last synced at: about 2 years ago - Stars: 0 - Forks: 0

vqphuynh/dp3-algorithm

DP3 is an algorithm for distributed and shared-memory parallel Frequent Itemsets Mining.

Last synced at: over 2 years ago - Stars: 0 - Forks: 0