An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: big-data-analytics

ydataai/ydata-profiling

1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.

Language: Python - Size: 840 MB - Last synced at: 43 minutes ago - Pushed at: 4 days ago - Stars: 12,986 - Forks: 1,722

bose234/data-storage-project

This repository contains a data storage project focused on analyzing sales and returns using a real-world dataset. It features SQL-based ETL processes, data visualization with Tableau, and a comparison of relational and graph databases. 🐙📊

Language: TSQL - Size: 24.5 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 0 - Forks: 0

Peippo1/TrendNest

TrendNest is a modular, AI-integrated data pipeline that extracts, cleans, models, and visualizes time-based trends from data. It includes Gemini 1.5 summarisation, CSV export, and a dashboard UI. Built with Python, SQL, and BigQuery support, and fully dockerized for deployment —for data engineering and analytics portfolios.

Language: Python - Size: 93.7 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 0 - Forks: 0

Lara33779/Alibaba-Cloud-Useful-Resources

This repository shares useful resources, updates, and tips to help you navigate the world of cloud computing with Alibaba Cloud.

Size: 388 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

clairelee-codes/bigdata-analyst-notes

Big Data Analyst Certification

Language: Jupyter Notebook - Size: 56.6 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

v6d-io/v6d

vineyard (v6d): an in-memory immutable data manager. (Project under CNCF, TAG-Storage)

Language: C++ - Size: 19.4 MB - Last synced at: 10 days ago - Pushed at: about 1 month ago - Stars: 902 - Forks: 124

ingef/conquery

Visual, interactive queries against big databases

Language: Java - Size: 49 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 37 - Forks: 13

HuaTanSang/VreID

Vehicle re-identification - Big data analysis final project

Language: Python - Size: 5.7 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 0 - Forks: 0

dongsuo/vue-data-board

A Data Analysis Board in Vue.

Language: Vue - Size: 10.4 MB - Last synced at: 15 days ago - Pushed at: 16 days ago - Stars: 1,329 - Forks: 291

MrXujiang/v6.dooring.public

可视化大屏解决方案, 提供一套可视化编辑引擎, 助力个人或企业轻松定制自己的可视化大屏应用.

Language: TypeScript - Size: 36 MB - Last synced at: 9 days ago - Pushed at: 6 months ago - Stars: 659 - Forks: 152

madhurimarawat/Madhurima-Mindscape

This is a personal blog where I share a variety of content, including personal reflections, tech insights, project diaries, and creative photography. Explore different categories such as personal growth, tech insights, and project experiences.

Language: HTML - Size: 27.9 MB - Last synced at: 21 days ago - Pushed at: 22 days ago - Stars: 0 - Forks: 0

lithops-cloud/lithops

A multi-cloud framework for big data analytics and embarrassingly parallel jobs, that provides an universal API for building parallel applications in the cloud ☁️🚀

Language: Python - Size: 12.9 MB - Last synced at: 24 days ago - Pushed at: about 1 month ago - Stars: 334 - Forks: 114

mahmoudparsian/pyspark-tutorial

PySpark-Tutorial provides basic algorithms using PySpark

Language: Jupyter Notebook - Size: 8.96 MB - Last synced at: 27 days ago - Pushed at: 27 days ago - Stars: 1,223 - Forks: 478

jdvelasq/courses

Material de apoyo para cursos, Facultad de Minas, Universidad Nacional de Colombia

Language: Python - Size: 470 MB - Last synced at: 29 days ago - Pushed at: 29 days ago - Stars: 17 - Forks: 7

ICT-BDA/EasyML

Easy Machine Learning is a general-purpose dataflow-based system for easing the process of applying machine learning algorithms to real world tasks.

Language: Java - Size: 14.9 MB - Last synced at: 24 days ago - Pushed at: over 1 year ago - Stars: 1,976 - Forks: 439

varshithdupati/yelp-business-analysis

Big Data analysis on Yelp reviews/businesses for Arizona. Using Hadoop, Spark, PySpark.

Language: Jupyter Notebook - Size: 686 KB - Last synced at: 16 days ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

JuanParias29/BigDataProcessingProject

Este repositorio contiene un proyecto de análisis y procesamiento de datos a gran escala basado en la metodología CRISP-DM, enfocado en resolver preguntas de negocio dentro del ámbito educativo.

Language: Jupyter Notebook - Size: 4.3 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

ceredamatteo-lab/GSECA

Gene Set Enrichment Class Analysis for heterogeneous RNA sequencing data

Language: R - Size: 56.4 MB - Last synced at: about 1 month ago - Pushed at: almost 5 years ago - Stars: 5 - Forks: 1

theoliverlear/Crypto-Trader

A Spring Boot web app that buys and sells cryptocurrencies from API data sources. Its quick trading and other features allow users to leverage computer power to outperform the market.

Language: Java - Size: 34.2 MB - Last synced at: 18 days ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

mgunawardhana/learning-RStudio

R is a programming language used for statistical analysis, data visualization, and data science. It is widely used by researchers, data analysts, and scientists around the world.

Language: R - Size: 621 KB - Last synced at: 3 days ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

archivesunleashed/aut

The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.

Language: Scala - Size: 39.5 MB - Last synced at: 11 days ago - Pushed at: over 1 year ago - Stars: 144 - Forks: 33

PriyankaJhaTheDeveloper/YellowTaxiNYC_HiveCaseStudy

This repository shows the Case Study of Yellow Taxi Cabs of NYC, using the Hadoop-Hive ecosystem with HiveQL.

Size: 556 KB - Last synced at: 28 days ago - Pushed at: over 3 years ago - Stars: 3 - Forks: 3

SepidehHayati/Projects

Includes both my personal and academic projects, reports, assignments at the University of Pavia.

Language: Jupyter Notebook - Size: 30 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

Irish-C/learner_dashboard

A Group Project for Big Data Analytics with Dash.

Language: Python - Size: 62 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

LatiefDataVisionary/big-data-and-data-analytics-college-task

Language: Jupyter Notebook - Size: 63.2 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

JKA098/Pokemon-Feistiness-MapReduce-Job

This Project aims to implement a **Hadoop MapReduce job in Pseudo-Distributed Mode** to determine the **feistiest Pokémon** based on their **type**. The job processes the Pokémon dataset (`pokemon.csv`) and outputs a CSV file containing Pokémon **type1, type2, name, and feistiness score**.

Language: Python - Size: 220 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

JKA098/CSTADS-2021-22-Substance-Use-Analysis

The **Canadian Student Tobacco, Alcohol and Drugs Survey (CSTADS)** 2021–22 dataset is analyzed to explore: * Provincial variation in youth **cannabis**, **alcohol**, and **tobacco** use * The impact of **cannabis legalization** * Access networks for each substance * Regional policy implications using **geospatial** and **network** analysis

Size: 4.37 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

adwaiy2912/BDA-Lab

Repository contains weekly lab work and assignments for the Big Data Analytics (BDA) course

Language: Python - Size: 7.8 MB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

vinay-ram1999/data-engineer-playground

Language: TypeScript - Size: 9.4 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

prof-rjimenez/cit_bigdata_basico

Repositorio para las clases de laboratorio del curso básico de introducción a Big Data.

Language: Python - Size: 97.2 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 4

LatiefDataVisionary/big-data-for-data-science-college-task

Language: Mermaid - Size: 3.73 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

y0nil/kusto.blog

A technical blog about Kusto

Language: HTML - Size: 2.78 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 11 - Forks: 2

MrHAM17/Spotify_Streaming_Analytics

This is my Sem 7 BDA Lab Project. For complete details, kindly check the below README File.

Language: Jupyter Notebook - Size: 14.9 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

exajobs/data-engineering-collection

A collection of awesome software, libraries, Learning Tutorials, documents, books, resources and interesting stuff about Big Data Science & Engineering

Size: 241 KB - Last synced at: about 2 months ago - Pushed at: over 3 years ago - Stars: 7 - Forks: 1

rouyang2017/SISSO

A data-driven method combining symbolic regression and compressed sensing for accurate & interpretable models.

Language: Fortran - Size: 3.88 MB - Last synced at: about 2 months ago - Pushed at: 3 months ago - Stars: 276 - Forks: 85

K-G-PRAJWAL/Big-Data-Engineering

Language: PLpgSQL - Size: 254 MB - Last synced at: about 2 months ago - Pushed at: 5 months ago - Stars: 22 - Forks: 14

caioricciuti/ch-ui

Use CH-UI to work with your data from Click House self-hosted with a user-friendly interface. CH-UI is a modern and feature-rich user interface for ClickHouse databases. It offers an intuitive platform for querying ClickHouse databases, executing queries, and visualizing metrics about your instance.

Language: TypeScript - Size: 24.1 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 346 - Forks: 26

trieu/leo-cdp-free-edition

The binary build of LEO CDP Free Edition for training purposes

Language: HTML - Size: 782 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 38 - Forks: 14

yaoguangluo/ChromosomeDNA

《DNA元基催化与肽计算》 在进化计算中, 软件函数文件进行 DNA 语义元基索引编码的 PDE 新陈代谢优化方式, 是一种有效的进化方式.

Language: Java - Size: 676 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 7 - Forks: 2

Yash22222/Olympic-Games-Analytics-Using-Apache-Spark

The "Olympic Games Analytics Using Apache Spark Databricks" project explores data from the Olympic Games (1896-2016) to identify trends and insights. Using Apache Spark for big data processing and Databricks for visualization, the project analyzes key factors like top-performing countries and athlete attributes, showcasing real-world analytics.

Language: HTML - Size: 18.7 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

Houssam-11/BigData-Architecture

Big Data system predicts pandemic risk (COVID-19) via data analysis, ML modeling, and real-time dashboard.

Language: Python - Size: 29 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

bydevmar/Master_MASD_FPO

Ce dépôt GitHub regroupe tous les cours, TP, TD, projets, et exercices de ma formation en master en mathématiques appliquées pour la science des données. Parcourez-le pour une vue complète de mon parcours académique, offrant une perspective détaillée de mon apprentissage dans ce domaine.

Language: Jupyter Notebook - Size: 155 MB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 6 - Forks: 0

KayvanShah1/Big-Data-Specialization-Coursera

Repository for the Big Data Specialization from University of California San Diego on Coursera

Language: Python - Size: 20 MB - Last synced at: 9 days ago - Pushed at: over 1 year ago - Stars: 10 - Forks: 1

bilgeswe/BigDataManagement

Building a Data Pipeline with Lakehouse Architecture on Microsoft Azure Platform

Language: TSQL - Size: 2.02 MB - Last synced at: about 2 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

SrLozano/Tinder-Big-Data-Analysis

Big Data Analysis of Tinder done at Universitat Rovira i Virgili and Universitat Politècnica de Catalunya · BarcelonaTech

Language: Jupyter Notebook - Size: 21.7 MB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 10 - Forks: 5

FastTrack-Academy/adolescent-suicide-dashboard

An interactive data visualization and analytics tool designed to analyze risk factors, trends, and disparities in adolescent suicide rates. Using machine learning and open data, this dashboard helps policymakers, educators, and mental health professionals identify patterns and develop prevention strategies to support adolescent well-being. 🚀

Language: HTML - Size: 11.9 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

chandrahask535/Big-Data-Analysis-to-Identify-Adverse-Effects-of-Covid-19-Vaccines2.0

This project utilizes big data analytics, machine learning, and statistical methods to identify and classify adverse effects of COVID-19 vaccinations. By analyzing large datasets, it aims to uncover patterns and correlations, providing valuable insights into vaccine safety and efficacy.

Language: Python - Size: 5.71 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 3 - Forks: 0

ToheedAsghar/Smart-Meters-London-Analytics

This project analyzes the Smart Meters in London dataset, performing data preprocessing, EDA, and predictive modeling to forecast energy usage and identify optimization opportunities. It demonstrates my expertise in transforming raw data into actionable insights for improving energy efficiency using AI and real-world datasets.

Language: Jupyter Notebook - Size: 2.17 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

rapticore/ssvc_ore_miner

SSVC Ore Miner - www.rapticore.com

Language: Python - Size: 433 KB - Last synced at: 21 days ago - Pushed at: 7 months ago - Stars: 9 - Forks: 1

grahman20/ADF

Adaptive Decision Forest(ADF) is an incremental machine learning framework called to produce a decision forest to classify new records. ADF is capable to classify new records even if they are associated with previously unseen classes. ADF also is capable of identifying and handling concept drift; it, however, does not forget previously gained knowledge. Moreover, ADF is capable of handling big data if the data can be divided into batches.

Language: Java - Size: 1.63 MB - Last synced at: 26 days ago - Pushed at: about 2 years ago - Stars: 4 - Forks: 0

Nico-Curti/PhDthesis

PhD thesis in Applied Physics

Language: TeX - Size: 220 MB - Last synced at: about 1 month ago - Pushed at: over 5 years ago - Stars: 8 - Forks: 0

Wittline/pyspark-on-aws-emr

The goal of this project is to offer an AWS EMR template using Spot Fleet and On-Demand Instances that you can use quickly. Just focus on writing pyspark code.

Language: Python - Size: 3.61 MB - Last synced at: 2 months ago - Pushed at: about 3 years ago - Stars: 27 - Forks: 13

yuvrajsaraogi/Unemployment-Analysis-with-Python

Unemployment is measured by the unemployment rate which is the number of people who are unemployed as a percentage of the total labour force. We have seen a sharp increase in the unemployment rate during Covid-19, so analyzing the unemployment rate can be a good data science project.

Language: Jupyter Notebook - Size: 244 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

OwenOrcan/YiraBot-Crawler

YiraBot: Simplifying Web Scraping for All. A user-friendly tool for developers and enthusiasts, offering command-line ease and Python integration. Ideal for research, SEO, and data collection.

Language: Python - Size: 221 KB - Last synced at: 20 days ago - Pushed at: 7 months ago - Stars: 19 - Forks: 0

JoseRuiz01/AirlineOn-TimePerformanceAnalysis

Airline on-time performance analysis using Spark Machine Learning libraries

Size: 0 Bytes - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

SohelRana-aiub-Pro/Traffic-Forecasting-Graph-Neural-Networks-LSTM

https://docs.omniverse.nvidia.com/prod_install-guide/prod_install-guide/overview.html

Language: Jupyter Notebook - Size: 1.07 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

deepaiimpactx/BARS

Language: Python - Size: 16.2 MB - Last synced at: 2 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

BasharatWali/Medicine_Rec_System

Language: Jupyter Notebook - Size: 27.3 KB - Last synced at: 3 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

khushi-sabarad/PsyliqIntenshipDataAnalysis

Big Data Analysis Internship. Diabetes Prediction, HR & Employee Data Analysis. Tools: SQL, Power BI and Excel

Size: 22.5 KB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

JosepSampe/lithops Fork of lithops-cloud/lithops

A multi-cloud framework for big data analytics and embarrassingly parallel jobs, that provides an universal API for building parallel applications in the cloud ☁️🚀

Language: Python - Size: 12.9 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 2 - Forks: 0

Advaitiyer/advaitiyer.github.io

Data Scientist's Portfolio covering the topics: Big Data Analytics, Information Visualization, Advanced Data Mining, Applied Data Analytics, Financial, and Marketing Analytics, Artificial Intelligence, and Deep Learning.

Language: HTML - Size: 53.2 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

ellie991/Titanic-Dataset-Analysis

Big Data Analysis on Titanic Dataset

Language: R - Size: 190 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

ellie991/Spark-Spotify-Analysys

SPOTIFY - Big Data Analysis w/ Spark

Language: Python - Size: 11 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

GMAP/DSPBench

a suite of benchmark applications for distributed data stream processing systems

Language: Java - Size: 250 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 28 - Forks: 3

Matt-J-Dong/Top-Towns-To-Take-Over-Tech

Which American cities are the best for tech jobs?

Language: Scala - Size: 12.8 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 1

Kaustubh-Indulkar/TE-IT-DSBDA-Assignmnets

This repository contains the solutions for a series of assignments covering Data Science And Big Data Analytics concepts.

Language: Jupyter Notebook - Size: 9.71 MB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

lesleyzhao/Bus_Delays_Analysis

Bus Delays Analysis is a big data analytics project designed to do ETL and analyze bus delays using Scala, Apache Spark, and HDFS.

Language: Scala - Size: 12.7 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

msche81/2-Jedha_Fullstack

450h Data Scientist training - Collect and store large amounts of data - Build prediction models in Machine Learning and Deep Learning - Deploy your models in real conditions

Language: Jupyter Notebook - Size: 248 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 2 - Forks: 0

subhanjandas/RDBMS-to-GraphDB---Big-Data-Analytics-using-Neo4j

This project involves migration from a traditional RDBMS to Neo4j for big data analytics. Using graph database technology, various business-critical questions are addressed, including identifying the employees who sold Tofu, the products sold with Tofu, the total number of products, top 5 products by sales, and the category with the highest sales.

Language: JavaScript - Size: 668 KB - Last synced at: 3 days ago - Pushed at: 11 months ago - Stars: 1 - Forks: 1

tashi-2004/Apache-Hadoop-Spark-Hive-CyberAnalytics

This project utilizes Apache Hadoop, Hive, and PySpark to process and analyze the UNSW-NB15 dataset, enabling advanced query analysis, machine learning modeling, and visualization. The project demonstrates efficient data ingestion, processing, and predictive analytics for network security insights.

Language: Jupyter Notebook - Size: 2.62 MB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

OuchenOussama/hespressence

Kappa Architecture Based Sentiment Analysis System for User Comments

Language: Python - Size: 10.8 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 1

mohamedsaleh1984/twitter-spark

Fetch data from Twitter and push it through Kafka to Spark then HDFS

Language: Java - Size: 7.82 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

Dare-marvel/Big-Data-Analytics--BDA--

💾 Welcome to the Big Data Analytics Repository! 📚✨ Immerse yourself in a carefully curated reservoir of knowledge on Big Data Analytics. 🌐💡 Explore the intricacies of deriving insights from vast datasets and navigating powerful analytics tools. 🚀🔍

Language: Java - Size: 174 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 1 - Forks: 2

shivatejapecheti/Twitter-Live-Feed-Analysis-and-Streaming-for-Movies

Bigdata Analysis Project

Language: Jupyter Notebook - Size: 165 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 2 - Forks: 0

AbdullahKhurshid/ecommerce-marketing-analytics

Using Apache Spark for marketing analytics

Language: R - Size: 2.3 MB - Last synced at: 3 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

madhurimarawat/Big-Data-Analytics

This repository demonstrates big data processing, visualization, and machine learning using tools such as Hadoop, Spark, Kafka, and Python.

Language: Jupyter Notebook - Size: 10.7 MB - Last synced at: 4 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 1

sxhixho/Preprocessing_Analysis

A project that demonstrates data storage, preprocessing, and analysis using tools like HDFS, Apache Pig, and Hive, executed in an Azure virtual machine environment. The project includes cleaning and aggregating a Spotify dataset and running Hive queries to extract meaningful insights.

Size: 4.24 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

notnayan/WLV_HCK

You're welcome.

Language: Jupyter Notebook - Size: 99.4 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

wandersonlira/state-of-data-brazil-2023

Este repositório abriga o projeto acadêmico da disciplina de Tópicos de Big Data em Python. O projeto analisa os dados da pesquisa anual "State of Data Brazil", realizada pela comunidade Data Hackers em parceria com a Bain & Company.

Language: Jupyter Notebook - Size: 17.1 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 3

Amey-Thakur/BIG-DATA-ANALYTICS-AND-COMPUTATIONAL-LAB-I

CSDLO7032: Big Data Analytics & CSL704: Computational Lab - I <Semester VII>

Language: Jupyter Notebook - Size: 183 MB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 8 - Forks: 2

metatron-app/metatron-discovery

Powerful & Easy way for big data discovery

Language: TypeScript - Size: 93.3 MB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 444 - Forks: 111

Radhikareddy-chintareddy/Big-Data-Analysis-NY-Weather-Air-Quality-2022

End-to-end workflow showcasing database setup, API development, and interactive data retrieval of large datasets. Includes integration and analysis of 2022 SURFACE HOURLY weather data (global, US, and NY) merged with NY air pollution data from the EPA to uncover actionable insights.

Language: Jupyter Notebook - Size: 3.47 MB - Last synced at: 3 months ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

Radhikareddy-chintareddy/Big-Data-Insights-NYC-Taxi-Trips-2013-

A project showcasing memory-efficient big data processing using Python, focusing on scalable data handling to overcome memory constraints. Includes anomaly detection, efficient visualizations, and actionable insights from the 2013 NYC Taxi Trip dataset.

Language: Jupyter Notebook - Size: 2.49 MB - Last synced at: 3 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

haustcsa/SocialSituSecu

SocialSituSecu is a project exploring the social network security, computing and intelligence basd on social situational metadata, which is sponsored by National Natural Science Foundation of China Grant No.61972133, and Project of Leading Talents in Science and Technology Innovation for Thousands of People Plan in Henan Province Grant No.204200510021.

Language: Python - Size: 87.9 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 5 - Forks: 1

jaanli/american-community-survey

American Community Survey data on people and households

Language: Jupyter Notebook - Size: 142 MB - Last synced at: 13 days ago - Pushed at: 7 months ago - Stars: 19 - Forks: 1

sanketrs/implementation-of-modern-data-engineering-architecture-with-fabric_analytics

Building a next-generation hybrid data pipeline architecture that combines the power of Microsoft Fabric, Azure Cloud, and Power BI. This pipeline is engineered to tackle the challenges of real-time data ingestion, multi-layered processing, and analytics, delivering business-critical insights.

Language: Python - Size: 32.2 KB - Last synced at: 3 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

AAlkiyumi/Predicting-Hospital-Readmission-Risk

This project aims to create a predictive model that forecasts the likelihood of a patient being readmitted to the hospital within 30 days of discharge.

Language: Jupyter Notebook - Size: 13.2 MB - Last synced at: 3 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

tabletop-labs/tabletop

A curated selection of tools, libraries and services that help tame your dataflow to productively build ambitious, data driven & reactive applications on a streaming lakehouse

Language: Go - Size: 290 KB - Last synced at: 2 days ago - Pushed at: about 2 years ago - Stars: 5 - Forks: 0

bibekbhatta/BusinessAnalytics

Anyone (including beginners) can use these resources to get started with accessing, cleaning, and analysing different kinds of data in Python. No installation required. No registration required.

Language: Jupyter Notebook - Size: 84.1 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

bryanfks-dev/Klempoken-Analysis

Analysis and forcasting model for Klempoken MSMEs

Language: Jupyter Notebook - Size: 6.19 MB - Last synced at: 3 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

mehwishferoz/BDA-project

A Hadoop MapReduce project analyzing the Consumer Complaints dataset with five queries to extract insights like complaints by product, state, company, tags, and timely responses.

Language: Java - Size: 7.42 MB - Last synced at: 4 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

Amey-Thakur/OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-K-MEANS-CLUSTERING

Big Data Analytics [BDA] Mini Project

Language: Jupyter Notebook - Size: 2.55 MB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 10 - Forks: 1

waseemsalami/project-Big-Data-in-behavioral-science-

An exciting Big Data project done during a course I took at the Technion university

Language: HTML - Size: 31.8 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 1 - Forks: 0

madhurimarawat/Python-Projects

This repository contains the projects that I made in the Python programming language.

Language: Jupyter Notebook - Size: 17.6 MB - Last synced at: 4 months ago - Pushed at: 7 months ago - Stars: 1 - Forks: 0

MSUSAzureAccelerators/Workplace-Intelligence-Accelerator

The Workplace Intelligence Accelerator leverages machine learning and big data analytics to combine and transform data, allowing customer to easily identify factors that influence how people work in their organization.

Language: TSQL - Size: 22.3 MB - Last synced at: 3 months ago - Pushed at: about 2 years ago - Stars: 6 - Forks: 3

chaaalistaa/Thelookecommerce---Project

Analysis "TheLook" eCommerce with highlight goals such as identifying sales trends, understanding customer behaviors, enhancing customer retention, and driving repeat purchases.

Language: Jupyter Notebook - Size: 18.6 KB - Last synced at: 4 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

Ashish7129/Graph_Sampling

Graph Sampling is a python package containing various approaches which samples the original graph according to different sample sizes.

Language: Python - Size: 4.91 MB - Last synced at: 7 months ago - Pushed at: over 4 years ago - Stars: 161 - Forks: 50

BhushanSagar/Telecom-Data-Analysis

Telecom Data Analysis with Apache Hive

Language: HiveQL - Size: 357 KB - Last synced at: 5 months ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

Ren294/Covid-Data-Process

This project integrates real-time data processing and analytics using Apache NiFi, Kafka, Spark, Hive, and AWS services for comprehensive COVID-19 data insights.

Language: Shell - Size: 6.22 MB - Last synced at: 2 months ago - Pushed at: 9 months ago - Stars: 6 - Forks: 0

hatoonguls/Big-Data-Analytics

The repositary contains big data analytics projects using Apache Spark, SQL, and Machine Learning models.

Language: Python - Size: 197 KB - Last synced at: 4 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0