Topic: "sqoop"
heibaiying/BigData-Notes
大数据入门指南 :star:
Language: Java - Size: 22.9 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 16,422 - Forks: 4,279

apache/sqoop 📦
Mirror of Apache Sqoop
Language: Java - Size: 17.9 MB - Last synced at: 4 days ago - Pushed at: about 4 years ago - Stars: 982 - Forks: 583

sunnyandgood/BigData
💎🔥大数据学习笔记
Language: Java - Size: 316 MB - Last synced at: over 1 year ago - Pushed at: about 6 years ago - Stars: 647 - Forks: 222

WeBankFinTech/Exchangis
Exchangis is a lightweight,highly extensible data exchange platform that supports data transmission between structured and unstructured heterogeneous data sources
Language: Java - Size: 41.3 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 447 - Forks: 207

bluishglc/bdp
A prototype project of big data platform, the source codes of the book Big Data Platform Architecture and Prototype
Language: Java - Size: 403 KB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 184 - Forks: 135

aliyun/aliyun-maxcompute-data-collectors
Language: Java - Size: 93.7 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 127 - Forks: 64

v5tech/cloud
云计算之hadoop、hive、hue、oozie、sqoop、hbase、zookeeper环境搭建及配置文件
Language: Shell - Size: 31.7 MB - Last synced at: about 2 months ago - Pushed at: about 8 years ago - Stars: 55 - Forks: 43

dimajix/spark-training
Repository used for Spark Trainings
Language: Jupyter Notebook - Size: 9 MB - Last synced at: 2 months ago - Pushed at: about 2 years ago - Stars: 53 - Forks: 66

Cigna/ibis
IBIS is a workflow creation-engine that abstracts the Hadoop internals of ingesting RDBMS data.
Language: Python - Size: 749 KB - Last synced at: 7 months ago - Pushed at: about 3 years ago - Stars: 51 - Forks: 15

vivek2319/Learn-Hadoop-and-Spark
This repository focuses on gathering and making a curated list resources to learn Hadoop for FREE.
Language: Python - Size: 211 MB - Last synced at: over 1 year ago - Pushed at: about 7 years ago - Stars: 46 - Forks: 39

mrugankray/Big-Data-Cluster
The goal of this project is to build a docker cluster that gives access to Hadoop, HDFS, Hive, PySpark, Sqoop, Airflow, Kafka, Flume, Postgres, Cassandra, Hue, Zeppelin, Kadmin, Kafka Control Center and pgAdmin. This cluster is solely intended for usage in a development environment. Do not use it to run any production workloads.
Language: Shell - Size: 118 KB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 41 - Forks: 15

san089/Cloudera_Material
Cloudera_Material: Study Material to help people preparing for Cloudera CCA Spark and Hadoop Developer Exam (CCA175). Feel free to collaborate.
Size: 9.02 MB - Last synced at: 3 months ago - Pushed at: about 5 years ago - Stars: 37 - Forks: 30

Powerspace/pg2bq
Export PostgreSQL tables to Google BigQuery
Language: Scala - Size: 640 KB - Last synced at: over 2 years ago - Pushed at: about 4 years ago - Stars: 36 - Forks: 11

maniram-yadav/Big_DataHadoop_Projects
Big data projects implemented by Maniram yadav
Language: PigLatin - Size: 2.79 MB - Last synced at: over 2 years ago - Pushed at: about 7 years ago - Stars: 33 - Forks: 33

peiliping/meepo 📦
异构存储数据迁移
Language: Java - Size: 986 KB - Last synced at: about 1 year ago - Pushed at: over 7 years ago - Stars: 30 - Forks: 22

Mrkuhuo/bigdata_learning
大数据组件学习代码
Language: Java - Size: 36.7 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 22 - Forks: 7

zenoyang/web-click-flow
网站点击流离线日志分析
Language: Java - Size: 2.98 MB - Last synced at: 2 months ago - Pushed at: almost 7 years ago - Stars: 19 - Forks: 11

tejasjbansal/HELTHCARE-SYSTEM
Data cleaning, pre-processing, and Analytics on a Health care data using Spark and Python.
Language: Jupyter Notebook - Size: 3 MB - Last synced at: over 2 years ago - Pushed at: about 3 years ago - Stars: 17 - Forks: 14

ven2day/Bigdata-docker-sandbox
Docker Big Data Tools: This docker-compose file is configured to run multiple nodes. This is a Hadoop Cluster that contains the necessary tools that can be used in the BigData domain, It's a collection of docker containers that you can use directly.
Language: VBA - Size: 79.7 MB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 17 - Forks: 4

Jayvardhan-Reddy/BigData-Ecosystem-Architecture
Life-cycle: Internal working of HDFS, SQOOP, HIVE, SPARK, HBASE, KAFKA with code.
Language: Shell - Size: 562 KB - Last synced at: about 2 years ago - Pushed at: almost 6 years ago - Stars: 13 - Forks: 16

conch-stack/conch-bigdata
Big Data
Language: HTML - Size: 14.2 MB - Last synced at: over 1 year ago - Pushed at: about 2 years ago - Stars: 10 - Forks: 3

Stefen-Taime/ETL-Data-Pipeline-RDBMS-TO-HDFS-using-Airflow-Apache-Sqoop-Spark-Postgres-and-Hive
This project aims to move the data from a Relational database system (RDBMS) to a Hadoop file system (HDFS)
Language: Python - Size: 17.7 MB - Last synced at: 3 months ago - Pushed at: about 3 years ago - Stars: 10 - Forks: 4

Sathiyarajan/big-data-pipeline
Big Data
Language: Java - Size: 705 MB - Last synced at: over 2 years ago - Pushed at: almost 3 years ago - Stars: 9 - Forks: 6

Pathairush/rdbms_to_hdfs_data_pipeline
A data pipeline moving data from a Relational database system (RDBMS) to a Hadoop file system (HDFS).
Language: Python - Size: 46.9 MB - Last synced at: over 2 years ago - Pushed at: about 4 years ago - Stars: 8 - Forks: 1

milindjagre/HDPCD
This repository contains all the documents related to HDPCD certification.
Language: PigLatin - Size: 42 KB - Last synced at: over 2 years ago - Pushed at: almost 8 years ago - Stars: 8 - Forks: 10

lovnishverma/bigdataecosystem
Complete Big Data Ecosystem on Docker Desktop
Language: Shell - Size: 405 KB - Last synced at: about 2 months ago - Pushed at: 2 months ago - Stars: 7 - Forks: 1

Pathairush/airflow_hive_spark_sqoop
A docker using the airflow with Hadoop ecosystem (hive, spark, and sqoop)
Language: Shell - Size: 24.4 KB - Last synced at: over 2 years ago - Pushed at: about 4 years ago - Stars: 7 - Forks: 4

MehdiTAZI/BigData-Platform
End to end big data project, that aims to show how to implement different big data layers, from the infrastructure layer to the end user one. [HADOOP][Spark][Kafka][Cassandra][Ansible][Jupyter][Docker]
Language: Jupyter Notebook - Size: 85 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 6 - Forks: 6

bhvikp/sqoop-spark-hive
MYSQL | SQOOP | SPARK | HIVE workflow
Language: Scala - Size: 33.2 KB - Last synced at: 8 days ago - Pushed at: almost 7 years ago - Stars: 6 - Forks: 8

TritonDataCenter/hadoop-manta
Hadoop Filesystem Driver for Manta
Language: Java - Size: 172 KB - Last synced at: 13 days ago - Pushed at: over 7 years ago - Stars: 6 - Forks: 6

Madadata/hasoop
Hasoop - Node.js client for Sqoop 2
Language: JavaScript - Size: 171 KB - Last synced at: 28 days ago - Pushed at: over 8 years ago - Stars: 6 - Forks: 1

ghosh17/Predictive-Analysis
Predictive Analysis using Big Data platforms and Machine Learning Libraries
Language: Shell - Size: 410 KB - Last synced at: over 2 years ago - Pushed at: almost 9 years ago - Stars: 6 - Forks: 1

melwinmpk/PizzaOrders_DataPipeline
Pizza Orders Data Pipeline Usecase Solved by SQL, Sqoop, HDFS, Hive, Airflow.
Language: Python - Size: 603 KB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 5 - Forks: 2

NickkBright/Spark-SqoopCDC
Change data capture realization using Spark and Sqoop
Language: Scala - Size: 17.6 KB - Last synced at: over 1 year ago - Pushed at: almost 7 years ago - Stars: 5 - Forks: 2

jazzwang/hive_labs
Hive, Sqoop related labs
Language: Shell - Size: 3 MB - Last synced at: over 2 years ago - Pushed at: almost 7 years ago - Stars: 5 - Forks: 1

SaravananJaichandar/Big-Data
A Hadoop repository to portray the use-cases of different hadoop components with real-time projects and their workings explained in detail.
Size: 13 MB - Last synced at: 11 months ago - Pushed at: over 7 years ago - Stars: 5 - Forks: 4

VictoriaGomesDS/Intro_Ecossistema_Hadoop
Size: 233 KB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 0

MarceloJSSantos/acelereracao-global-dev-4-everis-dio
Repositório criado para armazenar anotações e atividades desempenhadas no treinamento na plataforma da Digital Inovattion One (DIO) para o Processo seletivo de Engenheiros de Dados pela empresa Everis.
Language: Jupyter Notebook - Size: 57.3 MB - Last synced at: over 1 year ago - Pushed at: about 4 years ago - Stars: 4 - Forks: 2

rktrojan/BigDataHadoop
BigData/Hadoop related codebase including Sqoop, Hive/HQL, Spark, Flink
Language: Jupyter Notebook - Size: 1.74 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 3 - Forks: 0

thedatasociety/lab-hadoop
Language: PLpgSQL - Size: 4.6 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 3 - Forks: 7

avatime/gamul-gamul
🏆가물가물 : 빅데이터 분산 처리를 활용한 물가기반 식재료 가격 정보 제공 웹앱 서비스 - 🥇SSAFY 7기 특화프로젝트 우수상 1등(2022.10.07)
Language: Java - Size: 190 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 0

Jayvardhan-Reddy/BigData_Concepts
The various underlying process that takes place on each concept of Big-data ecosystem.
Language: Shell - Size: 10.7 KB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 3 - Forks: 5

lazy-apple/BigData_Long
爬虫+大数据项目
Language: Java - Size: 183 KB - Last synced at: over 2 years ago - Pushed at: over 6 years ago - Stars: 3 - Forks: 0

lazy-apple/BigData
大数据电商项目
Language: Java - Size: 205 KB - Last synced at: over 2 years ago - Pushed at: over 6 years ago - Stars: 3 - Forks: 0

danielsqtang/Data-Ingestion-Shellscript
Scripts for ETL
Language: Shell - Size: 49.8 KB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 3 - Forks: 1

shinde-chandrakant/BANK-ATM-ETL
Spar Nord, a Danish bank, optimizes ATM refill frequency by observing withdrawal behavior and dependent factors. The project builds a batch ETL pipeline to load the data into Redshift Data Mart for analytical queries.
Language: Jupyter Notebook - Size: 2.07 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 2 - Forks: 1

terodea/CS-BigData
Learn Big Data tools/ framework by doing examples, POC, per projects.
Language: Python - Size: 15.7 MB - Last synced at: over 2 years ago - Pushed at: almost 3 years ago - Stars: 2 - Forks: 3

melwinmpk/UserTransactions_DataPipeline
User Transactions Data Pipeline Usecase Solved by SQL, Sqoop, HDFS, Hive. Implemented the Slowly changing Dimensions (SCD) 1.
Language: Shell - Size: 5.49 MB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 0

melwinmpk/Userreview_Data_Pipeline_Sqoop_HDFS
Solving the Restaurant User Review Data Pipeline Scenarios using Shellscript, Python, Sqoop, HDFS
Language: Python - Size: 3.4 MB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 0

kasipavankumar/sqoop-docker
Apache Sqoop using Docker. 🐳
Language: Dockerfile - Size: 20.5 KB - Last synced at: over 2 years ago - Pushed at: almost 4 years ago - Stars: 2 - Forks: 0

alvarofpp/4linux-hadoop 📦
Scipts usados durante o curso Big Data Analytics com Hadoop oferecido pela 4Linux
Language: PigLatin - Size: 1.03 MB - Last synced at: over 2 years ago - Pushed at: almost 4 years ago - Stars: 2 - Forks: 0

sujeethshetty/big-data
Project, assignments & research related to Hadoop Ecosytem
Language: Jupyter Notebook - Size: 16.7 MB - Last synced at: 6 months ago - Pushed at: about 4 years ago - Stars: 2 - Forks: 1

siddharth271101/Stock-Exchange-Analysis
Created a data pipeline using sqoop to ingest data from sql server into the hive table and used hive for feature engineering and analysis.
Language: Shell - Size: 14.5 MB - Last synced at: over 2 years ago - Pushed at: about 5 years ago - Stars: 2 - Forks: 1

NikhilURao/H1B_VisaProject
This repository contains the H1B_Visa Applicants Data Analysis project/case study using Hadoop undertaken during the training at NIIT. MapReduce,Hive,Pig,Scoop and Shell-scripting are the technologies used.
Language: Shell - Size: 729 KB - Last synced at: over 1 year ago - Pushed at: almost 6 years ago - Stars: 2 - Forks: 5

aj-22/incremental
Incremental updates in HIVE via CLI and HUE
Language: TSQL - Size: 167 KB - Last synced at: almost 2 years ago - Pushed at: about 6 years ago - Stars: 2 - Forks: 5

Niranjankumar-c/DataAnalytics_using_ClickstreamData
Casestudy completed as part of BigData training from analytix labs
Size: 12.6 MB - Last synced at: over 1 year ago - Pushed at: about 7 years ago - Stars: 2 - Forks: 2

vishal2232/Project_1-Spark-using-Scala-API-
Problem statement, get the revenue and number of orders from order_items on daily basis.
Size: 1.67 MB - Last synced at: about 2 years ago - Pushed at: over 8 years ago - Stars: 2 - Forks: 0

VladimirZelenokor1/Big-Data-Project---Predicting-Trip-Fares-with-Spark-Hive
A CRISP-DM–based big data pipeline for predicting NYC ride-sharing trip fares: ingesting 2024 TLC data via Sqoop into HDFS/Hive, performing ETL and feature engineering with Spark & PySpark, training and tuning Linear Regression & Gradient Boosted Tree models, and outlining end-to-end deployment.
Language: Java - Size: 906 KB - Last synced at: 18 days ago - Pushed at: 27 days ago - Stars: 1 - Forks: 0

DadaNanjesha/Redshift-ETL-Project
The project covers the complete data pipeline—from importing data from an RDS source to HDFS using Sqoop, processing data with Spark, to executing analytical queries on an AWS Redshift cluster.
Language: Jupyter Notebook - Size: 833 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

thdaraujo/cheat
A handful of cheatsheets and programming tips.
Size: 105 KB - Last synced at: 5 days ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

Heisenberghj7/Retail-Store-BigData
📊 📑This project provides a step-by-step big data analytics applied in the retail industry through the use of a variety of big data technologies. such as HDFS, Hive and Spark..
Language: Python - Size: 2.11 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

EddieAmaitum/Spar-Nord-Bank-ETL-and-BI-Project
In this project I build a batch ETL pipeline to read transactional data from Amazon RDS, transform it to a usable format and then load it into an Amazon S3 bucket. The data is then loaded into Redshift Tables, after which I perform analytical queries on the loaded data to gain insights.
Language: Jupyter Notebook - Size: 2.72 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

EddieAmaitum/NYC-Yellow-Taxi-DataOps-with-AWS-Analyzing-TLC-Datasets
Performed business operations using Big data technologies: AWS EMR, AWS RDS (MySQL), Hadoop, Apache Scoop, Apache HBase, MapReduce
Language: Python - Size: 5.63 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

sarathchandrikak/ETL-Bank-Transcation
Data Analysis of bank transaction data
Language: Jupyter Notebook - Size: 9.34 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

naimazizi/hive-export
Combination between Apache Spark and Sqoop to extract data from Hive table into relational database, integrated with pipeline using luigi.
Language: Python - Size: 11.7 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

ahpuchend/agdata
农产品数据分析系统
Language: JavaScript - Size: 9.4 MB - Last synced at: about 1 year ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 0

Psingh12354/Internship_Notes_Cog
Language: Scala - Size: 737 KB - Last synced at: 4 months ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 0

melwinmpk/SCD_in_Warehouse
Hive Project- Understand the various types of SCDs and implement these slowly changing dimesnsion in Hadoop Hive and Spark.
Language: HiveQL - Size: 22.7 MB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 4

Linho1150/BIGDATA_PROGRAMMING_PROJECT
Analyzed traffic flow around the university through bus arrival time at the bus stop in Myongji University. Use web crawling (Python), Hadoop, HDFS, Sqoop, Hive, Zeppelin, and Amazon RDS (Mysql).
Language: Python - Size: 11.9 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

anaghazachariah/sqoop-installation-ubuntu
Size: 22.5 KB - Last synced at: almost 2 years ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 0

RaghuKantamsetti/Hadoop-Use-Case-on-Healthcare
This Repository is about processing and store Healthcare data using Big Data tool Hadoop and its components.
Language: Shell - Size: 42 KB - Last synced at: over 1 year ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 1

AnkitaSinha98/Customer360-Data-Analysis
Big Data is Stored and analyzed of various Customer using Hadoop and other tools like Hive, Zookeeper, Hbase and sqoop and all details of the customer is analyzed then result are given.This result is very useful for companies.
Size: 292 KB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 1

alokjani/bigdata-vagrant-devlab
Hadoop Software Development sandbox
Language: Shell - Size: 206 KB - Last synced at: 3 months ago - Pushed at: almost 5 years ago - Stars: 1 - Forks: 1

Sailendra-R-D/Prep-Resource-CCA175
A quick lookup for CCA-175 certification
Size: 27.3 MB - Last synced at: over 2 years ago - Pushed at: over 6 years ago - Stars: 1 - Forks: 2

amittian/BANK-BIG-DATA-ANALYSIS-USING-HADOOP
Banking Data Analysis Using SQL ,SQOOP, HIVE, HADOOP, TABLEAU, R, UNIX
Size: 13.7 KB - Last synced at: almost 2 years ago - Pushed at: over 6 years ago - Stars: 1 - Forks: 0

f2e-awesome/HadoopEcosystem
Hadoop 生态体系(ecosystem)
Language: JavaScript - Size: 3.91 KB - Last synced at: over 2 years ago - Pushed at: almost 7 years ago - Stars: 1 - Forks: 1

Niranjankumar-c/SqoopCaseStudy-ALabs
Sqoop Case Study's done during Analytixlabs Dig Data Classes
Size: 13.7 KB - Last synced at: over 1 year ago - Pushed at: about 7 years ago - Stars: 1 - Forks: 2

anirudhgupta22/Microsoft-Azure-HDInsight
Short documentation on Microsoft's Azure HDInsight
Size: 2.02 MB - Last synced at: about 1 year ago - Pushed at: over 7 years ago - Stars: 1 - Forks: 0

ahmedmohamedfoua/Big-Data-Project---Predicting-Trip-Fares-with-Spark-Hive
This repository provides a complete workflow for predicting ride-sharing trip fares in New York City using Spark and Hive. Explore the data, models, and results while leveraging the power of big data! 🐙🚀
Language: Java - Size: 906 KB - Last synced at: 2 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

Fradhyle/Voo-ong
더조은컴퓨터아카데미 빅데이터 10기 최종 팀 프로젝트
Language: HiveQL - Size: 57.7 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 0 - Forks: 0

gilberto-009199/bigdata
Workspaces de BigData:
Language: Java - Size: 60.4 MB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 0 - Forks: 0

Zain970/ETL-Data-Pipline-Using-Apache-Airflow
Utilize Sqoop to import data from relational databases and ingest files from S3 buckets into HDFS.Apply complex transformations using Apache Spark to prepare data for analysis and reporting. Create and manage Hive tables for structured data storage and query optimization.Load processed data into HBase, making it accessible for various teams and app
Language: Python - Size: 3.91 KB - Last synced at: 27 days ago - Pushed at: 27 days ago - Stars: 0 - Forks: 0

tejaswirupa/Big-Data-Systems-Project-Hadoop-Hive-MapReduce-Sqoop-Workflows
Designed and implemented scalable data workflows using Hadoop, Hive, and Sqoop. This project involved log aggregation, airline delay analysis, word frequency processing, and TF-IDF computation across multiple datasets using MapReduce, Hive queries, and Hadoop Streaming.
Size: 3.75 MB - Last synced at: 2 days ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

raja9283/HadoopSCD
A data pipeline on GCP Dataproc using Sqoop, HDFS, Hive, and PySpark to implement SCD Type 2 for an e-commerce use case. Tracks customer and product changes (e.g., address, price) and their impact on sales, demonstrating scalable data warehousing and processing.
Language: Python - Size: 12.7 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

ANKIT21111/Patient-Alert-ETL
The Patient Alert ETL 🚑 project creates a real-time data pipeline to monitor vital health parameters from IoT devices in hospitals. Using Apache Kafka, Spark, and HBase, it processes streaming data and sends immediate alerts via Amazon SNS when vitals exceed normal thresholds, enhancing patient care through timely interventions.
Language: Python - Size: 5.47 MB - Last synced at: 3 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

ccao-data/service-sqoop-iasworld 📦
Service to continually import iasWorld backend data to Parquet using Apache Sqoop
Language: Shell - Size: 407 KB - Last synced at: about 2 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

offthetab/VKAPI-ML-DataHarvester
Pipeline to harvest data via VK API for ML analysis with hadoop and spark
Language: Jupyter Notebook - Size: 6.69 MB - Last synced at: 9 days ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

sebastianruizm/CCA175-Exam-Preparation
Backup de mi preparación para el examen CCA175 de Cloudera
Language: Python - Size: 42 KB - Last synced at: 4 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

shinde-chandrakant/online-advertising-platform
Online Advertising Platform - a comprehensive big data project
Language: Python - Size: 7.73 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

shivananda199/hive-analytics-in-aws-for-e-commerce
A project to create a Hive data warehouse for E-commerce in AWS and perform data analysis.
Language: HiveQL - Size: 344 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

shinde-chandrakant/BigData-Ops-on-TLC-Yellow-Taxi
Analysed New York City's Yellow taxi data set with Big Data tools such as Hadoop, HBase, Sqoop, MapReduce and AWS Cloud Infrastructure.
Language: Python - Size: 7.19 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

faryar251/ecomsales-and-walmartstock-analysis
Performed end-to-end big data analysis on E-Commerce Sales & Walmart Stock data, extracting valuable insights for impactful reporting.
Size: 5.79 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

ashok-amsamani/HIVE-SQOOP-Integration
Listed steps about how to move data from Mysql to HIVE using Sqoop and Hive to Mysql using sqoop.
Size: 216 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

fernandadiasm/study
Repositório criado com o objetivo de reunir exercícios e anotações sobre tecnologias e linguagens.
Language: Shell - Size: 4.01 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

aalexren/iu-bigdata-project
[Innopolis University] Big Data Course 2023. Final Project.
Language: HiveQL - Size: 118 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

sablokgaurav/data_engineering
java_codes
Language: Java - Size: 1.95 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

Mehak0310/Real_time_Health_Alert_Notification
Propose a reliable data pipeline solution to capture high-velocity stream of patient vitals such as body temperature, heartbeat, blood pressure (BP) coming from IoT devices and send an instant email notification incase of abnormal vitals.
Language: Python - Size: 3.23 MB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

Mehak0310/ATM-data-ETL-Pipeline-Sqoop-Pyspark-Redshift
Build a batch ETL pipeline to read transactional data from RDS, transform and load it into target dimensions and facts on Redshift Data Mart(Schema, after which some analytical queries have to be performed on the loaded data.
Language: Jupyter Notebook - Size: 1.98 MB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

ZGG2016/sqoop
Size: 5.86 KB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

abhij215/dataEngineeringNotes
contains notes on various topics related to data engineer.
Size: 4.88 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0
