GitHub topics: sqoop-import

Repositories

abhilash-1/pyspark-project

This is the first project where we worked on apache spark, In this project what we have done is that we downloaded the datasets from KAGGLE where everyone is aware of, we have downloaded loan, customers credit card and transactions datasets . After downloading the datsaets we have cleaned the data . Then after by using new tools and technologies like spark, HDFS, Hive and many more we have executed new use cases on the datasets we have downloaded from kaggle. As we all know apache spark is a framework that can quickly process the very large datsets.

Language: Jupyter Notebook - Size: 1.87 MB - Last synced at: about 2 months ago - Pushed at: over 3 years ago - Stars: 17 - Forks: 13

san089/Cloudera_Material

Cloudera_Material: Study Material to help people preparing for Cloudera CCA Spark and Hadoop Developer Exam (CCA175). Feel free to collaborate.

Size: 9.02 MB - Last synced at: 2 months ago - Pushed at: about 5 years ago - Stars: 37 - Forks: 30

ANKIT21111/SparNordETL

ETL Pipeline for Spar Nord Bank for the analysis of refilling frequency of the ATM's all over the europe

Language: Jupyter Notebook - Size: 4.59 MB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

bluishglc/bdp

A prototype project of big data platform, the source codes of the book Big Data Platform Architecture and Prototype

Language: Java - Size: 403 KB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 184 - Forks: 135

AyanChatterjee20/Sqoop

Size: 5.86 KB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 0

kinszee/MySQL-Hive-PowerBI-Pipeline

Built a data pipeline by creating tables in MySQL DB, ingested tables to Hadoop for data warehousing and built HiveQL views. Hive views in Linux VM were connected to Power BI application in Windows to create visualizations.

Size: 2.17 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Subham2S/BigData-Engineering-Capstone-Project-1

BigData Engineering Capstone Project with Tech-stack : Linux, MySQL, sqoop, HDFS, Hive, Impala, SparkSQL, SparkML, git

Language: Python - Size: 15.2 MB - Last synced at: almost 2 years ago - Pushed at: almost 3 years ago - Stars: 6 - Forks: 0

alexandrustoica/pipeoop 📦

Real-Time & Batch Data Processing Pipeline

Language: Python - Size: 679 KB - Last synced at: over 2 years ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

anaghazachariah/sqoop-installation-ubuntu

Size: 22.5 KB - Last synced at: almost 2 years ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 0

PannagaS/Banking-Query

A query system for a hypothetical bank scenario

Language: HiveQL - Size: 31.3 KB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

marco-gallegos/sqoopit Fork of lucafon/pysqoop

A python package that lets you sqoop into HDFS/Hive/HBase data from RDBMS using sqoop

Language: Python - Size: 1.7 MB - Last synced at: 5 months ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 0

Related Keywords

sqoop-import 11 sqoop 6 hadoop 5 hive 5 spark 5 sqoop-export 4 big-data 3 pyspark 3 sql 3 hdfs 3 mysql 2 bigdata 2 spark-streaming 2 spark-sql 2 python 2 hiveql 2 hadoop-hdfs 2 kafka 2 mysql-database 2 powerbi 1 sub-queries 1 views 1 power-query 1 odbc-driver 1 etl-pipeline 1 data-warehousing 1 data-transformation 1 data-ingestion 1 databricks 1 git 1 hdfs-dfs 1 impala 1 linux-shell 1 sparkml 1 cassandra 1 mqtt 1 university-project 1 zeppelin-notebook 1 hql 1 hbase 1 py 1 python3 1 dataframes 1 github 1 jupyter-notebook 1 vscode 1 cca 1 cca175 1 certification 1 cloudera 1 flume 1 hive-metastore 1 sqoop-session 1 amazon-redshift 1 demo 1 middle-end 1 middle-office 1 oozie 1 prototype 1 quickstart 1 redis 1 spark-demo 1 spark-examples 1 spark-streaming-examples 1 sparksql 1 dashboards 1 data-analysis 1

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Repos