GitHub topics: vectorassembler
CamilaJaviera91/pyspark-first-approach
This code demonstrates how to integrate PySpark with datasets and perform simple data transformations. It loads a sample dataset using PySpark's built-in functionalities or reads data from external sources and converts it into a PySpark DataFrame for distributed processing and manipulation.
Language: Python - Size: 2.72 MB - Last synced at: 1 day ago - Pushed at: 3 months ago - Stars: 2 - Forks: 0

arnoldchrisoduor1/LinearRegression-Model-with-ApacheSpark-and-DataBricks
Using Apache pySpark on DataBricks, I was able to do feature Engineering on Customer Data, trained and used a Linear Regression Model to predict their bill based on previous customer trends.
Language: Jupyter Notebook - Size: 3.91 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

mauryashobhit/cruise_ship_member_prediction
predicting number of crew memebers on a ship based on multiple parameters
Language: Jupyter Notebook - Size: 22.5 KB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0
