An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: data-exploration-and-preprocessing

AI-Northstar-Tech/vector-io

Comprehensive Vector Data Tooling. The universal interface for all vector database, datasets and RAG platforms. Easily export, import, backup, re-embed (using any model) or access your vector data from any vector databases or repository.

Language: Jupyter Notebook - Size: 4.4 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 243 - Forks: 28

SayamAlt/Company-Bankruptcy-Prediction

Successfully developed a machine learning model which can accurately predict whether a firm will become bankrupt or not, depending on various features such as net value growth rate, borrowing dependency, cash/total assets, etc.

Language: Jupyter Notebook - Size: 18.4 MB - Last synced at: 6 days ago - Pushed at: over 1 year ago - Stars: 8 - Forks: 0

SayamAlt/Symptoms-Disease-Text-Classification

Successfully developed a fine-tuned BERT transformer model which can accurately classify symptoms to their corresponding diseases upto an accuracy of 89%.

Language: Jupyter Notebook - Size: 860 KB - Last synced at: 21 days ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

whoisalisha/BOM-Business-Analyst-Technical-Case-Study-Assignment

BOM(Bill of Materials) Business Analyst Case Study Solution using Python, Pandas manipulation and Visualization Technique

Size: 1.95 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

SayamAlt/Financial-News-Sentiment-Analysis

Successfully developed a fine-tuned DistilBERT transformer model which can accurately predict the overall sentiment of a piece of financial news up to an accuracy of nearly 81.5%.

Language: Jupyter Notebook - Size: 745 KB - Last synced at: 21 days ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

Aniket2021448/Movie-recommender-system

A Machine Learning Project implemented from scratch which involves web scraping, data engineering, exploratory data analysis, NLP processing and ML, achieving the functionality of a Content based movie recommender system

Language: HTML - Size: 383 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

Andersoncrs/Analisis_Exploratorio_De_Datos-EDA-_Rendimiento_Estudiantil

Este análisis exploratorio de datos (EDA) realizado sobre el conjunto de datos de rendimiento estudiantil tiene como objetivo identificar y comprender los factores que influyen en el desempeño académico de los estudiantes. A través de la limpieza, transformación y visualización de datos, se busca descubrir patrones y relaciones significatvas.

Language: Jupyter Notebook - Size: 698 KB - Last synced at: 2 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

srosalino/Prediction_of_Seoul_Bikes_Demand

The objective of this project is to predict the number of bicycles needed to be made available each hour in order to make the service as efficient as possible

Language: Jupyter Notebook - Size: 11.2 MB - Last synced at: 3 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

AngelX62/Data-Science-Job-Clean

Data was downloaded through Kaggle

Language: Jupyter Notebook - Size: 4.43 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

nafisalawalidris/Employee-Attrition-Control

The Employee Attrition Control project uses data analysis and predictive modeling to understand and address employee turnover. It provides insights and recommendations to reduce attrition and improve employee satisfaction and retention.

Language: Jupyter Notebook - Size: 4.04 MB - Last synced at: 3 months ago - Pushed at: almost 2 years ago - Stars: 2 - Forks: 0

SayamAlt/Global-News-Headlines-Text-Summarization

Successfully established a text summarization model using Seq2Seq modeling with Luong Attention, which can give a short and concise summary of the global news headlines.

Language: Jupyter Notebook - Size: 513 KB - Last synced at: 21 days ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

venkat-a/Happiness-Prediction

Prediction of happy Customers based on Happiness Survey Data

Language: Jupyter Notebook - Size: 1.16 MB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

saikaryekar/PySpark-Plane-Dataset-Exploration

Explored a dataset of planes while learning PySpark commands.

Language: Jupyter Notebook - Size: 24.4 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

SaiSurajMatta/Covid-19-Data-Exploration-Project

An SQL-based exploration of COVID-19 data and vaccination progress using the Covid-Deaths dataset for insights into global pandemic trends.

Size: 19.7 MB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

SayamAlt/Credit-Card-Approval-Prediction

Successfully developed a machine learning model which can accurately predict up to 100% accuracy whether a credit card application of a given applicant would be approved or not, based on several demographic features such as applicant age, total income, marital status, total years of work experience, etc.

Language: Jupyter Notebook - Size: 15.8 MB - Last synced at: 13 days ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

SayamAlt/Taxi-Trip-Fare-Prediction

Successfully created a machine learning model which can accurately predict the fare of a taxi trip based on several features such as trip duration, tip amount, etc.

Language: Jupyter Notebook - Size: 29.8 MB - Last synced at: 21 days ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

ImAnitaYadav07/FIFA24_line_up_analysis

In this project we can analyze the EA FIFA 23 data to predict current potential and future potential for team formations and list out the possible fifa 24 team line ups.

Language: Jupyter Notebook - Size: 1.38 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

SayamAlt/Employee-Attrition-Prediction

Successfully established a machine learning model which can accurately predict whether an employee of a given company will leave it in the impending future or not, based on several employee details and employment metrics.

Language: Jupyter Notebook - Size: 9.54 MB - Last synced at: 21 days ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Related Keywords
data-exploration-and-preprocessing 18 machine-learning 7 data-visualization 7 model-training-and-evaluation 7 exploratory-data-analysis 5 cross-validation 5 model-deployment 4 hyperparameter-optimization 4 feature-engineering 4 natural-language-processing 4 binary-classification 3 model-architecture-and-implementation 3 model-inference 3 data-cleaning 3 text-tokenization 3 seaborn 3 data-exploration 3 pandas 2 numpy 2 fine-tune-bert-tensorflow 2 hugging-face-transformers 2 model-selection 2 cicd-deployment 2 hyperparameter-tuning 2 statistical-analysis 2 visualization 2 text-preprocessing 2 multiclass-classification 2 python 2 attention-mechanism 1 luong-attention 1 continuous-integration 1 turnover-analysis 1 time-series-analysis 1 predictive-modeling-techniques 1 feature-selection-and-engineering 1 data-visualization-and-storytelling 1 plotly 1 scikit-learn 1 regularization-methods 1 data-analysis 1 streamlit-webapp 1 continuous-deployment 1 python-lambda 1 fifa23 1 regression-modelling 1 model-testing 1 model-retraining 1 sql-queries 1 sql-data-exploration 1 sql-data-analysis 1 sql 1 mysql 1 mssql 1 pyspark-notebook 1 sklearn 1 model-evaluation-and-tuning 1 metrics 1 matplotlib 1 text-summarization 1 text-generation 1 seq2seq-model 1 github-actions 1 docker-container 1 zilliz 1 vector-search-engine 1 vector-database 1 turbopuffer 1 qdrant 1 pinecone 1 parquet 1 milvus 1 lancedb 1 kdb 1 huggingface-datasets 1 huggingface 1 datastax 1 data-import 1 data-export 1 data-backup 1 chromadb 1 nltk-python 1 free-hosting-service 1 sentiment-analysis 1 distilbert-model 1 window-functions 1 seaborn-plots 1 pivot-tables 1 matplotlib-pyplot 1 kpi 1 key-metrics 1 google-colaboratory 1 generative-ai 1 excel 1 dax-expression 1 calculated-fields 1 business-analytics 1 aggregate-functions 1 text-classification 1 bert-fine-tuning 1