An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: data-sampling

ufoym/imbalanced-dataset-sampler

A (PyTorch) imbalanced dataset sampler for oversampling low frequent classes and undersampling high frequent ones.

Language: Python - Size: 181 KB - Last synced at: 1 day ago - Pushed at: 15 days ago - Stars: 2,295 - Forks: 265

minhtran241/arxiv-citation-network

This project aims to analyze the citation network of arXiv papers. We use Python to clean the data and create a Neo4j network to visualize and analyze the citation relationships between arXiv papers.

Language: Python - Size: 1.43 MB - Last synced at: 19 days ago - Pushed at: 5 months ago - Stars: 2 - Forks: 1

madhurimarawat/Data-Wrangling

This repository contains experiments on data wrangling techniques, focusing on methods for handling missing values, filtering, aggregation, and more.

Language: Jupyter Notebook - Size: 621 KB - Last synced at: about 2 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

layumi/AdaBoost_Seg

TIP2022 Adaptive Boosting (AdaBoost) for Domain Adaptation ? :woman_shrugging: Why not ! :ok_woman:

Language: Python - Size: 671 KB - Last synced at: 10 days ago - Pushed at: about 2 years ago - Stars: 47 - Forks: 3

sbittla/gatling-javafaker-maven

Generating realistic test data or simulating load with authentic, dynamic data using the Gatling framework and JavaFaker

Language: Scala - Size: 86.9 KB - Last synced at: 12 days ago - Pushed at: 4 months ago - Stars: 2 - Forks: 0

codexlynx/hardware-attacks-state-of-the-art

Microarchitectural exploitation and other hardware attacks.

Size: 189 KB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 85 - Forks: 11

uwgraphics/flexibleSubsetSelection

A Python package for flexible subset selection for data visualization.

Language: Jupyter Notebook - Size: 44.7 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

Anshika-Codsoft/Codsoft_task-5

Here is Task 5: Credit card fraud detection using machine learning, for my data science internship with Codsoft

Language: Jupyter Notebook - Size: 0 Bytes - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

gegaryfa/Adaptive-data-sampling-and-transmission

Adaptive data sampling and transmission in a wireless sensor node as a function of energy reserves

Language: Arduino - Size: 454 KB - Last synced at: almost 2 years ago - Pushed at: over 7 years ago - Stars: 2 - Forks: 0

jialincheoh/statistical-analysis-tool

This is a course that I did in Summer 2021 at Purdue University. It covers A/B testing, false discovery rate, Bonferroni correction, T-tests, data sampling, Machine Learning model evaluations, hypothesis testing, chi-squared-test, etc.

Language: Jupyter Notebook - Size: 13.8 MB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

neurocard/neurocard

State-of-the-art neural cardinality estimators for join queries

Language: Python - Size: 52.2 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 51 - Forks: 18

SmellyArmure/OC_DS_Project7

Implémentation d'un modèle de scoring (OpenClassrooms | Data Scientist | Projet 7)

Language: Jupyter Notebook - Size: 34.7 MB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 3 - Forks: 2

ReML-AI/RDS

Reinforced Data Sampling

Language: Python - Size: 1.57 MB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 7 - Forks: 2

chomiczdawid/data-preparation

Process of data preparaton in R.

Language: R - Size: 3.63 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

avalanchesiqi/twitter-sampling

Code and Data for paper: Variation across Scales: Measurement Fidelity under Twitter Data Sampling (ICWSM '20)

Language: Python - Size: 1.09 MB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 1 - Forks: 0

roberto1648/sampling-of-biased-physical-signals

A method for sampling a balanced dataset from biased signals by leveraging statistical distributions derived from the data.

Language: Jupyter Notebook - Size: 11.3 MB - Last synced at: about 1 year ago - Pushed at: almost 7 years ago - Stars: 0 - Forks: 0

Related Keywords
data-sampling 16 data-visualization 5 machine-learning 3 python 3 research 2 jupyter-notebook 2 data-aggregation 2 statistics 1 learned-database 1 deep-generative-model 1 cardinality-estimation 1 visuzalization 1 t-tests 1 scikit-learn 1 python36 1 matplotlib 1 hypothesis-testing 1 false-discovery-rate 1 chi-squared-test 1 bonferroni 1 xbee 1 iot 1 arduino 1 image-classification 1 logistic-regression 1 internship-task 1 data-science 1 visualization 1 summarization 1 subsets 1 subset-selection 1 sampling 1 optimization 1 mip 1 exemplars 1 decluttering 1 state-of-the-art 1 speculative-execution 1 spectre 1 sampling-methods 1 twitter-api-stream 1 twitter 1 measurements 1 imputation-methods 1 data-preparation 1 data-analysis 1 correlation-analysis 1 reinforcement-learning 1 streamlit 1 shap 1 risk-modeling 1 pycaret 1 model-interpretation 1 local-interpretation 1 lgbmclassifier 1 imbalanced-classification 1 heroku-deployment 1 flask-api 1 feature-engineering 1 data-cleaning 1 data-ag 1 dashboards 1 dashboard 1 custom-metrics 1 bayesian-optimization 1 transformers 1 self-supervised-learning 1 query-optimization 1 probabilistic-models 1 ml-for-systems 1 side-channel-attacks 1 gatling-example 1 gatling 1 datageneration 1 tip 1 gta5 1 domainadaptation 1 domain-adaptation 1 data-efficient 1 cityscapes 1 adapative 1 adaboost 1 text-data-processing 1 output 1 markdown 1 pytorch 1 handling-missing-values 1 detailed-documentation 1 date-time-processing 1 data-wrangling-workflow 1 data-wrangling 1 neo4j 1 data-reshaping 1 data-preprocessing 1 data-merging 1 data-filtering 1 data-conversion 1 data-concatenation 1 network-analysis 1 codes 1