Topic: "data-enrichment"
HoloClean/holoclean
A Machine Learning System for Data Enrichment.
Language: Python - Size: 8.09 MB - Last synced at: 1 day ago - Pushed at: about 2 years ago - Stars: 525 - Forks: 131

upgini/upgini
Data search & enrichment library for Machine Learning → Easily find and add relevant features to your ML & AI pipeline from hundreds of public and premium external data sources, including open & commercial LLMs
Language: Python - Size: 166 MB - Last synced at: 5 days ago - Pushed at: 6 days ago - Stars: 337 - Forks: 25

HoloClean/HoloClean-Legacy-deprecated 📦
A Machine Learning System for Data Enrichment.
Language: Python - Size: 179 MB - Last synced at: over 1 year ago - Pushed at: almost 7 years ago - Stars: 75 - Forks: 22

chaoss/grimoirelab-elk
Language: Python - Size: 6.49 MB - Last synced at: about 15 hours ago - Pushed at: 9 days ago - Stars: 60 - Forks: 122

IBM/watson-discovery-food-reviews
Combine Watson Knowledge Studio and Watson Discovery to discover customer sentiment from product reviews
Language: JavaScript - Size: 18.4 MB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 36 - Forks: 44

peopledatalabs/peopledatalabs-python
A Python client for the People Data Labs API
Language: Python - Size: 691 KB - Last synced at: 10 days ago - Pushed at: 11 days ago - Stars: 35 - Forks: 4

UtrechtUniversity/ricgraph
Ricgraph - Research in context graph
Language: Python - Size: 154 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 30 - Forks: 5

peopledatalabs/peopledatalabs-js
A universal JS client with TypeScript support for the People Data Labs API
Language: TypeScript - Size: 1.03 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 25 - Forks: 9

wingkwong/hk-atm-locator
:atm: 香港自動櫃員機定位器 :atm: Centralising Automated Teller Machine (ATM) Data in Hong Kong in a well-defined yet standardised format and display in a web portal for public use
Language: JavaScript - Size: 11.1 MB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 17 - Forks: 7

audiomuze/tagminder
Import, maintain and export tag metadata to/from audio files and a dynamically created SQLite table. Automates incremental tag cleanup, enrichment and standardisation for your digital audio library at scale using pre-scripted SQL queries and Polars, achieving quality and consistency in your metadata not possible with a tagger
Language: Python - Size: 1020 KB - Last synced at: 27 days ago - Pushed at: 27 days ago - Stars: 11 - Forks: 1

dice-group/deer Fork of GeoKnow/DEER
RDF Dataset Enrichment Framework
Language: JavaScript - Size: 73.5 MB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 10 - Forks: 7

analogueapp/mercury
Data Enrichment Service
Language: Python - Size: 513 KB - Last synced at: 6 days ago - Pushed at: almost 2 years ago - Stars: 9 - Forks: 1

peopledatalabs/peopledatalabs-go
A Go client for the People Data Labs API
Language: Go - Size: 110 KB - Last synced at: 10 days ago - Pushed at: 11 days ago - Stars: 8 - Forks: 0

Steve0verton/google-maps-geocode-enrichment
This project repository provides a headless module to enrich location data in a database table using the Google Maps Geocode API.
Language: Python - Size: 2 MB - Last synced at: 5 months ago - Pushed at: over 3 years ago - Stars: 7 - Forks: 0

xueyouluo/cn-data-enhance
使用翻译技术做数据增强。
Language: Python - Size: 11.7 KB - Last synced at: over 1 year ago - Pushed at: almost 7 years ago - Stars: 7 - Forks: 3

AmadeusITGroup/CrawlerBox
CrawlerBox is an automated analysis framework designed for parsing emails and crawling embedded web resources.
Language: Python - Size: 162 KB - Last synced at: 3 days ago - Pushed at: about 1 month ago - Stars: 3 - Forks: 0

LeadMagic/leadmagic-mcp
🚀 Production-ready Model Context Protocol server for LeadMagic API - Complete B2B data enrichment suite with 19 powerful tools. Zero-config setup for Claude, Cursor, Windsurf, Continue.dev, and all MCP clients.
Language: TypeScript - Size: 251 KB - Last synced at: 28 days ago - Pushed at: 2 months ago - Stars: 2 - Forks: 2

peopledatalabs/peopledatalabs-rust
A Rust client for the People Data Labs API
Language: Rust - Size: 94.7 KB - Last synced at: 24 days ago - Pushed at: 3 months ago - Stars: 2 - Forks: 2

mljar/enrichment
Data enrichment with AI for pandas DataFrame
Language: Python - Size: 22.5 KB - Last synced at: 2 days ago - Pushed at: 4 months ago - Stars: 2 - Forks: 1

dtim-upc/THOR
THOR (Text Homogenization from Oblivion to Reality)
Language: Jupyter Notebook - Size: 5.44 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 2 - Forks: 1

daliclass/annotator
Web GUI and Java Service to annotate items with additional human level information. #Annotation #ML
Language: Java - Size: 229 KB - Last synced at: over 2 years ago - Pushed at: over 6 years ago - Stars: 2 - Forks: 0

GGiecold-zz/Keras_playground
Training deep neural networks with Keras
Language: Python - Size: 6.78 MB - Last synced at: over 1 year ago - Pushed at: almost 8 years ago - Stars: 2 - Forks: 0

spiliossp/Multimodal_Image_Super_Resolution_using_Deep_Neural_Networks
Multimodal Image Super Resolution using (Interpretable) Deep Neural Networks
Language: Jupyter Notebook - Size: 130 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 1 - Forks: 0

LeadMagic/leadmagic-n8n
🚀 Professional n8n community node for LeadMagic - Complete B2B data enrichment, email finder, company intelligence, LinkedIn enrichment, and sales automation API integration
Language: TypeScript - Size: 153 KB - Last synced at: about 12 hours ago - Pushed at: 2 months ago - Stars: 1 - Forks: 2

patricksferraz/cep2address
A high-performance Python tool for batch processing Brazilian postal codes (CEP) into complete addresses. Features parallel processing, multiple API sources, and flexible I/O formats. Perfect for data enrichment and address validation.
Language: Python - Size: 9.77 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 1

berksudan/Data-Engineering-Case-Study-with-SQL-Optimization-and-Python-Data-Integration
Case Study for a data engineering job application at a company
Language: Python - Size: 2.09 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 1 - Forks: 0

enrichment-api/Enrichment-api-node
Company and Email Enrichment API NodeJs
Language: JavaScript - Size: 30.3 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

HerculesCRUE/HerculesED
Herramienta de gestión de CV. Hércules ED - Enriquecimiento de Datos
Language: C# - Size: 195 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 1

markdouthwaite/demography
Quickly load demographic data based on UK post codes to enrich your dataset. Based on data made available by the UK's Office for National Statistics (ONS).
Language: Python - Size: 6.72 MB - Last synced at: 8 days ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

magifd2/lookup-go
A powerful Go-based CLI tool to enrich JSON/JSONL data streams by looking up values in CSV or JSON files. Inspired by Splunk's lookup command.
Language: Go - Size: 28.3 KB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

AdityaSreevatsaK/SmartFlow-Prep
SmartFlow-Prep is the data preprocessing pipeline for the SmartFlow system. It collects, cleans, and enriches historical CitiBike trip data with contextual features such as weather, time, holidays, and station metadata. The processed output serves as the structured input for SmartFlow’s bike rebalancing model.
Language: Python - Size: 15.6 KB - Last synced at: 23 days ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

WillianMonteiro23/projetos-sql
Esta repositório contém projetos que utilizam SQL para análise de dados. Cada projeto explora conjuntos de dados, realizando consultas para extrair insights, transformar dados e visualizar resultados, demonstrando habilidades em gerenciamento de banco de dados e otimização de consultas.
Language: TSQL - Size: 69.1 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

SangSokPark/leadmagic-n8n
Boost your B2B data enrichment with LeadMagic for n8n. Achieve high accuracy in email finding and streamline your lead generation process. 🚀✨
Language: TypeScript - Size: 132 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

LeadMagic/leadmagic-openapi
🎯 Production-ready OpenAPI 3.1 specification for LeadMagic's complete B2B data enrichment API suite. 19 endpoints, 249 examples, platform-agnostic documentation with comprehensive testing framework.
Language: JavaScript - Size: 59.6 KB - Last synced at: about 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

TechWithTy/trestle-python-sdk
Python SDK for Trestle API - A comprehensive OSINT and data enrichment toolkit with support for phone validation, reverse lookups, and more.
Language: Python - Size: 39.1 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

nymeria-io/nymeria.py
A Python client for the Nymeria API
Language: Python - Size: 41 KB - Last synced at: about 5 hours ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

nymeria-io/nymeria.rb
A Ruby client for the Nymeria API
Language: Ruby - Size: 49.8 KB - Last synced at: 23 days ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

nymeria-io/nymeria.go
A Go client for the Nymeria API
Language: Go - Size: 122 KB - Last synced at: 6 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

ankaboot-source/thedig
🧩➜👤 TheDig enrich personal data from a full name and an email
Language: Python - Size: 5.74 MB - Last synced at: 3 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

TechnikInterlytics/VerityExamples
Data file examples and user guides for VerityPy and VerityDotNet libraries
Language: HTML - Size: 3.84 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

rianlucascs/webscraper_e_enriquecedor_de_dados
Este projeto apresenta um software dedicado à automação da coleta de dados da web. Utilizando o nome da empresa e o domínio fornecidos, ele extrai informações relevantes e realiza uma filtragem para garantir a precisão dos dados obtidos. Essa abordagem simplifica significativamente o processo de obtenção de dados valiosos diretamente da internet.
Language: Python - Size: 962 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

la-gruge/urssaf-phone-finder
Phone finder utilisant les données de l'URSSAF
Language: Python - Size: 546 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

aflip/mood-muse
Embedding based semantic search app for poetry [App and EDA notebooks]
Language: Jupyter Notebook - Size: 30.3 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

jacobc5266/Data-Processing-and-Reporting-Automation-Program
Efficiently process and automate complex reports using Python. This project streamlines data integration, cleaning, and report generation, reducing processing time by 97%.
Size: 203 KB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

karlahrnndz/gdq-run-genre-medium
Code for the Medium article "Visualizing “Games Done Quick” Video Game Genres Over the Years".
Language: Python - Size: 22.5 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

kvndrsslr-zz/docker-deer
Docker for DEER
Size: 1.95 KB - Last synced at: over 1 year ago - Pushed at: over 8 years ago - Stars: 0 - Forks: 0
