An open API service providing repository metadata for many open source software ecosystems.

Topic: "data-enrichment"

HoloClean/holoclean

A Machine Learning System for Data Enrichment.

Language: Python - Size: 8.09 MB - Last synced at: 1 day ago - Pushed at: about 2 years ago - Stars: 525 - Forks: 131

upgini/upgini

Data search & enrichment library for Machine Learning → Easily find and add relevant features to your ML & AI pipeline from hundreds of public and premium external data sources, including open & commercial LLMs

Language: Python - Size: 166 MB - Last synced at: 5 days ago - Pushed at: 6 days ago - Stars: 337 - Forks: 25

HoloClean/HoloClean-Legacy-deprecated 📦

A Machine Learning System for Data Enrichment.

Language: Python - Size: 179 MB - Last synced at: over 1 year ago - Pushed at: almost 7 years ago - Stars: 75 - Forks: 22

chaoss/grimoirelab-elk

Language: Python - Size: 6.49 MB - Last synced at: about 15 hours ago - Pushed at: 9 days ago - Stars: 60 - Forks: 122

IBM/watson-discovery-food-reviews

Combine Watson Knowledge Studio and Watson Discovery to discover customer sentiment from product reviews

Language: JavaScript - Size: 18.4 MB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 36 - Forks: 44

peopledatalabs/peopledatalabs-python

A Python client for the People Data Labs API

Language: Python - Size: 691 KB - Last synced at: 10 days ago - Pushed at: 11 days ago - Stars: 35 - Forks: 4

UtrechtUniversity/ricgraph

Ricgraph - Research in context graph

Language: Python - Size: 154 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 30 - Forks: 5

peopledatalabs/peopledatalabs-js

A universal JS client with TypeScript support for the People Data Labs API

Language: TypeScript - Size: 1.03 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 25 - Forks: 9

wingkwong/hk-atm-locator

:atm: 香港自動櫃員機定位器 :atm: Centralising Automated Teller Machine (ATM) Data in Hong Kong in a well-defined yet standardised format and display in a web portal for public use

Language: JavaScript - Size: 11.1 MB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 17 - Forks: 7

audiomuze/tagminder

Import, maintain and export tag metadata to/from audio files and a dynamically created SQLite table. Automates incremental tag cleanup, enrichment and standardisation for your digital audio library at scale using pre-scripted SQL queries and Polars, achieving quality and consistency in your metadata not possible with a tagger

Language: Python - Size: 1020 KB - Last synced at: 27 days ago - Pushed at: 27 days ago - Stars: 11 - Forks: 1

dice-group/deer Fork of GeoKnow/DEER

RDF Dataset Enrichment Framework

Language: JavaScript - Size: 73.5 MB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 10 - Forks: 7

analogueapp/mercury

Data Enrichment Service

Language: Python - Size: 513 KB - Last synced at: 6 days ago - Pushed at: almost 2 years ago - Stars: 9 - Forks: 1

peopledatalabs/peopledatalabs-go

A Go client for the People Data Labs API

Language: Go - Size: 110 KB - Last synced at: 10 days ago - Pushed at: 11 days ago - Stars: 8 - Forks: 0

Steve0verton/google-maps-geocode-enrichment

This project repository provides a headless module to enrich location data in a database table using the Google Maps Geocode API.

Language: Python - Size: 2 MB - Last synced at: 5 months ago - Pushed at: over 3 years ago - Stars: 7 - Forks: 0

xueyouluo/cn-data-enhance

使用翻译技术做数据增强。

Language: Python - Size: 11.7 KB - Last synced at: over 1 year ago - Pushed at: almost 7 years ago - Stars: 7 - Forks: 3

AmadeusITGroup/CrawlerBox

CrawlerBox is an automated analysis framework designed for parsing emails and crawling embedded web resources.

Language: Python - Size: 162 KB - Last synced at: 3 days ago - Pushed at: about 1 month ago - Stars: 3 - Forks: 0

LeadMagic/leadmagic-mcp

🚀 Production-ready Model Context Protocol server for LeadMagic API - Complete B2B data enrichment suite with 19 powerful tools. Zero-config setup for Claude, Cursor, Windsurf, Continue.dev, and all MCP clients.

Language: TypeScript - Size: 251 KB - Last synced at: 28 days ago - Pushed at: 2 months ago - Stars: 2 - Forks: 2

peopledatalabs/peopledatalabs-rust

A Rust client for the People Data Labs API

Language: Rust - Size: 94.7 KB - Last synced at: 24 days ago - Pushed at: 3 months ago - Stars: 2 - Forks: 2

mljar/enrichment

Data enrichment with AI for pandas DataFrame

Language: Python - Size: 22.5 KB - Last synced at: 2 days ago - Pushed at: 4 months ago - Stars: 2 - Forks: 1

dtim-upc/THOR

THOR (Text Homogenization from Oblivion to Reality)

Language: Jupyter Notebook - Size: 5.44 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 2 - Forks: 1

daliclass/annotator

Web GUI and Java Service to annotate items with additional human level information. #Annotation #ML

Language: Java - Size: 229 KB - Last synced at: over 2 years ago - Pushed at: over 6 years ago - Stars: 2 - Forks: 0

GGiecold-zz/Keras_playground

Training deep neural networks with Keras

Language: Python - Size: 6.78 MB - Last synced at: over 1 year ago - Pushed at: almost 8 years ago - Stars: 2 - Forks: 0

spiliossp/Multimodal_Image_Super_Resolution_using_Deep_Neural_Networks

Multimodal Image Super Resolution using (Interpretable) Deep Neural Networks

Language: Jupyter Notebook - Size: 130 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 1 - Forks: 0

LeadMagic/leadmagic-n8n

🚀 Professional n8n community node for LeadMagic - Complete B2B data enrichment, email finder, company intelligence, LinkedIn enrichment, and sales automation API integration

Language: TypeScript - Size: 153 KB - Last synced at: about 12 hours ago - Pushed at: 2 months ago - Stars: 1 - Forks: 2

patricksferraz/cep2address

A high-performance Python tool for batch processing Brazilian postal codes (CEP) into complete addresses. Features parallel processing, multiple API sources, and flexible I/O formats. Perfect for data enrichment and address validation.

Language: Python - Size: 9.77 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 1

berksudan/Data-Engineering-Case-Study-with-SQL-Optimization-and-Python-Data-Integration

Case Study for a data engineering job application at a company

Language: Python - Size: 2.09 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 1 - Forks: 0

enrichment-api/Enrichment-api-node

Company and Email Enrichment API NodeJs

Language: JavaScript - Size: 30.3 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

HerculesCRUE/HerculesED

Herramienta de gestión de CV. Hércules ED - Enriquecimiento de Datos

Language: C# - Size: 195 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 1

markdouthwaite/demography

Quickly load demographic data based on UK post codes to enrich your dataset. Based on data made available by the UK's Office for National Statistics (ONS).

Language: Python - Size: 6.72 MB - Last synced at: 8 days ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

magifd2/lookup-go

A powerful Go-based CLI tool to enrich JSON/JSONL data streams by looking up values in CSV or JSON files. Inspired by Splunk's lookup command.

Language: Go - Size: 28.3 KB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

AdityaSreevatsaK/SmartFlow-Prep

SmartFlow-Prep is the data preprocessing pipeline for the SmartFlow system. It collects, cleans, and enriches historical CitiBike trip data with contextual features such as weather, time, holidays, and station metadata. The processed output serves as the structured input for SmartFlow’s bike rebalancing model.

Language: Python - Size: 15.6 KB - Last synced at: 23 days ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

WillianMonteiro23/projetos-sql

Esta repositório contém projetos que utilizam SQL para análise de dados. Cada projeto explora conjuntos de dados, realizando consultas para extrair insights, transformar dados e visualizar resultados, demonstrando habilidades em gerenciamento de banco de dados e otimização de consultas.

Language: TSQL - Size: 69.1 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

SangSokPark/leadmagic-n8n

Boost your B2B data enrichment with LeadMagic for n8n. Achieve high accuracy in email finding and streamline your lead generation process. 🚀✨

Language: TypeScript - Size: 132 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

LeadMagic/leadmagic-openapi

🎯 Production-ready OpenAPI 3.1 specification for LeadMagic's complete B2B data enrichment API suite. 19 endpoints, 249 examples, platform-agnostic documentation with comprehensive testing framework.

Language: JavaScript - Size: 59.6 KB - Last synced at: about 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

TechWithTy/trestle-python-sdk

Python SDK for Trestle API - A comprehensive OSINT and data enrichment toolkit with support for phone validation, reverse lookups, and more.

Language: Python - Size: 39.1 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

nymeria-io/nymeria.py

A Python client for the Nymeria API

Language: Python - Size: 41 KB - Last synced at: about 5 hours ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

nymeria-io/nymeria.rb

A Ruby client for the Nymeria API

Language: Ruby - Size: 49.8 KB - Last synced at: 23 days ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

nymeria-io/nymeria.go

A Go client for the Nymeria API

Language: Go - Size: 122 KB - Last synced at: 6 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

ankaboot-source/thedig

🧩➜👤 TheDig enrich personal data from a full name and an email

Language: Python - Size: 5.74 MB - Last synced at: 3 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

TechnikInterlytics/VerityExamples

Data file examples and user guides for VerityPy and VerityDotNet libraries

Language: HTML - Size: 3.84 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

rianlucascs/webscraper_e_enriquecedor_de_dados

Este projeto apresenta um software dedicado à automação da coleta de dados da web. Utilizando o nome da empresa e o domínio fornecidos, ele extrai informações relevantes e realiza uma filtragem para garantir a precisão dos dados obtidos. Essa abordagem simplifica significativamente o processo de obtenção de dados valiosos diretamente da internet.

Language: Python - Size: 962 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

la-gruge/urssaf-phone-finder

Phone finder utilisant les données de l'URSSAF

Language: Python - Size: 546 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

aflip/mood-muse

Embedding based semantic search app for poetry [App and EDA notebooks]

Language: Jupyter Notebook - Size: 30.3 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

jacobc5266/Data-Processing-and-Reporting-Automation-Program

Efficiently process and automate complex reports using Python. This project streamlines data integration, cleaning, and report generation, reducing processing time by 97%.

Size: 203 KB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

karlahrnndz/gdq-run-genre-medium

Code for the Medium article "Visualizing “Games Done Quick” Video Game Genres Over the Years".

Language: Python - Size: 22.5 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

kvndrsslr-zz/docker-deer

Docker for DEER

Size: 1.95 KB - Last synced at: over 1 year ago - Pushed at: over 8 years ago - Stars: 0 - Forks: 0

Related Topics