An open API service providing repository metadata for many open source software ecosystems.

Topic: "data-extraction"

pim97/scrappey.js

Scrappey.js: A versatile JavaScript wrapper for Scrappey API for solving Cloudflare, datadome, enabling seamless web scraping of anti-bot protected websites. Simplify data extraction with robust functionality and reliable results. Unlock valuable insights effortlessly. Get started with Scrappey

Language: JavaScript - Size: 124 KB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 9 - Forks: 4

steno-aarhus/ukbAid

Aid Steno Researchers Who Work on the UKB RAP.

Language: R - Size: 16.3 MB - Last synced at: 29 days ago - Pushed at: about 2 months ago - Stars: 9 - Forks: 5

armiro/crawlers

A bunch of crawlers for extracting data from various sites (site name is mentioned for each one)

Language: Python - Size: 163 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 9 - Forks: 18

deadbits/trs

🔭 Threat report analysis via LLM and Vector DB

Language: Python - Size: 1.29 MB - Last synced at: 29 days ago - Pushed at: over 1 year ago - Stars: 9 - Forks: 1

Shiva-sankaran/LineEX

Data Extraction from Scientific Line Charts

Language: Python - Size: 2.62 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 9 - Forks: 0

shine-jayakumar/Extract-Data-From-PDF-In-Python

Batch-convert pdf to text, extract data from pdf in python

Language: Python - Size: 13.7 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 9 - Forks: 4

NMikolajewicz/MetaLab

Meta-analysis toolbox for basic research applications. Developed in MATLAB R2016b.

Language: MATLAB - Size: 3.07 MB - Last synced at: over 1 year ago - Pushed at: about 6 years ago - Stars: 9 - Forks: 6

armiro/cv-data-extractor

Extract essential data (e.g. GPA, skills, education, age, ...) from PDF-formatted working Resume files (under develop)

Language: Python - Size: 49.8 KB - Last synced at: about 2 years ago - Pushed at: almost 7 years ago - Stars: 9 - Forks: 3

Boomslet/Web_Crawler

Open-source web crawler

Language: Python - Size: 34.2 KB - Last synced at: about 2 years ago - Pushed at: almost 7 years ago - Stars: 9 - Forks: 6

jakubjafra/stellaris-map-generation 📦

Extracts geopolitical data from Stellaris save game files

Language: JavaScript - Size: 8.67 MB - Last synced at: about 1 month ago - Pushed at: almost 8 years ago - Stars: 9 - Forks: 0

N4rr34n6/TelegramBackup

TelegramBackup is a sophisticated tool designed for extracting, organizing, and archiving messages from your Telegram chats, channels, and groups.

Language: Python - Size: 39.1 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 8 - Forks: 1

rririanto/unstructured-demo-streamlit

Extract your docs (CSV, PDF, JSON, HTML, DOCS, Sheets and more) for your own GPT and LLM projects using Unstructured.io via streamlit

Language: Python - Size: 6.84 KB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 8 - Forks: 0

pmpontes/OCR-Medical-Records

Tool to perform OCR on the growth chart kept in a child's individual health record.

Language: CSS - Size: 107 MB - Last synced at: about 1 year ago - Pushed at: almost 8 years ago - Stars: 8 - Forks: 2

venkat-0706/Amazon-WebScraper

An Amazon web scraper extracts product data like prices, reviews, and ratings using tools like BeautifulSoup or Scrapy, aiding in market research while adhering to ethical and legal guidelines.

Language: Python - Size: 7.81 KB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 7 - Forks: 0

BeautifulMoon211/Onthemarket-Scraping

Web scraping tool used to extract real estate information from OnTheMarket.com, a leading property portal in the United Kingdom.

Language: TypeScript - Size: 10.7 MB - Last synced at: about 1 month ago - Pushed at: 8 months ago - Stars: 7 - Forks: 0

blalop/bbva2pandas

Extract the data from your BBVA's monthly statements

Language: Python - Size: 122 KB - Last synced at: 15 days ago - Pushed at: 9 months ago - Stars: 7 - Forks: 3

geniuszly/GenPythonDoxing

GenPythonDoxing is a demo version of a Python-based tool designed for gathering publicly available information about email addresses, usernames, IP addresses, and Minecraft nicknames. It utilizes various APIs and web scraping techniques to collect data, providing a comprehensive view of online footprints.

Language: Python - Size: 8.79 KB - Last synced at: about 1 month ago - Pushed at: 9 months ago - Stars: 7 - Forks: 0

ksm26/Function-Calling-and-Data-Extraction-with-LLMs

Master the techniques of function-calling and structured data extraction with LLMs. Learn to enhance LLM capabilities, integrate web services, and build practical applications for real-world data usability.

Language: Jupyter Notebook - Size: 1.09 MB - Last synced at: about 2 months ago - Pushed at: 11 months ago - Stars: 7 - Forks: 3

Bisaloo/xlcutter

Parse Batches of 'xlsx' Files Based on a Template

Language: R - Size: 4.5 MB - Last synced at: 1 day ago - Pushed at: over 1 year ago - Stars: 7 - Forks: 0

masterT/yolo-scraper

A simple way to structure your web scraper.

Language: JavaScript - Size: 294 KB - Last synced at: 10 days ago - Pushed at: over 2 years ago - Stars: 7 - Forks: 1

milahu/reverse-template-engine

find a template of many similar html files

Language: JavaScript - Size: 159 KB - Last synced at: 30 days ago - Pushed at: over 2 years ago - Stars: 7 - Forks: 0

xarantolus/jsonextract

Go package for finding and extracting any JavaScript object (not just JSON) from an io.Reader

Language: Go - Size: 231 KB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 7 - Forks: 0

maitreyeepaliwal/Alleropedia-Database-for-Allergens

Metadetabase of 13145 records generated for Allergens with a tabular view of the data. Web interface connected to ease the use, analysis and extraction of data with several added functionalities. Tutorial section added to educate the users of the interface design and features and the database.

Language: HTML - Size: 994 KB - Last synced at: 3 months ago - Pushed at: almost 4 years ago - Stars: 7 - Forks: 0

desininja/voice-disorder

Data Science project. ML algorithms to detect voice disorders.

Language: Jupyter Notebook - Size: 3.61 MB - Last synced at: 14 days ago - Pushed at: almost 5 years ago - Stars: 7 - Forks: 4

Banz99/Nier-Texture-Manager

Nier Texture Extractor and Repacker for the PS3 and x360 release.

Language: C# - Size: 996 KB - Last synced at: about 2 years ago - Pushed at: about 6 years ago - Stars: 7 - Forks: 0

jrdnbradford/readMDTable

R 📦 for reading markdown tables into tibbles

Language: R - Size: 7.66 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 6 - Forks: 0

lhotanok/zalando-scraper

Apify actor extracting data from Zalando

Language: TypeScript - Size: 149 KB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 6 - Forks: 3

imubit/pi-pbook-data-extractor 📦

ProcessBook applet for extracting historical data from PI Server

Size: 395 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 6 - Forks: 4

DemonMartin/scrappey-wrapper

An API wrapper for Scrappey.com written in Node.js (cloudflare bypass & solver)

Language: JavaScript - Size: 61.5 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 6 - Forks: 0

sypht-team/sypht-csharp-client

A C# / .NET client for the Sypht API

Language: C# - Size: 65.4 KB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 6 - Forks: 1

sypht-team/sypht-elixir-client

An Elixir client for the Sypht API https://sypht.com

Language: Elixir - Size: 47.9 KB - Last synced at: about 1 month ago - Pushed at: about 5 years ago - Stars: 6 - Forks: 0

nlpub/babelnet-extract 📦

An application for extracting certain data from BabelNet.

Language: Java - Size: 43 KB - Last synced at: about 1 year ago - Pushed at: over 6 years ago - Stars: 6 - Forks: 1

ImranR98/InstacartFlation

A Python script that scrapes your Instacart order history and saves the data in a JSON file.

Language: Python - Size: 55.7 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 5 - Forks: 1

MiniAiLive/ID-DocumentRecognition-Docker

MiniAiLive Intelligent ID OCR for Reliable Identity Verification From document verification to data entry, our MiniAiLive OCR solution can help transform your identity verification process.

Language: Python - Size: 1.65 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 5 - Forks: 4

GMDSantana/crivo

A tool for extracting and filtering URLs, IPs, domains, and subdomains from text or web pages, with built-in web scraping capabilities.

Language: Python - Size: 77.1 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 5 - Forks: 0

davidumoru/scryer

Analyse and generate reports of websites using Gemini AI

Language: TypeScript - Size: 686 KB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 5 - Forks: 0

darsan-in/Job-Crawler

The Job Crawler is an integral component of the Job RAID project, designed to automatically scrape and collect data from various job listing websites. This crawler enables Job RAID to aggregate comprehensive job listings, ensuring that users have access to up-to-date and relevant job opportunities.

Language: Python - Size: 6.83 MB - Last synced at: 2 days ago - Pushed at: 6 months ago - Stars: 5 - Forks: 0

lykmapipo/Python-Spark-Log-Analysis

Python scripts to process, and analyze log files using PySpark.

Language: Python - Size: 131 KB - Last synced at: 18 days ago - Pushed at: 10 months ago - Stars: 5 - Forks: 0

os-climate/crrf-det

A web application for PDF content and table extraction, featuring image-based visual layout analysis, indexed document search, batch processing and extraction result annotation.

Language: C++ - Size: 6.63 MB - Last synced at: 24 days ago - Pushed at: 11 months ago - Stars: 5 - Forks: 3

The-Nebula-Developers/Complex-Parser

Complex Parser is a powerful Python package designed to streamline the process of data extraction from JSON-like structures while also enriching the extracted data with synonym retrieval capabilities.

Language: Python - Size: 37.1 KB - Last synced at: 2 days ago - Pushed at: about 1 year ago - Stars: 5 - Forks: 0

shine-jayakumar/Web-Scraping-With-Python

Script to extract customer reviews from a webpage while bypassing bot challenge

Language: Python - Size: 112 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 5 - Forks: 0

promptapi/scraper-py

Python package for Prompt API's Scraper API

Language: Python - Size: 55.7 KB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 5 - Forks: 3

akshayuppal3/twitter_scraping_tool

A scraping tool based on tweepy module

Language: Python - Size: 41 KB - Last synced at: almost 2 years ago - Pushed at: over 6 years ago - Stars: 5 - Forks: 0

sbaresearch/bitcoin-data-extractor

A tool for extracting data from Bitcoin-like blockchains into a relational database model.

Language: Java - Size: 495 KB - Last synced at: about 1 year ago - Pushed at: about 7 years ago - Stars: 5 - Forks: 2

chofste/ETL

Language: Python - Size: 2.71 MB - Last synced at: about 11 hours ago - Pushed at: about 13 hours ago - Stars: 4 - Forks: 0

pepe-god/DataProphet

Extracts the identity information citizens from MySQL, creates a family network based on TC ID No. and exports it to CSV

Language: Python - Size: 195 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 4 - Forks: 0

danhilse/web-scraper

A versatile Python-based web scraper that extracts content from single URLs or entire sitemaps, organizing data into structured text files. Features include sitemap parsing, content grouping by URL structure, and an easy-to-use command-line interface. Ideal for data extraction, content analysis, and web research tasks.

Language: Python - Size: 4.39 MB - Last synced at: 20 days ago - Pushed at: about 2 months ago - Stars: 4 - Forks: 1

Solrikk/DataDigger

DataDigger is a powerful and intuitive web application designed to extract and analyze data from web pages.

Language: Go - Size: 38.1 KB - Last synced at: 28 days ago - Pushed at: 2 months ago - Stars: 4 - Forks: 0

defnecirci/MatSciTableExtract

Extracting structured materials science data from tables using LLMs

Language: Python - Size: 147 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 4 - Forks: 0

kanugurajesh/Invoice

Automated Data Extraction and invoice management application

Language: TypeScript - Size: 1.19 MB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 4 - Forks: 1

acuciureanu/js-maid

A rule-driven engine designed for seamless extraction of data from JavaScript files.

Language: TypeScript - Size: 429 KB - Last synced at: about 1 month ago - Pushed at: 4 months ago - Stars: 4 - Forks: 0

yukiyuqichen/His-Geo

A library to extract historical toponyms from texts, geocode and visualize the results on maps.

Language: Jupyter Notebook - Size: 44.9 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 4 - Forks: 0

serenity4/OpenType.jl

Extract font data from OpenType font files

Language: Julia - Size: 1.67 MB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 4 - Forks: 0

shaadclt/PDF-Data-Extraction-PyMuPDF4LLM

This repository demonstrates how to extract text, images, and structured content from PDF documents using pymupdf4llm in Google Colab. It also includes data preparation for LlamaIndex for further document analysis and information extraction.

Language: Jupyter Notebook - Size: 17.6 KB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 4 - Forks: 0

lykmapipo/NYC-TLC-Trip-Data

Python scripts to download, process, and analyze the New York City Taxi and Limousine Commission (TLC) Trip Record Data dataset

Language: Jupyter Notebook - Size: 100 MB - Last synced at: 18 days ago - Pushed at: 9 months ago - Stars: 4 - Forks: 1

ReconXSecurityHQ/highlight

highlight is a script to detect and highlight patterns such as URLs, domains, IPv4 addresses, IPv6 addresses, subnets, ports, categories, HTML tags, and more.

Language: Shell - Size: 8.79 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 4 - Forks: 2

irfanalidv/trustpilot_scraper

A Python library for scraping Trustpilot reviews.

Language: Python - Size: 56.6 KB - Last synced at: about 1 month ago - Pushed at: 12 months ago - Stars: 4 - Forks: 1

ahmedmujtaba1/Python-Projects

There are numerous sources of source code available for scraping various types of social media platforms and websites.

Language: Jupyter Notebook - Size: 28.4 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 4 - Forks: 1

swainshashwat/Flock

Craft custom Language Model Models (LLMs) effortlessly using Flock. Build LLMs for specific domains like a pro, supported by wizardlm, bloom, falcon, and llama. Extract insights from text and images seamlessly. Powered by Python, pdfMiner, langChain, and streamLit. Unlock domain-specific intelligence with Flock! 🚀

Language: Jupyter Notebook - Size: 35.2 KB - Last synced at: 3 days ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 3

farukalamai/yelp-scraper-scrapy-python

Yelp Restaurant data scraping using python, scrapy spider

Language: Python - Size: 23.4 KB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 4 - Forks: 2

coskundeniz/twitter-data-extractor

Twitter Data Extractor

Language: Python - Size: 17.6 MB - Last synced at: 3 days ago - Pushed at: almost 2 years ago - Stars: 4 - Forks: 0

LylaCoding/Website-Subpage-Scraper

This Python script scrapes internal links on a webpage. It prompts for a URL, sends a GET request to retrieve HTML, uses BeautifulSoup to parse and filter links. Then it prompts the user for output mode (terminal or file) to either print or write the links. Installs required modules (requests and beautifulsoup4) if not found.

Language: Python - Size: 16.6 KB - Last synced at: about 1 year ago - Pushed at: about 2 years ago - Stars: 4 - Forks: 1

openlawnz/openlawnz-parsers

PDF data extraction parsers that get published onto npm. Standalone, but run in conjunction with the openlawnz-pipeline.

Language: TypeScript - Size: 709 KB - Last synced at: 22 days ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 2

ReaperZ0v/csgo-stats-scraper

A web scraper coded in python for scraping the gaming stats for counter strike global offensive (CSGO) and exporting to CSV

Language: Python - Size: 1.95 KB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 4 - Forks: 1

ishitashah23/sql-practice-code

SQL practice code from reference books and online practice websites

Size: 15.6 KB - Last synced at: almost 2 years ago - Pushed at: over 4 years ago - Stars: 4 - Forks: 0

RocktimRajkumar/ATS

:trophy: An applicant tracking system (ATS) is a software application that enables the electronic handling of recruitment and hiring needs. Corporate recruiters or hiring managers can then search and sort through the resumes in a number of ways, depending on the needs

Language: Python - Size: 1.82 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 4 - Forks: 3

Kalebu/Worldmeter-coronavirus-scraper

A python program that tracks coronavirus statistics based on the worldometer website

Language: Python - Size: 7.81 KB - Last synced at: 5 days ago - Pushed at: almost 5 years ago - Stars: 4 - Forks: 2

sypht-team/sypht-ruby-client

A Ruby client for the Sypht API

Language: Ruby - Size: 53.7 KB - Last synced at: 2 months ago - Pushed at: over 5 years ago - Stars: 4 - Forks: 0

YoannPa/Gff3ToBed

Language: Shell - Size: 22.5 KB - Last synced at: about 2 years ago - Pushed at: about 6 years ago - Stars: 4 - Forks: 3

hugcis/data_journalism_extractor

A tool for extracting and integrating data from heterogeneous data sources

Language: Python - Size: 4.82 MB - Last synced at: about 1 month ago - Pushed at: over 6 years ago - Stars: 4 - Forks: 1

AmirAli104/Text2Excel

A GUI desktop application that can extract data from a text file and put them in an Excel or CSV file using regular expression (regex) patterns

Language: Python - Size: 135 KB - Last synced at: about 4 hours ago - Pushed at: about 6 hours ago - Stars: 3 - Forks: 0

atakanwas/discord-twitter-scraper

Scrape connected Twitter/X accounts from Discord member profiles automatically using Selenium and undetected_chromedriver.

Size: 1.63 MB - Last synced at: about 7 hours ago - Pushed at: about 9 hours ago - Stars: 3 - Forks: 0

lightfeed/lightfeed

Lightfeed SDK to search and filter web data

Language: Python - Size: 112 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 3 - Forks: 1

mzazakeith/PuppetMaster

Puppeteer & Crawl4AI microservice for web automation, scraping, and AI processing with Bull queues

Language: Python - Size: 47.9 KB - Last synced at: about 8 hours ago - Pushed at: 7 days ago - Stars: 3 - Forks: 0

oxylabs/how-to-scrape-wayfair

A step-by-step tutorial on extracting data from Wayfair’s product pages at scale and in real time. The guide details actionable code and considers various aspects before and during the scraping process.

Language: Python - Size: 3.07 MB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 3 - Forks: 0

thiagosalvatore/pyrser

Turn anything into a pydantic-based schema

Language: Python - Size: 242 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 3 - Forks: 0

aglasencnik/Parsera.NET

A lightweight NuGet package for the Parsera API, designed to simplify interactions and streamline data scraping tasks. This wrapper offers an easy-to-use interface, enabling developers to harness the power of Parsera's capabilities effortlessly.

Language: C# - Size: 39.1 KB - Last synced at: 8 days ago - Pushed at: 7 months ago - Stars: 3 - Forks: 0

kingtroga/linkedin_scraper

This Python script automates interactions with LinkedIn Sales Navigator using Selenium WebDriver. It can log into LinkedIn, navigate through lists of prospects, send messages, and manage contacts based on specific criteria.

Language: Python - Size: 2.5 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 3 - Forks: 0

ksm26/Functions-Tools-and-Agents-with-LangChain

Explore Functions, Tools and Agents with LangChain along with LangChain Expression Language

Language: Jupyter Notebook - Size: 1.65 MB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 6

odevjorge/instagram-post-fetcher

"instagram-post-fetcher" is a Python module leveraging Selenium to extract Instagram post details, including account username, descriptions, media URLs, and post timestamps. Simplifying access to Instagram data for analytics and research.

Language: Python - Size: 9.77 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 0

Phantom-fs/Projects

Projects with multiple different concepts and usages

Language: Java - Size: 28.6 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 0

fcoagz/xtweet

xtweet es una biblioteca de Python para interactuar con la API de Twitter.

Language: Python - Size: 6.84 KB - Last synced at: 1 day ago - Pushed at: almost 2 years ago - Stars: 3 - Forks: 0

mukherjee07/ORCA-project-Li-ion-battery-material-discovery

These are important tools written for working with ORCA (a computational chemistry open source package) in a super-computing environment. They include multiple job submission scripts (BASH), multiple job cancellation (BASH), job setup with different geometries (in python), data extraction (BASH), final optimized geometry extraction (BASH) and

Language: Python - Size: 18.6 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 3 - Forks: 1

tdiam/greece-population-census-2021

👥 Greece Population Census 2021 Data

Language: Python - Size: 22.3 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 3 - Forks: 0

hudavn/InfoMagic

Look up tool by extract data from online sources

Language: Jupyter Notebook - Size: 92.5 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 3 - Forks: 0

hudavn/PodcastGuests

Lead generation about podcast guests in Business category

Language: Jupyter Notebook - Size: 2.18 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 3 - Forks: 0

itsmeale/sspdata

Extração de dados do portal de dados abertos da secretaria de segurança pública do estado de São Paulo.

Language: Python - Size: 189 KB - Last synced at: 10 months ago - Pushed at: almost 4 years ago - Stars: 3 - Forks: 0

IshtyM/Data-Extraction-and-Text-Analytics

Text Analysis that includes extraction of word count, Positive Score, Negative Score, Polarity, Uncertainty, Constraining, Positive Word Proportion, Negative Word Proportion, Uncertainty Word Proportion and many more.

Language: Jupyter Notebook - Size: 84 KB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 3 - Forks: 0

esencgr/Python_Scripts_Projects

Data Extraction & Dataset Creation & Data Scraping

Language: Python - Size: 6.34 MB - Last synced at: about 1 year ago - Pushed at: almost 4 years ago - Stars: 3 - Forks: 2

hive-scripts/hivehoney

Extract data from remote Hive to local Windows OS.

Language: Python - Size: 291 KB - Last synced at: about 6 hours ago - Pushed at: over 4 years ago - Stars: 3 - Forks: 1

promptapi/scraper-go

Golang wrapper for Prompt API's Scraper API

Language: Go - Size: 43.9 KB - Last synced at: almost 2 years ago - Pushed at: over 4 years ago - Stars: 3 - Forks: 0

rubenandrebarreiro/fire-fighting-autonomous-intelligent-sensors-mobile-app

☁️ 📲 🔥 A project based in Mobile and Pervasive Computing. This project was built using Arduino, C++ (C Plus Plus), Java, Android and Google App Engine. The scenario chosen for this project was to combat forest fires with the use of sensors and actuators, to detect and prevent fire occurrences in forests, as also, collect and analyse data from them, and kept in a Web Server. A Mobile App was also built to support some functions and interactions with the system, globally.

Language: Java - Size: 50 MB - Last synced at: almost 2 years ago - Pushed at: almost 6 years ago - Stars: 3 - Forks: 0

datamade/illinois-criminal-justice

Data extractions tools from Illinois State Reports on the Criminal Justice System

Language: Makefile - Size: 1.95 KB - Last synced at: 2 months ago - Pushed at: almost 9 years ago - Stars: 3 - Forks: 0

lykmapipo/US-Gas-Prices

Python scripts that scrape US gas prices

Language: Python - Size: 1.83 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 2 - Forks: 0

GhentCDH/taulu

Taulu is a Python package designed to segment tabular data in scanned or photographed documents.

Language: Python - Size: 11.2 MB - Last synced at: 5 days ago - Pushed at: 6 days ago - Stars: 2 - Forks: 0

ntbies/ntbies_services

This Odoo module streamlines data extraction from supplier invoices and expenses and automates contact info pre-filling, utilizing NTBIES services for enhanced operational efficiency.

Language: Python - Size: 137 KB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 2 - Forks: 0

rithulkamesh/docproc

Opinionated and Sophisticated Document Region Analyzer.

Language: Python - Size: 219 KB - Last synced at: 5 days ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 0

Traves-Theberge/webform-cli

A CLI tool for extracting unstructured data from websites using customizable schemas and Google's Gemini API and outputing them into structured schemas.

Language: TypeScript - Size: 59.6 KB - Last synced at: 29 days ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 1

AAlkiyumi/Senior-Design-Project

Web scraper for collecting product and review data from e-commerce sites using Scraping Bee, AWS, Selenium, and Pandas. Focuses on cost-effective solutions, user-friendly interfaces, and efficient data extraction and analysis.

Language: Python - Size: 208 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 0

ImranDevPython/zillow-property-scraper

A powerfull zillow properties scraper

Language: Python - Size: 253 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 2 - Forks: 1

Related Topics
python 235 web-scraping 146 automation 69 data-analysis 66 data-science 53 pandas 49 data-visualization 47 machine-learning 45 data-mining 45 beautifulsoup 42 python3 39 scraper 38 selenium 37 webscraping 35 data 34 data-cleaning 32 llm 30 data-engineering 27 api 27 csv 26 data-processing 26 data-transformation 26 web-scraper 25 nlp 24 javascript 22 scraping 22 ai 22 sql 21 ocr 21 data-scraping 21 pdf 20 data-exploration 20 json 19 requests 19 open-source 18 etl 17 crawler 16 html 16 nodejs 15 web-crawler 15 streamlit 15 beautifulsoup4 15 excel 14 natural-language-processing 14 blockchain-analysis 14 data-parser 13 extract 13 data-recovery 13 digital-forensics 13 scrapy 13 image-processing 13 database 13 web-scraping-python 12 blockchain-parser 12 pdf-parser 12 crypto-tool 12 cryptocurrency-parser 12 numpy 12 digital-wallet-tool 12 wallet-tool 12 blockchain-security 12 blockchain-tool 12 btc-analysis 12 invoice 12 bitcoin-tool 12 bcparser 12 btc-data-analysis 12 btc-security 12 crypto-analysis 12 crypto-parser 12 data-preprocessing 12 crypto-analysis-tool 12 web-crawling 11 sentiment-analysis 11 cli 11 invoice-parser 11 api-client 11 text-extraction 10 etl-pipeline 10 data-manipulation 10 web-automation 10 text-mining 10 structured-data 10 java 10 typescript 10 selenium-webdriver 10 html-parsing 9 document-capture 9 extract-fields 9 receipt-capture 9 receipt-reader 9 receipt-scanner 9 openai 9 receipt-scanning 9 sypht 9 sypht-api 9 r 9 jupyter-notebook 9 css-selector 9 text-processing 8