GitHub topics: extract-data
meltano/meltano
Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.
Language: Python - Size: 140 MB - Last synced at: about 14 hours ago - Pushed at: about 15 hours ago - Stars: 2,110 - Forks: 177

opendatalab/MinerU
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
Language: Python - Size: 125 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 35,316 - Forks: 2,879

MeltanoLabs/tap-stackexchange
Singer tap for the StackExchange API
Language: Python - Size: 1.19 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 3 - Forks: 1

pymupdf/PyMuPDF
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
Language: Python - Size: 332 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 7,394 - Forks: 619

DocumindHQ/documind
Open-source platform for extracting structured data from documents using AI.
Language: JavaScript - Size: 1020 KB - Last synced at: 4 days ago - Pushed at: about 1 month ago - Stars: 1,326 - Forks: 48

MeltanoLabs/tap-dbt
Singer Tap for dbt API v2 built with the Meltano SDK
Language: Python - Size: 1010 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 12 - Forks: 7

elixir-crawly/crawly
Crawly, a high-level web crawling & scraping framework for Elixir.
Language: Elixir - Size: 2.8 MB - Last synced at: 4 days ago - Pushed at: 9 months ago - Stars: 1,026 - Forks: 118

Lamouchi-Bayrem/Document_Scanner
flask web app that scans documents using OpenCV
Language: Python - Size: 4.1 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 0 - Forks: 0

Agenty/scrapingai
Build web scraping agents using AI to auto-extract the data from websites, capture screenshot, generate pdf from URL and web crawling with Agenty
Language: TypeScript - Size: 209 KB - Last synced at: 7 days ago - Pushed at: over 1 year ago - Stars: 17 - Forks: 2

m92vyas/llm-reader
Turn Webpage to LLM friendly input text. Similar to Firecrawl and Jina Reader API. Makes RAG, AI web scraping, image & webpage links extraction easy.
Language: Python - Size: 92.8 KB - Last synced at: 20 days ago - Pushed at: 21 days ago - Stars: 191 - Forks: 14

OmkarPathak/ResumeParser
A simple resume parser used for extracting information from resumes
Language: Python - Size: 1.54 MB - Last synced at: 26 days ago - Pushed at: over 1 year ago - Stars: 303 - Forks: 172

tarqhilmarsiregar/fashion-scraping-etl
Implementasi ETL pipeline sederhana untuk web scraping data fashion, meliputi ekstraksi, pembersihan, transformasi, dan penyimpanan ke format CSV, Database postgreSQL, serta Google Sheets sebagai dasar insight data
Language: Python - Size: 6.84 KB - Last synced at: 28 days ago - Pushed at: 28 days ago - Stars: 0 - Forks: 0

guillaC/SQLiteDiskExplorer
SQLiteDiskExplorer enables you to explore, catalog, and batch extract SQLite files from disks and removable media.
Language: C# - Size: 400 KB - Last synced at: 1 day ago - Pushed at: 11 months ago - Stars: 17 - Forks: 0

bda-research/node-crawler
Web Crawler/Spider for NodeJS + server-side jQuery ;-)
Language: TypeScript - Size: 1.04 MB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 6,757 - Forks: 879

apurvasijaria/GooglePlayStoreScrape
Python module to extract Google Play store reviews and other information of any android app.
Language: Python - Size: 114 KB - Last synced at: 3 days ago - Pushed at: 10 months ago - Stars: 4 - Forks: 0

BaseMax/ExtractWord
Extract word(s) from the lines of the file.
Language: PHP - Size: 23.4 KB - Last synced at: 7 days ago - Pushed at: about 6 years ago - Stars: 4 - Forks: 1

ammaryasirnaich/PyReqify
This project is a lightweight Python module designed to generate the reqirements.txt file. It streamline dependency management by automatically extracting imported modules from python or juypter files and generating there requirements.txt
Language: Python - Size: 63.5 KB - Last synced at: 8 days ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

alienzhou/giframe
extract the first frame in GIF without reading whole bytes, support both browser and nodejs 📸
Language: TypeScript - Size: 6.46 MB - Last synced at: 8 days ago - Pushed at: over 5 years ago - Stars: 23 - Forks: 6

LivingSkySchoolDivision/MySchoolSaskIntegrations
Export definitions, and notes regarding how they work, for extracting data from MySchoolSask (an implementation of Follett Aspen)
Language: PowerShell - Size: 1.28 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 2

danschultzer/receipt-scanner
Receipt scanner extracts information from your PDF or image receipts - built in NodeJS
Language: JavaScript - Size: 3.54 MB - Last synced at: about 1 month ago - Pushed at: over 6 years ago - Stars: 299 - Forks: 56

Abimathi03/Android-JSON-App
An Android application that demonstrates how to extract employee information from a JSON string and display it on the screen using basic TextView widgets.
Language: Java - Size: 96.7 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

zebbern/JSX
🕵️♂️ | A Chrome extension that collects all JavaScript (.js) links, form endpoints, and all other links from a webpage with a single click!
Language: JavaScript - Size: 1.06 MB - Last synced at: 4 days ago - Pushed at: 3 months ago - Stars: 3 - Forks: 0

Dann-Oliv/Query-Results-To-Excel
Language: Python - Size: 7.81 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

laur89/docker-seedbox-rclone-fetch-extract
Dockerised service pulling data from remote seedbox & extracting archives
Language: Shell - Size: 841 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 14 - Forks: 3

Agenta-AI/job_extractor_template
Template for an AI application that extracts the job information from a job description using openAI functions and langchain
Language: Python - Size: 15.6 KB - Last synced at: 18 days ago - Pushed at: over 1 year ago - Stars: 10 - Forks: 1

ropensci/smapr
An R package for acquisition and processing of NASA SMAP data
Language: R - Size: 6.48 MB - Last synced at: 6 days ago - Pushed at: over 1 year ago - Stars: 85 - Forks: 25

Techcatchers/PyLyrics-Extractor
Get Lyrics for any songs by just passing in the song name (spelled or misspelled) in less than 2 seconds using this awesome Python Library.
Language: Python - Size: 17.6 KB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 57 - Forks: 18

umLu/tubeframes
A Python package for retrieving YouTube data, including video statistics, captions, and channel information. TubeData outputs results in a user-friendly pandas DataFrame format, making it ideal for data analysis workflows — especially in Jupyter Notebooks.
Language: Python - Size: 53.7 KB - Last synced at: 19 days ago - Pushed at: 3 months ago - Stars: 2 - Forks: 0

Mohdadnan2320/JobQuest
Full-Stack Developer (MERN) Assignment Jobsforce.ai LLC. To build a Job Recommendation System
Language: JavaScript - Size: 76.2 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

Arman2409/data-falcon
Web crawler
Language: TypeScript - Size: 1.62 MB - Last synced at: 8 days ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

aidayang/MinerU-OneClick
MinerU免安装部署一键启动整合包
Size: 49.8 KB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 7 - Forks: 0

ark-mod/ArkSavegameToolkitNet
Library for reading ARK Survival Evolved savegame files using C#.
Language: C# - Size: 5.85 MB - Last synced at: 4 days ago - Pushed at: over 2 years ago - Stars: 20 - Forks: 27

Mysteriza/Show-Saved-WiFi
Extract and manage saved Wi-Fi profiles on Windows with ease!
Language: Python - Size: 25.7 MB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

LMLK-seal/Printext
Printext is a lightweight, application that extracts text from images.
Language: Python - Size: 404 KB - Last synced at: 3 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

CairX/extract-colors-py
Extract colors from an image. Colors are grouped based on visual similarities using the CIE76 formula.
Language: Python - Size: 4.72 MB - Last synced at: 8 days ago - Pushed at: over 4 years ago - Stars: 68 - Forks: 20

DevExpress-Examples/winforms-dashboard-extract-data-source
This example demonstrates how to create the Extract data source, replace existing dashboard data sources with Extract data sources and update the Extract data file.
Language: C# - Size: 2.04 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 1

DevExpress-Examples/wpf-dashboard-how-to-update-extract-data-source-file
This example demonstrates how to update the extract data file at runtime.
Language: C# - Size: 2.45 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 1

asad70/Insider-Trading
This program extracts insider trading data from the sec website and stores it in excel file for the specified time frame.
Language: Python - Size: 98.6 KB - Last synced at: about 2 months ago - Pushed at: over 2 years ago - Stars: 53 - Forks: 15

serhaturtis/TOOL-FastBatchImageCrop
A simple UI tool to batch crop images to prepare datasets from images and videos.
Language: Python - Size: 955 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 27 - Forks: 3

ionictemplate-app/Social-Network-Data-Scraper-Pro
Easily scrape 10,000+ email messages in one hour, helping you quickly increase your customers Extracts data from (LinkedIn, Facebook, Instagram, Youtube, Pinterest, Twitter) Perfect search by specific Keywords Ready-to-use Social Network Data Scraper Software to get started instantly 100% Include source code and install file
Size: 45.9 KB - Last synced at: 4 months ago - Pushed at: about 1 year ago - Stars: 27 - Forks: 6

manucabral/pysoccerdata
A python package for extracting real-time soccer data from diverse online sources, providing essential statistics and insights.
Language: Python - Size: 32.2 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

msoap/html2data
Library and cli for extracting data from HTML via CSS selectors
Language: Go - Size: 7.15 MB - Last synced at: 3 months ago - Pushed at: 9 months ago - Stars: 69 - Forks: 3

pdfix/pdfix_sdk_example_npm
Example project demonstrating how to use PDFix SDK WebAssembly build in Node.js. Make PDF Files Accessible, Extract Data from PDF, Convert PDF to HTML, Fill-in PDF Form, Stamp PDF and more...
Language: JavaScript - Size: 882 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

rainergo/UASFRA-MS-KnowledgeGraph
Python project to read and use ESG data from XBRL-files to construct a neo4j Knowledge-Graph to be enriched with external data (Wikidata, DBPedia). An OpenAI-attached chat bot is used to query the Graph.
Language: HTML - Size: 158 MB - Last synced at: 3 months ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

peterbencze/serritor
Serritor is an open source web crawler framework built upon Selenium and written in Java. It can be used to crawl dynamic web pages that require JavaScript to render data.
Language: Java - Size: 969 KB - Last synced at: about 2 months ago - Pushed at: almost 3 years ago - Stars: 32 - Forks: 15

DaoMinhThong/E-commerce_SQL_project
Language: Jupyter Notebook - Size: 1020 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

yuanxu-li/html-table-extractor
extract data from html table
Language: Python - Size: 31.3 KB - Last synced at: 2 months ago - Pushed at: about 5 years ago - Stars: 86 - Forks: 22

pdfix/pdfix_sdk_example_cpp
Make PDF Files Accessible, Extract Data from PDF, Convert PDF to HTML, Fill-in PDF Form, Stamp PDF and more...
Language: C++ - Size: 21.4 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 20 - Forks: 4

steffegit/VeridionAssignment
Address Extraction Challenge for Veridion Internship
Language: Python - Size: 271 KB - Last synced at: about 7 hours ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

ArthurSilvaDantas/ExtractJSON
Aplicação Web para extrair informações de um arquivo JSON.
Language: JavaScript - Size: 49.5 MB - Last synced at: 3 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

kormanowsky/jextract
Allows extracting data from DOM
Language: JavaScript - Size: 140 KB - Last synced at: 3 days ago - Pushed at: almost 5 years ago - Stars: 7 - Forks: 1

pdfix/pdfix_sdk_example_java
PDFix SDK samples for Java Maven. PDF manipulation, content extraction, conversion , accessibility and more...
Language: Java - Size: 20.7 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 4 - Forks: 2

drisskhattabi6/Meteo-Data-Mining
This repo contains using Data Mining Techniques to analyze meteorological (meteo) data. The objective is to extract meaningful insights and patterns from the data that can aid in understanding weather phenomena and predicting future weather conditions.
Language: Jupyter Notebook - Size: 16.1 MB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

izikeros/todo-extractor
Script for extracting TODO notes from the text file
Language: Python - Size: 38.1 KB - Last synced at: 3 days ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 0

Alapipapi/MinerU Fork of opendatalab/MinerU
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
Language: Python - Size: 103 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

floriancochard/extract-data-from-paper
A tool designed to extract numerical data from scanned historical weather documents.
Language: Python - Size: 151 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 13 - Forks: 2

rakhi9932/Amazon_Analysis
Amazon sales data analysis interactive dashboard
Size: 6.65 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

KEZIMAdynamics/DokuExtractor
Easily extract data from PDF documents
Language: C# - Size: 74.7 MB - Last synced at: 6 days ago - Pushed at: 7 months ago - Stars: 10 - Forks: 5

CatherineFramework/mercy
Mercy is an open-source Rust crate and CLI designed for building cybersecurity utilities and projects.
Language: Rust - Size: 548 KB - Last synced at: 2 days ago - Pushed at: about 1 year ago - Stars: 11 - Forks: 0

osh/gr-eventstream
gr-eventstream is a set of GNU Radio blocks for creating precisely timed events and either inserting them into, or extracting them from normal data-streams precisely. It allows for the definition of high speed time-synchronous c++ burst event handlers, as well as bridging to standard GNU Radio Async PDU messages with precise timing easily.
Language: C++ - Size: 842 KB - Last synced at: 2 months ago - Pushed at: over 7 years ago - Stars: 44 - Forks: 28

darkskygit/ChatImporter
import chat records from your im and store into single sqlite database
Language: Rust - Size: 494 KB - Last synced at: 2 months ago - Pushed at: 8 months ago - Stars: 11 - Forks: 1

labteral/bluebird 📦
Unofficial Python client for Twitter
Language: Python - Size: 112 KB - Last synced at: 23 days ago - Pushed at: over 4 years ago - Stars: 43 - Forks: 14

Zuriel-HR/PEtoJSON
Extracción de características de archivos en formato portable ejecutable a archivo en formato JSON
Language: Python - Size: 26.4 KB - Last synced at: 2 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

walidbosso/R_Data_mining
Extract knowledge from a data using different techniques, including Association Rules Hierarchical Agglomerative Clustering (HAC) K-means Clustering Decision Trees
Language: R - Size: 9.75 MB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

tamK-kol/Chatbot-Q-A-in-Invoice-Extractor-LLM
The Invoice Extractor markdown is a specific format used to extract relevant information from invoices. It's a standardized way to annotate invoices with key information, making it easier to automate the extraction process.
Language: Python - Size: 348 KB - Last synced at: 4 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

orvill-as/extract-email
This program prompts the user for input and output file paths, extracts email addresses from the input file using a regular expression, and writes the email addresses to the output file. It also measures and prints the elapsed time taken to run the program.
Language: Python - Size: 1000 Bytes - Last synced at: 8 months ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 0

Jelared/Project-GEIPAN
Basic data extraction from website GEIPAN
Language: Jupyter Notebook - Size: 85.9 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

OObasuyi/DoctorCandy
Extract IPs and URLs from docx and PDF files
Language: Python - Size: 57.6 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

simplyYan/cutinfo
go library to extract information based on references
Language: Go - Size: 18.6 KB - Last synced at: 3 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

mhismail/PinPoint-Digitizer
Open source digitizer application to extract data from plots
Language: SCSS - Size: 464 MB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 20 - Forks: 1

Zeeshanahmad4/NLP--Data-extraction-Microsoft-Word-documents-into-a-CSV
Language: Jupyter Notebook - Size: 1.37 MB - Last synced at: 3 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

pdfix/pdfix_sdk_example_dotnet
Make PDF Files Accessible, Extract Data from PDF, Convert PDF to HTML, Fill-in PDF Form, Stamp PDF and more...
Language: C# - Size: 26.9 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 12 - Forks: 6

timothy-bartlett/PyMuPDF
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
Language: Python - Size: 288 MB - Last synced at: 3 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

Thee-Unruly/Optimal-Character-Recognition
Extracting info from documents / images
Language: Jupyter Notebook - Size: 327 KB - Last synced at: 4 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

NostalgicCoder/ReadExcelFile.Lib
Extracts data from a spreadsheet and outputs its contents to a '.SQL' file. Data extraction tool useful for people using SQL Server Express with no access to SSMS addon and import wizard.
Language: C# - Size: 378 KB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

loglux-lab/ip-extractor
ip-extractor.sh uses nano to extract IP addresses. Results are stored in 'hosts', with duplicates removed. Ideal for sifting through logs and data-rich files.
Language: Shell - Size: 2.93 KB - Last synced at: 4 months ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

ShubhRanpara/Auto-Filler
This repository contains my team's internship project work at Flexbox Technologies. We have developed a system that fills the patient details form automatically with the patient data extracted from pdf file.
Language: Python - Size: 6.82 MB - Last synced at: 4 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

ShubhRanpara/Auto-Filler-Web
This repository contains my internship project work at Flexbox Technologies. I have developed a system that fills the patient details form automatically with the patient data extracted from pdf file.
Language: HTML - Size: 7.26 MB - Last synced at: 8 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

slotix/dataflowkit
Extract structured data from web sites. Web sites scraping.
Language: Go - Size: 4.61 MB - Last synced at: 11 months ago - Pushed at: over 2 years ago - Stars: 654 - Forks: 80

geanpannellini/real_estate_property_transactions
A repository containing comprehensive data on real estate property transactions, encompassing transaction details, property characteristics, and market insights for analytical purposes in the real estate industry.
Size: 58.6 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

shubhambhandari29/autoMail
Language: HTML - Size: 9.12 MB - Last synced at: 11 months ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

isaacmg/fb_scraper
FBLYZE is a Facebook scraping system and analysis system.
Language: Jupyter Notebook - Size: 2.61 MB - Last synced at: 3 months ago - Pushed at: about 4 years ago - Stars: 64 - Forks: 21

Anjali1751/Extracting-data-of-scanned-images
Extracting Data Of Scanned Images
Language: Python - Size: 607 KB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

Warard/WordExtractor
Python program which extracts some data from a specific Word document used in my company. Without this program data used to be extracted manually, opening hundred of Word documents one by one to copy/past some informations on an Excel file. Now it is fully automatic.
Language: Python - Size: 7.81 KB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

Qyfashae/Extract_Off_Data
Extract Data from offline file. Ex: Emails, Phone Numbers, Links etc.
Language: Python - Size: 9.77 KB - Last synced at: 4 months ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

jehad-halahla/linux_project
a linux lab bash project that focuses on automation and text extraction
Language: Shell - Size: 17.6 KB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 9 - Forks: 0

FuuToru/Face-Recognition-using-Machine-Learning
This is a repo to face recognition on 5 famous people
Language: Jupyter Notebook - Size: 52.7 MB - Last synced at: 7 months ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

jeffersonsalvador/cnpj-extractor
🇺🇸 Solution for importing and analyzing public Brazilian business data (CNPJ). 🇧🇷 Processamento de Dados CNPJ: Uma solução robusta e conteinerizada para importação e análise de dados empresariais brasileiros (CNPJ).
Language: PHP - Size: 225 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 0

leticiamn/WebScrapperAiPapper
Web scraping para extrair dados de produtos, tradução utilizando o LibreTranslate, tratamento dos dados e classificação de produtos em categorias utilizando um modelo de IA treinado com TensorFlow .
Language: Python - Size: 67.6 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Chauhan-Aniket/Extract-Numbers
Extract numbers from string/file
Language: JavaScript - Size: 1000 KB - Last synced at: about 1 year ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

Joffreybvn/mailxtract
Language: HTML - Size: 39.1 KB - Last synced at: about 1 year ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

Bessouat40/pdf-region-picker
A project to select only part of a PDF file. It's usefull when you want to extract informations with some python library like fitz.
Language: JavaScript - Size: 3.92 MB - Last synced at: 4 months ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

edgeryders/ebook-utils
Extract metadata from Project Gutenberg e-books, and other utilities.
Language: PHP - Size: 18.6 KB - Last synced at: about 1 year ago - Pushed at: almost 7 years ago - Stars: 1 - Forks: 0

RagavendranMRN/WebScraper-WebCrawling
This repo contains the script used by me to extract data out of webpages (web scraping) using a python script that I wrote using BeautifulSoup
Language: Java - Size: 22.5 KB - Last synced at: about 1 year ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 1

Lenashri7/Excel-Automation
This UiPath project automates the process of extracting data from an Excel sheet and filling out a Google Form with the extracted information.
Size: 6.84 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Mamdouh66/Extracty
Extract structured data from any unstructured web page
Language: Python - Size: 258 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 9 - Forks: 1

iwoodsawyer/qdigiplot
QT GUI program for extracting data points from scanned image file of plot
Language: C++ - Size: 153 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 1

MrAssembler0x00/PyTypeExtension
Python📦module for data manipulation & extraction using standardized formats📄.
Size: 0 Bytes - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

dmryutov/parsers
Collection of parsers written in PHP, Python
Language: PLpgSQL - Size: 108 MB - Last synced at: about 1 year ago - Pushed at: over 7 years ago - Stars: 13 - Forks: 7

bjorn3/goodgame_empire_import 📦
A importer for goodgame empire
Language: Rust - Size: 2.41 MB - Last synced at: 1 day ago - Pushed at: about 5 years ago - Stars: 5 - Forks: 2
