Topic: "extract-data"
opendatalab/MinerU
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
Language: Python - Size: 125 MB - Last synced at: 6 days ago - Pushed at: 7 days ago - Stars: 36,874 - Forks: 3,017

pymupdf/PyMuPDF
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
Language: Python - Size: 331 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 7,444 - Forks: 621

bda-research/node-crawler
Web Crawler/Spider for NodeJS + server-side jQuery ;-)
Language: TypeScript - Size: 1.06 MB - Last synced at: 11 days ago - Pushed at: about 1 month ago - Stars: 6,763 - Forks: 879

meltano/meltano
Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.
Language: Python - Size: 140 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 2,118 - Forks: 177

DocumindHQ/documind
Open-source platform for extracting structured data from documents using AI.
Language: JavaScript - Size: 1020 KB - Last synced at: 16 days ago - Pushed at: about 2 months ago - Stars: 1,326 - Forks: 48

elixir-crawly/crawly
Crawly, a high-level web crawling & scraping framework for Elixir.
Language: Elixir - Size: 2.8 MB - Last synced at: 16 days ago - Pushed at: 10 months ago - Stars: 1,026 - Forks: 118

slotix/dataflowkit
Extract structured data from web sites. Web sites scraping.
Language: Go - Size: 4.61 MB - Last synced at: 11 months ago - Pushed at: over 2 years ago - Stars: 654 - Forks: 80

OmkarPathak/ResumeParser
A simple resume parser used for extracting information from resumes
Language: Python - Size: 1.54 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 303 - Forks: 172

danschultzer/receipt-scanner
Receipt scanner extracts information from your PDF or image receipts - built in NodeJS
Language: JavaScript - Size: 3.54 MB - Last synced at: 11 days ago - Pushed at: over 6 years ago - Stars: 299 - Forks: 56

Qusic/TraceUtility 📦
Extract data from .trace documents generated by Instruments
Language: Objective-C - Size: 47.9 KB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 221 - Forks: 80

m92vyas/llm-reader
Turn Webpage to LLM friendly input text. Similar to Firecrawl and Jina Reader API. Makes RAG, AI web scraping, image & webpage links extraction easy.
Language: Python - Size: 92.8 KB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 201 - Forks: 15

yuanxu-li/html-table-extractor
extract data from html table
Language: Python - Size: 31.3 KB - Last synced at: 3 months ago - Pushed at: about 5 years ago - Stars: 86 - Forks: 22

ropensci/smapr
An R package for acquisition and processing of NASA SMAP data
Language: R - Size: 6.48 MB - Last synced at: 3 days ago - Pushed at: over 1 year ago - Stars: 85 - Forks: 25

msoap/html2data
Library and cli for extracting data from HTML via CSS selectors
Language: Go - Size: 7.15 MB - Last synced at: 3 months ago - Pushed at: 9 months ago - Stars: 69 - Forks: 3

CairX/extract-colors-py
Extract colors from an image. Colors are grouped based on visual similarities using the CIE76 formula.
Language: Python - Size: 4.72 MB - Last synced at: 20 days ago - Pushed at: over 4 years ago - Stars: 68 - Forks: 20

isaacmg/fb_scraper
FBLYZE is a Facebook scraping system and analysis system.
Language: Jupyter Notebook - Size: 2.61 MB - Last synced at: 4 days ago - Pushed at: about 4 years ago - Stars: 64 - Forks: 21

Techcatchers/PyLyrics-Extractor
Get Lyrics for any songs by just passing in the song name (spelled or misspelled) in less than 2 seconds using this awesome Python Library.
Language: Python - Size: 17.6 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 57 - Forks: 18

asad70/Insider-Trading
This program extracts insider trading data from the sec website and stores it in excel file for the specified time frame.
Language: Python - Size: 98.6 KB - Last synced at: 2 months ago - Pushed at: over 2 years ago - Stars: 53 - Forks: 15

fivesmallq/web-data-extractor
Extracting and parsing structured data with jQuery Selector, XPath or JsonPath from common web format like HTML, XML and JSON.
Language: Java - Size: 717 KB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 53 - Forks: 19

osh/gr-eventstream
gr-eventstream is a set of GNU Radio blocks for creating precisely timed events and either inserting them into, or extracting them from normal data-streams precisely. It allows for the definition of high speed time-synchronous c++ burst event handlers, as well as bridging to standard GNU Radio Async PDU messages with precise timing easily.
Language: C++ - Size: 842 KB - Last synced at: 3 months ago - Pushed at: almost 8 years ago - Stars: 44 - Forks: 28

labteral/bluebird 📦
Unofficial Python client for Twitter
Language: Python - Size: 112 KB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 43 - Forks: 14

giveabit/Trio-Plus-Data
Extract audio and other data from the Digitech Trio Plus guitar pedal's SD card
Language: Python - Size: 23.3 MB - Last synced at: over 1 year ago - Pushed at: over 7 years ago - Stars: 38 - Forks: 8

peterbencze/serritor
Serritor is an open source web crawler framework built upon Selenium and written in Java. It can be used to crawl dynamic web pages that require JavaScript to render data.
Language: Java - Size: 969 KB - Last synced at: 2 months ago - Pushed at: almost 3 years ago - Stars: 32 - Forks: 15

ionictemplate-app/Social-Network-Data-Scraper-Pro
Easily scrape 10,000+ email messages in one hour, helping you quickly increase your customers Extracts data from (LinkedIn, Facebook, Instagram, Youtube, Pinterest, Twitter) Perfect search by specific Keywords Ready-to-use Social Network Data Scraper Software to get started instantly 100% Include source code and install file
Size: 45.9 KB - Last synced at: 4 months ago - Pushed at: about 1 year ago - Stars: 27 - Forks: 6

serhaturtis/TOOL-FastBatchImageCrop
A simple UI tool to batch crop images to prepare datasets from images and videos.
Language: Python - Size: 955 KB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 27 - Forks: 3

alienzhou/giframe
extract the first frame in GIF without reading whole bytes, support both browser and nodejs 📸
Language: TypeScript - Size: 6.46 MB - Last synced at: 19 days ago - Pushed at: over 5 years ago - Stars: 23 - Forks: 6

Skyluker4/UnityAssetReplacer
A tool to replace data in a Unity Asset Bundle from modified files.
Language: C# - Size: 140 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 21 - Forks: 4

pdfix/pdfix_sdk_example_cpp
Make PDF Files Accessible, Extract Data from PDF, Convert PDF to HTML, Fill-in PDF Form, Stamp PDF and more...
Language: C++ - Size: 21.4 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 20 - Forks: 4

mhismail/PinPoint-Digitizer
Open source digitizer application to extract data from plots
Language: SCSS - Size: 464 MB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 20 - Forks: 1

ark-mod/ArkSavegameToolkitNet
Library for reading ARK Survival Evolved savegame files using C#.
Language: C# - Size: 5.85 MB - Last synced at: 15 days ago - Pushed at: over 2 years ago - Stars: 20 - Forks: 27

hseera/python-utilities
Different python utility scripts to help automate mundane/repetitive tasks. Useful for performance testers/data scientist or anyone who wants to automate mundane tasks in python.
Language: Python - Size: 2.81 MB - Last synced at: over 2 years ago - Pushed at: almost 4 years ago - Stars: 20 - Forks: 1

peterstangl/svg2data
A Python module for reading data from a plot provided as SVG file.
Language: Python - Size: 63.5 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 19 - Forks: 3

righthandabacus/mdict_reader
Extract data from Octopus mdict (*.mdd, *.mdx) files
Language: Python - Size: 11.7 KB - Last synced at: over 2 years ago - Pushed at: over 7 years ago - Stars: 19 - Forks: 7

guillaC/SQLiteDiskExplorer
SQLiteDiskExplorer enables you to explore, catalog, and batch extract SQLite files from disks and removable media.
Language: C# - Size: 400 KB - Last synced at: 5 days ago - Pushed at: 11 months ago - Stars: 17 - Forks: 0

rdlopes/WebHere
HTML scraping for Objective-C.
Language: Objective-C - Size: 1.24 MB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 17 - Forks: 6

Agenty/scrapingai
Build web scraping agents using AI to auto-extract the data from websites, capture screenshot, generate pdf from URL and web crawling with Agenty
Language: TypeScript - Size: 209 KB - Last synced at: 8 days ago - Pushed at: over 1 year ago - Stars: 17 - Forks: 2

laur89/docker-seedbox-rclone-fetch-extract
Dockerised service pulling data from remote seedbox & extracting archives
Language: Shell - Size: 841 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 14 - Forks: 3

floriancochard/extract-data-from-paper
A tool designed to extract numerical data from scanned historical weather documents.
Language: Python - Size: 151 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 13 - Forks: 2

BlockBuilder57/XBC2ModelDecomp 📦
Extracts Xenoblade 2 models into XNALara and glTF format
Language: C# - Size: 162 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 13 - Forks: 2

dmryutov/parsers
Collection of parsers written in PHP, Python
Language: PLpgSQL - Size: 108 MB - Last synced at: over 1 year ago - Pushed at: over 7 years ago - Stars: 13 - Forks: 7

MeltanoLabs/tap-dbt
Singer Tap for dbt API v2 built with the Meltano SDK
Language: Python - Size: 1 MB - Last synced at: about 1 hour ago - Pushed at: about 3 hours ago - Stars: 12 - Forks: 7

pdfix/pdfix_sdk_example_dotnet
Make PDF Files Accessible, Extract Data from PDF, Convert PDF to HTML, Fill-in PDF Form, Stamp PDF and more...
Language: C# - Size: 26.9 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 12 - Forks: 6

darkskygit/ChatImporter
import chat records from your im and store into single sqlite database
Language: Rust - Size: 494 KB - Last synced at: 3 months ago - Pushed at: 8 months ago - Stars: 11 - Forks: 1

CatherineFramework/mercy
Mercy is an open-source Rust crate and CLI designed for building cybersecurity utilities and projects.
Language: Rust - Size: 548 KB - Last synced at: 6 days ago - Pushed at: about 1 year ago - Stars: 11 - Forks: 0

KEZIMAdynamics/DokuExtractor
Easily extract data from PDF documents
Language: C# - Size: 74.7 MB - Last synced at: 6 days ago - Pushed at: 7 months ago - Stars: 10 - Forks: 5

Agenta-AI/job_extractor_template
Template for an AI application that extracts the job information from a job description using openAI functions and langchain
Language: Python - Size: 15.6 KB - Last synced at: 30 days ago - Pushed at: over 1 year ago - Stars: 10 - Forks: 1

DapengFeng/waymo-toolkit
A toolkit for extracting elements and visualization for Waymo Open Dataset
Language: Python - Size: 2.78 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 10 - Forks: 2

dewshr/NCBI-GenBank-file-parser
This program can be used to parse the NCBI GenBank file to create a tabulated csv file.
Language: Python - Size: 10.7 KB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 10 - Forks: 11

Mamdouh66/Extracty
Extract structured data from any unstructured web page
Language: Python - Size: 258 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 9 - Forks: 1

jehad-halahla/linux_project
a linux lab bash project that focuses on automation and text extraction
Language: Shell - Size: 17.6 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 9 - Forks: 0

ocampor/pivot-table-to-csv
This repository takes a *.xslx that contains a Pivot Table with hidden external source data and converts the pivot cache into CSV. It takes into account files that are too big to be in memory and handles this situation by dividing the original data into n batches.
Language: Python - Size: 20.5 KB - Last synced at: about 1 year ago - Pushed at: almost 3 years ago - Stars: 8 - Forks: 4

aidayang/MinerU-OneClick
MinerU免安装部署一键启动整合包
Size: 49.8 KB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 7 - Forks: 0

kormanowsky/jextract
Allows extracting data from DOM
Language: JavaScript - Size: 140 KB - Last synced at: 14 days ago - Pushed at: almost 5 years ago - Stars: 7 - Forks: 1

JdeJabali/JXLDataTableExtractor
Extract data as tables from Excel. Search columns by their header or index number. Sets conditions for extracting the rows.
Language: C# - Size: 493 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 6 - Forks: 0

sypht-team/sypht-elixir-client
An Elixir client for the Sypht API https://sypht.com
Language: Elixir - Size: 47.9 KB - Last synced at: 7 days ago - Pushed at: over 5 years ago - Stars: 6 - Forks: 0

OxideDevX/info_you_windows
Script for extracting data about the computer with the record of the latter in the text log file
Language: Batchfile - Size: 20.5 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 5 - Forks: 1

mheriyanto/EDFI
:earth_asia: EDFI is an open-source script to extract data from a 2D or 3D image.
Language: MATLAB - Size: 1.02 MB - Last synced at: 2 days ago - Pushed at: almost 3 years ago - Stars: 5 - Forks: 0

alexey-savchenko-am/Excel.DataTable
Allows to extract data from excel table or write some data to table.
Language: C# - Size: 179 KB - Last synced at: 2 days ago - Pushed at: almost 4 years ago - Stars: 5 - Forks: 1

bjorn3/goodgame_empire_import 📦
A importer for goodgame empire
Language: Rust - Size: 2.41 MB - Last synced at: 5 days ago - Pushed at: over 5 years ago - Stars: 5 - Forks: 2

pdfix/pdfix_sdk_example_java
PDFix SDK samples for Java Maven. PDF manipulation, content extraction, conversion , accessibility and more...
Language: Java - Size: 20.7 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 4 - Forks: 2

apurvasijaria/GooglePlayStoreScrape
Python module to extract Google Play store reviews and other information of any android app.
Language: Python - Size: 114 KB - Last synced at: 15 days ago - Pushed at: 11 months ago - Stars: 4 - Forks: 0

orvill-as/extract-email
This program prompts the user for input and output file paths, extracts email addresses from the input file using a regular expression, and writes the email addresses to the output file. It also measures and prints the elapsed time taken to run the program.
Language: Python - Size: 1000 Bytes - Last synced at: 8 months ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 0

pratik149/pdf-table-extractor
Extract tables from searchable as well as non-searchable pdf files!
Language: Jupyter Notebook - Size: 840 KB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 4 - Forks: 1

RobyFerro/ResumeParser.js 📦
A simple tool to parse and extract data from a resume.
Language: JavaScript - Size: 1.11 MB - Last synced at: over 2 years ago - Pushed at: almost 5 years ago - Stars: 4 - Forks: 2

rshad/Extract-Information-from-Log-files-using-Python
Extract the last alert found in a .log file, given a date as a parameter - Use case: Wazuh log file
Language: Python - Size: 9.77 KB - Last synced at: almost 2 years ago - Pushed at: about 5 years ago - Stars: 4 - Forks: 5

BaseMax/ExtractWord
Extract word(s) from the lines of the file.
Language: PHP - Size: 23.4 KB - Last synced at: about 22 hours ago - Pushed at: about 6 years ago - Stars: 4 - Forks: 1

jadsonluan/data-extraction-scripts
Repositório para scripts de extração de dados
Language: JavaScript - Size: 12.7 KB - Last synced at: about 1 year ago - Pushed at: over 6 years ago - Stars: 4 - Forks: 2

programmercv/resume_exporter
A ruby gem to export public résumé data from various sources (LinkedIn, Xing, Stackoverflow) to json or xml
Language: Ruby - Size: 326 KB - Last synced at: 15 days ago - Pushed at: about 8 years ago - Stars: 4 - Forks: 3

MeltanoLabs/tap-stackexchange
Singer tap for the StackExchange API
Language: Python - Size: 1.19 MB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 3 - Forks: 1

zebbern/JSX
🕵️♂️ | A Chrome extension that collects all JavaScript (.js) links, form endpoints, and all other links from a webpage with a single click!
Language: JavaScript - Size: 1.06 MB - Last synced at: 8 days ago - Pushed at: 4 months ago - Stars: 3 - Forks: 0

jeffersonsalvador/cnpj-extractor
🇺🇸 Solution for importing and analyzing public Brazilian business data (CNPJ). 🇧🇷 Processamento de Dados CNPJ: Uma solução robusta e conteinerizada para importação e análise de dados empresariais brasileiros (CNPJ).
Language: PHP - Size: 225 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 0

zSynctic/Img2Txt
Img2Txt - Extract Text From Images using AI
Language: Python - Size: 55.7 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 3 - Forks: 0

astemiracle/project-d2
extracting dota2 stats
Language: R - Size: 5.86 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 3 - Forks: 1

arevish/Brain-wave-detection-system
Project is focused on the detection and extraction of a brain wave signal with the help of analog as well as digital circuitry. Using active electrodes on human scalp, the brain signals were fed into a series of hardware and software stages. Simple conscious movements such as blinking caused a change in the detected waveform. Although the project was not successful in discriminating between different motions or utilizes the signal to control an electrical device, the team was able to successfully separate and display the alpha waves after filtering off all associated unwanted signals.
Size: 6.75 MB - Last synced at: about 1 month ago - Pushed at: almost 3 years ago - Stars: 3 - Forks: 0

TIMESTICKING/image_graph_line_to_digital_convert
read digital numbers(points) from a image with plots. image to plot. image2digital. line2digital. image2digitalline. graph2digitalline. imageline2digital. graphline2digital. pictureline2digital. readimageline.
Language: MATLAB - Size: 56.1 MB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 3 - Forks: 0

hive-scripts/hivehoney
Extract data from remote Hive to local Windows OS.
Language: Python - Size: 291 KB - Last synced at: 17 days ago - Pushed at: over 4 years ago - Stars: 3 - Forks: 1

mohamedhaddi/recursive-extractor Fork of AbdullahALRashdan/Cybertalents-f100
Extract a recursively compressed single file (multiple archive formats).
Language: Python - Size: 9.77 KB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 3 - Forks: 1

RocktimRajkumar/CV-Grader
:runner: CV parser is a compiler or interpreter that converts the structured form of data into a structured form.
Language: Python - Size: 182 KB - Last synced at: over 2 years ago - Pushed at: over 5 years ago - Stars: 3 - Forks: 0

palwesh/Resume-Parser
Extract the data from resume using djnago rest api
Language: Python - Size: 13.7 KB - Last synced at: over 2 years ago - Pushed at: over 5 years ago - Stars: 3 - Forks: 0

avocadianmage/file-icon-info
Node.js module that extracts icon information from executables.
Language: C# - Size: 39.1 KB - Last synced at: about 18 hours ago - Pushed at: over 6 years ago - Stars: 3 - Forks: 0

malakhovks/doc-docx-extract-api
Atomic Web Service (AWS, REST API) for converting DOC/DOCX files to plain/text, powered by catdoc, docx2txt and Node.js
Language: JavaScript - Size: 26.4 KB - Last synced at: over 2 years ago - Pushed at: over 7 years ago - Stars: 3 - Forks: 0

aplicacionamedida/html-snippet
Extract html snippets getting the minimal css rules from source or computing the css values
Language: JavaScript - Size: 15.6 KB - Last synced at: over 2 years ago - Pushed at: almost 8 years ago - Stars: 3 - Forks: 0

LivingSkySchoolDivision/MySchoolSaskIntegrations
Export definitions, and notes regarding how they work, for extracting data from MySchoolSask (an implementation of Follett Aspen)
Language: PowerShell - Size: 1.28 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 2 - Forks: 2

umLu/tubeframes
A Python package for retrieving YouTube data, including video statistics, captions, and channel information. TubeData outputs results in a user-friendly pandas DataFrame format, making it ideal for data analysis workflows — especially in Jupyter Notebooks.
Language: Python - Size: 53.7 KB - Last synced at: 6 days ago - Pushed at: 3 months ago - Stars: 2 - Forks: 0

izikeros/todo-extractor
Script for extracting TODO notes from the text file
Language: Python - Size: 38.1 KB - Last synced at: 7 days ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

cmb-css/twitter-hoover
Collect data from filtered Twitter streams.
Language: Python - Size: 1.19 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 2

ConstantinLenoir/reread-markdown
A Mardown parser for extracting hierarchical content.
Language: JavaScript - Size: 76.2 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

walidbosso/R_Data_mining
Extract knowledge from a data using different techniques, including Association Rules Hierarchical Agglomerative Clustering (HAC) K-means Clustering Decision Trees
Language: R - Size: 9.75 MB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

ChetanXpro/Document-AI
A app to extract structured data from a pdf document
Language: Python - Size: 1000 KB - Last synced at: 4 months ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

t-fbd/audiodotturn
General tool/library for extracting simple metadata and producing new file formats from only a filename(s).
Language: Python - Size: 187 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 2 - Forks: 0

MaryemSamet/Neurone-network-that-applies-a-trending-strategy-
Node js & python
Language: Jupyter Notebook - Size: 596 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

Pevicsanch/project-data-of-the-territorial-division-of-Barcelona
collecting data from the Barcelona City Hall Open Data Service's on socioeconomic indicators of the territorial division of the city of Barcelona
Language: Jupyter Notebook - Size: 770 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 1

jalal246/corename
Automatically extracts packages root name for monorepos
Language: JavaScript - Size: 111 KB - Last synced at: about 1 month ago - Pushed at: almost 3 years ago - Stars: 2 - Forks: 0

SC-Networks/Hydrator
A pragmatic hydrator and extractor library
Language: PHP - Size: 50.8 KB - Last synced at: 3 months ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 1

qwazr/extractor
A WEB API for text and meta-data extraction
Language: Java - Size: 25.5 MB - Last synced at: 3 months ago - Pushed at: almost 4 years ago - Stars: 2 - Forks: 0

shaddih1/BookExtracting
BookExtracting v0.2 | Coded by Shady H | {Designed to automate book extraction}-Unfinished
Language: Python - Size: 117 KB - Last synced at: about 1 year ago - Pushed at: almost 4 years ago - Stars: 2 - Forks: 0

pdfix/pdfix_sdk_example_node_js
Example project demonstrating how to use PDFix SDK WebAssembly build in Node.js. Make PDF Files Accessible, Extract Data from PDF, Convert PDF to HTML, Fill-in PDF Form, Stamp PDF and more...
Language: JavaScript - Size: 329 KB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 2 - Forks: 0

syphonetic/Powershell-LogExtractor
Language: PowerShell - Size: 68.4 KB - Last synced at: 9 months ago - Pushed at: over 4 years ago - Stars: 2 - Forks: 0

SandeepBalachandran/Pytheract
Tool for extracting data from files.
Language: Python - Size: 212 KB - Last synced at: about 2 months ago - Pushed at: over 4 years ago - Stars: 2 - Forks: 0

amogh9594/amazonscrape
Amazon Product Data Scraper .
Language: Python - Size: 11.7 KB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 2 - Forks: 1
