Topic: "data-extraction"
getmaxun/maxun
π₯ Open Source No Code Web Data Extraction Platform β’ Turn Websites To APIs & Spreadsheets With No-Code Robots In Minutes π₯
Language: TypeScript - Size: 4.26 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 12,533 - Forks: 973

vi3k6i5/flashtext
Extract Keywords from sentence or Replace keywords in sentences.
Language: Python - Size: 439 KB - Last synced at: 5 days ago - Pushed at: 28 days ago - Stars: 5,648 - Forks: 603

D4Vinci/Scrapling
π·οΈ An undetectable, powerful, flexible, high-performance Python library to make Web Scraping Easy and Effortless as it should be!
Language: Python - Size: 1.82 MB - Last synced at: 7 days ago - Pushed at: 12 days ago - Stars: 2,969 - Forks: 189

JonathanLink/PDFLayoutTextStripper
Converts a pdf file into a text file while keeping the layout of the original pdf. Useful to extract the content from a table in a pdf file for instance. This is a subclass of PDFTextStripper class (from the Apache PDFBox library).
Language: Java - Size: 21.1 MB - Last synced at: 20 days ago - Pushed at: over 1 year ago - Stars: 1,589 - Forks: 214

hi-primus/optimus
:truck: Agile Data Preparation Workflows madeΒ easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
Language: Python - Size: 110 MB - Last synced at: 1 day ago - Pushed at: 5 months ago - Stars: 1,508 - Forks: 232

raznem/parsera
Lightweight library for scraping web-sites with LLMs
Language: Python - Size: 2.21 MB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 1,069 - Forks: 64

thinh-vu/vnstock
A beginner-friendly yet powerful Python toolkit for financial analysis and automation β built to make modern investing accessible to everyone
Language: Python - Size: 56.6 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 798 - Forks: 175

midavr09/BCParser
Size: 15.6 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 771 - Forks: 0

polyrabbit/hacker-news-digest
:newspaper: Let ChatGPT Summarize Hacker News for You
Language: Python - Size: 4.65 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 712 - Forks: 93

adrienjoly/npm-pdfreader
π Parse text and tables from PDF files.
Language: HTML - Size: 1.77 MB - Last synced at: 6 days ago - Pushed at: 4 months ago - Stars: 674 - Forks: 85

chakshu-jain/BCParser
Size: 0 Bytes - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 603 - Forks: 0

a-maliarov/amazoncaptcha
Pure Python, lightweight, Pillow-based solver for Amazon's text captcha.
Language: Python - Size: 81 MB - Last synced at: 5 days ago - Pushed at: 7 days ago - Stars: 471 - Forks: 85

parv-mehta10/BCParser
BCParser Bitcoin-Tool Blockchain-Parser Crypto-Tool BTC-Data-Analysis Blockchain-Analysis Cryptocurrency-Parser Data-Extraction Blockchain-Tool BTC-Analysis Crypto-Parser
Size: 15.6 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 410 - Forks: 0

Almakster/BCParser
BCParser Bitcoin-Tool Blockchain-Parser Crypto-Tool BTC-Data-Analysis Blockchain-Analysis Cryptocurrency-Parser Data-Extraction Blockchain-Tool
Size: 15.6 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 353 - Forks: 0

shcherbak-ai/contextgem
ContextGem: Effortless LLM extraction from documents
Language: Python - Size: 9.73 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 313 - Forks: 26

py-pdf/benchmarks
Benchmarking PDF libraries
Language: Python - Size: 3.73 MB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 269 - Forks: 15

notluken/BCParser
Size: 15.6 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 220 - Forks: 0

jpjacobpadilla/Stealth-Requests
Undetected Web-Scraping & Seamless HTML Parsing in Python!
Language: Python - Size: 691 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 203 - Forks: 10

serpapi/clauneck
A tool for scraping emails, social media accounts, and much more information from websites using Google Search Results.
Language: Ruby - Size: 34.2 KB - Last synced at: about 4 hours ago - Pushed at: about 1 year ago - Stars: 178 - Forks: 11

molybdenum-99/infoboxer
Wikipedia information extraction library
Language: Ruby - Size: 8.17 MB - Last synced at: 6 days ago - Pushed at: about 1 year ago - Stars: 175 - Forks: 13

sypht-team/sypht-python-client
A python client for the Sypht API
Language: Python - Size: 165 KB - Last synced at: about 1 month ago - Pushed at: 10 months ago - Stars: 162 - Forks: 5

dilawar/PlotDigitizer
A Python utility to digitize plots.
Language: Python - Size: 2.15 MB - Last synced at: 1 day ago - Pushed at: 9 months ago - Stars: 139 - Forks: 24

173TECH/sayn
Data processing and modelling framework for automating tasks (incl. Python & SQL transformations).
Language: Python - Size: 4.54 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 122 - Forks: 15

CambioML/any-parser
Accurate, private and configurable document retrieval LLM
Language: Python - Size: 22.1 MB - Last synced at: 24 days ago - Pushed at: 25 days ago - Stars: 121 - Forks: 11

nfx/go-htmltable
Structured HTML table data extraction from URLs in Go that has almost no external dependencies
Language: Go - Size: 416 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 120 - Forks: 8

johnbumgarner/newspaper3_usage_overview
This repository provides usage examples for the Python module Newspaper3k.
Language: Python - Size: 121 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 120 - Forks: 17

dream-num/univer-clipsheet
A powerful Chrome extension for web scraping
Language: TypeScript - Size: 5.72 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 119 - Forks: 17

villagecomputing/superpipe
Superpipe - optimized LLM pipelines for structured data
Language: Python - Size: 11.2 MB - Last synced at: 9 days ago - Pushed at: 11 months ago - Stars: 110 - Forks: 3

sshniro/line-segmentation-algorithm-to-gcp-vision
Line segmentation algorithm for Google Vision API.
Language: Kotlin - Size: 2.76 MB - Last synced at: 11 days ago - Pushed at: over 2 years ago - Stars: 97 - Forks: 37

reincubate/ricloud
Python client for Reincubate's ricloud API. Yes, it works with iOS 14 & iPhone 12 backups!
Language: Python - Size: 220 KB - Last synced at: 22 days ago - Pushed at: about 5 years ago - Stars: 95 - Forks: 25

chenkovsky/cyac
High performance Trie and Ahocorasick automata (AC automata) Keyword Match & Replace Tool for python. Correct case insensitive implementation!
Language: Cython - Size: 1.75 MB - Last synced at: 21 days ago - Pushed at: 7 months ago - Stars: 94 - Forks: 15

hermit-crab/ScrapeMate
Scraping assistant tool. Editing and maintaining CSS/XPath selectors across webpages.
Language: JavaScript - Size: 761 KB - Last synced at: almost 2 years ago - Pushed at: almost 7 years ago - Stars: 93 - Forks: 12

tech-engine/goscrapy
GoScrapy: Harnessing Go's power for blazingly fast web scraping, inspired by Python's Scrapy framework.
Language: Go - Size: 6.16 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 89 - Forks: 2

dav009/flash
Golang Keyword extraction/replacement Datastructure using Tries instead of regexes
Language: Go - Size: 7.81 KB - Last synced at: 11 days ago - Pushed at: over 7 years ago - Stars: 89 - Forks: 6

sypht-team/sypht-java-client
A Java client for the Sypht API
Language: Java - Size: 108 KB - Last synced at: about 1 month ago - Pushed at: almost 4 years ago - Stars: 87 - Forks: 1

docwire/docwire
DocWire SDK: Award-winning modern data processing in C++20. SourceForge Community Choice & Microsoft support. AI-driven processing. Supports nearly 100 data formats, including email boxes and OCR. Boost efficiency in text extraction, web data extraction, data mining, document analysis. Offline processing is possible for security and confidentiality
Language: C++ - Size: 35.8 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 83 - Forks: 18

danburzo/hred
Reduce HTML and XML to JSON from the command line, using an expressive query language inspired by CSS selectors.
Language: JavaScript - Size: 207 KB - Last synced at: 8 days ago - Pushed at: 8 months ago - Stars: 73 - Forks: 1

Zubdata/Google-Maps-Scraper
Google maps scraper with gui
Language: Python - Size: 146 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 68 - Forks: 27

WeTransfer/format_parser
file metadata parsing, done cheap
Language: Ruby - Size: 891 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 62 - Forks: 18

chrisrober011/BCParser
BCParser Bitcoin-Tool Blockchain-Parser Crypto-Tool BTC-Data-Analysis Blockchain-Analysis Cryptocurrency-Parser Data-Extraction Blockchain-Tool
Size: 15.6 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 55 - Forks: 0

html-extract/hext
Domain-specific language for extracting structured data from HTML documents
Language: C++ - Size: 2.13 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 53 - Forks: 3

scopashq/typestream π¦
β‘οΈ Next-generation data transformation framework for TypeScript that puts developer experience first
Language: TypeScript - Size: 560 KB - Last synced at: 3 days ago - Pushed at: about 3 years ago - Stars: 53 - Forks: 0

uhh-lt/newsleak
Information extraction and interactive visualization of textual datasets for investigative data-driven journalism and eDiscovery
Language: Java - Size: 116 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 52 - Forks: 15

rohanpillai20/Table-Extractor-From-Image
This repository contains the code that extracts a table from an image and exports it to an Excel.
Language: Python - Size: 72.3 KB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 51 - Forks: 14

StabRise/spark-pdf
PDF DataSource for Apache Spark, allow to read PDF files directly to the DataFrame and ocr it
Language: Scala - Size: 5.72 MB - Last synced at: 14 days ago - Pushed at: 14 days ago - Stars: 49 - Forks: 3

Articdive/ArticData
Collection of data extracted from Minecraft.
Size: 7.33 MB - Last synced at: 3 days ago - Pushed at: over 1 year ago - Stars: 44 - Forks: 0

VorTECHsa/refinery
Refinery is a tool to extract and transform semi-structured data from Excel spreadsheets of different layouts in a declarative way.
Language: Kotlin - Size: 374 KB - Last synced at: 12 months ago - Pushed at: almost 2 years ago - Stars: 44 - Forks: 6

serpapi/google-search-results-java
Google Search Results JAVA API via SerpApi
Language: Java - Size: 260 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 43 - Forks: 24

linw1995/jsonpath
A query expression for extracting data from JSON.
Language: Python - Size: 763 KB - Last synced at: 7 days ago - Pushed at: 5 months ago - Stars: 41 - Forks: 4

luminati-io/brightdata-mcp
A powerful Model Context Protocol (MCP) server that provides an all-in-one solution for public web access.
Language: JavaScript - Size: 63.8 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 40 - Forks: 4

shriprem/FWDataViz
Fixed Width Data Visualizer plugin for Notepad++. Turns Notepad++ into Excel for fixed-width data files. Displays cursor position data. Jumps to specific fields. Folding Record Blocks. Extracts Data. Builtin dialogs to configure file-type, record-type & fields; Themes & Colors; and Folding. Handles homogenous, mixed & multi-line records.
Language: C++ - Size: 12.3 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 40 - Forks: 5

VictorAtPL/awesome-receipt-data-extraction π¦
A curated list (and summaries) of awesome research publications on topic of data extraction from photos of receipts.
Language: TeX - Size: 13.6 MB - Last synced at: 5 days ago - Pushed at: over 2 years ago - Stars: 37 - Forks: 5

rekloud/tinvois-parser
Extract receipt info
Language: Python - Size: 22.1 MB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 35 - Forks: 3

mhucka/taupe π¦
Taupe takes a downloaded Twitter archive ZIP file, extracts the URLs corresponding to tweets, retweets, replies, quote tweets, and liked tweets, and outputs the results in a comma-separated values (CSV) format that you can use with other software tools.
Language: Python - Size: 176 KB - Last synced at: 2 days ago - Pushed at: about 2 years ago - Stars: 33 - Forks: 1

johnbumgarner/newshound
This Python package can be used to systematically extract multiple data elements (e.g., title, keywords, text) from news sources around the world in over 50 languages.
Size: 28.3 KB - Last synced at: 6 days ago - Pushed at: about 2 years ago - Stars: 33 - Forks: 3

sypht-team/sypht-golang-client
A Golang client for the Sypht API
Language: Go - Size: 73.2 KB - Last synced at: 28 days ago - Pushed at: almost 5 years ago - Stars: 33 - Forks: 0

MrHacker-X/OsintifyX
OsintifyX: Powerful Open-source OSINT tool for extracting valuable information from Instagram profiles.
Language: Python - Size: 3.83 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 32 - Forks: 2

rubydamodar/ProText-Analyzer
ProText Analyzer is a powerful tool for extracting insights from text. It conducts sentiment analysis, categorizing content as positive, negative, or neutral, while also assessing readability and linguistic complexity. Ideal for businesses and researchers, it enhances understanding of textual data.
Language: Jupyter Notebook - Size: 1.32 MB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 30 - Forks: 1

linw1995/data_extractor
Combine XPath, CSS Selectors and JSONPath for Web data extracting.
Language: Python - Size: 1.07 MB - Last synced at: 7 days ago - Pushed at: 5 months ago - Stars: 28 - Forks: 5

AryanVBW/Exif
ExifTool is a powerful command-line tool that can be used to extract and edit metadata in a wide range of media files, including images, audio, and video. Metadata is information that is stored within a file that describes the fileβs content or other attributes.
Language: Python - Size: 8.12 MB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 26 - Forks: 10

gambolputty/wiktionary-de-parser
Extract data from German Wiktionary XML files.
Language: Python - Size: 488 KB - Last synced at: 8 days ago - Pushed at: 4 months ago - Stars: 26 - Forks: 8

NextKore/SmartMuv
An EVM-compatible Solidity Smart Contract Storage/Slot Analyzer and Data Extractor.
Language: Python - Size: 226 KB - Last synced at: 5 days ago - Pushed at: 4 months ago - Stars: 25 - Forks: 7

chaitanyarahalkar/Financial-Info-Extractor
Extract financial information in CSV format for companies compliant to the NSE
Language: Python - Size: 36.1 KB - Last synced at: about 1 month ago - Pushed at: almost 7 years ago - Stars: 22 - Forks: 7

arkutils/Obelisk
Project Obelisk - Uploading Ark Data daily
Size: 59.5 MB - Last synced at: about 2 hours ago - Pushed at: about 3 hours ago - Stars: 21 - Forks: 5

ImranR98/Wealthsimpleton
A Python script that scrapes your Wealthsimple activity history and saves the data in a JSON file.
Language: Python - Size: 9.77 KB - Last synced at: 7 days ago - Pushed at: about 2 months ago - Stars: 21 - Forks: 4

cpl/exodus
Data exfiltration using DNS
Language: Go - Size: 38.1 KB - Last synced at: 11 months ago - Pushed at: over 5 years ago - Stars: 21 - Forks: 3

pim97/scrappey-wrapper-python
An API wrapper for Scrappey.com written in Python (cloudflare, datadome bypass & solver)
Language: Python - Size: 248 KB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 20 - Forks: 0

biraj21/web-wanderer
A multi-threaded web crawler written in Python, utilizing ThreadPoolExecutor and Playwright to efficiently crawl dynamically rendered web pages and download them.
Language: Python - Size: 207 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 20 - Forks: 1

shdev/phpflashtext
Extract Keywords from sentence or Replace keywords in sentences. @ https://github.com/vi3k6i5/flashtext
Language: PHP - Size: 1.21 MB - Last synced at: 6 days ago - Pushed at: almost 6 years ago - Stars: 20 - Forks: 5

OwenOrcan/YiraBot-Crawler
YiraBot: Simplifying Web Scraping for All. A user-friendly tool for developers and enthusiasts, offering command-line ease and Python integration. Ideal for research, SEO, and data collection.
Language: Python - Size: 221 KB - Last synced at: 13 days ago - Pushed at: 6 months ago - Stars: 19 - Forks: 0

arkutils/Purlovia
Project Purlovia - digging up Ark data
Language: Python - Size: 3.03 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 19 - Forks: 9

peterstangl/svg2data
A Python module for reading data from a plot provided as SVG file.
Language: Python - Size: 63.5 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 19 - Forks: 3

dossma/telegram-downloader
Download all content from a Telegram channel
Language: Python - Size: 56.6 KB - Last synced at: 23 days ago - Pushed at: 24 days ago - Stars: 18 - Forks: 4

Fabiopf02/ofx-data-extractor
A module written in TypeScript that provides a utility to extract data from an OFX file in Node.js and Browser
Language: TypeScript - Size: 210 KB - Last synced at: 9 days ago - Pushed at: 3 months ago - Stars: 18 - Forks: 9

QuantumByteStudios/GitHubUserDataExtractor
A tool that displays information and received events about any user on GitHub straight on your terminal screen
Language: Python - Size: 52.3 MB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 18 - Forks: 3

NikhilaThota/CapstoneProject_House_Prices_Prediction
Understand the relationships between various features in relation with the sale price of a house using exploratory data analysis and statistical analysis. Applied ML algorithms such as Multiple Linear Regression, Ridge Regression and Lasso Regression in combination with cross validation. Performed parameter tuning, compared the test scores and suggested a best model to predict the final sale price of a house. Seaborn is used to plot graphs and scikit learn package is used for statistical analysis.
Language: Jupyter Notebook - Size: 7.91 MB - Last synced at: over 1 year ago - Pushed at: over 7 years ago - Stars: 18 - Forks: 13

arkutils/arkutils-website
The source for the arkutils website, home of a few Ark: Survival Evolved tools.
Language: Svelte - Size: 5.04 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 16 - Forks: 3

ppatrzyk/filmweb-export
Eksport danych z serwisu filmweb
Language: Python - Size: 368 KB - Last synced at: 12 days ago - Pushed at: 10 months ago - Stars: 16 - Forks: 2

ROBROICH/SAP_AND_COMMON_DATA_MODEL_DEMO
This demo describes the basic integration between S/4HANA and the Microsoft Common Data Model (Model)
Size: 4.24 MB - Last synced at: about 1 year ago - Pushed at: almost 5 years ago - Stars: 16 - Forks: 2

extralit/extralit
Fast and accurate systemic literature data extraction with LLM assistance
Language: Python - Size: 639 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 15 - Forks: 20

Capevace/data-wizard
Extract Structured Data from PDFs, Word Docs and Images. Embeddable directly into your application, regardless of the stack.
Language: JavaScript - Size: 134 MB - Last synced at: 1 day ago - Pushed at: 4 days ago - Stars: 14 - Forks: 3

robert-mcdermott/ollama-batch-cluster
Large Scale Batch Processing with Ollama
Language: Python - Size: 1.01 MB - Last synced at: 27 days ago - Pushed at: 6 months ago - Stars: 14 - Forks: 3

webmiddle/webmiddle
Node.js framework for modular web scraping and data extraction
Language: JavaScript - Size: 2.53 MB - Last synced at: 3 days ago - Pushed at: over 2 years ago - Stars: 14 - Forks: 2

floriancochard/extract-data-from-paper
A tool designed to extract numerical data from scanned historical weather documents.
Language: Python - Size: 151 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 13 - Forks: 2

sypht-team/sypht-node-client
A Nodejs client for the Sypht API
Language: JavaScript - Size: 62.5 KB - Last synced at: 28 days ago - Pushed at: about 2 years ago - Stars: 13 - Forks: 4

attogram/justrefs
Just Refs - extract just the references and related topics from any page on the English Wikipedia
Language: PHP - Size: 244 KB - Last synced at: 27 days ago - Pushed at: almost 5 years ago - Stars: 13 - Forks: 0

xingbow/SciDaEx
Data Extraction and Structuring Demo
Language: Python - Size: 1.09 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 12 - Forks: 0

MOUHASSINE-badreddine/MoroccanHousing-ETL
Moroccan housing data pipeline using scrapy, mongodb , zyte and digitalocean cloud
Language: Python - Size: 30.3 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 12 - Forks: 0

Jacobvs/ML-Music-Analyzer
This repository uses deep learning to determine real-time chords, bpm, and extract other features from music audio
Language: Python - Size: 63.2 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 12 - Forks: 1

StabRise/ScaleDP
ScaleDP is an Open-Source extension of Apache Spark for Document Processing
Language: Python - Size: 7.88 MB - Last synced at: about 18 hours ago - Pushed at: about 2 months ago - Stars: 11 - Forks: 0

webtap-ai/webtap
AI web scraping python library for efficient and reliable web scraping.
Size: 31.4 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 11 - Forks: 1

fabioms-br/azure-data-factory
Aprender Gerencimento de Dados ETL/ELT
Size: 80.1 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 11 - Forks: 1

sp1thas/book-depository-dataset π¦
A large collection of books, scraped from bookdepository.com
Language: Python - Size: 99.1 MB - Last synced at: 5 days ago - Pushed at: almost 2 years ago - Stars: 11 - Forks: 1

sypht-team/sypht-kotlin-client
A Kotlin client for the Sypht API
Language: Kotlin - Size: 136 KB - Last synced at: 28 days ago - Pushed at: about 2 years ago - Stars: 11 - Forks: 2

petrpatek/airbnb-scraper
Apify public actor for scraping Airbnb homes.
Language: JavaScript - Size: 761 KB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 11 - Forks: 6

dangvansam/detect-extract-table
Detect and Extract Table On Image (OpenCV)
Language: Python - Size: 610 KB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 11 - Forks: 2

crispyzingy/PDFExcelWordParser
:rocket:Parse PDFs, Word and Excel documents. Read, Create, Merge/Combine, Extract data from office documents.
Language: Python - Size: 514 KB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 11 - Forks: 8

rubypoddar/GitHub-User-Data-Fetcher
GitHub User Data Fetcher: A tool that extracts and analyzes comprehensive data from GitHub user profiles, including repositories, followers, and activity metrics, to provide actionable insights for recruiters, project managers, and developers.
Language: Python - Size: 17.6 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 10 - Forks: 0

fuchsia-programming/scrape π¦
When you need those jobs hypersonic π scrape πͺ
Language: JavaScript - Size: 2.79 MB - Last synced at: about 1 year ago - Pushed at: over 5 years ago - Stars: 10 - Forks: 3

pim97/scrappey.js
Scrappey.js: A versatile JavaScript wrapper for Scrappey API for solving Cloudflare, datadome, enabling seamless web scraping of anti-bot protected websites. Simplify data extraction with robust functionality and reliable results. Unlock valuable insights effortlessly. Get started with Scrappey
Language: JavaScript - Size: 124 KB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 9 - Forks: 4
