An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: extract-data

meltano/meltano

Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.

Language: Python - Size: 140 MB - Last synced at: about 14 hours ago - Pushed at: about 15 hours ago - Stars: 2,110 - Forks: 177

opendatalab/MinerU

A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。

Language: Python - Size: 125 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 35,316 - Forks: 2,879

MeltanoLabs/tap-stackexchange

Singer tap for the StackExchange API

Language: Python - Size: 1.19 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 3 - Forks: 1

pymupdf/PyMuPDF

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.

Language: Python - Size: 332 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 7,394 - Forks: 619

DocumindHQ/documind

Open-source platform for extracting structured data from documents using AI.

Language: JavaScript - Size: 1020 KB - Last synced at: 4 days ago - Pushed at: about 1 month ago - Stars: 1,326 - Forks: 48

MeltanoLabs/tap-dbt

Singer Tap for dbt API v2 built with the Meltano SDK

Language: Python - Size: 1010 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 12 - Forks: 7

elixir-crawly/crawly

Crawly, a high-level web crawling & scraping framework for Elixir.

Language: Elixir - Size: 2.8 MB - Last synced at: 4 days ago - Pushed at: 9 months ago - Stars: 1,026 - Forks: 118

Lamouchi-Bayrem/Document_Scanner

flask web app that scans documents using OpenCV

Language: Python - Size: 4.1 MB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 0 - Forks: 0

Agenty/scrapingai

Build web scraping agents using AI to auto-extract the data from websites, capture screenshot, generate pdf from URL and web crawling with Agenty

Language: TypeScript - Size: 209 KB - Last synced at: 7 days ago - Pushed at: over 1 year ago - Stars: 17 - Forks: 2

m92vyas/llm-reader

Turn Webpage to LLM friendly input text. Similar to Firecrawl and Jina Reader API. Makes RAG, AI web scraping, image & webpage links extraction easy.

Language: Python - Size: 92.8 KB - Last synced at: 20 days ago - Pushed at: 21 days ago - Stars: 191 - Forks: 14

OmkarPathak/ResumeParser

A simple resume parser used for extracting information from resumes

Language: Python - Size: 1.54 MB - Last synced at: 26 days ago - Pushed at: over 1 year ago - Stars: 303 - Forks: 172

tarqhilmarsiregar/fashion-scraping-etl

Implementasi ETL pipeline sederhana untuk web scraping data fashion, meliputi ekstraksi, pembersihan, transformasi, dan penyimpanan ke format CSV, Database postgreSQL, serta Google Sheets sebagai dasar insight data

Language: Python - Size: 6.84 KB - Last synced at: 28 days ago - Pushed at: 28 days ago - Stars: 0 - Forks: 0

guillaC/SQLiteDiskExplorer

SQLiteDiskExplorer enables you to explore, catalog, and batch extract SQLite files from disks and removable media.

Language: C# - Size: 400 KB - Last synced at: 1 day ago - Pushed at: 11 months ago - Stars: 17 - Forks: 0

bda-research/node-crawler

Web Crawler/Spider for NodeJS + server-side jQuery ;-)

Language: TypeScript - Size: 1.04 MB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 6,757 - Forks: 879

apurvasijaria/GooglePlayStoreScrape

Python module to extract Google Play store reviews and other information of any android app.

Language: Python - Size: 114 KB - Last synced at: 3 days ago - Pushed at: 10 months ago - Stars: 4 - Forks: 0

BaseMax/ExtractWord

Extract word(s) from the lines of the file.

Language: PHP - Size: 23.4 KB - Last synced at: 7 days ago - Pushed at: about 6 years ago - Stars: 4 - Forks: 1

ammaryasirnaich/PyReqify

This project is a lightweight Python module designed to generate the reqirements.txt file. It streamline dependency management by automatically extracting imported modules from python or juypter files and generating there requirements.txt

Language: Python - Size: 63.5 KB - Last synced at: 8 days ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

alienzhou/giframe

extract the first frame in GIF without reading whole bytes, support both browser and nodejs 📸

Language: TypeScript - Size: 6.46 MB - Last synced at: 8 days ago - Pushed at: over 5 years ago - Stars: 23 - Forks: 6

LivingSkySchoolDivision/MySchoolSaskIntegrations

Export definitions, and notes regarding how they work, for extracting data from MySchoolSask (an implementation of Follett Aspen)

Language: PowerShell - Size: 1.28 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 2 - Forks: 2

danschultzer/receipt-scanner

Receipt scanner extracts information from your PDF or image receipts - built in NodeJS

Language: JavaScript - Size: 3.54 MB - Last synced at: about 1 month ago - Pushed at: over 6 years ago - Stars: 299 - Forks: 56

Abimathi03/Android-JSON-App

An Android application that demonstrates how to extract employee information from a JSON string and display it on the screen using basic TextView widgets.

Language: Java - Size: 96.7 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

zebbern/JSX

🕵️‍♂️ | A Chrome extension that collects all JavaScript (.js) links, form endpoints, and all other links from a webpage with a single click!

Language: JavaScript - Size: 1.06 MB - Last synced at: 4 days ago - Pushed at: 3 months ago - Stars: 3 - Forks: 0

Dann-Oliv/Query-Results-To-Excel

Language: Python - Size: 7.81 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

laur89/docker-seedbox-rclone-fetch-extract

Dockerised service pulling data from remote seedbox & extracting archives

Language: Shell - Size: 841 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 14 - Forks: 3

Agenta-AI/job_extractor_template

Template for an AI application that extracts the job information from a job description using openAI functions and langchain

Language: Python - Size: 15.6 KB - Last synced at: 18 days ago - Pushed at: over 1 year ago - Stars: 10 - Forks: 1

ropensci/smapr

An R package for acquisition and processing of NASA SMAP data

Language: R - Size: 6.48 MB - Last synced at: 6 days ago - Pushed at: over 1 year ago - Stars: 85 - Forks: 25

Techcatchers/PyLyrics-Extractor

Get Lyrics for any songs by just passing in the song name (spelled or misspelled) in less than 2 seconds using this awesome Python Library.

Language: Python - Size: 17.6 KB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 57 - Forks: 18

umLu/tubeframes

A Python package for retrieving YouTube data, including video statistics, captions, and channel information. TubeData outputs results in a user-friendly pandas DataFrame format, making it ideal for data analysis workflows — especially in Jupyter Notebooks.

Language: Python - Size: 53.7 KB - Last synced at: 19 days ago - Pushed at: 3 months ago - Stars: 2 - Forks: 0

Mohdadnan2320/JobQuest

Full-Stack Developer (MERN) Assignment Jobsforce.ai LLC. To build a Job Recommendation System

Language: JavaScript - Size: 76.2 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

Arman2409/data-falcon

Web crawler

Language: TypeScript - Size: 1.62 MB - Last synced at: 8 days ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

aidayang/MinerU-OneClick

MinerU免安装部署一键启动整合包

Size: 49.8 KB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 7 - Forks: 0

ark-mod/ArkSavegameToolkitNet

Library for reading ARK Survival Evolved savegame files using C#.

Language: C# - Size: 5.85 MB - Last synced at: 4 days ago - Pushed at: over 2 years ago - Stars: 20 - Forks: 27

Mysteriza/Show-Saved-WiFi

Extract and manage saved Wi-Fi profiles on Windows with ease!

Language: Python - Size: 25.7 MB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

LMLK-seal/Printext

Printext is a lightweight, application that extracts text from images.

Language: Python - Size: 404 KB - Last synced at: 3 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

CairX/extract-colors-py

Extract colors from an image. Colors are grouped based on visual similarities using the CIE76 formula.

Language: Python - Size: 4.72 MB - Last synced at: 8 days ago - Pushed at: over 4 years ago - Stars: 68 - Forks: 20

DevExpress-Examples/winforms-dashboard-extract-data-source

This example demonstrates how to create the Extract data source, replace existing dashboard data sources with Extract data sources and update the Extract data file.

Language: C# - Size: 2.04 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 1

DevExpress-Examples/wpf-dashboard-how-to-update-extract-data-source-file

This example demonstrates how to update the extract data file at runtime.

Language: C# - Size: 2.45 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 1

asad70/Insider-Trading

This program extracts insider trading data from the sec website and stores it in excel file for the specified time frame.

Language: Python - Size: 98.6 KB - Last synced at: about 2 months ago - Pushed at: over 2 years ago - Stars: 53 - Forks: 15

serhaturtis/TOOL-FastBatchImageCrop

A simple UI tool to batch crop images to prepare datasets from images and videos.

Language: Python - Size: 955 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 27 - Forks: 3

ionictemplate-app/Social-Network-Data-Scraper-Pro

Easily scrape 10,000+ email messages in one hour, helping you quickly increase your customers Extracts data from (LinkedIn, Facebook, Instagram, Youtube, Pinterest, Twitter) Perfect search by specific Keywords Ready-to-use Social Network Data Scraper Software to get started instantly 100% Include source code and install file

Size: 45.9 KB - Last synced at: 4 months ago - Pushed at: about 1 year ago - Stars: 27 - Forks: 6

manucabral/pysoccerdata

A python package for extracting real-time soccer data from diverse online sources, providing essential statistics and insights.

Language: Python - Size: 32.2 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

msoap/html2data

Library and cli for extracting data from HTML via CSS selectors

Language: Go - Size: 7.15 MB - Last synced at: 3 months ago - Pushed at: 9 months ago - Stars: 69 - Forks: 3

pdfix/pdfix_sdk_example_npm

Example project demonstrating how to use PDFix SDK WebAssembly build in Node.js. Make PDF Files Accessible, Extract Data from PDF, Convert PDF to HTML, Fill-in PDF Form, Stamp PDF and more...

Language: JavaScript - Size: 882 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

rainergo/UASFRA-MS-KnowledgeGraph

Python project to read and use ESG data from XBRL-files to construct a neo4j Knowledge-Graph to be enriched with external data (Wikidata, DBPedia). An OpenAI-attached chat bot is used to query the Graph.

Language: HTML - Size: 158 MB - Last synced at: 3 months ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

peterbencze/serritor

Serritor is an open source web crawler framework built upon Selenium and written in Java. It can be used to crawl dynamic web pages that require JavaScript to render data.

Language: Java - Size: 969 KB - Last synced at: about 2 months ago - Pushed at: almost 3 years ago - Stars: 32 - Forks: 15

DaoMinhThong/E-commerce_SQL_project

Language: Jupyter Notebook - Size: 1020 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

yuanxu-li/html-table-extractor

extract data from html table

Language: Python - Size: 31.3 KB - Last synced at: 2 months ago - Pushed at: about 5 years ago - Stars: 86 - Forks: 22

pdfix/pdfix_sdk_example_cpp

Make PDF Files Accessible, Extract Data from PDF, Convert PDF to HTML, Fill-in PDF Form, Stamp PDF and more...

Language: C++ - Size: 21.4 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 20 - Forks: 4

steffegit/VeridionAssignment

Address Extraction Challenge for Veridion Internship

Language: Python - Size: 271 KB - Last synced at: about 7 hours ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

ArthurSilvaDantas/ExtractJSON

Aplicação Web para extrair informações de um arquivo JSON.

Language: JavaScript - Size: 49.5 MB - Last synced at: 3 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

kormanowsky/jextract

Allows extracting data from DOM

Language: JavaScript - Size: 140 KB - Last synced at: 3 days ago - Pushed at: almost 5 years ago - Stars: 7 - Forks: 1

pdfix/pdfix_sdk_example_java

PDFix SDK samples for Java Maven. PDF manipulation, content extraction, conversion , accessibility and more...

Language: Java - Size: 20.7 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 4 - Forks: 2

drisskhattabi6/Meteo-Data-Mining

This repo contains using Data Mining Techniques to analyze meteorological (meteo) data. The objective is to extract meaningful insights and patterns from the data that can aid in understanding weather phenomena and predicting future weather conditions.

Language: Jupyter Notebook - Size: 16.1 MB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

izikeros/todo-extractor

Script for extracting TODO notes from the text file

Language: Python - Size: 38.1 KB - Last synced at: 3 days ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 0

Alapipapi/MinerU Fork of opendatalab/MinerU

A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。

Language: Python - Size: 103 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

floriancochard/extract-data-from-paper

A tool designed to extract numerical data from scanned historical weather documents.

Language: Python - Size: 151 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 13 - Forks: 2

rakhi9932/Amazon_Analysis

Amazon sales data analysis interactive dashboard

Size: 6.65 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

KEZIMAdynamics/DokuExtractor

Easily extract data from PDF documents

Language: C# - Size: 74.7 MB - Last synced at: 6 days ago - Pushed at: 7 months ago - Stars: 10 - Forks: 5

CatherineFramework/mercy

Mercy is an open-source Rust crate and CLI designed for building cybersecurity utilities and projects.

Language: Rust - Size: 548 KB - Last synced at: 2 days ago - Pushed at: about 1 year ago - Stars: 11 - Forks: 0

osh/gr-eventstream

gr-eventstream is a set of GNU Radio blocks for creating precisely timed events and either inserting them into, or extracting them from normal data-streams precisely. It allows for the definition of high speed time-synchronous c++ burst event handlers, as well as bridging to standard GNU Radio Async PDU messages with precise timing easily.

Language: C++ - Size: 842 KB - Last synced at: 2 months ago - Pushed at: over 7 years ago - Stars: 44 - Forks: 28

darkskygit/ChatImporter

import chat records from your im and store into single sqlite database

Language: Rust - Size: 494 KB - Last synced at: 2 months ago - Pushed at: 8 months ago - Stars: 11 - Forks: 1

labteral/bluebird 📦

Unofficial Python client for Twitter

Language: Python - Size: 112 KB - Last synced at: 23 days ago - Pushed at: over 4 years ago - Stars: 43 - Forks: 14

Zuriel-HR/PEtoJSON

Extracción de características de archivos en formato portable ejecutable a archivo en formato JSON

Language: Python - Size: 26.4 KB - Last synced at: 2 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

walidbosso/R_Data_mining

Extract knowledge from a data using different techniques, including Association Rules Hierarchical Agglomerative Clustering (HAC) K-means Clustering Decision Trees

Language: R - Size: 9.75 MB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

tamK-kol/Chatbot-Q-A-in-Invoice-Extractor-LLM

The Invoice Extractor markdown is a specific format used to extract relevant information from invoices. It's a standardized way to annotate invoices with key information, making it easier to automate the extraction process.

Language: Python - Size: 348 KB - Last synced at: 4 months ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

orvill-as/extract-email

This program prompts the user for input and output file paths, extracts email addresses from the input file using a regular expression, and writes the email addresses to the output file. It also measures and prints the elapsed time taken to run the program.

Language: Python - Size: 1000 Bytes - Last synced at: 8 months ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 0

Jelared/Project-GEIPAN

Basic data extraction from website GEIPAN

Language: Jupyter Notebook - Size: 85.9 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

OObasuyi/DoctorCandy

Extract IPs and URLs from docx and PDF files

Language: Python - Size: 57.6 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

simplyYan/cutinfo

go library to extract information based on references

Language: Go - Size: 18.6 KB - Last synced at: 3 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

mhismail/PinPoint-Digitizer

Open source digitizer application to extract data from plots

Language: SCSS - Size: 464 MB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 20 - Forks: 1

Zeeshanahmad4/NLP--Data-extraction-Microsoft-Word-documents-into-a-CSV

Language: Jupyter Notebook - Size: 1.37 MB - Last synced at: 3 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

pdfix/pdfix_sdk_example_dotnet

Make PDF Files Accessible, Extract Data from PDF, Convert PDF to HTML, Fill-in PDF Form, Stamp PDF and more...

Language: C# - Size: 26.9 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 12 - Forks: 6

timothy-bartlett/PyMuPDF

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.

Language: Python - Size: 288 MB - Last synced at: 3 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

Thee-Unruly/Optimal-Character-Recognition

Extracting info from documents / images

Language: Jupyter Notebook - Size: 327 KB - Last synced at: 4 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

NostalgicCoder/ReadExcelFile.Lib

Extracts data from a spreadsheet and outputs its contents to a '.SQL' file. Data extraction tool useful for people using SQL Server Express with no access to SSMS addon and import wizard.

Language: C# - Size: 378 KB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

loglux-lab/ip-extractor

ip-extractor.sh uses nano to extract IP addresses. Results are stored in 'hosts', with duplicates removed. Ideal for sifting through logs and data-rich files.

Language: Shell - Size: 2.93 KB - Last synced at: 4 months ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

ShubhRanpara/Auto-Filler

This repository contains my team's internship project work at Flexbox Technologies. We have developed a system that fills the patient details form automatically with the patient data extracted from pdf file.

Language: Python - Size: 6.82 MB - Last synced at: 4 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

ShubhRanpara/Auto-Filler-Web

This repository contains my internship project work at Flexbox Technologies. I have developed a system that fills the patient details form automatically with the patient data extracted from pdf file.

Language: HTML - Size: 7.26 MB - Last synced at: 8 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

slotix/dataflowkit

Extract structured data from web sites. Web sites scraping.

Language: Go - Size: 4.61 MB - Last synced at: 11 months ago - Pushed at: over 2 years ago - Stars: 654 - Forks: 80

geanpannellini/real_estate_property_transactions

A repository containing comprehensive data on real estate property transactions, encompassing transaction details, property characteristics, and market insights for analytical purposes in the real estate industry.

Size: 58.6 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

shubhambhandari29/autoMail

Language: HTML - Size: 9.12 MB - Last synced at: 11 months ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

isaacmg/fb_scraper

FBLYZE is a Facebook scraping system and analysis system.

Language: Jupyter Notebook - Size: 2.61 MB - Last synced at: 3 months ago - Pushed at: about 4 years ago - Stars: 64 - Forks: 21

Anjali1751/Extracting-data-of-scanned-images

Extracting Data Of Scanned Images

Language: Python - Size: 607 KB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

Warard/WordExtractor

Python program which extracts some data from a specific Word document used in my company. Without this program data used to be extracted manually, opening hundred of Word documents one by one to copy/past some informations on an Excel file. Now it is fully automatic.

Language: Python - Size: 7.81 KB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

Qyfashae/Extract_Off_Data

Extract Data from offline file. Ex: Emails, Phone Numbers, Links etc.

Language: Python - Size: 9.77 KB - Last synced at: 4 months ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

jehad-halahla/linux_project

a linux lab bash project that focuses on automation and text extraction

Language: Shell - Size: 17.6 KB - Last synced at: 2 months ago - Pushed at: over 1 year ago - Stars: 9 - Forks: 0

FuuToru/Face-Recognition-using-Machine-Learning

This is a repo to face recognition on 5 famous people

Language: Jupyter Notebook - Size: 52.7 MB - Last synced at: 7 months ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

jeffersonsalvador/cnpj-extractor

🇺🇸 Solution for importing and analyzing public Brazilian business data (CNPJ). 🇧🇷 Processamento de Dados CNPJ: Uma solução robusta e conteinerizada para importação e análise de dados empresariais brasileiros (CNPJ).

Language: PHP - Size: 225 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 0

leticiamn/WebScrapperAiPapper

Web scraping para extrair dados de produtos, tradução utilizando o LibreTranslate, tratamento dos dados e classificação de produtos em categorias utilizando um modelo de IA treinado com TensorFlow .

Language: Python - Size: 67.6 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Chauhan-Aniket/Extract-Numbers

Extract numbers from string/file

Language: JavaScript - Size: 1000 KB - Last synced at: about 1 year ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

Joffreybvn/mailxtract

Language: HTML - Size: 39.1 KB - Last synced at: about 1 year ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

Bessouat40/pdf-region-picker

A project to select only part of a PDF file. It's usefull when you want to extract informations with some python library like fitz.

Language: JavaScript - Size: 3.92 MB - Last synced at: 4 months ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

edgeryders/ebook-utils

Extract metadata from Project Gutenberg e-books, and other utilities.

Language: PHP - Size: 18.6 KB - Last synced at: about 1 year ago - Pushed at: almost 7 years ago - Stars: 1 - Forks: 0

RagavendranMRN/WebScraper-WebCrawling

This repo contains the script used by me to extract data out of webpages (web scraping) using a python script that I wrote using BeautifulSoup

Language: Java - Size: 22.5 KB - Last synced at: about 1 year ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 1

Lenashri7/Excel-Automation

This UiPath project automates the process of extracting data from an Excel sheet and filling out a Google Form with the extracted information.

Size: 6.84 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Mamdouh66/Extracty

Extract structured data from any unstructured web page

Language: Python - Size: 258 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 9 - Forks: 1

iwoodsawyer/qdigiplot

QT GUI program for extracting data points from scanned image file of plot

Language: C++ - Size: 153 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 1

MrAssembler0x00/PyTypeExtension

Python📦module for data manipulation & extraction using standardized formats📄.

Size: 0 Bytes - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

dmryutov/parsers

Collection of parsers written in PHP, Python

Language: PLpgSQL - Size: 108 MB - Last synced at: about 1 year ago - Pushed at: over 7 years ago - Stars: 13 - Forks: 7

bjorn3/goodgame_empire_import 📦

A importer for goodgame empire

Language: Rust - Size: 2.41 MB - Last synced at: 1 day ago - Pushed at: about 5 years ago - Stars: 5 - Forks: 2