Topic: "extract-information"
fhamborg/news-please
news-please - an integrated web crawler and information extractor for news that just works
Language: Python - Size: 2.99 MB - Last synced at: about 6 hours ago - Pushed at: 30 days ago - Stars: 2,211 - Forks: 432

OP-Engineering/link-preview-js
⛓ Extract web links information: title, description, images, videos, etc. [via OpenGraph], runs on mobiles and node.
Language: TypeScript - Size: 1.03 MB - Last synced at: 10 days ago - Pushed at: 2 months ago - Stars: 794 - Forks: 128

gkiril/oie-resources
A curated list of Open Information Extraction (OIE) resources: papers, code, data, etc.
Size: 1.26 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 477 - Forks: 58

danschultzer/receipt-scanner
Receipt scanner extracts information from your PDF or image receipts - built in NodeJS
Language: JavaScript - Size: 3.54 MB - Last synced at: 17 days ago - Pushed at: over 6 years ago - Stars: 299 - Forks: 56

garyelephant/pygrok
python implementation of jordansissel's grok regular expression library
Language: Python - Size: 66.4 KB - Last synced at: 18 days ago - Pushed at: over 1 year ago - Stars: 277 - Forks: 75

opensemanticsearch/open-semantic-etl
Python based Open Source ETL tools for file crawling, document processing (text extraction, OCR), content analysis (Entity Extraction & Named Entity Recognition) & data enrichment (annotation) pipelines & ingestor to Solr or Elastic search index & linked data graph database
Language: Python - Size: 615 KB - Last synced at: 17 days ago - Pushed at: over 2 years ago - Stars: 267 - Forks: 72

schollz/pluck
Pluck text in a fast and intuitive way :rooster:
Language: Go - Size: 6.55 MB - Last synced at: 4 days ago - Pushed at: over 5 years ago - Stars: 215 - Forks: 6

liaoziyang/OpenIE-Spider
Extract Information from web corpus using Open Information Extraction.
Language: Python - Size: 5.86 KB - Last synced at: 10 months ago - Pushed at: almost 8 years ago - Stars: 175 - Forks: 72

buiquangmanhhp1999/extract-information-from-identity-card
From identity card image, this repo detect 4 corners, align by OpenCV, then detect word in image and recognize word by Transformer OCR.
Language: Python - Size: 111 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 119 - Forks: 52

uma-pi1/minie
An open information extraction system that provides compact extractions
Language: Java - Size: 4.48 MB - Last synced at: 6 days ago - Pushed at: about 3 years ago - Stars: 91 - Forks: 27

OpenJarbas/simple_NER
simple rule based named entity recognition
Language: Python - Size: 2.1 MB - Last synced at: about 1 month ago - Pushed at: about 3 years ago - Stars: 43 - Forks: 9

bagrii/address_extraction
Extracting addresses from text
Language: Python - Size: 5.44 MB - Last synced at: 10 months ago - Pushed at: about 7 years ago - Stars: 39 - Forks: 24

kanjirz50/python-extractcontent3 Fork of petitviolet/python-extractcontent
HTMLから本文抽出を行うextractcontent.rb の Python3版
Language: HTML - Size: 43 KB - Last synced at: 4 days ago - Pushed at: almost 6 years ago - Stars: 23 - Forks: 3

carlospolop/easy_stegoCTF
Brutteforce for stego CTFs
Language: Python - Size: 77.1 KB - Last synced at: 19 days ago - Pushed at: over 2 years ago - Stars: 16 - Forks: 6

YW-Ma/MBI
Morphological Building Index, extract Buildings from a high-resolution top view image.
Language: MATLAB - Size: 3.39 MB - Last synced at: about 1 year ago - Pushed at: over 5 years ago - Stars: 14 - Forks: 2

dewshr/NCBI-GenBank-file-parser
This program can be used to parse the NCBI GenBank file to create a tabulated csv file.
Language: Python - Size: 10.7 KB - Last synced at: over 1 year ago - Pushed at: about 6 years ago - Stars: 10 - Forks: 11

Ardevop-sk/nlp-tools
Natural Language Processing is process in which computer understand human language. This library provides a set of tools to understand and extract information from unstructured text in Slovak language.
Language: Java - Size: 3.05 MB - Last synced at: about 1 year ago - Pushed at: almost 3 years ago - Stars: 8 - Forks: 0

Dovyski/payload-info-action
Github Action to extract info from the webhook payload object using jq filters.
Language: JavaScript - Size: 2.5 MB - Last synced at: 9 days ago - Pushed at: 5 months ago - Stars: 6 - Forks: 7

mohammed-rampurawala/extract_android_data
Language: Java - Size: 91.8 KB - Last synced at: over 1 year ago - Pushed at: about 7 years ago - Stars: 5 - Forks: 3

danielgp/sharepoint-extractor
Extract information from online SharePoint using nodejs framework
Language: JavaScript - Size: 265 KB - Last synced at: 23 days ago - Pushed at: 23 days ago - Stars: 4 - Forks: 0

RocktimRajkumar/ATS
:trophy: An applicant tracking system (ATS) is a software application that enables the electronic handling of recruitment and hiring needs. Corporate recruiters or hiring managers can then search and sort through the resumes in a number of ways, depending on the needs
Language: Python - Size: 1.82 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 4 - Forks: 3

VictorAlessander/Smith
A toolkit to make easy web scraping the world.
Language: Python - Size: 107 KB - Last synced at: 9 days ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 1

arevish/Brain-wave-detection-system
Project is focused on the detection and extraction of a brain wave signal with the help of analog as well as digital circuitry. Using active electrodes on human scalp, the brain signals were fed into a series of hardware and software stages. Simple conscious movements such as blinking caused a change in the detected waveform. Although the project was not successful in discriminating between different motions or utilizes the signal to control an electrical device, the team was able to successfully separate and display the alpha waves after filtering off all associated unwanted signals.
Size: 6.75 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 0

avidito/swifind
Web scraping scripting language and toolset.
Language: Python - Size: 171 KB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 3 - Forks: 0

RobbiNespu/MyKad 📦
A simple packagist to extract information from Malaysian Identity Card (MyKad)
Language: PHP - Size: 14.6 KB - Last synced at: 3 months ago - Pushed at: about 4 years ago - Stars: 3 - Forks: 1

BaseMax/ExtractWord
Extract word(s) from the lines of the file.
Language: PHP - Size: 23.4 KB - Last synced at: 8 days ago - Pushed at: about 6 years ago - Stars: 3 - Forks: 1

friedrith/natural-script
Script language to parse english expressions.
Language: JavaScript - Size: 1.99 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

jalal246/corename
Automatically extracts packages root name for monorepos
Language: JavaScript - Size: 111 KB - Last synced at: 29 days ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

trinhdoduyhungss/Plant_keywords_extraction
In this project, you will learn how to extract keywords or words that are more important than others in your sentence easily and implement them in an actual project. It is the plant keyword extraction project, the plant characterization word.
Language: JavaScript - Size: 945 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 0

praveengadiyaram369/Antlr_Java_Repository_Miner
Mining Software Repositories Project to analyze Java projects to extract information regarding the evolution of antlr4 patterns
Language: Python - Size: 2.08 MB - Last synced at: almost 2 years ago - Pushed at: about 5 years ago - Stars: 2 - Forks: 0

praveengadiyaram369/MSR-2
Mining Software Repositories project to analyze antlr4 projects and extract information regarding enter, exit and visit methods
Language: Python - Size: 892 KB - Last synced at: almost 2 years ago - Pushed at: about 5 years ago - Stars: 2 - Forks: 1

MrShoenel/mkvinfo2json
MkvInfo2Json is a tool to recursively scan directories for MKV files and extract meta-information using mkvinfo that is then stored as JSON.
Language: JavaScript - Size: 328 KB - Last synced at: 2 months ago - Pushed at: about 8 years ago - Stars: 2 - Forks: 0

drisskhattabi6/Chat-with-PDF-Locally
Chat with PDF locally: An advanced chatbot using Ollama LLMs to interactively extract information from PDFs, Using Streamlit & Ollama and langchain
Language: Python - Size: 11.4 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 1 - Forks: 0

Agenta-AI/job_extractor_template
Template for an AI application that extracts the job information from a job description using openAI functions and langchain
Language: Python - Size: 15.6 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

Purviharniya/SPAR10
Language: Jupyter Notebook - Size: 99.5 MB - Last synced at: about 1 year ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 3

rattletat/yt-scraper
A simple command utility to extract information from the YouTube API v3 for scientific purposes.
Language: Python - Size: 433 KB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 1 - Forks: 1

Herrmenn/alexa-skill
Language: Java - Size: 107 KB - Last synced at: almost 2 years ago - Pushed at: over 6 years ago - Stars: 1 - Forks: 0

yeremi/stopwords
A lightweight and efficient PHP library tailored for developers working on Natural Language Processing (NLP) tasks in Brazilian Portuguese.
Language: PHP - Size: 167 KB - Last synced at: 24 days ago - Pushed at: 4 months ago - Stars: 0 - Forks: 1

CodewithAbhi7/Chat-with-Various-type-Documents-using-Meta-AI
This Streamlit application allows users to upload multiple files (PDFs, DOCX, HTML, and images) and extract text from them. The extracted content is processed into text chunks, embedded into a FAISS vector store, and used for question-answering with the help of the Meta AI API.
Language: Python - Size: 22.5 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

vincenzoaltavilla/monitora-messaggi
Expandable program which allows an admin to check the interaction trend of every user in an e-learning platform, using the logs. This in order to periodically track its dinamicity.
Language: Python - Size: 101 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

amit2014/PDF-Extractor
PDF Extractor, a powerful Python application that simplifies the extraction of highlighted text from PDF files.
Language: HTML - Size: 26.1 MB - Last synced at: about 1 year ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

Abhilashayagyaseni/Data-modelling-project
Visualizing and extracting insights from several different sets of related data
Size: 2.76 MB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

marlonmoreira1/projeto4_telecom_churn
In this project my task is to create a model of Logistic Regression, to extract information if a customer is going to cancel his plan (Yes or No) and the probability of one option or the other.
Language: Jupyter Notebook - Size: 895 KB - Last synced at: about 1 year ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

SymbolofMoon/TweepyStream
A CLI Tool based on python
Language: Python - Size: 6.84 KB - Last synced at: about 1 year ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

trsavi/Polovni-Automobili-Webscraper
Script that extracts information from car ads from website and collects them in mysql database for later use.
Language: Python - Size: 1.01 MB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 1

FEMessage/parse-resume-server
🔍A node.js server that parses resume and extracts information
Language: JavaScript - Size: 83 KB - Last synced at: 2 months ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 1

SkriptInsight/DocExtractionTool
Tool to extract information from Skript and its addons
Language: Java - Size: 5.16 MB - Last synced at: 8 days ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 0

BaseMax/SmartFilter
A Smart Filtering to keep and remove the character or words of the text. (SOON)
Language: PHP - Size: 102 KB - Last synced at: 8 days ago - Pushed at: almost 6 years ago - Stars: 0 - Forks: 2

MrShoenel/mediainfo2web
A simple Angular-based tool that can run through a directory of media-files analyzed by mediainfo and show aggregated information about them.
Size: 1000 Bytes - Last synced at: 2 months ago - Pushed at: about 6 years ago - Stars: 0 - Forks: 0
