Ecosyste.ms: Repos
An open API service providing repository metadata for many open source software ecosystems.
GitHub / matebenyovszky / algolia-pdf-crawler
This repository contains a PDF crawler that extracts text from PDF documents (currently using Microsoft Read model) and uploads it to Algolia index.
JSON API: https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/matebenyovszky%2Falgolia-pdf-crawler
Stars: 0
Forks: 0
Open Issues: 0
License: other
Language: Python
Repo Size: 49.8 KB
Dependencies:
4
Created: 7 months ago
Updated: 6 months ago
Last pushed: 6 months ago
Last synced: 6 months ago
Topics: algolia, algolia-api, algolia-search, azure-ai, beautifulsoup, bs4, ocr, pdf
Files
Dependencies
- python 3.9 build