Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub / matebenyovszky / algolia-pdf-crawler

This repository contains a PDF crawler that extracts text from PDF documents (currently using Microsoft Read model) and uploads it to Algolia index.

JSON API: https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/matebenyovszky%2Falgolia-pdf-crawler

Stars: 0
Forks: 0
Open Issues: 0

License: other
Language: Python
Repo Size: 49.8 KB
Dependencies: 4

Created: 7 months ago
Updated: 6 months ago
Last pushed: 6 months ago
Last synced: 6 months ago

Topics: algolia, algolia-api, algolia-search, azure-ai, beautifulsoup, bs4, ocr, pdf

Files
    Loading...
    Readme
    Loading...
    Dependencies
    Dockerfile docker