anyparser_core

Anyparser Python SDK for RAG/ETL Pipelines - File Content Extraction. Supports extraction from various file formats including PDF, Microsoft Office documents, OCR/Image to Text, Audio to Text, and Website to Text.

JSON API: http://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/anyparser%2Fanyparser_core

Stars: 1
Forks: 1
Open issues: 0

License: apache-2.0
Language: Python
Size: 296 KB
Dependencies parsed at: Pending

Created at: 3 months ago
Updated at: 2 months ago
Pushed at: 2 months ago
Last synced at: 24 days ago

Topics: cache-augmented-generation, crawler, crewai, etl-framework, etl-pipeline, knowledge-graph, knowledgebase, langchain, langgraph, llamaindex, ms-office, n8n, ocr, openai, pdf, python, rag, retrieval-augmented-generation, search-engine, typescript

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Repos

GitHub / anyparser / anyparser_core