GitHub / lh0x00 / docsifer
Docsifer is a powerful tool for converting various data formats into Markdown for applications such as indexing, text analysis, and more. It supports PDF, PowerPoint, Word, Excel, Images, Audio, HTML, and other text-based formats, and leverages LLMs to enhance performance.
JSON API: http://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lh0x00%2Fdocsifer
PURL: pkg:github/lh0x00/docsifer
Stars: 5
Forks: 0
Open issues: 0
License: mit
Language: Python
Size: 150 KB
Dependencies parsed at: Pending
Created at: 6 months ago
Updated at: 5 months ago
Pushed at: 5 months ago
Last synced at: 3 months ago
Topics: analysis, autogen, chunking, docsier, documents, emeddings, indexing, langchain, llama-index, markdown, markitdown, rag, text-embeddings, text-processing, vector-database