An open API service providing repository metadata for many open source software ecosystems.

GitHub / easonlai / chat_with_pdf_table

The contents of this repository showcase how to extract table data from a PDF file and preprocess it to facilitate word embedding. This preprocessing step enhances the readability of table data for language models and enables us to extract more contextual information from the tables.

JSON API: http://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/easonlai%2Fchat_with_pdf_table
PURL: pkg:github/easonlai/chat_with_pdf_table

Stars: 9
Forks: 4
Open issues: 1

License: None
Language: Jupyter Notebook
Size: 85.9 KB
Dependencies parsed at: Pending

Created at: over 1 year ago
Updated at: 9 months ago
Pushed at: over 1 year ago
Last synced at: 29 days ago

Topics: azure-openai, chroma, chromadb, embedding-models, embedding-vectors, embeddings, langchain, langchain-python, pdf, pdf-document-processor, pdf-parser, pdf-parsing, python, word-embeddings

    Loading...