GitHub / easonlai / chat_with_pdf_table
The contents of this repository showcase how to extract table data from a PDF file and preprocess it to facilitate word embedding. This preprocessing step enhances the readability of table data for language models and enables us to extract more contextual information from the tables.
JSON API: http://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/easonlai%2Fchat_with_pdf_table
PURL: pkg:github/easonlai/chat_with_pdf_table
Stars: 9
Forks: 4
Open issues: 1
License: None
Language: Jupyter Notebook
Size: 85.9 KB
Dependencies parsed at: Pending
Created at: over 1 year ago
Updated at: 9 months ago
Pushed at: over 1 year ago
Last synced at: 29 days ago
Topics: azure-openai, chroma, chromadb, embedding-models, embedding-vectors, embeddings, langchain, langchain-python, pdf, pdf-document-processor, pdf-parser, pdf-parsing, python, word-embeddings