An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: document-embedding

dissorial/doc-chatbot

Document chatbot — multiple files, topics, chat windows and chat history. Powered by GPT.

Language: TypeScript - Size: 2.54 MB - Last synced at: about 8 hours ago - Pushed at: almost 2 years ago - Stars: 854 - Forks: 146

ddangelov/RESTful-Top2Vec

Expose a Top2Vec model with a REST API.

Language: Python - Size: 243 KB - Last synced at: 11 days ago - Pushed at: over 2 years ago - Stars: 90 - Forks: 20

BobXWu/FASTopic

A Fast, Adaptive, Stable, and Transferable Topic Model (NeurIPS 2024)

Language: Python - Size: 1.67 MB - Last synced at: 14 days ago - Pushed at: 3 months ago - Stars: 94 - Forks: 6

ddangelov/Top2Vec

Top2Vec learns jointly embedded topic, document and word vectors.

Language: Python - Size: 83.4 MB - Last synced at: 19 days ago - Pushed at: 6 months ago - Stars: 3,028 - Forks: 375

cnuahs/semantic-history-search

A Chrome extension to provide semantic search over your browsing history.

Language: TypeScript - Size: 521 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

Tobsky/DocuQuery

This Streamlit application demonstrates the integration of ChatGroq (Llama3 model), OpenAIEmbeddings, and FAISS for document embedding and retrieval.

Language: Python - Size: 1.58 MB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

ehtisham-sadiq/Exploring-Word2Vec-and-Doc2Vec

Dive into the world of Word2Vec and Doc2Vec models to uncover insights and applications.

Language: Jupyter Notebook - Size: 2.93 KB - Last synced at: 2 months ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 0

EQTPartners/pause

🍊 PAUSE (Positive and Annealed Unlabeled Sentence Embedding), accepted by EMNLP'2021 🌴

Language: Python - Size: 83 KB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 25 - Forks: 1

chen0040/java-text-embedding

Word embedding in Java

Language: Java - Size: 55.7 KB - Last synced at: about 1 month ago - Pushed at: about 4 years ago - Stars: 7 - Forks: 1

samhavens/flair-as-service

Container-first, JSON-configurable, NLP REST service based on Flair

Language: Python - Size: 23.4 KB - Last synced at: about 1 year ago - Pushed at: over 5 years ago - Stars: 10 - Forks: 0

pprablanc/doc_embedding_topic_mod

Improving document embedding with weighted average of word embedding through topic modeling

Language: R - Size: 1.37 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

mathiasbruun/politician2vec

Utilities for learning, manipulating, and visualising politician embeddings in semantic space and inferring party positions.

Language: Python - Size: 181 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

jdenes/TopicEmbeddings

An open-source framework to create and test document embeddings using topic models.

Language: Python - Size: 208 MB - Last synced at: about 1 year ago - Pushed at: about 5 years ago - Stars: 1 - Forks: 0

marcomoldovan/hierarchical-language-modeling

We address the task of learning contextualized word, sentence and document representations with a hierarchical language model by stacking Transformer-based encoders on a sentence level and subsequently on a document level and performing masked token prediction.

Language: Jupyter Notebook - Size: 6.83 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 6 - Forks: 0

stko-lab/LD-Connect

LD Connect: A Linked Data Portal for IOS Press Scientometrics

Language: JavaScript - Size: 2.96 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

leyresv/Book_Recommendation_System

Content-based book recommendation system

Language: Python - Size: 24.4 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

maxoodf/tgnews

Telegram Data Clustering Contest (Bossy Gnu's submission )

Language: C++ - Size: 41 KB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 4 - Forks: 2

ChiaraDiBonaventura/covid_opinion

Applying NLP to understand people's sentiment about Covid-19 and Government actions in Italy, conditional on their political affiliation.

Language: Jupyter Notebook - Size: 13.4 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

eriknovak/python-text-embedding-microservice

Service for producing text representations via word embeddings

Language: Python - Size: 248 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 2

inimah/Neural-Language-Models

Experiments on Neural Language Embeddings

Language: Python - Size: 187 MB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

Related Keywords
document-embedding 20 word-embeddings 7 nlp 6 topic-modeling 5 natural-language-processing 3 word-embedding 3 semantic-search 3 machine-learning 2 sentence-encoder 2 top2vec 2 text-search 2 clustering 2 word-vectors 2 language-model 2 sentence-embeddings 2 deep-learning 2 openai 2 embeddings 2 langchain 2 nlp-machine-learning 2 knowledge-graph 1 geo-enrichment 1 coreference-resolution 1 knowledge-graph-embedding 1 ld-connect 1 linked-data 1 ontology-engineering 1 rdf 1 scientometrics 1 transformer 1 transfer-learning 1 representation-learning 1 pytorch 1 natural-language-understanding 1 information-retrieval 1 document-retrieval 1 attention-mechanism 1 topic-models 1 political-scaling 1 ideological-scaling 1 distributed-representations 1 translation-model 1 sequence-to-sequence 1 semi-supervised-learning 1 mono-language 1 cross-languages 1 cross-language-embeddings 1 binary-classification 1 bilingual-word-embedding 1 microservice 1 tsne-algorithm 1 tfidf-vectorizer 1 tfidf-text-analysis 1 tfidf 1 matrix-factorization 1 latent-dirichlet-allocation 1 data-visualization 1 data-preprocessing 1 data-cleaning 1 data-analysis 1 covid19-data 1 covid-19 1 word2vec 1 telegram 1 document-similarity 1 document-clustering 1 cpp 1 cosine-similarity 1 bookrecommendsystem 1 sparql 1 semantic-web 1 nlp-apis 1 topic-search 1 topic-modelling 1 text-semantic-similarity 1 sentence-transformers 1 pre-trained-language-models 1 bert 1 neural-topic-models 1 neural-topic-modeling 1 topic-model 1 text-similarity 1 semantic-search-engine 1 restful-api 1 rest-api 1 fastapi 1 vectorization 1 typescript 1 tailwindcss 1 reactjs 1 pinecone 1 pdf-processing 1 openai-api 1 nextjs 1 mongoose 1 gpt-4 1 gpt-3 1 chatbot 1 chat 1 kubernetes 1