embedding-models | Topic | Ecosyste.ms: Repos

Separius/awesome-sentence-embedding 📦

A curated list of pretrained sentence and word embedding models

Language: Python - Size: 282 KB - Last synced at: 18 days ago - Pushed at: about 4 years ago - Stars: 2,260 - Forks: 263

Hironsan/awesome-embedding-models

A curated list of awesome embedding models tutorials, projects and communities.

Language: Jupyter Notebook - Size: 47.9 KB - Last synced at: 17 days ago - Pushed at: over 6 years ago - Stars: 1,796 - Forks: 251

ContextualAI/gritlm

Generative Representational Instruction Tuning

Language: Jupyter Notebook - Size: 11.3 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 654 - Forks: 47

StarlightSearch/EmbedAnything

Production-ready Inference, Ingestion and Indexing built in Rust 🦀

Language: Rust - Size: 30.9 MB - Last synced at: 3 days ago - Pushed at: 19 days ago - Stars: 648 - Forks: 56

Sujit-O/pykg2vec

Python library for knowledge graph embedding and representation learning.

Language: Python - Size: 9.29 MB - Last synced at: 20 days ago - Pushed at: about 4 years ago - Stars: 614 - Forks: 113

marl/openl3

OpenL3: Open-source deep audio and image embeddings

Language: Jupyter Notebook - Size: 687 MB - Last synced at: about 2 months ago - Pushed at: about 2 years ago - Stars: 517 - Forks: 60

BBC-Esq/VectorDB-Plugin

Plugin that lets you ask questions about your documents including audio and video files.

Language: Python - Size: 34.4 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 340 - Forks: 44

mana-ysh/knowledge-graph-embeddings 📦

Implementations of Embedding-based methods for Knowledge Base Completion tasks

Language: Python - Size: 10.2 MB - Last synced at: 9 days ago - Pushed at: over 4 years ago - Stars: 259 - Forks: 63

CVxTz/image_search_engine

Image search engine

Language: Python - Size: 1.75 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 226 - Forks: 41

lgalke/vec4ir

Word Embeddings for Information Retrieval

Language: Python - Size: 965 KB - Last synced at: 3 months ago - Pushed at: almost 2 years ago - Stars: 225 - Forks: 42

spcl/ncc

Neural Code Comprehension: A Learnable Representation of Code Semantics

Language: Python - Size: 9.16 MB - Last synced at: about 2 months ago - Pushed at: 8 months ago - Stars: 213 - Forks: 51

yusufhilmi/client-vector-search

A client side vector search library that can embed, store, search, and cache vectors. Works on the browser and node. It outperforms OpenAI's text-embedding-ada-002 and is way faster than Pinecone and other VectorDBs.

Language: TypeScript - Size: 314 KB - Last synced at: 9 days ago - Pushed at: about 1 year ago - Stars: 210 - Forks: 14

akutuzov/webvectors

Web-ify your word2vec: framework to serve distributional semantic models online

Language: Python - Size: 4.85 MB - Last synced at: 7 days ago - Pushed at: 5 months ago - Stars: 200 - Forks: 47

mangopy/tool-retrieval-benchmark

Official code for ACL2025 "🔍 Retrieval Models Aren’t Tool-Savvy: Benchmarking Tool Retrieval for Large Language Models"

Language: JavaScript - Size: 3.3 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 176 - Forks: 2

jgraving/selfsne

Self-Supervised Noise Embeddings (Self-SNE)

Language: Jupyter Notebook - Size: 2.79 MB - Last synced at: 22 days ago - Pushed at: 3 months ago - Stars: 158 - Forks: 13

formath/tensorflow-predictor-cpp

tensorflow prediction using c++ api

Language: Python - Size: 94.7 KB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 146 - Forks: 60

ALucek/QuicKB

Optimize Document Retrieval with Fine-Tuned KnowledgeBases

Language: Python - Size: 1.63 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 107 - Forks: 21

p768lwy3/torecsys

ToR[e]cSys is a PyTorch Framework to implement recommendation system algorithms, including but not limited to click-through-rate (CTR) prediction, learning-to-ranking (LTR), and Matrix/Tensor Embedding. The project objective is to develop an ecosystem to experiment, share, reproduce, and deploy in real-world in a smooth and easy way.

Language: Python - Size: 6.42 MB - Last synced at: 12 days ago - Pushed at: over 3 years ago - Stars: 104 - Forks: 18

shobrook/weightgain

Train an adapter for any embedding model in under a minute

Language: Python - Size: 544 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 98 - Forks: 2

shamspias/langchain-chat

langchain-chat is an AI-driven Q&A system that leverages OpenAI's GPT-4 model and FAISS for efficient document indexing. It loads and splits documents from websites or PDFs, remembers conversations, and provides accurate, context-aware answers based on the indexed data. Easy to set up and extend.

Language: Python - Size: 1.34 MB - Last synced at: 3 months ago - Pushed at: almost 2 years ago - Stars: 86 - Forks: 17

HITsz-TMG/KaLM-Embedding

Code for KaLM-Embedding models

Language: Python - Size: 319 KB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 81 - Forks: 6

ikergarcia1996/MetaVec

A monolingual and cross-lingual meta-embedding generation and evaluation framework

Language: Python - Size: 69.3 KB - Last synced at: 3 months ago - Pushed at: about 3 years ago - Stars: 80 - Forks: 5

kaushalshetty/Positional-Encoding

Encoding position with the word embeddings.

Language: Jupyter Notebook - Size: 154 KB - Last synced at: over 2 years ago - Pushed at: about 7 years ago - Stars: 79 - Forks: 13

D2KLab/entity2vec

Generates a set of property-specific entity embeddings from knowledge graphs using node2vec

Language: Python - Size: 23.7 MB - Last synced at: almost 2 years ago - Pushed at: almost 4 years ago - Stars: 77 - Forks: 24

ART-Group-it/KERMIT

🐸 KERMIT - A lightweight library to encode and interpret Universal Syntactic Embeddings

Language: JavaScript - Size: 14.6 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 56 - Forks: 8

datquocnguyen/STransE

STransE: a novel embedding model of entities and relationships in knowledge bases (NAACL 2016)

Language: C++ - Size: 36.4 MB - Last synced at: over 2 years ago - Pushed at: over 7 years ago - Stars: 51 - Forks: 16

RoyZhengGao/edge2vec

Learning node representation using edge semantics

Language: Python - Size: 9.65 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 48 - Forks: 21

oracle-samples/ai-optimizer

GenAI/RAG Optimizer and Toolkit for experimentation using Oracle Database AI Vector Search

Language: Python - Size: 25.9 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 47 - Forks: 23

user1342/Tweezer

A binary analysis tool for identifying unknown function names, using a word-2-vec model

Language: Python - Size: 11.5 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 47 - Forks: 5

Glaciohound/VCML

PyTorch implementation of paper "Visual Concept-Metaconcept Learner", NeruIPS 2019

Language: Python - Size: 2.43 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 47 - Forks: 7

alisonbma/aiSFX

Representation Learning for the Automatic Indexing of Sound Effects Libraries (ISMIR 2022): Deep audio embeddings pre-trained on UCS & Non-UCS-compliant datasets.

Language: Python - Size: 59.6 KB - Last synced at: 21 days ago - Pushed at: about 2 years ago - Stars: 45 - Forks: 4

worldbank/GISTEmbed

GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embeddings

Language: Python - Size: 1.3 MB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 42 - Forks: 3

nsrinidhibhat/gradio_RAG

Code and resources showcasing the Retrieval-Augmented Generation (RAG) technique, a solution for enhancing data freshness in Large Language Models (LLMs). Incorporate up-to-date external knowledge into LLM-generated responses. Additionally, this repository includes a Gradio-based user interface for seamless model deployment.

Language: Python - Size: 779 KB - Last synced at: 12 months ago - Pushed at: almost 2 years ago - Stars: 35 - Forks: 13

databricks-industry-solutions/product-search

Semantic product search on Databricks

Language: Python - Size: 513 KB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 32 - Forks: 14

thustorage/PetPS

PetPS: Supporting Huge Embedding Models with Tiered Memory

Language: C++ - Size: 32.2 MB - Last synced at: 7 months ago - Pushed at: about 1 year ago - Stars: 30 - Forks: 2

BoYanSTKO/place2vec

Place2Vec ground truth dataset

Size: 16.6 KB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 25 - Forks: 7

yuniko-software/tokenizer-to-onnx-model

Convert Hugging Face tokenizers to ONNX models for cross-language compatibility (.NET, Java, Python) with embedding models

Language: Jupyter Notebook - Size: 43 KB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 24 - Forks: 2

maxscheurer/cppe

C++ and Python library for Polarizable Embedding

Language: C++ - Size: 4.56 MB - Last synced at: 6 days ago - Pushed at: 11 months ago - Stars: 22 - Forks: 5

UWNETLAB/dcss_supplementary

Supplementary materials for McLevey 2021 Doing Computational Social Science (Sage, UK).

Language: HTML - Size: 436 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 18 - Forks: 9

Zipstack/unstract-adapters

Unstract's interface to LLMs, Embeddings and VectorDBs.

Language: Python - Size: 632 KB - Last synced at: 8 days ago - Pushed at: 12 months ago - Stars: 18 - Forks: 3

FabianGroeger96/deep-embedded-music

Creation of an embedding space using unsupervised triplet loss and Tile2Vec that can be used for a variety of downstream tasks

Language: Jupyter Notebook - Size: 26.4 MB - Last synced at: 2 months ago - Pushed at: almost 4 years ago - Stars: 18 - Forks: 2

su-park/mteb_ko_leaderboard

한글 텍스트 임베딩 모델 리더보드

Size: 2.51 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 17 - Forks: 1

Wang-Yu-Qing/UTPM

Code for paper: Learning to Build User-tag Profile in Recommendation System

Language: Python - Size: 2.79 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 17 - Forks: 4

rbitr/ferrite

Simple, lightweight transformers in Fortran

Language: Fortran - Size: 28.3 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 16 - Forks: 1

easonlai/chatbot_with_pdf_streamlit

This code example shows how to make a chatbot for semantic search over documents using Streamlit, LangChain, and various vector databases. The chatbot lets users ask questions and get answers from a document collection. The code is in Python and can be customized for different scenarios and data.

Language: Jupyter Notebook - Size: 6.57 MB - Last synced at: 3 months ago - Pushed at: almost 2 years ago - Stars: 15 - Forks: 5

ashutosh1919/data2vec-pytorch

Ready to run PyTorch implementation of Data2Vec 2.0: Highly efficient self-supervised representation learning for vision, speech and text.

Language: Python - Size: 116 KB - Last synced at: 5 days ago - Pushed at: over 2 years ago - Stars: 15 - Forks: 3

chaoweifang/PFE

Piecewise Flat Embedding for Image Segmentation

Language: C++ - Size: 35 MB - Last synced at: about 2 years ago - Pushed at: about 7 years ago - Stars: 14 - Forks: 3

KevKibe/docindex 📦

⚡️Framework for fast persistent storage of multiple document embeddings and metadata into Pinecone for source-traceable, production-level RAG.

Language: Python - Size: 816 KB - Last synced at: 28 days ago - Pushed at: 7 months ago - Stars: 13 - Forks: 3

aws-samples/fine-tune-embedding-models-on-sagemaker

This repository contains samples for fine-tuning embedding models using Amazon SageMaker. Embedding models are useful for tasks such as semantic similarity, text clustering, and information retrieval. Fine-tuning these models on your specific domain data can greatly improve their performance.

Language: Jupyter Notebook - Size: 47.9 KB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 12 - Forks: 0

ritaranx/BMRetriever

This is the code for our paper "BMRetriever: Tuning Large Language Models as Better Biomedical Text Retrievers".

Language: Python - Size: 728 KB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 12 - Forks: 2

mana-ysh/symmetry-learning-kgc

Python implementation of "Data-dependent Learning of Symmetric/Antisymmetric Relations for Knowledge Base Completion [Manabe+. 2018]"

Language: Python - Size: 11.7 MB - Last synced at: 9 days ago - Pushed at: over 7 years ago - Stars: 11 - Forks: 1

itmo-mbss-lab/sr_labs_book

The project is related to the development of labs for the ITMO Speaker Recognition Course.

Language: Jupyter Notebook - Size: 3.25 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 10 - Forks: 8

easonlai/chat_with_pdf_table

The contents of this repository showcase how to extract table data from a PDF file and preprocess it to facilitate word embedding. This preprocessing step enhances the readability of table data for language models and enables us to extract more contextual information from the tables.

Language: Jupyter Notebook - Size: 85.9 KB - Last synced at: 20 days ago - Pushed at: over 1 year ago - Stars: 9 - Forks: 4

natelalor/AI_report_generator

A tool that converts long audio files into a thorough, summarized report. Leverages OpenAI and its API (ChatGPT backend), Langchain for text processing, and Pinecone for vector database facilitation.

Language: Python - Size: 15.3 MB - Last synced at: 1 day ago - Pushed at: over 1 year ago - Stars: 8 - Forks: 2

karolzak/images-vector-search

Simple implementation of search for visually similar images using deep learning and vector search. It's based on pretrained ImageNet weights so it doesnt require any additional training

Language: Python - Size: 9.23 MB - Last synced at: 2 months ago - Pushed at: almost 2 years ago - Stars: 8 - Forks: 2

bnabis93/vision-language-examples

Vision-lanugage model example code.

Language: Python - Size: 2.99 MB - Last synced at: about 1 year ago - Pushed at: almost 2 years ago - Stars: 7 - Forks: 0

agankur21/entity_disambiguation

Neural Network models to map mention of a text to corresponding entity in the Knowledge Base

Language: Python - Size: 1.99 MB - Last synced at: over 1 year ago - Pushed at: almost 8 years ago - Stars: 7 - Forks: 3

sovit-123/local_file_search

Local file search using embedding techniques

Language: Python - Size: 113 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 6 - Forks: 1

touhi99/DL_Dialogue_act_classification

DL Lab Project - Given a subset of switchboard corpus, goal is to classify dialogue acts from Speech and Text data. We define a RNN-LSTM model for Text classification and CNN model for speech classification and then ensemble both model to output a stable and higher performance model

Language: Python - Size: 1.33 MB - Last synced at: 3 months ago - Pushed at: over 4 years ago - Stars: 6 - Forks: 0

chaosgen/awesome-sentence-embedding

A curated list of pretrained sentence and word embedding models

Language: Python - Size: 213 KB - Last synced at: 4 days ago - Pushed at: 9 months ago - Stars: 5 - Forks: 0

Seven-33/langchain-chat

langchain-chat is an AI-driven Q&A system that leverages OpenAI's GPT-4 model and FAISS for efficient document indexing. It loads and splits documents from websites or PDFs, remembers conversations, and provides accurate, context-aware answers based on the indexed data. Easy to set up and extend.

Language: Python - Size: 433 KB - Last synced at: 3 months ago - Pushed at: about 2 years ago - Stars: 5 - Forks: 0

nur-ag/IGEL

IGEL: Inductive Graph Embeddings through Locality Encodings

Language: Jupyter Notebook - Size: 13.2 MB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 5 - Forks: 0

p16i/siamese-net-and-friends

Language: Python - Size: 5.41 MB - Last synced at: 3 months ago - Pushed at: over 6 years ago - Stars: 5 - Forks: 3

SnowNation101/NYX

Unified Multimodal Retriever for RAG

Language: Python - Size: 1.92 MB - Last synced at: 6 days ago - Pushed at: 7 days ago - Stars: 4 - Forks: 0

louisbrulenaudet/lemone-embed

All-in-one repo for the Lemone-embed project, a series of fine-tuned embedding models for Tax retrieval augmented generation (RAG).

Language: Python - Size: 3.6 MB - Last synced at: 12 days ago - Pushed at: 3 months ago - Stars: 4 - Forks: 0

jargonsdev/ai

The AI-Powered assistant for jargons.dev ecosystem

Language: TypeScript - Size: 135 KB - Last synced at: 11 days ago - Pushed at: 4 months ago - Stars: 4 - Forks: 1

whw199833/gbiz_torch

A comprehensive toolkit package designed to help you accurately predict key metrics in commercial area

Language: Python - Size: 242 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 0

huyhoang17/Visual_Embedding_Tutorial

MNIST Embedding Visualisation using Tensorflow Projector, link blog:

Language: HTML - Size: 15.6 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 0

xinyu-intel/ncf_mxnet

Neural Collaborative Filtering with MXNet

Language: Python - Size: 27.4 MB - Last synced at: 3 months ago - Pushed at: over 5 years ago - Stars: 4 - Forks: 2

andikarachman/News-Title-Classification

Machine learning model to classify news into categories based on their headline

Language: Jupyter Notebook - Size: 6.26 MB - Last synced at: over 2 years ago - Pushed at: almost 6 years ago - Stars: 4 - Forks: 2

trustlelab/siteware-backend-v2

Siteware Backend - German Voice AI Agent provider - Deepgram + Twilio + Elevenlabs + OpenAI + Pinecone

Language: TypeScript - Size: 110 KB - Last synced at: 3 months ago - Pushed at: 8 months ago - Stars: 3 - Forks: 0

wuji3/nlpdk

Natural Language Processing(NLP) Toolbox

Language: Python - Size: 324 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 3 - Forks: 1

ksm26/Embedding-Models-From-Architecture-to-Implementation

Understand and build embedding models, focusing on word and sentence embeddings, dual encoder architectures. Learn to train embedding models using contrastive loss, implement them in semantic search and RAG systems.

Language: Jupyter Notebook - Size: 2 MB - Last synced at: 4 months ago - Pushed at: 11 months ago - Stars: 3 - Forks: 0

olasunkanmi-SE/IntelliSearch

IntelliSearch is an advanced retrieval-based question-answering and recommendation system that leverages embeddings and a large language model (LLM) to provide accurate and relevant information to users.

Language: TypeScript - Size: 1010 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 0

BogdanFloris/detecting-and-addressing-change

Code for my Master Thesis: How to detect and address changes in machine learning based data pipelines

Language: Python - Size: 151 KB - Last synced at: about 1 year ago - Pushed at: about 2 years ago - Stars: 3 - Forks: 0

GiorgiaAuroraAdorni/bachelor-thesis

Detailed analysis of the research project, carried out during my bachelor internship, entitled "Neural Networks for the Learning of personality traits from Natural Language". @ Unimib 17/18.

Language: TeX - Size: 12.6 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 0

MottoX/EBR-papers

Helpful papers on embedding-based retrieval (EBR)

Size: 1.95 KB - Last synced at: almost 2 years ago - Pushed at: almost 3 years ago - Stars: 3 - Forks: 0

GiorgiaAuroraAdorni/learning-personality

Learning Personality is a bachelor internship project that use neural networks to extract, from natural language (in particular reviews), personality traits, through automatic approaches. @ Unimib 17/18.

Language: Python - Size: 1010 KB - Last synced at: over 2 years ago - Pushed at: over 5 years ago - Stars: 3 - Forks: 1

firojalam/crisis-embedding-models

Embedding models designed using crisis related tweets collected by AIDR (http://aidr.qcri.org/)

Language: Python - Size: 950 KB - Last synced at: about 2 months ago - Pushed at: almost 7 years ago - Stars: 3 - Forks: 0

SINGHxTUSHAR/IMDB-Analysis

IMDB-Analysis is a sentiment Analysis project based on movie review, whether it is +ve or -ve. Model is design with a simple RNN architecture and embedded with word2vec. Deployed on streamlit web-app open cloud service.

Language: Jupyter Notebook - Size: 16 MB - Last synced at: 3 months ago - Pushed at: 5 months ago - Stars: 2 - Forks: 0

SubhangiSati/RAG-using-DeepSeek-R1

This repository highlights my learning journey in building Retrieval-Augmented Generation (RAG) pipelines using DeepSeek on Lightning AI, covering document ingestion, retrieval, and integration with generative AI. It showcases fine-tuning, evaluation, and optimization for accurate open-domain QA and knowledge management.

Language: Jupyter Notebook - Size: 1.01 MB - Last synced at: 4 months ago - Pushed at: 6 months ago - Stars: 2 - Forks: 0

tobiasodion/RAGBOT

A CLI chatbot that uses RAG architecture for improving and adapting LLM to specific context. It allows users to ask questions and get response directly from open-source LLMs(OpenAI, MistralAI etc.) or from the information on a website which is provided as context using the RAG architecture.

Language: JavaScript - Size: 788 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 0

madeyexz/social_vegan

A dating (match-making) app for serious daters with embeddings and vector database.

Language: Python - Size: 2.98 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

VectifyAI/FAE

A method to fine-tune the black box OpenAI’s embedding model.

Language: Jupyter Notebook - Size: 16 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

karhunenloeve/NTOPL 📦

Estimation of Neural Network Dimension using Algebraic Topology and Lie Theory.

Language: Python - Size: 713 KB - Last synced at: 9 months ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 0

nanxstats/exp2vec

🧬 Tissue-specific gene embeddings trained on GTEx data

Language: R - Size: 99.6 MB - Last synced at: about 18 hours ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 1

BatoolHamawi/COVID-19WordEmbeddings

COVID-19 Arabic Word embeddings is a domain- specific pre-trained distributed word representation of COVID-19 Tweets which aims to provide the Arabic NLP research community with free to use and powerful word embedding models.

Language: Jupyter Notebook - Size: 54.7 KB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 2 - Forks: 0

pHequals7/NLP_Notebooks

NLP related concepts, challenges and datasets

Language: Jupyter Notebook - Size: 172 KB - Last synced at: about 1 year ago - Pushed at: almost 6 years ago - Stars: 2 - Forks: 0

imansaleh16/Stack-Overflow-Tags-Communities

Dataset used to produce communities of related tags in Stack Overflow

Size: 7.99 MB - Last synced at: 10 months ago - Pushed at: over 6 years ago - Stars: 2 - Forks: 0

sthanhng/Fashion-MNIST-Embedding-Visualization

Fashion-MNIST Embedding Visualization using TensorFlow Projector

Language: Python - Size: 14.1 MB - Last synced at: over 2 years ago - Pushed at: over 6 years ago - Stars: 2 - Forks: 0

kaggledevs/Datasets

Collection of datasets related to ML, AI, NLP and DL

Size: 2.93 KB - Last synced at: 25 days ago - Pushed at: over 7 years ago - Stars: 2 - Forks: 1

mana-ysh/poincare-embeddings

Implementation of poincare embeddings

Language: Scala - Size: 12.7 KB - Last synced at: 9 days ago - Pushed at: almost 8 years ago - Stars: 2 - Forks: 1

rafay123321/embedding-hallucinations

This repo shows how foundational model hallucinates and how we can fix such hallucinations using fine-tuning them

Language: Python - Size: 476 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 1 - Forks: 0

with-caer/curtana

Simplified zero-cost wrapper over llama.cpp powered by the lama-cpp-2 Crate.

Language: Rust - Size: 24.4 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1 - Forks: 0

yuniko-software/bge-m3-onnx

ONNX implementation of the BGE-M3 multilingual embedding model and tokenizer. Generates all three embedding types: dense vectors, sparse weights, and ColBERT vectors

Language: C# - Size: 51.8 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 1 - Forks: 0

harehimself/pinecone-lab

Experimenting with Pinecone as vector data continues to take center stage in AI-native systems. The purpose of this project is to explore the core capabilities, benchmark performance across different embedding models, and better understand what is possible with vector search in production environments.

Language: Python - Size: 301 KB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 1 - Forks: 0

Topic: "embedding-models"

Separius/awesome-sentence-embedding 📦

mana-ysh/knowledge-graph-embeddings 📦

KevKibe/docindex 📦

karhunenloeve/NTOPL 📦