An open API service providing repository metadata for many open source software ecosystems.

Topic: "unstructured-data"

iterative/dvc

🦉 Data Versioning and ML Experiments

Language: Python - Size: 19.6 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 14,683 - Forks: 1,235

voxel51/fiftyone

Refine high-quality datasets and visual AI models

Language: Python - Size: 1.96 GB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 9,730 - Forks: 652

Zipstack/unstract

No-code LLM Platform to launch APIs and ETL Pipelines to structure unstructured documents

Language: Python - Size: 34.4 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 5,467 - Forks: 515

neo4j-labs/llm-graph-builder

Neo4j graph construction from unstructured data using LLMs

Language: Jupyter Notebook - Size: 53 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 3,613 - Forks: 618

towhee-io/towhee

Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.

Language: Python - Size: 37.2 MB - Last synced at: 2 days ago - Pushed at: 9 months ago - Stars: 3,390 - Forks: 261

instill-ai/instill-core

🔮 Instill Core is a full-stack AI infrastructure tool for data, model and pipeline orchestration, designed to streamline every aspect of building versatile AI-first applications

Language: Python - Size: 11 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 2,273 - Forks: 118

milvus-io/bootcamp

Dealing with all unstructured data, such as reverse image search, audio search, molecular search, video analysis, question and answer systems, NLP, etc.

Language: Jupyter Notebook - Size: 111 MB - Last synced at: 16 days ago - Pushed at: 17 days ago - Stars: 2,168 - Forks: 638

nomic-ai/nomic

Interact, analyze and structure massive text, image, embedding, audio and video datasets

Language: Python - Size: 24.1 MB - Last synced at: 2 days ago - Pushed at: about 1 month ago - Stars: 1,758 - Forks: 193

dingodb/dingo

A multi-modal vector database that supports upserts and vector queries using unified SQL (MySQL-Compatible) on structured and unstructured data, while meeting the requirements of high concurrency and ultra-low latency.

Language: Java - Size: 29.6 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 1,613 - Forks: 262

tstanislawek/awesome-document-understanding

A curated list of resources for Document Understanding (DU) topic

Size: 5.56 MB - Last synced at: 3 days ago - Pushed at: about 2 years ago - Stars: 1,440 - Forks: 161

NanoNets/docext

An on-premises, OCR-free unstructured data extraction, markdown conversion and benchmarking toolkit. (https://idp-leaderboard.org/)

Language: Python - Size: 6.76 MB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 1,355 - Forks: 99

emcf/thepipe

Get clean data from tricky documents, powered by vision-language models ⚡

Language: Python - Size: 4.07 MB - Last synced at: 23 days ago - Pushed at: about 2 months ago - Stars: 1,277 - Forks: 80

shcherbak-ai/contextgem

ContextGem: Effortless LLM extraction from documents

Language: Python - Size: 39.9 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1,254 - Forks: 95

lotus-data/lotus

LOTUS: A semantic query engine for fast and easy LLM-powered data processing

Language: Python - Size: 1.51 MB - Last synced at: about 16 hours ago - Pushed at: about 21 hours ago - Stars: 1,248 - Forks: 107

Renumics/spotlight

Interactively explore unstructured datasets from your dataframe.

Language: TypeScript - Size: 47 MB - Last synced at: 3 days ago - Pushed at: about 1 month ago - Stars: 1,186 - Forks: 86

yobix-ai/extractous

Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.

Language: Rust - Size: 2.88 MB - Last synced at: 7 days ago - Pushed at: 7 months ago - Stars: 1,183 - Forks: 52

amphi-ai/amphi-etl

Visual Data Preparation and Transformation. Low-Code Python-based ETL.

Language: TypeScript - Size: 1.58 MB - Last synced at: 14 days ago - Pushed at: 2 months ago - Stars: 1,079 - Forks: 68

databricks/lilac

Curate better data for LLMs

Language: Python - Size: 37 MB - Last synced at: 8 days ago - Pushed at: over 1 year ago - Stars: 1,046 - Forks: 100

Open-Source-Legal/OpenContracts

Enterprise-grade and API-first LLM workspace for unstructured documents, including data extraction, redaction, rights management, prompt playground, and more!

Language: Python - Size: 131 MB - Last synced at: 2 days ago - Pushed at: 3 days ago - Stars: 894 - Forks: 88

nuclia/nucliadb

NucliaDB, The AI Search database for RAG

Language: Python - Size: 40.5 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 700 - Forks: 54

EulerSearch/embedding_studio

Embedding Studio is a framework which allows you transform your Vector Database into a feature-rich Search Engine.

Language: Python - Size: 10.2 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 380 - Forks: 5

graphlit/graphlit-mcp-server

Model Context Protocol (MCP) Server for Graphlit Platform

Language: TypeScript - Size: 605 KB - Last synced at: 27 days ago - Pushed at: 27 days ago - Stars: 318 - Forks: 35

garyelephant/pygrok

python implementation of jordansissel's grok regular expression library

Language: Python - Size: 66.4 KB - Last synced at: 12 days ago - Pushed at: over 1 year ago - Stars: 280 - Forks: 75

harishdeivanayagam/rowfill

Open-source unstructured data (PDFs, Images, Audiofiles) processing platform built for knowledge workers

Language: TypeScript - Size: 1.2 MB - Last synced at: 2 months ago - Pushed at: 4 months ago - Stars: 277 - Forks: 14

fzliu/radient

Radient turns many data types (not just text) into vectors for similarity search, RAG, regression analysis, and more.

Language: Python - Size: 127 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 276 - Forks: 11

automorphic-ai/trex

Enforce structured output from LLMs 100% of the time

Language: Python - Size: 7.81 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 239 - Forks: 8

RelevanceAI/relevanceai

Home of the AI workforce - Multi-agent system, AI agents & tools

Language: Python - Size: 69 MB - Last synced at: 2 months ago - Pushed at: 3 months ago - Stars: 232 - Forks: 36

velocitybolt/open-extract

Structured Data Extractor for AI Agents. Search your documents or the web for specific data and get it back in JSON or Markdown in a single tool call.

Language: Python - Size: 8.91 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 162 - Forks: 13

CambioML/any-parser

Accurate, private and configurable document retrieval LLM

Language: Python - Size: 27.6 MB - Last synced at: 6 days ago - Pushed at: 29 days ago - Stars: 126 - Forks: 12

mitdbg/palimpzest

A System for Optimized Semantic Computation

Language: Python - Size: 374 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 119 - Forks: 21

jostmey/dkm

Dynamic Kernel Matching (DKM) for Classifying Data with Non-conforming Features

Language: HTML - Size: 7.15 MB - Last synced at: 4 months ago - Pushed at: about 2 years ago - Stars: 95 - Forks: 6

DerwenAI/strwythura

How to construct knowledge graphs from unstructured data sources

Language: Jupyter Notebook - Size: 1.22 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 67 - Forks: 6

wangxb96/RAG-QA-Generator

RAG-QA-Generator 是一个用于检索增强生成(RAG)系统的自动化知识库构建与管理工具。该工具通过读取文档数据,利用大规模语言模型生成高质量的问答对(QA对),并将这些数据插入数据库中,实现RAG系统知识库的自动化构建和管理。

Language: Python - Size: 1.72 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 62 - Forks: 6

kuzudb/baml-kuzu-demo

Demo of knowledge graph creation and Graph RAG with BAML and Kuzu

Language: Python - Size: 5.09 MB - Last synced at: 6 days ago - Pushed at: about 1 month ago - Stars: 60 - Forks: 12

BartJongejan/Bracmat

Programming language for symbolic computation with unusual combination of pattern matching features: Tree patterns, associative patterns and expressions embedded in patterns.

Language: C - Size: 23.9 MB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 47 - Forks: 5

IBM/pixiedust-facebook-analysis 📦

A Jupyter notebook that uses the Watson Visual Recognition and Natural Language Understanding services to enrich Facebook Analytics and uses Cognos Dashboard Embedded to explore and visualize the results in Watson Studio

Language: Jupyter Notebook - Size: 6.67 MB - Last synced at: about 2 months ago - Pushed at: 7 months ago - Stars: 44 - Forks: 64

instill-ai/console

📺 Instill Console for 🔮 Instill Core: https://github.com/instill-ai/instill-core

Language: TypeScript - Size: 13.3 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 40 - Forks: 10

ScrapeGraphAI/Scrapontologies

Python library for Entities, relationships and schemas extraction from documents

Language: Python - Size: 688 KB - Last synced at: 7 days ago - Pushed at: 8 months ago - Stars: 40 - Forks: 3

adansons/base

Adansons Base is a data programming tool for error-analysis of training results. It organizes metadata of unstructured data and creates and organizes datasets. It makes dataset creation more effective and helps to find low-quality data by using the training results and improves AI performance.

Language: Jupyter Notebook - Size: 12.8 MB - Last synced at: 2 months ago - Pushed at: almost 3 years ago - Stars: 28 - Forks: 3

instill-ai/pipeline-backend

⇋ A REST/gRPC server for Instill VDP API service

Language: Go - Size: 74.3 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 26 - Forks: 21

antvis/T8

🧬 Narrative text visualization for unstructured data.

Language: TypeScript - Size: 593 KB - Last synced at: 4 days ago - Pushed at: 12 days ago - Stars: 25 - Forks: 1

instill-ai/cli

⌨️ Instill CLI for 🔮 Instill Core: https://github.com/instill-ai/instill-core

Language: Go - Size: 634 KB - Last synced at: 22 days ago - Pushed at: 22 days ago - Stars: 23 - Forks: 3

Zipstack/unstract-sdk

A framework for writing Unstract Tools/Apps

Language: Python - Size: 3.45 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 22 - Forks: 1

osllmai/inDox

The Indox Ecosystem offers integrated AI tools for data workflows. Our four components (IndoxArcg, IndoxMiner, IndoxJudge, and IndoxGen) enhance AI applications with advanced retrieval, extraction, evaluation, and generation capabilities, supporting multiple document formats and LLM providers.

Language: Jupyter Notebook - Size: 106 MB - Last synced at: 4 days ago - Pushed at: 17 days ago - Stars: 20 - Forks: 2

instill-ai/deprecated-model

⚗️ Instill Model contains components for AI model orchestration

Language: Makefile - Size: 6.06 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 20 - Forks: 4

Zipstack/unstract-adapters

Unstract's interface to LLMs, Embeddings and VectorDBs.

Language: Python - Size: 632 KB - Last synced at: 19 days ago - Pushed at: about 1 year ago - Stars: 18 - Forks: 3

instill-ai/model-backend

⇋ A REST/gRPC server for Instill Model API service

Language: Go - Size: 20.2 MB - Last synced at: 9 days ago - Pushed at: 10 days ago - Stars: 17 - Forks: 7

TuanaCelik/unstructuredio-haystack

💙 Unstructured Data Connectors for Haystack 2.0

Language: Python - Size: 22.5 KB - Last synced at: 3 months ago - Pushed at: almost 2 years ago - Stars: 16 - Forks: 2

nicbet/infozilla

The infoZilla unstructured software engineering data mining tool. It can find and extract source code regions, patches, stack traces, enumerations and itemizations from discussion threads.

Language: Java - Size: 530 KB - Last synced at: 4 months ago - Pushed at: over 6 years ago - Stars: 15 - Forks: 2

SachinKalsi/html_tag_annotator

A Machine Learning tool to create the training dataset very quickly & easily by using a smart chrome extension

Language: JavaScript - Size: 11.8 MB - Last synced at: 4 months ago - Pushed at: over 2 years ago - Stars: 14 - Forks: 3

IBM/generate-insights-from-data-formats-with-watson 📦

How do we process data in different formats like docx, pdf etc and generate insights to be linked with structured data in database?This pattern helps in establishing relations between structured & unstructured data to generate recommendations using Watson NLU & Watson Studio.

Language: Jupyter Notebook - Size: 1.06 MB - Last synced at: about 2 months ago - Pushed at: about 5 years ago - Stars: 14 - Forks: 14

jokruger/rl3examples

RL3 examples repository (information extraction, NER, NLP, web & text mining, etc).

Language: Python - Size: 89.8 KB - Last synced at: 9 months ago - Pushed at: over 6 years ago - Stars: 14 - Forks: 1

aclai-lab/SoleData.jl

Manage logical datasets!

Language: Julia - Size: 4.16 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 13 - Forks: 2

lazyhope/metamodel

Intelligent Schema Designer and Unstructured Data Parser

Language: JavaScript - Size: 164 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 13 - Forks: 0

floriancochard/extract-data-from-paper

A tool designed to extract numerical data from scanned historical weather documents.

Language: Python - Size: 151 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 13 - Forks: 2

instill-ai/deprecated-core

🔮 Instill Core contains components for supporting Instill VDP and Instill Model

Language: Makefile - Size: 1.25 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 13 - Forks: 4

mkearney/wibble

Web Data Frames

Language: R - Size: 497 KB - Last synced at: 3 months ago - Pushed at: over 6 years ago - Stars: 12 - Forks: 0

chaitjo/knowledge-graphs

Building Knowledge Graphs from Unstructured Text

Language: Jupyter Notebook - Size: 42.9 MB - Last synced at: over 2 years ago - Pushed at: over 5 years ago - Stars: 11 - Forks: 6

aws-samples/content-repository-with-dynamic-access-control

Code and walkthrough to build an end-to-end content repository for unstructured data with dynamic access control.

Language: TypeScript - Size: 1000 KB - Last synced at: about 2 months ago - Pushed at: about 2 years ago - Stars: 10 - Forks: 1

rririanto/unstructured-demo-streamlit

Extract your docs (CSV, PDF, JSON, HTML, DOCS, Sheets and more) for your own GPT and LLM projects using Unstructured.io via streamlit

Language: Python - Size: 6.84 KB - Last synced at: 4 months ago - Pushed at: almost 2 years ago - Stars: 8 - Forks: 0

maithilish/gotz

Gotz - Heavy duty ETL to automate data extraction from tons of HTML pages

Language: Java - Size: 1.41 MB - Last synced at: over 2 years ago - Pushed at: about 3 years ago - Stars: 8 - Forks: 0

ZoralLabs/rl3stdlib

The RL3 Standard Library is a collection of modules accessible to a RL3 program to simplify the programming process and removing the need to rewrite commonly used RL3 patterns and predicates.

Size: 32.2 KB - Last synced at: 9 months ago - Pushed at: over 6 years ago - Stars: 8 - Forks: 2

DataCanvasIO/dingo Fork of dingodb/dingo

A multi-modal vector database that supports upserts and vector queries using unified SQL (MySQL-Compatible) on structured and unstructured data, while meeting the requirements of high concurrency and ultra-low latency.

Language: Java - Size: 18.9 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 7 - Forks: 2

MoinDalvs/Resume_Screening_and_Parser

Business objective- The document classification solution should significantly reduce the manual human effort in the HRM. It should achieve a higher level of accuracy and automation with minimal human intervention Sample Data Set Details: Resumes and financial documents

Language: Jupyter Notebook - Size: 95.9 MB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 7 - Forks: 2

Clarifai/clarifai-python-datautils

Extract Transform and Load unstructured data into the Clarifai's AI platform

Language: Python - Size: 1.01 MB - Last synced at: 5 days ago - Pushed at: 17 days ago - Stars: 6 - Forks: 0

Shizheng-Wen/GAOT-3D

About code release of "Geometry-Aware Operator Transformer: A Generalizable Framework for PDE Solution Operators"

Language: Python - Size: 1.9 MB - Last synced at: 23 days ago - Pushed at: 23 days ago - Stars: 6 - Forks: 0

libraryofcelsus/LLM_File_Parser

AutoML/Unstructured Data Processing for RAG and LLM Dataset Creation. Current Database Options are: Qdrant or Marqo DB.

Language: Python - Size: 43 KB - Last synced at: 6 days ago - Pushed at: about 1 year ago - Stars: 6 - Forks: 1

thu-west/AnnotationTool

An Annotation Tool Designed for Health Unstructured Data (标注工具)

Language: Java - Size: 13.8 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 5 - Forks: 4

kodexa-ai/kodexa

Kodexa Python Client

Language: Python - Size: 12.8 MB - Last synced at: 7 days ago - Pushed at: 9 days ago - Stars: 4 - Forks: 1

SupermatAI/supermat

Novel data representation leading to granular citations and higher accuracy

Language: Python - Size: 5.57 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 4 - Forks: 1

hupe1980/go-textractor

📄 Amazon textract response parser written in go.

Language: Go - Size: 6.24 MB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 0

Menziess/Databook

Data Engineering knowledge as a readable tutorial (collaboratively).

Size: 2.44 MB - Last synced at: about 1 year ago - Pushed at: over 6 years ago - Stars: 4 - Forks: 1

kaloslazo/PyFuseDB

Database system that combines structured data retrieval through inverted indexes with unstructured data (images, audio) search using multidimensional vector embeddings, all within a unified platform.

Language: Python - Size: 631 MB - Last synced at: 4 months ago - Pushed at: 8 months ago - Stars: 3 - Forks: 0

instill-ai/connector-backend 📦

⇋ A REST/gRPC server for Instill AI's data connector service

Language: JavaScript - Size: 1.63 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 3

yrnigam/Named-Entity-Recognition-NER-using-LSTMs

Named Entity Recognition (NER) using LSTMs with Keras

Language: Jupyter Notebook - Size: 3.78 MB - Last synced at: almost 2 years ago - Pushed at: about 5 years ago - Stars: 3 - Forks: 6

saranpal/Spark-RDD-Set-Top-Box-Data-Analysis

Spark RDD transformation and action, process unstructured data

Language: Scala - Size: 654 KB - Last synced at: almost 2 years ago - Pushed at: over 6 years ago - Stars: 3 - Forks: 3

instill-ai/helm-charts

⎈ The Helm charts of Instill AI

Size: 237 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 2 - Forks: 1

b-cubed-eu/comp-unstructured-data

Scripts to explore the conditions that determine the reliability of models, trends and status by comparing aggregated cubes with structured monitoring schemes

Language: R - Size: 1.69 MB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 2 - Forks: 0

SAP-archive/hana-structurer-one 📦

SAP HANA Extreme application that analyzes unstructured data (tweets) to retrieve information such as location, people, companies, and also sentiment analysis.

Language: CSS - Size: 3.81 MB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 4

pradeepdev-1995/Index-based-semantic-similarity-unstructured-data-search

Unstructured data refers to information that is not organised using a predetermined data model or schema and cannot be stored in a conventional relational database system. There are several methods for search unstructured data semantically- That is by taking the actual context/meaning of the sentences.One best approach is index based approach.

Language: Jupyter Notebook - Size: 249 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

dominiksalvet/crypto-addr-extract

Extract cryptocurrency addresses from big datasets

Language: Python - Size: 42 KB - Last synced at: about 1 year ago - Pushed at: about 3 years ago - Stars: 2 - Forks: 0

instill-ai/mgmt-backend

⇋ A REST/gRPC server for Instill AI's Management API service

Language: Go - Size: 1.09 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1 - Forks: 2

instill-ai/.github

🏡 Instill AI organisation profile and default configuration

Size: 52.4 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

kodexa-ai/kodexa-cli

Command Line Tools for Kodexa

Language: Python - Size: 1.12 MB - Last synced at: 7 days ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 1

camlab-ethz/GAOT

About code release of "Geometry Aware Operator Transformer As An Efficient And Accurate Neural Surrogate For PDEs On Arbitrary Domains"

Language: Python - Size: 35.9 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

Toschu95/my-benefit-finder-vienna

My Benefit Finder Vienna is an AI-powered system designed to help individuals in Vienna quickly find and apply for relevant social benefits, grants, and subsidies. Using RAG (Retrieval-Augmented Generation) and a Large Language Model (LLM), this tool provides personalized recommendations based on the latest available data from official sources.

Language: Jupyter Notebook - Size: 637 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

FroCode/Realtime_Streaming_Unstructured-Data

Real-time streaming and processing of unstructured data (spark, airflow)

Language: Python - Size: 128 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

faisalman/re-parse-js

Compose a structured data from unstructured text using regex-based pattern matching, as found in UAParser.js

Language: TypeScript - Size: 31.3 KB - Last synced at: 2 days ago - Pushed at: 7 months ago - Stars: 1 - Forks: 1

yeisonmontoya1815/Special-Topics-in-Data-Analytics

In my PDD Data Analytics studies at Douglas College, the Special Topics course stands out as a crucial component. This specialized module delves into advanced aspects of data analysis beyond the core curriculum, offering a deep exploration of intricate domains. Through this focused study, I aim to enhance my proficiency in handling complex datasets

Language: Jupyter Notebook - Size: 15.2 MB - Last synced at: 4 months ago - Pushed at: 12 months ago - Stars: 1 - Forks: 0

konhay/sector-attention-index

Specifically built for the research proposal: Estimating sector attention index with deep learning methods : example of Chinese stock market, Jan. 4, 2024.

Language: Python - Size: 864 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

esteininger/file-processor

A Python library that uses AI to convert unstructured files (like PDFs, HTML, etc.) into structured data.

Language: Python - Size: 114 KB - Last synced at: 4 days ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 1

alexandreLamarre/Fission

Data analytics & Structured streaming optimized for the Edge

Language: Rust - Size: 31.3 KB - Last synced at: 5 months ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

mkirslis/Warship-Data

Generates a CSV file of warship data from Wikipedia.

Language: Python - Size: 155 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

ash-0521/Abandoned-Object-Detection-in-crowded-environment-using-MATLAB

Trained MATLAB models for 82% precision/80% recall, optimized with blob analysis for 25% performance boost. User-friendly alarm system with 500+ engaged users.

Size: 682 KB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

ttariqaziz/statistical_modeling_matlab

Highlights of my research work in MATLAB, statistical modeling of the unstructured raw data from GPS satellites for several years. Data modeling and processing, followed by different residual plots including trends and root mean square. In the end, the result was compared with independent data set models for validation purposes. The results were also presented at a European conference.

Size: 10.8 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

aws-samples/content-repository-with-multilingual-search

Code and walkthrough to build an end-to-end content repository for unstructured data with multilingual semantic search and dynamic access control.

Language: TypeScript - Size: 3.19 MB - Last synced at: about 2 months ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

mohamad1014/SmartSearch

A repository dealing with the ability to use LLMs for semantic search. The data considered are specific curated documents targetting closed domain search. This is created to show how relatively simple it is to use these methods and increase productivity within an org.

Language: Python - Size: 3.91 KB - Last synced at: 6 months ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

malfusion/sentiment-keyword-extraction

Multi-Pipeline Keyword Extractor and Word Cloud Visualizer for Sentiment Analysis tasks

Language: Java - Size: 6.76 MB - Last synced at: about 1 year ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

boomalope/ltb

Code for my working paper: The Winners and Losers of Rental Tribunals (February 14, 2022). Available at SSRN: https://ssrn.com/abstract=4029114

Language: HTML - Size: 69.7 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

adisorbo/NEON_tool

NEON mines rules for detecting natural language patterns in software informal documents. The inferred rules can be used for identifying and extracting relevant information embedded in unstructured texts.

Language: Java - Size: 68.8 MB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 5