Topic: "unstructured-data"
iterative/dvc
🦉 Data Versioning and ML Experiments
Language: Python - Size: 19.6 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 14,683 - Forks: 1,235

voxel51/fiftyone
Refine high-quality datasets and visual AI models
Language: Python - Size: 1.96 GB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 9,730 - Forks: 652

Zipstack/unstract
No-code LLM Platform to launch APIs and ETL Pipelines to structure unstructured documents
Language: Python - Size: 34.4 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 5,467 - Forks: 515

neo4j-labs/llm-graph-builder
Neo4j graph construction from unstructured data using LLMs
Language: Jupyter Notebook - Size: 53 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 3,613 - Forks: 618

towhee-io/towhee
Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.
Language: Python - Size: 37.2 MB - Last synced at: 2 days ago - Pushed at: 9 months ago - Stars: 3,390 - Forks: 261

instill-ai/instill-core
🔮 Instill Core is a full-stack AI infrastructure tool for data, model and pipeline orchestration, designed to streamline every aspect of building versatile AI-first applications
Language: Python - Size: 11 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 2,273 - Forks: 118

milvus-io/bootcamp
Dealing with all unstructured data, such as reverse image search, audio search, molecular search, video analysis, question and answer systems, NLP, etc.
Language: Jupyter Notebook - Size: 111 MB - Last synced at: 16 days ago - Pushed at: 17 days ago - Stars: 2,168 - Forks: 638

nomic-ai/nomic
Interact, analyze and structure massive text, image, embedding, audio and video datasets
Language: Python - Size: 24.1 MB - Last synced at: 2 days ago - Pushed at: about 1 month ago - Stars: 1,758 - Forks: 193

dingodb/dingo
A multi-modal vector database that supports upserts and vector queries using unified SQL (MySQL-Compatible) on structured and unstructured data, while meeting the requirements of high concurrency and ultra-low latency.
Language: Java - Size: 29.6 MB - Last synced at: 4 days ago - Pushed at: 5 days ago - Stars: 1,613 - Forks: 262

tstanislawek/awesome-document-understanding
A curated list of resources for Document Understanding (DU) topic
Size: 5.56 MB - Last synced at: 3 days ago - Pushed at: about 2 years ago - Stars: 1,440 - Forks: 161

NanoNets/docext
An on-premises, OCR-free unstructured data extraction, markdown conversion and benchmarking toolkit. (https://idp-leaderboard.org/)
Language: Python - Size: 6.76 MB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 1,355 - Forks: 99

emcf/thepipe
Get clean data from tricky documents, powered by vision-language models ⚡
Language: Python - Size: 4.07 MB - Last synced at: 23 days ago - Pushed at: about 2 months ago - Stars: 1,277 - Forks: 80

shcherbak-ai/contextgem
ContextGem: Effortless LLM extraction from documents
Language: Python - Size: 39.9 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1,254 - Forks: 95

lotus-data/lotus
LOTUS: A semantic query engine for fast and easy LLM-powered data processing
Language: Python - Size: 1.51 MB - Last synced at: about 16 hours ago - Pushed at: about 21 hours ago - Stars: 1,248 - Forks: 107

Renumics/spotlight
Interactively explore unstructured datasets from your dataframe.
Language: TypeScript - Size: 47 MB - Last synced at: 3 days ago - Pushed at: about 1 month ago - Stars: 1,186 - Forks: 86

yobix-ai/extractous
Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.
Language: Rust - Size: 2.88 MB - Last synced at: 7 days ago - Pushed at: 7 months ago - Stars: 1,183 - Forks: 52

amphi-ai/amphi-etl
Visual Data Preparation and Transformation. Low-Code Python-based ETL.
Language: TypeScript - Size: 1.58 MB - Last synced at: 14 days ago - Pushed at: 2 months ago - Stars: 1,079 - Forks: 68

databricks/lilac
Curate better data for LLMs
Language: Python - Size: 37 MB - Last synced at: 8 days ago - Pushed at: over 1 year ago - Stars: 1,046 - Forks: 100

Open-Source-Legal/OpenContracts
Enterprise-grade and API-first LLM workspace for unstructured documents, including data extraction, redaction, rights management, prompt playground, and more!
Language: Python - Size: 131 MB - Last synced at: 2 days ago - Pushed at: 3 days ago - Stars: 894 - Forks: 88

nuclia/nucliadb
NucliaDB, The AI Search database for RAG
Language: Python - Size: 40.5 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 700 - Forks: 54

EulerSearch/embedding_studio
Embedding Studio is a framework which allows you transform your Vector Database into a feature-rich Search Engine.
Language: Python - Size: 10.2 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 380 - Forks: 5

graphlit/graphlit-mcp-server
Model Context Protocol (MCP) Server for Graphlit Platform
Language: TypeScript - Size: 605 KB - Last synced at: 27 days ago - Pushed at: 27 days ago - Stars: 318 - Forks: 35

garyelephant/pygrok
python implementation of jordansissel's grok regular expression library
Language: Python - Size: 66.4 KB - Last synced at: 12 days ago - Pushed at: over 1 year ago - Stars: 280 - Forks: 75

harishdeivanayagam/rowfill
Open-source unstructured data (PDFs, Images, Audiofiles) processing platform built for knowledge workers
Language: TypeScript - Size: 1.2 MB - Last synced at: 2 months ago - Pushed at: 4 months ago - Stars: 277 - Forks: 14

fzliu/radient
Radient turns many data types (not just text) into vectors for similarity search, RAG, regression analysis, and more.
Language: Python - Size: 127 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 276 - Forks: 11

automorphic-ai/trex
Enforce structured output from LLMs 100% of the time
Language: Python - Size: 7.81 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 239 - Forks: 8

RelevanceAI/relevanceai
Home of the AI workforce - Multi-agent system, AI agents & tools
Language: Python - Size: 69 MB - Last synced at: 2 months ago - Pushed at: 3 months ago - Stars: 232 - Forks: 36

velocitybolt/open-extract
Structured Data Extractor for AI Agents. Search your documents or the web for specific data and get it back in JSON or Markdown in a single tool call.
Language: Python - Size: 8.91 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 162 - Forks: 13

CambioML/any-parser
Accurate, private and configurable document retrieval LLM
Language: Python - Size: 27.6 MB - Last synced at: 6 days ago - Pushed at: 29 days ago - Stars: 126 - Forks: 12

mitdbg/palimpzest
A System for Optimized Semantic Computation
Language: Python - Size: 374 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 119 - Forks: 21

jostmey/dkm
Dynamic Kernel Matching (DKM) for Classifying Data with Non-conforming Features
Language: HTML - Size: 7.15 MB - Last synced at: 4 months ago - Pushed at: about 2 years ago - Stars: 95 - Forks: 6

DerwenAI/strwythura
How to construct knowledge graphs from unstructured data sources
Language: Jupyter Notebook - Size: 1.22 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 67 - Forks: 6

wangxb96/RAG-QA-Generator
RAG-QA-Generator 是一个用于检索增强生成(RAG)系统的自动化知识库构建与管理工具。该工具通过读取文档数据,利用大规模语言模型生成高质量的问答对(QA对),并将这些数据插入数据库中,实现RAG系统知识库的自动化构建和管理。
Language: Python - Size: 1.72 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 62 - Forks: 6

kuzudb/baml-kuzu-demo
Demo of knowledge graph creation and Graph RAG with BAML and Kuzu
Language: Python - Size: 5.09 MB - Last synced at: 6 days ago - Pushed at: about 1 month ago - Stars: 60 - Forks: 12

BartJongejan/Bracmat
Programming language for symbolic computation with unusual combination of pattern matching features: Tree patterns, associative patterns and expressions embedded in patterns.
Language: C - Size: 23.9 MB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 47 - Forks: 5

IBM/pixiedust-facebook-analysis 📦
A Jupyter notebook that uses the Watson Visual Recognition and Natural Language Understanding services to enrich Facebook Analytics and uses Cognos Dashboard Embedded to explore and visualize the results in Watson Studio
Language: Jupyter Notebook - Size: 6.67 MB - Last synced at: about 2 months ago - Pushed at: 7 months ago - Stars: 44 - Forks: 64

instill-ai/console
📺 Instill Console for 🔮 Instill Core: https://github.com/instill-ai/instill-core
Language: TypeScript - Size: 13.3 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 40 - Forks: 10

ScrapeGraphAI/Scrapontologies
Python library for Entities, relationships and schemas extraction from documents
Language: Python - Size: 688 KB - Last synced at: 7 days ago - Pushed at: 8 months ago - Stars: 40 - Forks: 3

adansons/base
Adansons Base is a data programming tool for error-analysis of training results. It organizes metadata of unstructured data and creates and organizes datasets. It makes dataset creation more effective and helps to find low-quality data by using the training results and improves AI performance.
Language: Jupyter Notebook - Size: 12.8 MB - Last synced at: 2 months ago - Pushed at: almost 3 years ago - Stars: 28 - Forks: 3

instill-ai/pipeline-backend
⇋ A REST/gRPC server for Instill VDP API service
Language: Go - Size: 74.3 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 26 - Forks: 21

antvis/T8
🧬 Narrative text visualization for unstructured data.
Language: TypeScript - Size: 593 KB - Last synced at: 4 days ago - Pushed at: 12 days ago - Stars: 25 - Forks: 1

instill-ai/cli
⌨️ Instill CLI for 🔮 Instill Core: https://github.com/instill-ai/instill-core
Language: Go - Size: 634 KB - Last synced at: 22 days ago - Pushed at: 22 days ago - Stars: 23 - Forks: 3

Zipstack/unstract-sdk
A framework for writing Unstract Tools/Apps
Language: Python - Size: 3.45 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 22 - Forks: 1

osllmai/inDox
The Indox Ecosystem offers integrated AI tools for data workflows. Our four components (IndoxArcg, IndoxMiner, IndoxJudge, and IndoxGen) enhance AI applications with advanced retrieval, extraction, evaluation, and generation capabilities, supporting multiple document formats and LLM providers.
Language: Jupyter Notebook - Size: 106 MB - Last synced at: 4 days ago - Pushed at: 17 days ago - Stars: 20 - Forks: 2

instill-ai/deprecated-model
⚗️ Instill Model contains components for AI model orchestration
Language: Makefile - Size: 6.06 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 20 - Forks: 4

Zipstack/unstract-adapters
Unstract's interface to LLMs, Embeddings and VectorDBs.
Language: Python - Size: 632 KB - Last synced at: 19 days ago - Pushed at: about 1 year ago - Stars: 18 - Forks: 3

instill-ai/model-backend
⇋ A REST/gRPC server for Instill Model API service
Language: Go - Size: 20.2 MB - Last synced at: 9 days ago - Pushed at: 10 days ago - Stars: 17 - Forks: 7

TuanaCelik/unstructuredio-haystack
💙 Unstructured Data Connectors for Haystack 2.0
Language: Python - Size: 22.5 KB - Last synced at: 3 months ago - Pushed at: almost 2 years ago - Stars: 16 - Forks: 2

nicbet/infozilla
The infoZilla unstructured software engineering data mining tool. It can find and extract source code regions, patches, stack traces, enumerations and itemizations from discussion threads.
Language: Java - Size: 530 KB - Last synced at: 4 months ago - Pushed at: over 6 years ago - Stars: 15 - Forks: 2

SachinKalsi/html_tag_annotator
A Machine Learning tool to create the training dataset very quickly & easily by using a smart chrome extension
Language: JavaScript - Size: 11.8 MB - Last synced at: 4 months ago - Pushed at: over 2 years ago - Stars: 14 - Forks: 3

IBM/generate-insights-from-data-formats-with-watson 📦
How do we process data in different formats like docx, pdf etc and generate insights to be linked with structured data in database?This pattern helps in establishing relations between structured & unstructured data to generate recommendations using Watson NLU & Watson Studio.
Language: Jupyter Notebook - Size: 1.06 MB - Last synced at: about 2 months ago - Pushed at: about 5 years ago - Stars: 14 - Forks: 14

jokruger/rl3examples
RL3 examples repository (information extraction, NER, NLP, web & text mining, etc).
Language: Python - Size: 89.8 KB - Last synced at: 9 months ago - Pushed at: over 6 years ago - Stars: 14 - Forks: 1

aclai-lab/SoleData.jl
Manage logical datasets!
Language: Julia - Size: 4.16 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 13 - Forks: 2

lazyhope/metamodel
Intelligent Schema Designer and Unstructured Data Parser
Language: JavaScript - Size: 164 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 13 - Forks: 0

floriancochard/extract-data-from-paper
A tool designed to extract numerical data from scanned historical weather documents.
Language: Python - Size: 151 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 13 - Forks: 2

instill-ai/deprecated-core
🔮 Instill Core contains components for supporting Instill VDP and Instill Model
Language: Makefile - Size: 1.25 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 13 - Forks: 4

mkearney/wibble
Web Data Frames
Language: R - Size: 497 KB - Last synced at: 3 months ago - Pushed at: over 6 years ago - Stars: 12 - Forks: 0

chaitjo/knowledge-graphs
Building Knowledge Graphs from Unstructured Text
Language: Jupyter Notebook - Size: 42.9 MB - Last synced at: over 2 years ago - Pushed at: over 5 years ago - Stars: 11 - Forks: 6

aws-samples/content-repository-with-dynamic-access-control
Code and walkthrough to build an end-to-end content repository for unstructured data with dynamic access control.
Language: TypeScript - Size: 1000 KB - Last synced at: about 2 months ago - Pushed at: about 2 years ago - Stars: 10 - Forks: 1

rririanto/unstructured-demo-streamlit
Extract your docs (CSV, PDF, JSON, HTML, DOCS, Sheets and more) for your own GPT and LLM projects using Unstructured.io via streamlit
Language: Python - Size: 6.84 KB - Last synced at: 4 months ago - Pushed at: almost 2 years ago - Stars: 8 - Forks: 0

maithilish/gotz
Gotz - Heavy duty ETL to automate data extraction from tons of HTML pages
Language: Java - Size: 1.41 MB - Last synced at: over 2 years ago - Pushed at: about 3 years ago - Stars: 8 - Forks: 0

ZoralLabs/rl3stdlib
The RL3 Standard Library is a collection of modules accessible to a RL3 program to simplify the programming process and removing the need to rewrite commonly used RL3 patterns and predicates.
Size: 32.2 KB - Last synced at: 9 months ago - Pushed at: over 6 years ago - Stars: 8 - Forks: 2

DataCanvasIO/dingo Fork of dingodb/dingo
A multi-modal vector database that supports upserts and vector queries using unified SQL (MySQL-Compatible) on structured and unstructured data, while meeting the requirements of high concurrency and ultra-low latency.
Language: Java - Size: 18.9 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 7 - Forks: 2

MoinDalvs/Resume_Screening_and_Parser
Business objective- The document classification solution should significantly reduce the manual human effort in the HRM. It should achieve a higher level of accuracy and automation with minimal human intervention Sample Data Set Details: Resumes and financial documents
Language: Jupyter Notebook - Size: 95.9 MB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 7 - Forks: 2

Clarifai/clarifai-python-datautils
Extract Transform and Load unstructured data into the Clarifai's AI platform
Language: Python - Size: 1.01 MB - Last synced at: 5 days ago - Pushed at: 17 days ago - Stars: 6 - Forks: 0

Shizheng-Wen/GAOT-3D
About code release of "Geometry-Aware Operator Transformer: A Generalizable Framework for PDE Solution Operators"
Language: Python - Size: 1.9 MB - Last synced at: 23 days ago - Pushed at: 23 days ago - Stars: 6 - Forks: 0

libraryofcelsus/LLM_File_Parser
AutoML/Unstructured Data Processing for RAG and LLM Dataset Creation. Current Database Options are: Qdrant or Marqo DB.
Language: Python - Size: 43 KB - Last synced at: 6 days ago - Pushed at: about 1 year ago - Stars: 6 - Forks: 1

thu-west/AnnotationTool
An Annotation Tool Designed for Health Unstructured Data (标注工具)
Language: Java - Size: 13.8 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 5 - Forks: 4

kodexa-ai/kodexa
Kodexa Python Client
Language: Python - Size: 12.8 MB - Last synced at: 7 days ago - Pushed at: 9 days ago - Stars: 4 - Forks: 1

SupermatAI/supermat
Novel data representation leading to granular citations and higher accuracy
Language: Python - Size: 5.57 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 4 - Forks: 1

hupe1980/go-textractor
📄 Amazon textract response parser written in go.
Language: Go - Size: 6.24 MB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 0

Menziess/Databook
Data Engineering knowledge as a readable tutorial (collaboratively).
Size: 2.44 MB - Last synced at: about 1 year ago - Pushed at: over 6 years ago - Stars: 4 - Forks: 1

kaloslazo/PyFuseDB
Database system that combines structured data retrieval through inverted indexes with unstructured data (images, audio) search using multidimensional vector embeddings, all within a unified platform.
Language: Python - Size: 631 MB - Last synced at: 4 months ago - Pushed at: 8 months ago - Stars: 3 - Forks: 0

instill-ai/connector-backend 📦
⇋ A REST/gRPC server for Instill AI's data connector service
Language: JavaScript - Size: 1.63 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 3

yrnigam/Named-Entity-Recognition-NER-using-LSTMs
Named Entity Recognition (NER) using LSTMs with Keras
Language: Jupyter Notebook - Size: 3.78 MB - Last synced at: almost 2 years ago - Pushed at: about 5 years ago - Stars: 3 - Forks: 6

saranpal/Spark-RDD-Set-Top-Box-Data-Analysis
Spark RDD transformation and action, process unstructured data
Language: Scala - Size: 654 KB - Last synced at: almost 2 years ago - Pushed at: over 6 years ago - Stars: 3 - Forks: 3

instill-ai/helm-charts
⎈ The Helm charts of Instill AI
Size: 237 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 2 - Forks: 1

b-cubed-eu/comp-unstructured-data
Scripts to explore the conditions that determine the reliability of models, trends and status by comparing aggregated cubes with structured monitoring schemes
Language: R - Size: 1.69 MB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 2 - Forks: 0

SAP-archive/hana-structurer-one 📦
SAP HANA Extreme application that analyzes unstructured data (tweets) to retrieve information such as location, people, companies, and also sentiment analysis.
Language: CSS - Size: 3.81 MB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 4

pradeepdev-1995/Index-based-semantic-similarity-unstructured-data-search
Unstructured data refers to information that is not organised using a predetermined data model or schema and cannot be stored in a conventional relational database system. There are several methods for search unstructured data semantically- That is by taking the actual context/meaning of the sentences.One best approach is index based approach.
Language: Jupyter Notebook - Size: 249 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

dominiksalvet/crypto-addr-extract
Extract cryptocurrency addresses from big datasets
Language: Python - Size: 42 KB - Last synced at: about 1 year ago - Pushed at: about 3 years ago - Stars: 2 - Forks: 0

instill-ai/mgmt-backend
⇋ A REST/gRPC server for Instill AI's Management API service
Language: Go - Size: 1.09 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1 - Forks: 2

instill-ai/.github
🏡 Instill AI organisation profile and default configuration
Size: 52.4 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

kodexa-ai/kodexa-cli
Command Line Tools for Kodexa
Language: Python - Size: 1.12 MB - Last synced at: 7 days ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 1

camlab-ethz/GAOT
About code release of "Geometry Aware Operator Transformer As An Efficient And Accurate Neural Surrogate For PDEs On Arbitrary Domains"
Language: Python - Size: 35.9 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

Toschu95/my-benefit-finder-vienna
My Benefit Finder Vienna is an AI-powered system designed to help individuals in Vienna quickly find and apply for relevant social benefits, grants, and subsidies. Using RAG (Retrieval-Augmented Generation) and a Large Language Model (LLM), this tool provides personalized recommendations based on the latest available data from official sources.
Language: Jupyter Notebook - Size: 637 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

FroCode/Realtime_Streaming_Unstructured-Data
Real-time streaming and processing of unstructured data (spark, airflow)
Language: Python - Size: 128 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

faisalman/re-parse-js
Compose a structured data from unstructured text using regex-based pattern matching, as found in UAParser.js
Language: TypeScript - Size: 31.3 KB - Last synced at: 2 days ago - Pushed at: 7 months ago - Stars: 1 - Forks: 1

yeisonmontoya1815/Special-Topics-in-Data-Analytics
In my PDD Data Analytics studies at Douglas College, the Special Topics course stands out as a crucial component. This specialized module delves into advanced aspects of data analysis beyond the core curriculum, offering a deep exploration of intricate domains. Through this focused study, I aim to enhance my proficiency in handling complex datasets
Language: Jupyter Notebook - Size: 15.2 MB - Last synced at: 4 months ago - Pushed at: 12 months ago - Stars: 1 - Forks: 0

konhay/sector-attention-index
Specifically built for the research proposal: Estimating sector attention index with deep learning methods : example of Chinese stock market, Jan. 4, 2024.
Language: Python - Size: 864 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

esteininger/file-processor
A Python library that uses AI to convert unstructured files (like PDFs, HTML, etc.) into structured data.
Language: Python - Size: 114 KB - Last synced at: 4 days ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 1

alexandreLamarre/Fission
Data analytics & Structured streaming optimized for the Edge
Language: Rust - Size: 31.3 KB - Last synced at: 5 months ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

mkirslis/Warship-Data
Generates a CSV file of warship data from Wikipedia.
Language: Python - Size: 155 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

ash-0521/Abandoned-Object-Detection-in-crowded-environment-using-MATLAB
Trained MATLAB models for 82% precision/80% recall, optimized with blob analysis for 25% performance boost. User-friendly alarm system with 500+ engaged users.
Size: 682 KB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

ttariqaziz/statistical_modeling_matlab
Highlights of my research work in MATLAB, statistical modeling of the unstructured raw data from GPS satellites for several years. Data modeling and processing, followed by different residual plots including trends and root mean square. In the end, the result was compared with independent data set models for validation purposes. The results were also presented at a European conference.
Size: 10.8 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

aws-samples/content-repository-with-multilingual-search
Code and walkthrough to build an end-to-end content repository for unstructured data with multilingual semantic search and dynamic access control.
Language: TypeScript - Size: 3.19 MB - Last synced at: about 2 months ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

mohamad1014/SmartSearch
A repository dealing with the ability to use LLMs for semantic search. The data considered are specific curated documents targetting closed domain search. This is created to show how relatively simple it is to use these methods and increase productivity within an org.
Language: Python - Size: 3.91 KB - Last synced at: 6 months ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

malfusion/sentiment-keyword-extraction
Multi-Pipeline Keyword Extractor and Word Cloud Visualizer for Sentiment Analysis tasks
Language: Java - Size: 6.76 MB - Last synced at: about 1 year ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

boomalope/ltb
Code for my working paper: The Winners and Losers of Rental Tribunals (February 14, 2022). Available at SSRN: https://ssrn.com/abstract=4029114
Language: HTML - Size: 69.7 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

adisorbo/NEON_tool
NEON mines rules for detecting natural language patterns in software informal documents. The inferred rules can be used for identifying and extracting relevant information embedded in unstructured texts.
Language: Java - Size: 68.8 MB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 5
