An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: document-processing

Daniel-codi/Concept_Curve_Embeddings_Indexation

Code to make any AI have unlimited context persistent memory. In the example, a software for any AI to read the Uniform Commercial Code of Michigan. A document of 220,000 tokens

Language: JavaScript - Size: 20.3 MB - Last synced at: about 15 hours ago - Pushed at: about 16 hours ago - Stars: 0 - Forks: 0

jmanhype/DSPy-Multi-Document-Agents

An advanced distributed knowledge fabric for intelligent document processing, featuring multi-document agents, optimized query handling, and semantic understanding.

Language: Python - Size: 135 KB - Last synced at: 1 day ago - Pushed at: 9 months ago - Stars: 29 - Forks: 2

felixdittrich92/docling-OCR-OnnxTR

OnnxTR OCR plugin for Docling

Language: Python - Size: 1.47 MB - Last synced at: 3 days ago - Pushed at: about 1 month ago - Stars: 4 - Forks: 0

ucbepic/TWIX

TWIX is an open-source data extraction tool that reconstructs structured data from documents at scale, accurately and at low cost, by inferring the shared underlying visual template across documents

Language: Python - Size: 177 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 171 - Forks: 7

ucbepic/docetl

A system for agentic LLM-powered data processing and ETL

Language: Python - Size: 66.5 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 1,937 - Forks: 185

kevv1m/tikara

The metadata and text content extractor for almost every file type.

Size: 1000 Bytes - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

awslabs/project-lakechain

:zap: Cloud-native, AI-powered, document processing pipelines on AWS.

Language: TypeScript - Size: 177 MB - Last synced at: 3 days ago - Pushed at: about 2 months ago - Stars: 177 - Forks: 26

mancrurod/Resume-Optimization

Personal project that automates resume adaptation using LLMs. Converts .docx resumes to Markdown, tailors them to job descriptions with GPT-4o-mini or Gemini, and exports clean HTML and PDF resumes — with built-in editing and logging features.

Language: Python - Size: 71.3 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 2 - Forks: 0

abdullahshafiq-20/ResumeTex

ResumeTex is an AI-powered tool that converts standard PDF resumes into professionally formatted LaTeX documents. This service helps you create elegant, structured resumes without needing to learn LaTeX syntax.

Language: JavaScript - Size: 560 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 19 - Forks: 1

diegoabeltran16/OpenPages-pipeline

Open-source tool for turning technical documents into AI-ready formats. Built for better access to knowledge.

Language: Python - Size: 1.78 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 0 - Forks: 0

AshrafulAlamShaqib/pdf-page-counter

Offline web app to count pages in PDF files using PDF.js

Language: JavaScript - Size: 0 Bytes - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 0 - Forks: 0

AhmedZeyadTareq/Smart-markdown-Extractor

A smart AI-powered application to extract, reorganize, and interact with file content, converting it into clean Markdown format using OpenAI and Streamlit.

Language: Python - Size: 5.86 KB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 0 - Forks: 0

awslabs/rhubarb

A Python framework for multi-modal document understanding with Amazon Bedrock

Language: Python - Size: 31.7 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 82 - Forks: 6

iamarunbrahma/pdf-to-markdown

Conversion of PDF documents to structured Markdown, optimized for Retrieval Augmented Generation (RAG) and other NLP tasks. Extract text, tables, and images with preserved formatting for enhanced information retrieval and processing.

Language: Python - Size: 69.3 KB - Last synced at: 3 days ago - Pushed at: 6 months ago - Stars: 74 - Forks: 7

0x22B9/ai-telegram-bot

AI Telegram bot using Gemini for chat, audio, and docs, with HuggingFace image gen. Deploy on Fly.io. Try it now!

Language: Python - Size: 233 KB - Last synced at: 13 days ago - Pushed at: 13 days ago - Stars: 0 - Forks: 0

formkiq/formkiq-core

A full-featured Document Management Platform / Document Layer for your application, providing storage, discovery, processing, and retrieval. Deploys directly into your Amazon Web Services Cloud. Please 🌟 star to support our work!

Language: Java - Size: 20.1 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 128 - Forks: 18

enoch3712/ExtractThinker

ExtractThinker is a Document Intelligence library for LLMs, offering ORM-style interaction for flexible and powerful document workflows.

Language: Python - Size: 20.3 MB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 1,205 - Forks: 118

credeed/credeed-pdf-to-markdown

Convert PDF to Markdown using AI, can be used for Agent to understand documents.

Size: 0 Bytes - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 0 - Forks: 0

CentralFloridaAttorney/zmongo_retriever

Use data from MongoDB in LangChain, Llama and OpenAI

Language: Python - Size: 27.3 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 4 - Forks: 1

aws-samples/sample-document-processing-with-amazon-bedrock-data-automation

This repository contains examples for customers to get started using Amazon Bedrock Data Automation. The samples focus mainly on document processing use cases

Language: Jupyter Notebook - Size: 9.09 MB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 5 - Forks: 2

eklem/stopword-trainer

A module for creating stopword lists for any language, based on a set of documents.

Language: JavaScript - Size: 6.16 MB - Last synced at: 6 days ago - Pushed at: 8 months ago - Stars: 15 - Forks: 0

gs-ai/PDFProfessor

PDF Professor 2.0 extracts and processes PDF text, analyzed by Ollama for summarization, data extraction, and insights. More coming soon!

Language: Python - Size: 1.95 MB - Last synced at: 24 days ago - Pushed at: 24 days ago - Stars: 1 - Forks: 0

AmadeusITGroup/docs2vecs

CLI that helps with docs splitting, embedding and exposing them in a seamless manner

Language: Python - Size: 1.51 MB - Last synced at: 25 days ago - Pushed at: 26 days ago - Stars: 3 - Forks: 5

Node0/timbermill

OCR-powered chat session renderer that slices long conversations into paginated, searchable PDFs

Size: 3.91 KB - Last synced at: 5 days ago - Pushed at: 27 days ago - Stars: 0 - Forks: 0

swiss-ai-center/layout-analysis-service

Layout Analysis Service detect part of an image-based document using PP-PicoDet.

Language: Python - Size: 9.99 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

aws-samples/sample-for-multi-modal-document-to-json-with-sagemaker-ai

This open-source project delivers a complete pipeline for converting multi-page documents (PDFs/images) into structured JSON using Vision LLMs on Amazon SageMaker. The solution leverages the SWIFT Framework to fine-tune models specifically for document understanding tasks.

Language: Jupyter Notebook - Size: 3.18 MB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 3 - Forks: 0

aws-solutions/enhanced-document-understanding-on-aws

Enhanced Document Understanding on AWS delivers an easy-to-use web application that ingests and analyzes documents, extracts content, identifies and redacts sensitive customer information, and creates search indexes from the analyzed data.

Language: JavaScript - Size: 61.7 MB - Last synced at: 27 days ago - Pushed at: about 1 month ago - Stars: 37 - Forks: 14

jromero132/pdf-splitter

PDF Splitter is a Python tool that takes a multi-page PDF file and splits it into individual PDF files, one for each page of the original document.

Language: Python - Size: 2.93 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

souvik03-136/TenderBot

Task

Language: Python - Size: 127 MB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

jromero132/pdf-merger

A Python utility for merging multiple PDFs and images into a single PDF file. This tool maintains aspect ratios, centers content on custom-sized pages (default A4), and supports recursive directory processing. Perfect for organizing documents and creating cohesive PDF compilations.

Language: Python - Size: 2.93 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

easytocloud/Mac-letterhead

A macOS utility for merging letterhead templates with PDF and Markdown documents using a drag-and-drop interface

Language: Python - Size: 3.2 MB - Last synced at: 10 days ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

QuiddityAI/PDFerret

An all-in-one converter to make your files LLM-understandable

Language: HTML - Size: 32.1 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

kili-technology/awesome-datasets

A comprehensive list of annotated training datasets classified by use case.

Size: 24.9 MB - Last synced at: 6 days ago - Pushed at: almost 3 years ago - Stars: 33 - Forks: 6

JDM-Github/debahra-efficio

DEHBARA (Efficio) is a React and Express-based web application designed to streamline service requests for DTI, SSS, and other document processing needs. It simplifies the process of requesting official papers and services, integrating cloud storage for efficient data management.

Language: TypeScript - Size: 13.3 MB - Last synced at: 27 days ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

parsee-ai/parsee-core

Retrieval of fully structured data made easy. Use LLMs or custom models. Specialized on PDFs and HTML files. Extensive support of tabular data extraction and multimodal queries.

Language: Python - Size: 1.24 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 66 - Forks: 1

FayazK/Document-Metadata-Extractor

A Python tool that uses Google's Gemini AI to automatically extract structured metadata from PDF and DOCX documents, saving results to Excel for easy analysis and organizing raw responses as JSON files.

Language: Python - Size: 11.7 KB - Last synced at: about 1 month ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

aswinpradeepc/llmsearch

AI-powered search tool for querying financial reports, mutual fund documents, and market research using natural language. Built with FastAPI, Streamlit, OpenAI embeddings, and Pinecone vector search.

Language: Python - Size: 17.6 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

jcaperella29/Document_cleaning_CLI

A deep learning-based pipeline for cleaning scanned document images. Automatically removes noise, enhances text clarity, and optimizes images for OCR. 🚀

Language: MATLAB - Size: 94.5 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 1 - Forks: 0

unix-ami/Invertify

Invertify is a tool for inverting the colors of PDF files, perfect for creating dark mode versions of documents.

Language: Python - Size: 9.77 KB - Last synced at: about 1 month ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

adhikaritusharAAA/Document_cleaning_CLI

A deep learning-based pipeline for cleaning scanned document images. Automatically removes noise, enhances text clarity, and optimizes images for OCR. 🚀

Language: Python - Size: 0 Bytes - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

Swiftgum/swiftgum

The user data connection layer for AI applications. Transform any source into LLM-ready markdown. Focus on your AI, not integrations.

Language: TypeScript - Size: 3.05 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 4 - Forks: 0

KrzysztofTybinka/DocMiner

RAG APi with OCR feature, with option to use local embeddings and language models for secure, offline document processing and intelligent retrieval.

Language: C# - Size: 547 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

Md-Emon-Hasan/LangChain

Powerful framework for building applications with Large Language Models (LLMs), enabling seamless integration with memory, agents, and external data sources.

Language: Jupyter Notebook - Size: 737 KB - Last synced at: 2 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

abdur75648/urdu-text-detection

Text line detection for Urdu OCR (UTRNet)

Language: Python - Size: 48.5 MB - Last synced at: 12 days ago - Pushed at: 7 months ago - Stars: 6 - Forks: 1

dhlab-epfl/dhSegment

Generic framework for historical document processing

Language: Python - Size: 5.89 MB - Last synced at: about 2 months ago - Pushed at: almost 4 years ago - Stars: 374 - Forks: 115

baughmann/tikara

The metadata and text content extractor for almost every file type.

Language: Python - Size: 161 MB - Last synced at: 6 days ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

kallebysantos/ocrlot

A distributed ocr engine 🐆

Language: Elixir - Size: 291 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

Jayanth-MKV/advanced-rag-cookbooks

Advanced RAG Techniques and Projects

Language: HTML - Size: 1.71 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 2 - Forks: 0

adibshirazi/PDFMerger

PDF Merger Tool

Language: TypeScript - Size: 13.7 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

acsenrafilho/cucaracha

A bureaucratic cockroach (cucaracha) assistent to help in document processing and analysis

Language: Python - Size: 6.44 MB - Last synced at: 3 days ago - Pushed at: 3 months ago - Stars: 1 - Forks: 1

oeo/processor-rs

High-performance document processing pipeline in Rust. Extracts text, performs OCR, and optimizes images from PDFs and other document formats with parallel processing and memory efficiency.

Language: Rust - Size: 42 KB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

qlfv/Docling-Testing

Repository for testing and demonstrating the capabilities of Docling for document conversion.

Language: HTML - Size: 18.4 MB - Last synced at: 24 days ago - Pushed at: 4 months ago - Stars: 0 - Forks: 2

Huang-lab/figure-extractor

Flask-based service using PDFFigures 2.0 to extract figures and tables from scholarly PDFs. Features REST API, CLI, Docker support, and JSON metadata output (~1.5s/page processing). Designed for document processing and RAG pipelines.

Language: Python - Size: 16.8 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

drgsn/filefusion

FileFusion is a powerful file concatenation tool designed specifically for Large Language Model (LLM)

Language: Go - Size: 173 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 6 - Forks: 0

LF3551/AutoDocMark

AutoDocMark: Streamline Document-to-Markdown Workflows

Language: Python - Size: 112 KB - Last synced at: 5 days ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

steindani/pandoc-include

An include filter for Pandoc

Language: Haskell - Size: 9.77 KB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 62 - Forks: 20

BjornMelin/pdfusion

A lightweight Python utility for effortlessly merging multiple PDF files into a single document.

Language: Python - Size: 40 KB - Last synced at: about 2 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

jayllfpt/table2html

A Python package that converts table images into HTML format using Object Detection model and OCR.

Language: Python - Size: 365 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 2 - Forks: 0

terilios/file-upload-embeddings

Enterprise-grade document intelligence platform leveraging vector embeddings and LLMs for advanced document processing, semantic search, and information retrieval.

Language: Python - Size: 173 KB - Last synced at: about 2 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

MBAigner/PDFSegmenter

This library builds a graph-representation of the content of PDFs. The graph is then clustered, resulting page segments are classified and returned. Tables are retrieved formatted as a CSV.

Language: Python - Size: 399 KB - Last synced at: 15 days ago - Pushed at: over 4 years ago - Stars: 22 - Forks: 3

maemresen/mae-ghostscript

mae-ghostscript is a Docker-based tool for compressing PDF files efficiently using Ghostscript. This containerized solution simplifies the process of PDF compression, providing a consistent environment that works across different platforms. Users can run the container by mounting their local directories and specifying the PDF to compress.

Language: Shell - Size: 12.7 KB - Last synced at: 2 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

towfique-elahe/pdf-to-structured-csv

A Python-based tool for extracting structured data from PDFs using OCR and regex, and exporting it to CSV. Ideal for processing invoices, logs, or scanned documents into organized, usable datasets.

Language: Jupyter Notebook - Size: 27.3 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

digiparser/digiparser-website

DigiParser | Extract data from documents and emails

Language: TypeScript - Size: 88.1 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

cburschka/lyx

Unofficial mirror of git://git.lyx.org/lyx.git (updates daily. not affiliated with lyx.org.)

Language: C++ - Size: 616 MB - Last synced at: 6 days ago - Pushed at: about 2 years ago - Stars: 36 - Forks: 7

deBUGger404/RAG-Powered-GPT-4-Chatbot

🚀 Revolutionize your data interaction with a cutting-edge chatbot built on Retrieval-Augmented Generation (RAG) and OpenAI’s GPT-4. Upload documents, create custom knowledge bases, and get precise, contextual answers. Ideal for research, business operations, customer support, and more!

Language: HTML - Size: 23.4 KB - Last synced at: about 1 month ago - Pushed at: 8 months ago - Stars: 0 - Forks: 1

Shahrom-S/BarsAI

AI assistant

Language: Python - Size: 11.2 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

kaypro283/document-merger-analyzer

Automate merging of DOC, DOCX, and PDF files with word frequency analysis. Streamlines document consolidation for large-scale projects.

Language: Python - Size: 7.81 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

pratheeshkumar99/Document-based-Question-Answering-System

This project demonstrates a Retrieval-Augmented Generation (RAG) system for question answering. It integrates OpenAI’s GPT-4 model with FAISS for vector similarity search, enabling the system to provide accurate and contextually relevant answers based on a given document or dataset.

Language: Python - Size: 13.7 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

caltechlibrary/popstar

Phone-Oriented Processing SofTware for ARchives

Language: Makefile - Size: 49.2 MB - Last synced at: 29 days ago - Pushed at: 10 months ago - Stars: 2 - Forks: 0

swiss-ai-center/document-vectorizer-service

Service to vectorize documents into a FAISS vectorstore.

Language: Python - Size: 557 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

SDpDas/Document_annotate_tool

Adds annotation to each element in document and defines what it is.

Language: Python - Size: 292 KB - Last synced at: about 1 month ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

ArtemZarubin/XmlDocumentProcessor

XmlDocumentProcessor: A .NET component for XML document processing. It analyzes XML content, performs keyword-based queries, and transforms data into HTML. Emphasizes design patterns like Strategy pattern, with a focus on class diagramming. Implements penalty for non-compliance.

Language: C# - Size: 19.5 KB - Last synced at: 2 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

dayang4321/MSc-Team-Project-CMPU9010-2023-24-Group-3

TU Dublin Computer Science MSc. Final Project Group 3 - Accessibilator

Language: Jupyter Notebook - Size: 100 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

Jackojc/old-wotpp 📦

A document preprocessor that works in conjunction with tools like groff/troff & refer.

Language: C++ - Size: 60.5 KB - Last synced at: about 1 year ago - Pushed at: over 6 years ago - Stars: 1 - Forks: 0

rina-reimer/uwb-hacks-ai-local

AI-powered chatbot designed to simplify the job search process

Language: TypeScript - Size: 443 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

thoth2357/Watermark-removal

Program Helps remove watermark from a pdf document

Language: Python - Size: 3.91 KB - Last synced at: 3 months ago - Pushed at: about 5 years ago - Stars: 1 - Forks: 0

afrozas/proceedings

Semantic extraction from conference proceedings.

Language: Python - Size: 1.06 MB - Last synced at: about 1 year ago - Pushed at: almost 5 years ago - Stars: 31 - Forks: 1

johnsirmon/clearcouncil

ClearCouncil: Automated tools for collecting, organizing, and embedding publicly available local state county council documents (minutes, agendas) into LLMs. Python, JS, and wget scripts included for easy data retrieval and integration.

Language: Python - Size: 71.3 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 1

cemonal/Pdf2xNet

Pdf2xNet is a .NET library for seamless integration with Xpdf tools, enabling easy conversion of PDF documents to text, images, and HTML formats within your .NET applications.

Language: C# - Size: 11.9 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

greed2411/tokyo

tokyo, a REST API, when given any type of document 📄, Identifies mime-type 🧐. Suggests extension 🦔. Alas Extracts text 💪.

Language: Clojure - Size: 19.5 KB - Last synced at: 4 days ago - Pushed at: almost 5 years ago - Stars: 18 - Forks: 0

m4nd0mb3/document-templater

Document Templater is a powerful tool for automated document generation. Streamline the process of creating standard documents, such as contracts, reports, and forms, using predefined templates. This repository contains the source code for Document Templater, allowing you to easily integrate this functionality into your projects and automate docs.

Language: JavaScript - Size: 579 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 5 - Forks: 0

Oneirocom/generative-intent-detection

Generative intent detection with Magick

Language: TypeScript - Size: 42 KB - Last synced at: 7 days ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

fonckchain/pdf-text-converter

Python tool for converting PDF files to text. Simplify your document processing tasks.

Language: Python - Size: 1000 Bytes - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

x1ao4/doc-merger

通过 python 脚本将两个相对不完整的文档合并为一个完整的文档 / merge two relatively incomplete documents into one complete document via python script

Language: Python - Size: 22.5 KB - Last synced at: 3 months ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

jackvaughan09/phil Fork of hudnash/phil

Minimize the time requirement of audit report analysis with a containerized file conversion and scraping system

Language: Jupyter Notebook - Size: 106 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

anne27/Information-Retrieval

An implementation of basic IR techniques from scratch.

Language: Python - Size: 27.8 MB - Last synced at: 11 months ago - Pushed at: almost 6 years ago - Stars: 1 - Forks: 0

NinjaRocks/Data2Xml

Data2Xml is .Net 6.0 Library to map data to xml by list of XPATH. Supports data sets from API and SQL.

Language: C# - Size: 58.6 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

joseferrerh/invoices-leanautomation

This set of robots provides support for automatically obtaining information from invoices using docDigitizer API and keep track of the processed invoices on an Airtable repository

Language: RobotFramework - Size: 403 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 0

RPetitpierre/Generic_Semantic_Segmentation_of_Historical_Maps

Language: Jupyter Notebook - Size: 94.4 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 0

jeanbaptisteb/doccleaner

A Python command-line utility intended for automating some copyediting tasks in documents. It allows editing zipped, XML-based files (e.g. docx, odt, or epub), through XSLT stylesheets. Can be rather easily extended with your own custom xsl stylesheets.

Language: XSLT - Size: 81.1 KB - Last synced at: about 1 year ago - Pushed at: almost 7 years ago - Stars: 6 - Forks: 2

zyrolasting/dynamic-xml

Apply keyword procedures in a given Racket namespace using X-expressions.

Language: Racket - Size: 5.86 KB - Last synced at: 7 days ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 0

trehman65/backtoschool

School/College Stationary List OCR and Parsing

Language: C++ - Size: 13.7 KB - Last synced at: about 2 years ago - Pushed at: about 8 years ago - Stars: 0 - Forks: 0

Related Keywords