GitHub / deviknitkkr / Jemini
This project provides a REST API that allows users to submit URLs for crawling. The app internally uses RabbitMQ to publish the URLs, and then listens back to fetch the contents of the URLs using Jsoup. The app also scrapes links and indexes the content using Apache Lucene.
JSON API: http://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deviknitkkr%2FJemini
PURL: pkg:github/deviknitkkr/Jemini
Stars: 2
Forks: 0
Open issues: 0
License: apache-2.0
Language: Java
Size: 114 KB
Dependencies parsed at: Pending
Created at: about 2 years ago
Updated at: 3 months ago
Pushed at: 3 months ago
Last synced at: 3 months ago
Topics: apache-lucene, rabbitmq, search-engine, spring-boot, webcrawler