Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub / gurtejrehal / FALCON---AI-Data-Crawler

Falcon Search has been created to aid the National Crime Records Bureau keeping in mind the need for an efficient AI data crawler that collects classified data from the web based on given keywords. It is a SaaS web data integration (WDI) platform which converts unstructured web data into structured format by extracting, preparing and integrating web data in areas of crime for consumption in criminal investigation agencies. Falcon provides a visual environment for automating the workflow of extracting and transforming web data. After specifying the target website url, the web data extraction module provides a visual environment for designing automated workflows for harvesting data, going beyond HTML/XML parsing of static content to automate end user interactions yielding data that would otherwise not be immediately visible. Once extracted, the software provides full data preparation capabilities that are used for harmonizing and cleansing the web data. For consuming the results, Falcon provides several options. It has its own visualization and dashboarding module to help criminal investigators gain the insights that they need. It also provides APIs that offer full access to everything that can be done on our platform, allowing web data to be integrated directly. FALCON is capable of crawling ten million links and scrape one million links per month using Celery Worker. It moreover has the potential of outperforming this number if tested under standard cloud platforms.

JSON API: https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gurtejrehal%2FFALCON---AI-Data-Crawler

Stars: 33
Forks: 7
Open Issues: 13

License: None
Language: JavaScript
Repo Size: 91.6 MB
Dependencies: 79

Created: almost 4 years ago
Updated: 8 months ago
Last pushed: over 1 year ago
Last synced: 8 months ago

Files
    Loading...
    Readme
    Loading...
    Dependencies