Ecosyste.ms: Repos
An open API service providing repository metadata for many open source software ecosystems.
GitHub / Sakshi-ai999 / PDFDataExtractorOCR
This project is about how to extract data from PDF file and store the data into text format.
JSON API: https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Sakshi-ai999%2FPDFDataExtractorOCR
Stars: 0
Forks: 0
Open Issues: 20
License: None
Language: Python
Repo Size: 33.2 KB
Dependencies:
143
Created: almost 4 years ago
Updated: almost 4 years ago
Last pushed: over 1 year ago
Last synced: about 1 year ago
Files
Loading...
Readme
Loading...
Dependencies
requirements.txt
pypi
- Cython ==0.29.17
- Django ==3.0.7
- EbookLib ==0.17.1
- Flask ==1.1.2
- IMAPClient ==2.1.0
- Jinja2 ==2.11.2
- MarkupSafe ==1.1.1
- Pillow ==7.1.2
- PyMuPDF ==1.17.1
- PyPDF2 ==1.26.0
- PyYAML ==5.3.1
- Pygments ==2.6.1
- QtPy ==1.9.0
- Send2Trash ==1.5.0
- SpeechRecognition ==3.8.1
- Unidecode ==1.1.1
- Wand ==0.5.9
- Werkzeug ==1.0.1
- XlsxWriter ==1.2.9
- appdirs ==1.4.4
- argcomplete ==1.10.0
- asgiref ==3.2.7
- atomicwrites ==1.3.0
- attrs ==19.1.0
- backcall ==0.1.0
- beautifulsoup4 ==4.8.0
- bleach ==3.1.5
- certifi ==2020.4.5.1
- cffi ==1.14.0
- chardet ==3.0.4
- ci-info ==0.2.0
- click ==7.1.2
- colorama ==0.4.1
- coloredlogs ==14.0
- contextlib2 ==0.6.0.post1
- cycler ==0.10.0
- dateparser ==0.7.6
- decorator ==4.4.2
- defusedxml ==0.6.0
- distlib ==0.3.1
- distro ==1.5.0
- docker ==4.2.0
- docx2txt ==0.8
- entrypoints ==0.3
- etelemetry ==0.2.1
- extract-msg ==0.23.1
- filelock ==3.0.12
- humanfriendly ==8.2
- idna ==2.9
- image ==1.5.32
- img2pdf ==0.3.6
- importlib-metadata ==0.21
- invoice2data ==0.3.5
- ipykernel ==5.2.1
- ipython ==7.14.0
- ipython-genutils ==0.2.0
- ipywidgets ==7.5.1
- isodate ==0.6.0
- itsdangerous ==1.1.0
- jedi ==0.17.0
- jsonschema ==3.2.0
- jupyter ==1.0.0
- jupyter-client ==6.1.3
- jupyter-console ==6.1.0
- jupyter-core ==4.6.3
- kiwisolver ==1.2.0
- lxml ==4.5.0
- matplotlib ==3.2.1
- mistune ==0.8.4
- mod-wsgi ==4.7.1
- more-itertools ==7.2.0
- nbconvert ==5.6.1
- nbformat ==5.0.6
- networkx ==2.4
- nibabel ==3.1.0
- notebook ==6.0.3
- numpy ==1.18.3
- ocrmypdf ==10.1.0
- olefile ==0.46
- opencv-python ==4.2.0.34
- packaging ==19.1
- pandas ==1.0.5
- pandocfilters ==1.4.2
- parso ==0.7.0
- pdf2image ==1.13.1
- pdfminer.six ==20181108
- pdfplumber ==0.5.21
- pickleshare ==0.7.5
- pikepdf ==1.15.1
- pluggy ==0.13.1
- prometheus-client ==0.7.1
- prompt-toolkit ==3.0.5
- protobuf ==3.11.3
- prov ==1.5.3
- py ==1.8.0
- pycparser ==2.20
- pycryptodome ==3.9.7
- pydot ==1.4.1
- pydotplus ==2.0.2
- pyparsing ==2.4.2
- pypiwin32 ==223
- pyreadline ==2.1
- pyrsistent ==0.16.0
- pytesseract ==0.3.4
- pytest ==5.1.2
- python-dateutil ==2.8.1
- python-pptx ==0.6.18
- pytz ==2020.1
- pywin32 ==227
- pywinpty ==0.5.7
- pyxnat ==1.3
- pyzmq ==19.0.1
- qtconsole ==4.7.3
- rdflib ==5.0.0
- regex ==2020.6.8
- reportlab ==3.5.42
- requests ==2.23.0
- scipy ==1.4.1
- selenium ==3.141.0
- simplejson ==3.17.0
- six ==1.12.0
- sortedcontainers ==2.2.2
- soupsieve ==2.0.1
- sqlparse ==0.3.1
- tabula-py ==2.1.1
- terminado ==0.8.3
- testpath ==0.4.4
- textract ==1.6.3
- tika ==1.24
- tornado ==6.0.4
- tqdm ==4.46.1
- traitlets ==4.3.3
- tzlocal ==1.5.1
- unicodecsv ==0.14.1
- urllib3 ==1.25.3
- virtualenv ==20.0.25
- wcwidth ==0.1.7
- webencodings ==0.5.1
- websocket-client ==0.57.0
- widgetsnbextension ==3.5.1
- xlrd ==1.2.0
- xtarfile ==0.0.3
- zipp ==0.6.0