Ecosyste.ms: Repos
An open API service providing repository metadata for many open source software ecosystems.
GitHub / marianna13 / pile_tokenizer
Downloads, extracts and tokenizes pile (https://the-eye.eu/public/AI/pile) data.
JSON API: https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/marianna13%2Fpile_tokenizer
Stars: 1
Forks: 0
Open Issues: 0
License: mit
Language: Python
Repo Size: 89.5 MB
Dependencies:
168
Created: almost 2 years ago
Updated: over 1 year ago
Last pushed: almost 2 years ago
Last synced: about 1 year ago
Files
Loading...
Readme
Loading...
Dependencies
requirements.txt
pypi
- AppKit ==0.2.8
- ConfigParser ==5.2.0
- Cython ==0.29.30
- Foundation ==0.1.0a0.dev1
- HTMLParser ==0.0.2
- JPype1 ==1.4.0
- Jinja2 ==3.1.1
- MarkupSafe ==2.1.1
- Numeric ==24.2
- Pillow ==9.1.1
- PyQt4 ==4.11.4
- PyQt5 ==5.15.6
- QtPy ==2.1.0
- SQLAlchemy ==1.4.37
- Sphinx ==5.0.1
- accelerate ==0.9.0
- aiodns ==3.0.0
- apache_beam ==2.39.0
- apex ==0.9.10dev
- asynctest ==0.13.0
- azureml ==0.2.7
- beautifulsoup4 ==4.11.1
- boto3 ==1.22.12
- botocore ==1.25.12
- brotli ==1.0.9
- brotlicffi ==1.0.9.2
- brotlipy ==0.7.0
- cchardet ==2.1.7
- cftime ==1.6.0
- classy_vision ==0.6.0
- cloudpickle ==2.1.0
- codecarbon ==2.1.2
- comet_ml ==3.31.3
- cookiecutter ==2.1.1
- cryptography ==36.0.0
- ctypes_snappy ==1.03
- cycler ==0.11.0
- dall_e ==0.1
- diff ==0.6.1
- disco ==1.40.4
- distributed ==2022.6.0
- dl ==0.1.0
- docutils ==0.18.1
- elasticsearch ==8.2.2
- fairscale ==0.4.6
- fairseq ==0.12.1
- faiss ==1.5.3
- faiss_cpu ==1.7.2
- fastai ==2.6.3
- fastapi ==0.78.0
- fastparquet ==0.8.1
- flax ==0.5.1
- ftfy ==6.1.1
- fugashi ==1.1.2
- genapi ==0.0.8
- gluonnlp ==0.10.0
- gunicorn ==20.1.0
- h5py ==3.6.0
- haiku ==0.01
- hypothesis ==6.47.2
- idna_ssl ==1.1.0
- importlib_metadata ==4.11.3
- importlib_resources ==5.7.1
- ipadic ==1.0.0
- ipython ==8.4.0
- ipywidgets ==7.7.0
- isal ==0.11.1
- jax ==0.3.13
- jaxlib ==0.3.10
- jieba ==0.42.1
- jnius ==1.1.0
- kenlm ==0.0.0
- keyring ==23.6.0
- librosa ==0.9.1
- lockfile ==0.12.2
- lxml ==4.9.0
- lz4 ==4.0.1
- lzmaffi ==0.3.0
- matplotlib ==3.5.1
- mlflow ==1.26.1
- mock ==4.0.3
- mtrand ==0.1
- mxnet ==1.9.1
- mypy ==0.961
- nose ==1.3.7
- numarray ==1.5.1
- numexpr ==2.8.1
- odfpy ==1.4.1
- onnxruntime ==1.11.1
- openpyxl ==3.0.10
- optuna ==2.10.1
- ordereddict ==1.1
- paramiko ==2.11.0
- phonemizer ==3.2.1
- pickle5 ==0.0.12
- pox ==0.3.1
- protobuf ==4.21.1
- psutil ==5.9.0
- psycopg2 ==2.9.3
- py3nvml ==0.2.7
- pyOpenSSL ==22.0.0
- pycocotools ==2.0.4
- pyctcdecode ==0.3.0
- pydantic ==1.9.1
- pygit2 ==1.9.2
- pymysql ==1.0.2
- pytesseract ==0.3.9
- pytest ==7.1.2
- pythainlp ==3.0.8
- pytorch_lightning ==1.6.4
- pytorch_quantization ==0.0.1.dev5
- pyxlsb ==1.0.9
- railroad ==0.5.0
- rarfile ==4.0
- ray ==1.13.0
- requests_kerberos ==0.14.0
- rjieba ==0.1.11
- s3fs ==2022.3.0
- s3prl ==0.3.4
- sacremoses ==0.0.49
- scann ==1.2.7
- scikit_learn ==1.1.1
- scipy ==1.8.0
- sentencepiece ==0.1.96
- sets ==0.3.2
- setuptools_scm ==6.4.2
- sigopt ==8.4.0
- simplejson ==3.17.6
- slack_sdk ==3.17.1
- soundfile ==0.10.3.post1
- spacy ==3.3.1
- starlette ==0.20.3
- statsmodels ==0.13.2
- sympy ==1.10.1
- t5x ==0.0.0
- tables ==3.7.0
- tabulate ==0.8.9
- tensorboardX ==2.5.1
- tensorflow_hub ==0.12.0
- tensorflow_probability ==0.17.0
- tensorflow_text ==2.9.0
- tf2onnx ==1.11.1
- timeout_decorator ==0.5.0
- timm ==0.5.4
- tokio ==0.2.0
- toml ==0.10.2
- torch ==1.11.0
- torch_scatter ==2.0.9
- torchaudio ==0.11.0
- torchvision ==0.12.0
- tornado ==6.1
- traitlets ==5.1.1
- tzdata ==2022.1
- unicodedata2 ==14.0.0
- unidic ==1.1.0
- unidic_lite ==1.0.8
- uvicorn ==0.17.6
- uvloop ==0.16.0
- vissl ==0.1.6
- wandb ==0.12.14
- wget ==3.2
- xarray ==2022.3.0
- xlrd ==2.0.1
- xlsxwriter ==3.0.3
- xlwt ==1.3.0
- xmlrpclib ==1.0.1
- xxhash ==3.0.0
- zipp ==3.8.0