An open API service providing repository metadata for many open source software ecosystems.

Topic: "summarization-dataset"

HHousen/TransformerSum

Models to perform neural summarization (extractive and abstractive) using machine learning transformers and a tool to convert abstractive summarization datasets to the extractive task.

Language: Python - Size: 11.7 MB - Last synced at: 5 months ago - Pushed at: almost 2 years ago - Stars: 427 - Forks: 58

csebuetnlp/xl-sum

This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages" published in Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021.

Language: Python - Size: 5.41 MB - Last synced at: 9 months ago - Pushed at: about 1 year ago - Stars: 249 - Forks: 42

IlyaGusev/gazeta

Gazeta: Dataset for automatic summarization of Russian news / Газета: набор данных для автоматического реферирования на русском языке

Language: Python - Size: 76.2 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 27 - Forks: 1

rajdeep345/ECTSum

Dataset and Codes for our EMNLP 2022 Main Conference Long Paper titled "ECTSum: A New Benchmark Dataset For Bullet Point Summarization of Long Earnings Call Transcripts"

Language: Python - Size: 21.2 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 26 - Forks: 11

ziegler-ingo/CRAFT

Code, datasets, and checkpoints for the paper "CRAFT Your Dataset: Task-Specific Synthetic Dataset Generation Through Corpus Retrieval and Augmentation"

Language: Python - Size: 1.55 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 15 - Forks: 5

amazon-science/abstractive-factual-tradeoff

Code and data for the Dreyer et al (2023) paper on abstractiveness and factuality in abstractive summarization

Language: Python - Size: 1.98 MB - Last synced at: 18 days ago - Pushed at: almost 2 years ago - Stars: 11 - Forks: 0

dennlinger/klexikon

Klexikon: A German Dataset for Joint Summarization and Simplification

Language: Python - Size: 58.8 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 11 - Forks: 0

griff4692/calibrating-summaries

This is the official PyTorch codebase for the ACL 2023 paper: "What are the Desired Characteristics of Calibration Sets? Identifying Correlates on Long Form Scientific Summarization".

Language: Python - Size: 10.1 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 9 - Forks: 0

MohanKrishnaGR/Infosys_Text-Summarization

This repository contains the implementation of a Transformer-based model for abstractive text summarization and a rule-based approach for extractive text summarization.

Language: Jupyter Notebook - Size: 15.9 MB - Last synced at: 17 days ago - Pushed at: 9 months ago - Stars: 8 - Forks: 2

nakhunchumpolsathien/ThaiCrossSum_Corpora

Thai Crosslingual Summarization Datasets.

Language: Jupyter Notebook - Size: 3.1 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 7 - Forks: 0

tafseer-nayeem/BengaliSummarization

Code and Dataset of our work, Unsupervised Abstractive Summarization of Bengali Text Documents accepted at EACL 2021.

Language: Python - Size: 2.63 MB - Last synced at: almost 2 years ago - Pushed at: almost 4 years ago - Stars: 6 - Forks: 6

tafseer-nayeem/NeuFuse

[Computer Speech & Language, Elsevier] - Neural Sentence Fusion for Diversity Driven Abstractive Multi-Document Summarization.

Language: Python - Size: 14.6 KB - Last synced at: over 1 year ago - Pushed at: about 4 years ago - Stars: 2 - Forks: 1

zenquiorra/M3LS

M3LS : Multi-lingual Multi-modal summarization dataset

Language: Python - Size: 30.3 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

trongtuyen99/ViWikiSum

vietnamese multi doc-summarization dataset

Size: 37.3 MB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 0

BaseMax/DeepSummarizationNLP

In deep learning NLP, using a model we are trying to summarization the text.

Language: Python - Size: 27.3 KB - Last synced at: about 14 hours ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 0

DevKhizerer/T5_Summarizer

Using T5-Small and fine-tuning it using BBC's article summarization dataset.

Language: Jupyter Notebook - Size: 29.3 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

plandes/cnndmdb

CNN/DailyMail Dataset as SQLite

Language: Python - Size: 281 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

giganttheo/tib-dataset

Dataset for abstractive summarization of long multimodal presentations

Size: 1.95 KB - Last synced at: 12 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

theQuert/COVID-Tweets-Summ

COVID-19: Analyzing Tweets for Extractive Multi-Document Summarization on News

Language: HTML - Size: 39.7 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

Related Topics
summarization 9 abstractive-summarization 5 machine-learning 5 text-summarization 4 nlp 3 deep-learning 3 dataset 3 abstractive-text-summarization 3 transformer-models 2 summarization-algorithm 2 automatic-summarization 2 fine-tuning 2 low-resource-languages 2 text-summarisation 2 summarization-corpora 2 abstractive-summarization-dataset 2 reinforcement-learning 1 scientific-machine-learning 1 generative-ai 1 pytorch-nlp 1 model-calibration 1 metrics 1 factuality-checking 1 factuality 1 abstractive 1 sentence-fusion 1 neural-sentence-fusion 1 multi-document-summarization 1 multimodal-deep-learning 1 roberta 1 bart 1 sqlite3 1 task-specific 1 synthetic-dataset-generation 1 synthetic-data 1 question-answering 1 question-answer-generation 1 large-language-models 1 instruction-tuning 1 data-augmentation 1 corpus-data 1 financial-data 1 benchmarking 1 multi-modal 1 multi-lingual 1 large-scale-dataset 1 huggingface-transformers 1 huggingface-datasets 1 pytorch-lightning 1 social-network 1 social-media-analysis 1 social-media 1 covid-19 1 russian-language 1 vietnamese-nlp 1 datasets 1 crawling-python 1 bengali-summarization-dataset 1 bengali-summarization 1 bengali-nlp 1 bengali-abstractive-summarization 1 text-simplification 1 python3 1 german-language 1 german 1 crosslingual-summarization 1 extractive-summarization 1 distilbert 1 bert 1 albert 1 python 1 neural-network 1 deep-neural-networks 1 text-summarization-model 1 text-summarization-dataset 1 multilinguality 1 multilingual-text-summarization 1 multilingual-summarization 1 multilingual 1 low-resource-text-summarizarion 1 low-resource-summarization 1 tweets-extraction 1 tweets 1