Topic: "text-as-data"
JasonKessler/scattertext
Beautiful visualizations of how language differs among document types.
Language: Python - Size: 39.4 MB - Last synced at: 11 days ago - Pushed at: about 1 month ago - Stars: 2,302 - Forks: 292

MilaNLProc/contextualized-topic-models
A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021 (Bianchi et al.).
Language: Python - Size: 32 MB - Last synced at: 25 days ago - Pushed at: 4 months ago - Stars: 1,228 - Forks: 152

jboynyc/textnets
Text analysis with networks.
Language: Python - Size: 2.92 MB - Last synced at: about 6 hours ago - Pushed at: about 2 months ago - Stars: 285 - Forks: 25

ryanjgallagher/shifterator
Interpretable data visualizations for understanding how texts differ at the word level
Language: Python - Size: 40.1 MB - Last synced at: 19 days ago - Pushed at: 4 months ago - Stars: 275 - Forks: 29

JasonKessler/Scattertext-PyData
Notebooks for the Seattle PyData 2017 talk on Scattertext
Language: HTML - Size: 20.7 MB - Last synced at: about 1 month ago - Pushed at: over 7 years ago - Stars: 142 - Forks: 53

chkla/CSS-Events
Summer/ winter schools, workshops and conferences in computational social science 🫂
Size: 225 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 40 - Forks: 2

umanlp/SemScale Fork of codogogo/topfish
A tool for Semantic Scaling of Political Text (branch of Topfish, a suite of tools for Political Text Analysis)
Language: Python - Size: 17.5 MB - Last synced at: 6 months ago - Pushed at: over 1 year ago - Stars: 26 - Forks: 4

chkla/Populism-Text-Analysis
Literature 📄 and datasets 📚 on automatic populism detection
Size: 268 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 16 - Forks: 0

fedenanni/Computational-Text-Analysis-2018-19
2018 Computational Text Analysis Notebooks, University of Mannheim
Language: Jupyter Notebook - Size: 25.3 MB - Last synced at: 27 days ago - Pushed at: over 6 years ago - Stars: 13 - Forks: 7

cjerzak/LinkOrgs-software
LinkOrgs: An R package for linking linking records on organizations using half a billion open-collaborated records from LinkedIn
Language: HTML - Size: 90.6 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 12 - Forks: 1

wesslen/summer2017-socialmedia
Summer 2017 Social Media Analytics Workshop Series
Language: HTML - Size: 22.5 MB - Last synced at: about 2 months ago - Pushed at: about 7 years ago - Stars: 11 - Forks: 3

davidycliao/bisCrawler
An Automation Webcrawler for Extracting Central Bankers' Speeches
Language: Python - Size: 59.7 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 10 - Forks: 2

tweedmann/3x8emotions
Code and models for 3 different tools to measure appeals to 8 discrete emotions in German political text
Language: Jupyter Notebook - Size: 3.12 MB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 10 - Forks: 0

thieled/dictvectoR
'dictvectoR' measures the similarity between a concept dictionary and documents, using fastText word vectors. Implements the "Distributed-Dictionary-Representation" (Garten et al. 2018) method in R.
Language: R - Size: 5.6 MB - Last synced at: about 2 months ago - Pushed at: over 2 years ago - Stars: 9 - Forks: 2

WZBSocialScienceCenter/tm_corona
A small showcase for topic modeling with the tmtoolkit Python package. I use a corpus of articles from the German online news website Spiegel Online (SPON) to create a topic model for before and during the COVID-19 pandemic.
Language: Jupyter Notebook - Size: 51.5 MB - Last synced at: 3 months ago - Pushed at: over 4 years ago - Stars: 4 - Forks: 0

aflueckiger/KED2022
The ABC of Computational Text Analysis. BA Seminar, Spring 2022, University of Lucerne
Language: HTML - Size: 187 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 0

adamlauretig/gensim_in_R
Code for estimating word embeddings with gensim in R.
Size: 225 KB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 3 - Forks: 1

marcosfanton/stm_filobr
Uso de structural topic modeling para análise de teses e dissertações da pós-graduação em filosofia no Brasil.
Language: R - Size: 461 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 0

jfjelstul/regular-expressions-tutorial
A tutorial on using regular expressions in R
Size: 1.27 MB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 2 - Forks: 0

CT-P/portuguese_open_data
Empirical framework applied to parliament discourses and Twitter data, with a Discourse Polarization Index.
Language: Jupyter Notebook - Size: 17.3 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

bgonzalezbustamante/TextClass-Benchmark
TextClass Benchmark Leaderboards
Language: Jupyter Notebook - Size: 147 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

varvarailyina/mds_thesis
all code and results for my MDS thesis at the hertie school
Language: Jupyter Notebook - Size: 72.3 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 0 - Forks: 0

ichalkiad/datadescriptor_uselections2020
Code for collecting and cleaning speeches (text) of the US 2020 election campaign. Corresponding publication: "A text dataset of campaign speeches of the main tickets in the 2020 US presidential election", by Ioannis Chalkiadakis, Louise Anglès d’Auriac, Gareth W. Peters, and Divina Frau-Meigs
Language: Python - Size: 38.3 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

Jszabo16/EU-sentiments_NRSR
Replication script for mining sentiments towards the EU from Parliamentary Speeches in the National Council of the Slovak Republic (1994-2023)
Language: R - Size: 93.8 KB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

marek-chadim/Empirical-Economics
Coding and Machine Learning for Economists PhD course
Language: HTML - Size: 326 MB - Last synced at: 14 days ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

Sam-Gartenstein/Machine-Learning-for-the-Social-Sciences
Material from my Machine Learning for the Social Sciences course
Language: Jupyter Notebook - Size: 1.75 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

alexgatsby/Code-Samples---Alexsandra-Cavalcanti
A little sample of my recent work as a data analyst.
Language: HTML - Size: 11 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

BenjaminFReese/american_constitutional_praxis
This repository uses text-as-data methods alongside traditional primary source reading to analyze early American state constitutions. The R scripts create a function to scrape and clean the constitutional text, run sentiment analysis, calculate tf-idf, and perform LDA. This is a work-in-progress.
Language: HTML - Size: 2.78 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

aflueckiger/KED2021
The ABC of Computational Text Analysis. BA Seminar, Spring 2021, University of Lucerne
Language: HTML - Size: 241 MB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0
