An open API service providing repository metadata for many open source software ecosystems.

Topic: "text-as-data"

JasonKessler/scattertext

Beautiful visualizations of how language differs among document types.

Language: Python - Size: 39.4 MB - Last synced at: 11 days ago - Pushed at: about 1 month ago - Stars: 2,302 - Forks: 292

MilaNLProc/contextualized-topic-models

A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021 (Bianchi et al.).

Language: Python - Size: 32 MB - Last synced at: 25 days ago - Pushed at: 4 months ago - Stars: 1,228 - Forks: 152

jboynyc/textnets

Text analysis with networks.

Language: Python - Size: 2.92 MB - Last synced at: about 6 hours ago - Pushed at: about 2 months ago - Stars: 285 - Forks: 25

ryanjgallagher/shifterator

Interpretable data visualizations for understanding how texts differ at the word level

Language: Python - Size: 40.1 MB - Last synced at: 19 days ago - Pushed at: 4 months ago - Stars: 275 - Forks: 29

JasonKessler/Scattertext-PyData

Notebooks for the Seattle PyData 2017 talk on Scattertext

Language: HTML - Size: 20.7 MB - Last synced at: about 1 month ago - Pushed at: over 7 years ago - Stars: 142 - Forks: 53

chkla/CSS-Events

Summer/ winter schools, workshops and conferences in computational social science 🫂

Size: 225 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 40 - Forks: 2

umanlp/SemScale Fork of codogogo/topfish

A tool for Semantic Scaling of Political Text (branch of Topfish, a suite of tools for Political Text Analysis)

Language: Python - Size: 17.5 MB - Last synced at: 6 months ago - Pushed at: over 1 year ago - Stars: 26 - Forks: 4

chkla/Populism-Text-Analysis

Literature 📄 and datasets 📚 on automatic populism detection

Size: 268 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 16 - Forks: 0

fedenanni/Computational-Text-Analysis-2018-19

2018 Computational Text Analysis Notebooks, University of Mannheim

Language: Jupyter Notebook - Size: 25.3 MB - Last synced at: 27 days ago - Pushed at: over 6 years ago - Stars: 13 - Forks: 7

cjerzak/LinkOrgs-software

LinkOrgs: An R package for linking linking records on organizations using half a billion open-collaborated records from LinkedIn

Language: HTML - Size: 90.6 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 12 - Forks: 1

wesslen/summer2017-socialmedia

Summer 2017 Social Media Analytics Workshop Series

Language: HTML - Size: 22.5 MB - Last synced at: about 2 months ago - Pushed at: about 7 years ago - Stars: 11 - Forks: 3

davidycliao/bisCrawler

An Automation Webcrawler for Extracting Central Bankers' Speeches

Language: Python - Size: 59.7 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 10 - Forks: 2

tweedmann/3x8emotions

Code and models for 3 different tools to measure appeals to 8 discrete emotions in German political text

Language: Jupyter Notebook - Size: 3.12 MB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 10 - Forks: 0

thieled/dictvectoR

'dictvectoR' measures the similarity between a concept dictionary and documents, using fastText word vectors. Implements the "Distributed-Dictionary-Representation" (Garten et al. 2018) method in R.

Language: R - Size: 5.6 MB - Last synced at: about 2 months ago - Pushed at: over 2 years ago - Stars: 9 - Forks: 2

WZBSocialScienceCenter/tm_corona

A small showcase for topic modeling with the tmtoolkit Python package. I use a corpus of articles from the German online news website Spiegel Online (SPON) to create a topic model for before and during the COVID-19 pandemic.

Language: Jupyter Notebook - Size: 51.5 MB - Last synced at: 3 months ago - Pushed at: over 4 years ago - Stars: 4 - Forks: 0

aflueckiger/KED2022

The ABC of Computational Text Analysis. BA Seminar, Spring 2022, University of Lucerne

Language: HTML - Size: 187 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 0

adamlauretig/gensim_in_R

Code for estimating word embeddings with gensim in R.

Size: 225 KB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 3 - Forks: 1

marcosfanton/stm_filobr

Uso de structural topic modeling para análise de teses e dissertações da pós-graduação em filosofia no Brasil.

Language: R - Size: 461 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 0

jfjelstul/regular-expressions-tutorial

A tutorial on using regular expressions in R

Size: 1.27 MB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 2 - Forks: 0

CT-P/portuguese_open_data

Empirical framework applied to parliament discourses and Twitter data, with a Discourse Polarization Index.

Language: Jupyter Notebook - Size: 17.3 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

bgonzalezbustamante/TextClass-Benchmark

TextClass Benchmark Leaderboards

Language: Jupyter Notebook - Size: 147 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

varvarailyina/mds_thesis

all code and results for my MDS thesis at the hertie school

Language: Jupyter Notebook - Size: 72.3 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 0 - Forks: 0

ichalkiad/datadescriptor_uselections2020

Code for collecting and cleaning speeches (text) of the US 2020 election campaign. Corresponding publication: "A text dataset of campaign speeches of the main tickets in the 2020 US presidential election", by Ioannis Chalkiadakis, Louise Anglès d’Auriac, Gareth W. Peters, and Divina Frau-Meigs

Language: Python - Size: 38.3 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

Jszabo16/EU-sentiments_NRSR

Replication script for mining sentiments towards the EU from Parliamentary Speeches in the National Council of the Slovak Republic (1994-2023)

Language: R - Size: 93.8 KB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

marek-chadim/Empirical-Economics

Coding and Machine Learning for Economists PhD course

Language: HTML - Size: 326 MB - Last synced at: 14 days ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

Sam-Gartenstein/Machine-Learning-for-the-Social-Sciences

Material from my Machine Learning for the Social Sciences course

Language: Jupyter Notebook - Size: 1.75 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

alexgatsby/Code-Samples---Alexsandra-Cavalcanti

A little sample of my recent work as a data analyst.

Language: HTML - Size: 11 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

BenjaminFReese/american_constitutional_praxis

This repository uses text-as-data methods alongside traditional primary source reading to analyze early American state constitutions. The R scripts create a function to scrape and clean the constitutional text, run sentiment analysis, calculate tf-idf, and perform LDA. This is a work-in-progress.

Language: HTML - Size: 2.78 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

aflueckiger/KED2021

The ABC of Computational Text Analysis. BA Seminar, Spring 2021, University of Lucerne

Language: HTML - Size: 241 MB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

Related Topics
computational-social-science 10 text-analysis 7 natural-language-processing 6 nlp 5 r 4 political-science 4 topic-modeling 4 visualization 3 machine-learning 3 sociology 3 sentiment-analysis 3 political-communication 2 emotions 2 word2vec 2 text-visualization 2 word-embeddings 2 covid-19 2 social-science 2 python 2 scraping 2 text-mining 2 webscraping 2 teaching 2 stylometric 1 sentiment 1 stylometry 1 semiotic-squares 1 scatter-plot 1 llms 1 deepseek 1 japanese-language 1 nlp-machine-learning 1 topic-coherence 1 transformer 1 data-visualization 1 digital-humanities 1 information-theory 1 dictionary 1 ideology 1 scaling 1 word-representations 1 word-vectors 1 network-analysis 1 facebook-api 1 geospatial 1 twitter-api 1 d3 1 eda 1 exploratory-data-analysis 1 toxicity-classification 1 zero-shot-classification 1 rhetoric 1 us-election-2020 1 us-elections 1 conferences 1 events 1 summer-schools 1 winter-schools 1 workshops 1 community-detection 1 equinox 1 jax 1 organizational-units 1 record-linkage 1 transformer-architecture 1 elo-rating 1 gpt-4 1 gpt-4o 1 leaderboards 1 llama 1 llm 1 llms-benchmarking 1 misinformation 1 mistral 1 nous-hermes 1 ollama 1 openai 1 perspective-api 1 qwen2-5 1 text-classification 1 toxicity 1 nlp-library 1 corona 1 stm 1 pos-graduacao 1 philosophy 1 lda 1 educacao 1 capes 1 unsupervised-machine-learning 1 supervised-machine-learning 1 neural-networks 1 gensim 1 text-scaling 1 wordfish 1 python-functions-examples 1 python-functions 1 logistic-regression 1 impeachment-of-brazilian-president 1 geobr 1