Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: stylometry

jon-chun/AI-LIT

AI-LIT: Using AI Embeddings to find what is Lost in Translation

Language: Python - Size: 7.83 MB - Last synced: about 5 hours ago - Pushed: 1 day ago - Stars: 0 - Forks: 0

tsilvs/anonymouth Fork of codeclimate-testing/anonymouth

Document Anonymization Tool based on stylometric techniques

Language: Java - Size: 499 MB - Last synced: 6 days ago - Pushed: 7 days ago - Stars: 0 - Forks: 0

fastdatascience/faststylometry

Stylometry library for Burrows' Delta method

Language: Jupyter Notebook - Size: 10.2 MB - Last synced: 2 days ago - Pushed: about 1 month ago - Stars: 25 - Forks: 6

PaschalisAg/seneca_stylometry

This repository contains the dataset and the code used to run the experiments for the case of the disputed authorship of two of the Senecan plays, Octavia and Hercules Oetaeus.

Language: Jupyter Notebook - Size: 140 MB - Last synced: 23 days ago - Pushed: 24 days ago - Stars: 0 - Forks: 0

SupervisedStylometry/SuperStyl

Supervised Stylometry

Language: Python - Size: 135 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 20 - Forks: 4

mullerpeter/authorstyle

Python package to deal with PAN corpora and extract stylometric features from text documents.

Language: Python - Size: 1.21 MB - Last synced: about 1 month ago - Pushed: over 1 year ago - Stars: 14 - Forks: 0

ecomp-shONgit/stylo-ah-online

Stylo ah online is an online tool to compute comparative text analysis in your browser. It implements the pipeline consisting of text (string) normalization, string decomposition (into token / features), counting and building up a feature vector, measure computation (create a distance matrix) and clustering.

Language: JavaScript - Size: 124 KB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 1 - Forks: 0

JasonKessler/scattertext

Beautiful visualizations of how language differs among document types.

Language: Python - Size: 40.5 MB - Last synced: about 2 months ago - Pushed: 3 months ago - Stars: 2,197 - Forks: 285

Nicolas-le/argumentRetrieval

This git repository documents the code base used in a custom argument retrieval system. This git repository documents the code base used in a custom argument retrieval system. The system was build as a part of the Information Retrieval module at the University of Leipzig.

Language: Python - Size: 13 MB - Last synced: about 2 months ago - Pushed: over 3 years ago - Stars: 0 - Forks: 0

dmitryvoinov/voicesofplato

Stylometry-driven research of Plato's dialogues

Size: 256 KB - Last synced: 2 months ago - Pushed: 2 months ago - Stars: 0 - Forks: 0

isabel-mm/stylo-r-novels

R+Python code for stylometric analysis on a corpus of Anglophone novels.

Language: Python - Size: 18.4 MB - Last synced: 2 months ago - Pushed: 2 months ago - Stars: 1 - Forks: 0

evllabs/JGAAP

The Java Graphical Authorship Attribution Program

Language: Java - Size: 272 MB - Last synced: 2 months ago - Pushed: 12 months ago - Stars: 256 - Forks: 72

top-on/llmask

A command-line tool for masking authorship of text, by changing the writing style with a Large Language Model.

Language: Python - Size: 88.9 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 2 - Forks: 0

Jur1cek/gcj-dataset

Collected solutions from Google Code Jam programming competition (2008-2020).

Size: 1000 MB - Last synced: 3 months ago - Pushed: 4 months ago - Stars: 57 - Forks: 9

goldmonkey21/doxer

Stylometric Data Mining Library with a focus on identifying Satoshi Nakamoto as a case study.

Language: Python - Size: 108 MB - Last synced: 4 months ago - Pushed: 5 months ago - Stars: 21 - Forks: 3

7PartidasDigital/AnalisisTextual

Todo lo accesorio y entorno al proyecto sobre Análisis de textos con R

Language: R - Size: 11.6 MB - Last synced: 17 days ago - Pushed: 5 months ago - Stars: 5 - Forks: 0

rafayetrafi/BanglaMusicStylo-A-Stylometric-Dataset-of-Bangla-Music-Lyrics

With the rapid growth of Bangla music industry huge volume of Bangla songs are produced every day. Immense number of producers, lyricists, singers and artists are involved in production of songs from different genres. Among many genres of Bangla music; classical, folk, baul, modern music, Rabindra Sangeet, Nazrul Geeti, film music, rock music and fusion music has gained the highest popularity. Lyricists try to express their feelings and views towards any situation or subject through their writings. Therefore, each lyricist have their own dictionary of thoughts to put on music lyrics. In this paper, we have presented “BanglaMusicStylo”, the very first stylometric dataset of Bangla music lyrics. We have collected 2824 Bangla song lyrics of 211 lyricists in a digital form. All the lyrics are stored in text format for further use. This dataset could be used for stylometric analysis such as authorship attribution, linguistic forensics, gender identification from textual data, Bangla music genre classification, vandalism detection, emotion classification etc. Identifying the significant research opportunities in this area, we have formalized this dataset which could be used for stylometric analysis.

Size: 24.1 MB - Last synced: 6 months ago - Pushed: almost 5 years ago - Stars: 4 - Forks: 0

arian-askari/anonymous-comment

On Anonymous Commenting: A Greedy Approach to Balance Utilization and Anonymity for Instagram Users - Accepted at SIGIR 2019

Language: Python - Size: 22.5 KB - Last synced: 6 months ago - Pushed: almost 2 years ago - Stars: 0 - Forks: 0

ML-D00M/navalny-NLP

NLP-driven stylometric analysis to investigate the authorship of Alexey Navalny's texts from jail

Language: Jupyter Notebook - Size: 30 MB - Last synced: 6 months ago - Pushed: 6 months ago - Stars: 0 - Forks: 0

ee-2/register

A toolkit for analyzing register, genre and style

Language: Python - Size: 104 KB - Last synced: 7 months ago - Pushed: 7 months ago - Stars: 0 - Forks: 0

dykang/PASTEL

Data and code for Kang et al., EMNLP 2019's paper titled "(Male, Bachelor) and (Female, Ph.D) have different connotations: Parallelly Annotated Stylistic Language Dataset with Multiple Personas"

Language: Python - Size: 54.5 MB - Last synced: 8 months ago - Pushed: about 4 years ago - Stars: 29 - Forks: 3

jmclawson/stylo2gg

Visualize and explore stylo data with ggplot2

Language: R - Size: 6.93 MB - Last synced: 8 months ago - Pushed: 8 months ago - Stars: 1 - Forks: 0

a-coles/SMS-Stylometry

A tool that predicts the dialect of English of an SMS message using recurrent neural networks supplemented with data from Google Trends.

Language: Python - Size: 25.3 MB - Last synced: 17 days ago - Pushed: over 6 years ago - Stars: 6 - Forks: 2

radiaw/stylometry

Personal Project on analysis of authorship of fantasy books

Language: HTML - Size: 29 MB - Last synced: 9 months ago - Pushed: over 2 years ago - Stars: 1 - Forks: 0

czcorpus/QuitaUp

QuitaUp: A tool for quantitative stylometric analysis

Language: R - Size: 119 MB - Last synced: about 2 months ago - Pushed: about 2 years ago - Stars: 8 - Forks: 4

metasyn/stylometry-talk

some slides i put together for a short talk on stylometry at splunk

Language: HTML - Size: 4.42 MB - Last synced: 10 months ago - Pushed: almost 6 years ago - Stars: 0 - Forks: 0

versotym/stichometry

Stylometric analysis of poetic texts based on their versification

Language: Python - Size: 19.5 KB - Last synced: 10 months ago - Pushed: over 6 years ago - Stars: 3 - Forks: 0

gmikros/Stylo-Tutorial

This is a short introductory tutorial on Stylo package in R language

Language: R - Size: 4.07 MB - Last synced: 10 months ago - Pushed: over 5 years ago - Stars: 2 - Forks: 0

dimboump/nettt-2022

Code for Boumparis & Giannoutsos (2022) poster announcement at NeTTT Conference 2022.

Language: Jupyter Notebook - Size: 11.1 MB - Last synced: 10 months ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0

kishordgupta/Authentication_by_Stylometry_NLP

This Projects aim to evaluate the efficiency of Stylometry approach as an authentication method. Stylometric analysis research has been done for author identification and there is significant progress to recognize an author based on their written texts. In this project, we tried to detect differences between writing styles on the same topic provided by a set of users and we test that these differences are enough to use for an authentication system or not.

Language: C# - Size: 33.2 MB - Last synced: 10 months ago - Pushed: about 5 years ago - Stars: 0 - Forks: 0

ashenoy95/writeprints-static

Writeprints-Static Feature Set exctraction for Adversarial Stylometry

Language: Python - Size: 8.79 KB - Last synced: 11 months ago - Pushed: about 6 years ago - Stars: 0 - Forks: 0

Jero2760/estilometria

Corpus abierto de obras en español en formato txt para estudios de estilometría

Size: 27 MB - Last synced: 10 months ago - Pushed: over 3 years ago - Stars: 2 - Forks: 1

ArtaXerxess/Stylometric-Analysis

Basic implementation for stylometric analysis using Principal component analysis

Language: Python - Size: 1.02 MB - Last synced: 11 months ago - Pushed: over 1 year ago - Stars: 1 - Forks: 0

Michaeljfang/PyGAAP Fork of DavidBerdik/PyGAAP

The Python Graphical Authorship Attribution Program — An experimental Python port of the Duquesne University Evaluating Variations in Language Lab's JGAAP.

Language: Python - Size: 8.79 MB - Last synced: almost 1 year ago - Pushed: almost 1 year ago - Stars: 4 - Forks: 2

rakshithShetty/A4NT-author-masking

Repository for author masking

Language: Python - Size: 156 KB - Last synced: over 1 year ago - Pushed: over 5 years ago - Stars: 9 - Forks: 1

tonyamart/smirnova_hoax

The code and data used in the article dedicated to the hoax "Various Poems by Anna Smirnova" (1837)

Language: Jupyter Notebook - Size: 17 MB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 1 - Forks: 0

Rohrym/Stylometric-Analysis-on-British-Political-Speeches

An exploratory research project focussing on extracting and analysing speeches from British political leaders, chief among them Winston Churchill.

Language: Jupyter Notebook - Size: 19.3 MB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 3 - Forks: 1

Jur1cek/source2vec 📦

Source code embeddings for various programming languages

Language: Jupyter Notebook - Size: 6.59 MB - Last synced: over 1 year ago - Pushed: almost 6 years ago - Stars: 11 - Forks: 2

procesaur/Parallel-doc-embeds

Comparison of classification power (literary authorship attribution case) of word-based, lemma-based, POS-based and mBERT-based document embeddings, as well as their combinations.

Language: Python - Size: 2.51 GB - Last synced: about 1 year ago - Pushed: about 2 years ago - Stars: 1 - Forks: 0

Darkar25/CSGAAP

C# implementation of evllabs's JGAAP

Language: C# - Size: 8.35 MB - Last synced: about 1 year ago - Pushed: about 1 year ago - Stars: 1 - Forks: 0

zd341/Author-Authentication

Authorship Attribution Comparing Deep Learning and Machine Learning Models and Methods.

Language: Jupyter Notebook - Size: 17.4 MB - Last synced: about 1 year ago - Pushed: almost 2 years ago - Stars: 1 - Forks: 1

VictorIJnr/bu

I like the name bu, but I called this User Stylometry Association, or UStylA, in my paper. In short, this just clusters users based on their stylometry - how they write stuff. This ended up as my Senior Honours project at The University of St Andrews. I had more ambitious plans but I didn't have enough time for them. This isn't half bad either though.

Language: Python - Size: 174 MB - Last synced: over 1 year ago - Pushed: almost 5 years ago - Stars: 2 - Forks: 0

severinsimmler/shylo

A Shiny GUI for Stylo

Language: R - Size: 104 KB - Last synced: over 1 year ago - Pushed: over 5 years ago - Stars: 5 - Forks: 0

christofs/stylometry-bibliography

Bibtex copy of the Zotero bibliography on Stylometry

Language: TeX - Size: 979 KB - Last synced: about 1 year ago - Pushed: almost 7 years ago - Stars: 6 - Forks: 1

hennyu/style2666

Datos del análisis estilístico computacional de la novela 2666 de Roberto Bolaño

Language: HTML - Size: 28.3 KB - Last synced: over 1 year ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

hentisch/FastDelta

A c++ library for performant and quick stylometric analysis

Language: C++ - Size: 49.8 KB - Last synced: about 1 year ago - Pushed: about 2 years ago - Stars: 0 - Forks: 0

PaschalisAg/SAPS

Stylometric Analysis of Political Speeches (SAPS)

Language: Jupyter Notebook - Size: 8.01 MB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 0 - Forks: 0

sam0jones0/pyantistylometry

Project exploring the feasibility of an automated and extensible anti-stylometry tool written in Python.

Language: Python - Size: 28.3 KB - Last synced: over 1 year ago - Pushed: about 2 years ago - Stars: 0 - Forks: 0

arojascastro/fabulasmitologicas

A collection of Golden Age poems in Spanish in TEI and plain text

Language: XSLT - Size: 2.7 MB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 2 - Forks: 0

ABC-DH/EnExDi2020

Materials for EnExDi2020 (Poitiers, February 10-14):

Language: JavaScript - Size: 212 MB - Last synced: 6 months ago - Pushed: 6 months ago - Stars: 2 - Forks: 3

tonyamart/elegies_dhn

Data and code used in the article "What is Russian Elegy? Computational Study of a Nineteenth-Century Poetic Genre"

Language: R - Size: 7.83 MB - Last synced: about 1 year ago - Pushed: almost 3 years ago - Stars: 0 - Forks: 0

jamestiotio/genshin-of-the-wild

Genshin Impact vs The Legend of Zelda: Breath of the Wild

Language: Jupyter Notebook - Size: 432 MB - Last synced: about 1 month ago - Pushed: over 2 years ago - Stars: 1 - Forks: 1

jeffasante/authorship-attribution

Authorship attribution in tweeting.

Language: JavaScript - Size: 12.5 MB - Last synced: about 1 year ago - Pushed: over 3 years ago - Stars: 1 - Forks: 0

rudrajit1729/Cool-ideas-python

Covers wide range of industry implemented topics. (Course on JOC by IIT Ropar via NPTEL)

Language: Python - Size: 7.33 MB - Last synced: about 1 year ago - Pushed: over 4 years ago - Stars: 0 - Forks: 1

jtonra/moore

Corpora of Thomas Moore texts and results of stylometric analysis

Size: 2.87 MB - Last synced: 11 months ago - Pushed: over 4 years ago - Stars: 0 - Forks: 0

borh/TextStylometry.jl

Julia package for stylometric analysis

Language: Julia - Size: 36.1 KB - Last synced: about 1 year ago - Pushed: about 6 years ago - Stars: 0 - Forks: 0

christofs/zeta-dhd2018

Language: JavaScript - Size: 14.2 MB - Last synced: about 1 year ago - Pushed: over 6 years ago - Stars: 0 - Forks: 0

Related Keywords
stylometry 57 nlp 12 digital-humanities 8 authorship-attribution 8 r 6 machine-learning 6 natural-language-processing 5 poetry 4 privacy 4 stylo 3 stylometric-features 3 stylometric 3 visualization 3 text-classification 2 exploratory-data-analysis 2 churchill 2 literature 2 python 2 author-attribution 2 anonymity 2 word2vec 2 llm 2 sentiment 2 sentiment-analysis 2 privacy-tools 2 feature-extraction 2 python3 2 authorship-identification 2 spanish 1 novel 1 source-code-analysis 1 zotero 1 multilanguage 1 parallel 1 bibtex 1 shiny-gui 1 csharp 1 deep-learning 1 bibliography 1 feature-selection 1 clustering 1 convolutional-neural-networks 1 authorship 1 ai 1 statistics 1 versification 1 machine-translation 1 authentication 1 python-3 1 estilometria 1 goldenage 1 plain-text 1 siglo-de-oro 1 spanish-language 1 pca 1 principal-component-analysis 1 stylometric-analysis 1 a4nt 1 text-style-transfer 1 political-history 1 united-kingdom 1 fasttext 1 glove 1 source-code 1 british 1 zelda 1 zelda-botw 1 zelda-breath-of-the-wild 1 browser-automation 1 collatz-conjecture 1 game-development 1 gps-python 1 image-compression 1 image-processing 1 map-area-estimation 1 networkx 1 page-rank-algorithm 1 speech-recognition 1 speech-to-text 1 turtle 1 corpora 1 moore 1 julia 1 dhd2018 1 keyness 1 zeta 1 politics 1 anonymization 1 privacy-protection 1 stylometric-techniques 1 style 1 tei 1 tei-xml 1 cartography 1 commandline 1 digitaleditions 1 digitalhumanities 1 audio-analysis 1 botw 1 breath-of-the-wild 1