AbstractOptical Character Recognition (OCR) can open up understudied historical
documents to computational analysis, but the accuracy of OCR software varies.
This article reports a benchmarking experiment comparing the performance of
Tesseract, Amazon Textract, and Google Document AI on images of English and
Arabic text. English-language book scans (n = 322) and Arabic-language article
scans (n = ... hiện toàn bộ
AbstractParliamentary and legislative debate transcripts provide access to
information concerning the opinions, positions, and policy preferences of
elected politicians. They attract attention from researchers from a wide variety
of backgrounds, from political and social sciences to computer science. As a
result, the problem of computational sentiment and position-taking analysis has
been tackled ... hiện toàn bộ
Bennett Kleinberg, Isabelle van der Vegt, Paul Gill
AbstractThe increased threat of right-wing extremist violence necessitates a
better understanding of online extremism. Radical message boards, small-scale
social media platforms, and other internet fringes have been reported to fuel
hatred. The current paper examines data from the right-wing forum Stormfront
between 2001 and 2015. We specifically aim to understand the development of user
activity ... hiện toàn bộ