AbstractOptical Character Recognition (OCR) can open up understudied historical documents to computational analysis, but the accuracy of OCR software varies. This article reports a benchmarking experiment comparing the performance of Tesseract, Amazon Textract, and Google Document AI on images of English and Arabic text. English-language book scans (n<...... hiện toàn bộ
AbstractParliamentary and legislative debate transcripts provide access to information concerning the opinions, positions, and policy preferences of elected politicians. They attract attention from researchers from a wide variety of backgrounds, from political and social sciences to computer science. As a result, the problem of computational sentiment and position-...... hiện toàn bộ
Bennett Kleinberg, Isabelle van der Vegt, Paul Gill
AbstractThe increased threat of right-wing extremist violence necessitates a better understanding of online extremism. Radical message boards, small-scale social media platforms, and other internet fringes have been reported to fuel hatred. The current paper examines data from the right-wing forum Stormfront between 2001 and 2015. We specifically aim to understand ...... hiện toàn bộ