Abstract
Twitter is increasingly being used as a source of data for the Social Sciences. However, deriving the demographic characteristics of users and dealing with the non-random non-representative populations from which they are drawn represent challenges for social scientists. This paper has two objectives: first, it compares different methods for estimating demographic information from Twitter data based on the crowd-sourcing platform CrowdFlower and the image-recognition software Face++. Second, it proposes a method for calibrating the non-representative sample of Twitter users with auxiliary information from official statistics, hence allowing to generalize findings based on Twitter to the general population.
Original language | English |
---|---|
Title of host publication | SIS 2017. Statistics and Data Science: new challenges, new generations |
Editors | A. Petrucci, R. Verde |
Place of Publication | Florence, IT |
Publisher | Firenze University Press |
Pages | 1025-1031 |
Number of pages | 7 |
ISBN (Print) | 9788864535210 |
Publication status | Published - 2017 |
Research programs
- ESSB SOC