Skip to content Skip to navigation

Measurement error in social network data: A re-classification

logos from social media companies like facebook and twitter

"Aryadnagallinorekisdgkdf" by

lis+-olaymaschetita - Own work.

Licensed under CC BY-SA 4.0 via

Wikimedia Commons -

https://commons.wikimedia.org/wiki

/File:Aryadnagallinorekisdgkdf.jpg#

/media/File:Aryadnagallinorekisdgkdf.jpg

Dan J. Wang
Xiaolin Shi
Daniel A. McFarland
Jure Leskovec
Social Networks
2012

Abstract

Research on measurement error in network data has typically focused on missing data.We embed missing data, which we term false negative nodes and edges, in a broader classification of error scenarios. This includes false positive nodes and edges and falsely aggregated and disaggregated nodes. We simulate these six measurement errors using an online social network and a publication citation network, reporting their effects on four node-level measures – degree centrality, clustering coefficient, network constraint, and eigenvector centrality. Our results suggest that in networks with more positively-skewed degree distributions and higher average clustering, these measures tend to be less resistant to most forms of measurement error. In addition, we argue that the sensitivity of a given measure to an error scenario depends on the idiosyncrasies of the measure’s calculation, thus revising the general claim from past research that the more ‘global’ a measure, the less resistant it is to measurement error. Finally, we anchor our discussion to commonly-used networks in past research that suffer from these different forms of measurement error and make recommendations for correction strategies.

Affiliation: 
CSS
IRiSS