1,053
Views
3
CrossRef citations to date
0
Altmetric
Articles

Analysing community reaction to refugees through text analysis of social media data

&
Pages 492-534 | Published online: 16 Aug 2022
 

ABSTRACT

Understanding the social integration of refugees requires scholars and community leaders to understand the complex and varied political reaction of citizens to the prospect and reality of refugees entering their local communities. In this study, we apply the Structural Topic Model (STM) to characterise citizen-level discourse in comments posted in response to refugee-related news articles on Facebook in Lancaster, Pennsylvania, and Roanoke, Virginia, two cities with similar demographics and conservative partisanship, but sharply contrasting refugee-related policies and experiences. We find that, overall, commenters framed their arguments with an identity-based frame more often than economics, morality, security or legality frames, but that these tended to be blended in ways that obscure the basis in identity. We also find that comments within the discourse of the more refugee-experienced Lancaster community were more likely to involve substantive arguments than in Roanoke, more likely to use economics frames, less likely to use identity frames, less likely to involve incivility and less likely to feature a salient misinformation-influenced theme (refugees vs. homeless veterans). This suggests that host community discourse grows more substantive and positive as a function of hospitable refugee policies and refugee hosting experience, and we discuss how this research might be expanded beyond this pair of cases to evaluate this broader implication.

Data availability statement

The data that support the findings of this study are available on request from the corresponding author, CEK. The data are not publicly available due to concerns about the privacy of the Facebook commenters.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Notes

1 Determining which community has a larger population, depends on the level of aggregation. Comparing within city boundaries places Lancaster at 60% of Roanoke's population; comparing Metropolitan Areas, as defined by the U.S. Census, Roanoke has 58% the population of Lancaster.

4 They label one variant, labour competition, as a ‘zombie theory’, persisting in the face of contrary evidence. See also Malhotra, Margalit, and Mo (Citation2013).

5 This appeared repeatedly in our data, e.g. ‘If you had a bag of M&Ms and you knew 1 was poisoned, how many would you eat’? The posited base rates of metaphorical poisoned M&Ms included one in a bag, six in a bag, at least one in a bowl, a handful in a bowl, 1 to 10% (of a bowl), and 20% (of a dish). It mutated into other forms as well, e.g. ‘You're handing out food to homeless people. There is a basket of apples…one apple is poisonous and is likely to kill them. Knowing this, do you still give them the basket’?

6 Many reply comments, and some others, contained the full name of a prior speaker on the same thread. Both for privacy reasons, and to avoid the model using these words to form topics, we removed those that remained after our other cuts, replacing them with the token _name.

7 Note that a summary topic label is not strictly necessary. Many applications of topic models simply use the topic number, letting the keywords implicitly act as an automated label.

8 shows the guidance we used in assigning the topics to classes.

9 The Fightin' Words technique has been applied in other studies to a wide range of data sources ranging from history textbooks (discussions of different ethnic groups) (Lucy et al. Citation2020) to restaurant menus (cheap vs. expensive) (Jurafsky Citation2014). Applications to Facebook specifically include comparing posts by supporters of the Britain First movement and the UK Independence Party (Davidson and Berezin Citation2018) and comparing posts of those who self-disclose medical information with those who don't (Valizadeh et al. Citation2021).

10 These numbers give different impressions of the prevalence of topics in the corpus because of higher-order differences, e.g. skew, in the distributions of topic proportions. For example, Topic 21 which captures quotations of the Emma Lazarus poem appearing on the Statue of Liberty is highly skewed. Those comments that do provide such a quote often contain little else, while most comments do not mention it. Conversely the more general ‘Moral duty’ or ‘Empathy and compassion’ themes are spread more evenly across documents. The extreme case of the latter is the ‘Residual’ topic, which has an average proportion across the corpus of 1.6%, but is the largest topic in only 2 comments (0.04%).

11 So, if one were to count all comments, and count them equally regardless of length, the estimated topic proportions in the depicted model understate the proportion of nonsubstantive comments, including uncivil comments, and overstate the proportion of substantive comments.

12 The high-loading words street, mission, water rescue are due to this topic capturing a large cluster of comments encouraging people to volunteer at Lancaster's Water Street Rescue Mission.

13 The model also keys off of high-loading word law to include a small number of comments on the largely unrelated identity-based subject of Sharia law.

18 For example, ‘American citizens (and our nation's security) must come first. When we have absolutely no homeless citizens – especially our veterans – and when we have absolutely no one going hungry or without clothes then, and only then, should we consider taking in any refugees. Period. MAGA!!!’ We also note that the top 20 words co-appearing with veterans in our data include own, country, people, americans, first.

19 By this same logic, the least effective construction would be something like ‘rich terrorists’, combining power with undeservedness. We in fact see these notions applied separately in the ‘poisoned candy’ meme and the faked ‘luxury hotel’ headlines.

20 In each example these are words used at least ten times in the data that we judged to evoke the concept of focus with relatively high precision. The estimate for any one word is not affected by the inclusion or exclusion of any other word in the set.

21 At least one – white – is used more often to label another commenter's implicit identity group and there are some which are used in this way as well as for self-identification, e.g. christian.

22 The four most common words that remain after our filtering to ‘substantive’ lemmas are people, have, refugee, and country.

23 For a discussion of the challenges that come with private companies' power to provide (and withdraw) data for computational social science, see Drouhot et al. (Citation2023)

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 288.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.