Full article: Opinion polarization by learning from social feedback

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

ABSTRACT

We explore a new mechanism to explain polarization phenomena in opinion dynamics in which agents evaluate alternative views on the basis of the social feedback obtained on expressing them. High support of the favored opinion in the social environment is treated as a positive feedback which reinforces the value associated to this opinion. In connected networks of sufficiently high modularity, different groups of agents can form strong convictions of competing opinions. Linking the social feedback process to standard equilibrium concepts we analytically characterize sufficient conditions for the stability of bi-polarization. While previous models have emphasized the polarization effects of deliberative argument-based communication, our model highlights an affective experience-based route to polarization, without assumptions about negative influence or bounded confidence.

KEYWORDS:

1. Introduction

The public discourse around political polarization has regained momentum in the recent past. The fundamental divide between Democrats and Republicans in the United States and the rise of nationalistic voices in the European sphere reflect fundamental differences in attitudes and ideas regarding the ≫right≪ direction to go and have engaged the interest of investigators in social network science because it is a hard problem to model. With this paper we aim to identify mechanisms and conditions for the emergence of polarization. Polarization refers either to a distribution of opinions with multiple local maxima or to the process by which such strong divergences of opinions that divide a population come about (Bramson et al., Citation2016; DiMaggio, Evans, & Bryson, Citation1996). The challenge have been to develop an explanation of polarization with mathematical or computational models of opinion dynamics (Abelson, Citation1964; Chatterjee & Seneta, Citation1977; Deffuant, Neau, Amblard, & Weisbuch, Citation2000; DeGroot, Citation1974; French, Citation1956; Friedkin & Johnsen, Citation1990; Hegselmann & Krause et al., Citation2002). We propose a very basic reinforcement learning mechanism that leads to the emergence and persistence of stable patterns of bi-polarization even if the interaction network that encodes the patterns of social influence is strongly connected. With this mechanism, the paper provides a parsimonious answer to Abelson’s old question of ≫what on earth one must assume in order to generate the bimodal outcome of community cleavage studies≪ (Abelson, Citation1964,p. 153). In contrast to some previous models of opinion bi-polarization, our approach does not rely on negative social influence (Baldassarri & Bearman, Citation2007; Flache & Macy, Citation2011; Macy, Kitts, Flache, & Benard, Citation2003; Mark, Citation2003) or on notions of opinion homophily and bounded confidence (Axelrod, Citation1997; Deffuant et al., Citation2000; Duggins, Citation2017; Hegselmann & Krause et al., Citation2002; Mäs & Flache, Citation2013), but on a simple reinforcement mechanism that has not yet been explored in the context of opinion and polarization dynamics.

The main idea is that individuals express their opinion about an issue and are sensitive to approval and disapproval by their peers (Homans, Citation1974). Agreement leads to a positive experience which strengthens attachment to the expressed opinion. Conversely, disagreement is assumed to be related to a negative sentiment and decreases attachment to the current opinion. In a wide range of interaction networks with sufficiently high modularity this reinforcement leads to the formation of different opinion clusters in which agents become collectively more and more committed to their cluster’s opinion. The process we propose can be seen as an abstraction from recent models of polarization that rely on ideas from argument persuasion (Dandekar, Goel, & Lee, Citation2013; Mäs & Flache, Citation2013). However, our model comes with another connotation as the opinion change process is not assumed to involve argument processing but more elementary responses that are mediated by the positive (negative) experience that agreement (disagreement) brings about.

Although this mechanism is derived as a plausible heuristic motivated by psychological research on implicit processes of attitude change (Fazio, Citation2001; Fazio, Eiser, & Shook, Citation2004), its formalization is highly compatible with a subjective and procedural notion of rationality (Goldthorpe, Citation1998; Simon, Citation1978). It gives rise to a reward-driven reinforcement learning scheme that is psychologically plausible yet minimal. In the course of the process, agents internalize the expected opinion in their neighborhood and learn to associate values to the different opinion expressions that converge to the payoffs in the corresponding ≫opinion game≪. This means that classical equilibrium concepts can be applied to characterize the stable macroscopic outcomes of the opinion formation process. This bridge from a rather basic process of opinion formation to a setting where game-theoretic tools become applicable is one of the main contributions of the present paper.

While early models of social influence (Abelson, Citation1964; Chatterjee & Seneta, Citation1977; DeGroot, Citation1974; French, Citation1956) as well as more recent applications of this paradigm (Friedkin, Citation1999; Friedkin & Johnsen, Citation1990; Friedkin, Proskurnikov, Tempo, & Parsegov, Citation2016) allow for analytical treatment, analytical solutions are difficult to obtain if non-linearities such as bounded confidence (Deffuant et al., Citation2000; Hegselmann & Krause, Citation2002) or continuous forms of opinion-dependent interaction weights (Duggins, Citation2017; Mäs & Flache, Citation2013) are introduced (cf. Hegselmann & Krause et al., Citation2002; Flache & Macy, Citation2011). Consequently, new approaches to the modeling of opinion bi-polarization that generally come with non-linear extensions to produce the desired behavior rely on agent-based simulations to explore the model behavior. In this paper, we also use simulations to illustrate that opinion bi-polarization is possible—indeed likely—by the social feedback account. One main objective of the computational experiments, however, is to explore and validate the connection from a plausible social feedback mechanism to game-theoretic notions of equilibrium through the use of reinforcement learning. Once this connection is established, the concept of cohesion as used in social network analysis (Wasserman & Faust, Citation1994) and adopted in the theory of games on networks (Jackson & Zenou, Citation2014; Morris, Citation2000) provides a precise structural condition for the stability of bi-polarization in heterogeneous networks.

The incorporation of ideas from reinforcement learning in the context of opinion dynamics bears great potential as a new modeling paradigm. It differs fundamentally from most previous modeling approaches in conceiving the articulation of an opinion as a communication act that reflects but does not directly correspond to an agent’s attitudinal evaluation of an issue. While, in this paper, the decision of what opinion to express is based on an agent’s conviction that its opinion is approved by peers, the framework is general enough to incorporate situational factors as well as strategic considerations that may be involved in opinion statements. It assumes that agents express their opinions in their social environment and that peers may respond to this communication in different ways. These responses provide feedback to the sender and sometimes, in some media settings, the sheer number of responses provides a valuable reward. Implicitly or explicitly, these rewards lead to a re-evaluation of the expression that has triggered the responses which will affect the future behavior of that agent. This framework, therefore, may shift the explanatory focus from forms of social influence (e.g. strong versus weak or positive versus negative) to the incentives and rewards of opinion expression in different social settings.

The opinion polarization model we devise and analyze throughout this paper is a first illustration of this more general program. We focus on dyadic interaction events constrained by a time-homogeneous network and show that the existence of cohesive subgroups (Morris, Citation2000; Wasserman & Faust, Citation1994) is sufficient to generate stable bi-polarization even if the subgroups are connected. On the other hand, this simple model shows that very basic social feedback mechanisms may be involved in processes by which an initially moderate population polarizes into two camps which strongly support opposing views and its formulation in terms of reinforcement learning and game theory provides a new account of polarization processes that can be studied using analytical tools. With regard to more recent proposals to model bi-polarization, our model comes with a minimal set of individual-level assumptions and does, most notably, not rely on opinion homophily or bounded confidence by which interaction probabilities depend on opinion similarity. In that sense, our model identifies a previously unseen combination of mechanisms that can explain the emergence and stability of bi-polarization.

There are two basic properties that—in combination—lead to the emergence and persistence of bi-polarization in the proposed model. On the one hand, the reinforcement mechanism gives rise to a group polarization process by which a densely connected group initially inclined into one attitudinal direction becomes more extreme in interaction. See, for instance Sunstein (Citation2002) for a review of ample empirical evidence on this phenomenon. On the other hand, the model exhibits ≫gate keeping≪ such that opinions do not spread across structural holes (Burt, Citation2004) in between different communities which prevent a single opinion to spread over the entire network. We focus on these two properties and their interplay in Section 5.

In the presentation of our approach we proceed in the following way. In Section 2 we review the field of opinion dynamics with a special focus on recent approaches to model opinion bi-polarization. We describe our model in Section 3 and provide all the details on the implementation to allow for the reproduction of the results. In Section 4 we take a macroscopic and a microscopic perspective to analyze model realizations on random geometric graphs (Dall & Christensen, Citation2002; Penrose, Citation2003) in order to illustrate how the mechanism leads to stable polarization. Section 5 is devoted to the mathematical analysis of the model and establishes the fact that the learning scheme approaches the corresponding ≫opinion game≪ in the course of a simulation. We show that the model leads to a group polarization process by which opinions become extreme in densely connected groups and that structural holes prevent a single opinion to spread over the entire network. We then focus on a two-community setting to provide a more specific account of how the interplay of these two properties leads to polarization. In Section 6 we draw a conclusion on the paper as a whole and discuss the implications of combining opinion dynamics with reinforcement learning for future research.

2. Modeling opinion polarization

Modeling opinion bi-polarization is recently receiving considerable attention in the opinion dynamics community. In the last few years, a series of possible interaction mechanisms have been proposed by which a population polarizes and approaches a state with a bi-polar opinion distribution (Baldassarri & Bearman, Citation2007; Dandekar et al., Citation2013; Duggins, Citation2017; Flache & Macy, Citation2011; Friedkin, Citation2015; Macy et al., Citation2003; Mäs & Bischofberger, Citation2015; Mäs & Flache, Citation2013) and researchers have started to assess the relevance of these competing explanatory variants through experiments (Chacoma & Zanette, Citation2015; Moussad, Kämmer, Analytis, & Neth, Citation2013; Takács, Flache, & Mäs, Citation2016). The comparison of different models in terms of experimentally founded microscopic assumptions on the one hand and their capabilities to generate plausible macroscopic outcomes on the other is actually a quite remarkable scientific program as it aims at an empirical justification of rather different links in the explanatory chain (Hedström & Ylikoski, Citation2010).

A major motivation for such a mechanism-based approach to polarization has been the fact that most early models of opinion dynamics lead to either consensus or disagreeing opinions that are not polarized. These social influence models (Abelson, Citation1964; DeGroot, Citation1974; French, Citation1956; Friedkin & Johnsen, Citation1990) are generally based on mechanisms of iterated weighted averaging by which peoples views on an issue tend to become more similar.Footnote¹ Classic models (Abelson, Citation1964; DeGroot, Citation1974; French, Citation1956) do not entail the possibility to generate persistent bi-polarization in a strongly connected (irreducible) network (cf. (Dandekar et al., Citation2013)) even if the opinion space becomes arbitrarily complex (Chatterjee & Seneta, Citation1977; DeGroot, Citation1974). Within this linear class of models, persistent diversity may be achieved by the introduction of stubborn agents that, to a certain degree, are attached to their initial opinion. In the models devised by Friedkin, Johnsen and co-workers (Friedkin, Citation1999, Citation2015; Friedkin & Johnsen, Citation2011; Parsegov, Proskurnikov, Tempo, & Friedkin, Citation2017) this is modeled by an inhomogeneous term in the social influence system that balances the importance assigned to the initial opinion against the effects of the social influence process. By defining the patterns of mutual influence and persistence of initial opinions appropriately, this framework has proven quite productive in generating a wide range of macro outcomes such as choice shifts (Friedkin, Citation1999) and most recently the evolution of belief systems (Friedkin et al., Citation2016; Parsegov et al., Citation2017). If individual susceptibilities depend on the initial opinions even a bimodal opinion distribution may be a stable outcome of a linear averaging process (Friedkin, Citation2015).

We note that global agreement is also the outcome of most socio-physics models of social dynamics which—based on an analogy to simple spin systems (Ising, Citation1925) —often conceive opinions as a binary variable (e.g. ≫yes≪ versus ≫no≪) on which agents agree in interaction (e.g. by copying). There is a huge body of literature devoted to the analysis of the statistical properties of systems in which agents connected in a network exchange binary opinions and we refer to Castellano, Fortunato, & Loreto (Citation2009) for a review (but see, for instance, Frachebourg & Krapivsky, Citation1996; Slanina & Lavicka, Citation2003; Sood & Redner, Citation2005; Galam, Citation2008; Banisch, Lima, & Araújo, Citation2012; Gleeson, Citation2013 as well as Moran, Citation1958; Kimura & Weiss, Citation1964; Clifford & Sudbury, Citation1973 for predecessors in theoretical biology). As long as no noise (Carro, Toral, & San Miguel, Citation2016) or anti-conformism (Galam, Citation2004) is added to the system, the opinion imitation mechanism leads to global agreement whenever the interaction network is connected, because under these conditions the dynamics give rise to a random walk with absorbing boundaries (Banisch, Citation2014b, Citation2016).

The approach taken in this paper can be seen as a combination of physics-inspired binary opinion dynamics and social influence models. Agents are allowed to express two alternative opinions (binary choice) but evolve a private evaluation of these alternatives that takes continuous values. Despite the fact that this combination can account for the emergence and persistence of bi-polarization even if the interaction network consists of one connected component, it also leads to microscopically more plausible opinion change dynamics with respect to socio-physics models as agents do not constantly switch back and forth between the two opinions but may go through phases of confidence and indecision (see ).

Figure 1. Runable MATLAB-code of the model.

Figure 2. Runable MATLAB-code of the random graph model.

Figure 3. Time evolution of the model. The plot shows 10000 iterations for 100 agents (on average 100 expressions by each agent) along with the system state at t = 0; 2500; 5000; 7500; 10000.The size of the nodes represents their support level and the color which opinion is favored. The average opinion represents the fraction of agents that support 1 and the support strength is the value difference averaged over the respective sets of supporters. In the definition of the polarization dispersion and bimodality measures we follow DiMaggio et al. (Citation1996).

$Figure 3. Time evolution of the model. The plot shows 10000 iterations for 100 agents (on average 100 expressions by each agent) along with the system state at t = 0; 2500; 5000; 7500; 10000.The size of the nodes represents their support level and the color which opinion is favored. The average opinion represents the fraction of agents that support 1 and the support strength is the value difference averaged over the respective sets of supporters. In the definition of the polarization dispersion and bimodality measures we follow DiMaggio et al. (Citation1996).$

Figure 4. Time evolution of three selected individuals in the above example run. The plot shows 10000 (on average 100 expressions by each agent). The yellow stars represent opinion change events where the values associated to the two alternatives change ranking.

The first generation of previous opinion dynamics models which try to address the issue of stable opinion plurality are based on the continuous social influence models and include further constraints, most prominently, assumptions about value or opinion homophily (McPherson, Smith-Lovin, & Cook, Citation2001). In the opinion dynamics context this type of homophily refers to the fact that similar views may lead to attraction and an increasing likeliness of interaction. Paired with positive social influence it describes a process of repeated influence events by which ≫similarity leads to interaction and interaction leads to still more similarity≪ (Banisch, Araujo, & A, Citation2010, 109). In the classical models with bounded confidence (Deffuant et al., Citation2000; Hegselmann & Krause, Citation2002) this is implemented via a threshold mechanism that switches off influence if the opinion discrepancy exceeds a certain threshold. In relation to linear social influence models, this effectively means that the matrix of social influences $A$ is a function of the opinions that agents hold which may disintegrate into disconnected groups of influence due to the threshold mechanism. This procedural disintegration into disconnected influence groups is necessary for persistent disagreement in this class of models.

Also the influential model by (Axelrod, Citation1997) achieves global polarization by mechanisms of local convergence due to an homophily mechanism that disables mutual influence if the individuals become too different. In his multi-dimensional model, the interaction probability depends on the opinion overlap (i.e. the number of traits the two agents already share) and vanishes if agents are unaligned on all dimensions. Unlike in other models with a multidimensional representation of opinions (Banisch et al., Citation2010; Lorenz, Citation2007; Mäs & Flache, Citation2013), the Axelrod model relies on a spatial embedding in form of a lattice that shares certain similarities with the random geometric graph model we use in Section 4 to illustrate our results. This model gives rise to a number of spatial clusters of agents with aligned opinions and these configurations of homogeneous regions become stable as the overlap across regions and consequently the probability of cross-cutting assimilation vanishes. When run on a spatial topology, our model also gives rise to spatial clusters of opinions, but the interaction probabilities are not affected by this such that agents at the interface between different clusters are still exposed to non-confirming feedback. The social feedback process stabilizes because the locally prevailing opinion is more often reinforced in the interaction with peers. Therefore, while cultural drift or communication errors implemented as small random perturbations are ≫eroding the borders≪ (Centola, Gonzalez-Avella, Eguiluz, & San Miguel, Citation2007, p.907) between culturally stable regions in the cultural dissemination model (Klemm, Eguiluz, Toral, & Miguel, Citation2003), the interfaces between different opinion clusters are stable with respect to noise in the social feedback model where such perturbations occur naturally through exploration.

Bounded confidence can lead to a stable co-existence of a plurality of opinions in a population. However, unless ≫extreme≪ agents are artificially inserted (Deffuant, Amblard, Weisbuch, & Faure, Citation2002), bounded confidence models cannot drive a population into the extremes or lead to the emergence of antagonistic opinion groups as the opinion averaging procedure always locally reduces diversity. It crucially depends on the initial opinion diversity whether or not multiple opinions survive and the model will always end up with final opinions that are more moderate than the initial extremes.

The recent approaches to modeling polarization dynamics introduce new mechanisms of opinion exchange and influence in addition to positive influence and homophily. They have been classified into models that rely on the assumption of negative social influence and models that draw upon ideas from persuasion theory (cf. Mäs & Bischofberger, Citation2015). An alternative introduced in the physics literature on opinion dynamics differentiates between a continuous internal opinion and discrete opinion expression (Martins, Citation2008). We will briefly describe all three variants here.

The first branch of models seeks an explanation of polarization patterns by assuming a negative social influence in the interaction of distant agents such that the encounter and communication of two agents with very different views leads them to adopt even more distant positions in the opinion space (Baldassarri & Bearman, Citation2007; Flache & Macy, Citation2011; Macy et al., Citation2003; Mark, Citation2003). Despite the possibility to generate stable polarization patterns driving the population into two maximally extreme clusters, a recent experimental study found no indication for negative influence of this kind (Takács et al., Citation2016).Footnote² It is noteworthy that the combination of positive and negative social influence can lead to polarization but that this feature disappears in the presence of opinion homophily (Mäs & Bischofberger, Citation2015).

The second type of models capable of explaining polarization is based on psychological work on persuasion and attitude change (Ajzen, Citation2001; Fishbein, Citation1963; Lord, Ross, & Lepper, Citation1979; Petty, Cacioppo, & Goldman, Citation1981). Persuasion models generally assume that communication partners exchange arguments about the object on which an attitude is formed and that new arguments are learned from an interaction partner if they are in support of an agents view. The model by Mäs & Flache (Citation2013) is based on an explicit representation of attitudes borrowed from expectancy-value theory which treats an attitude as a weighted combination of a set of pro and con-arguments regarding certain aspects of the issue under discussion (see also (Urbig & Malitz, Citation2005) for an early related model). Their so-called ≫argument-communication theory of bi-polarization≪ posits a mechanism for the co-evolution of arguments and the associated weights with the latter encoding a limited capacity to process all arguments at once. That is, only a subset of arguments is considered to be relevant and recently discussed issues play a more important role. At each time step, two agents are chosen at random with a probability that is proportional to the similarity of their attitudes (opinion homophily). In the interaction, the first agent adopts a randomly chosen argument from its interaction partner (positive social influence). The larger the homophily, the more likely an agent with a similar opinion will be selected as interaction partner. As the similarity in attitudes of both agents may come about by different subsets of arguments (those that are currently relevant), the focal agent is likely to receive an argument in favor of his own attitude from that partner. As a result, the model leads to a process where agents with a similar attitude mutually reinforce that attitude by the exchange of supportive arguments.Footnote³

The model proposed in Mäs & Flache (Citation2013) operationalizes a deliberative argument-based route to polarization in line with the persuasive argument theory described in the seminal paper on group polarization by Sunstein (Citation2002). Sunstein reviews experimental evidence on group polarization and choice shifts through group discussions and debates on its political and institutional implications. His argument is that groups inclined to a certain attitudinal direction rely on a limited argument pool which is biased into the respective direction as there is a disproportionate number of supporting claims. As arguments are exchanged the group members acquire new arguments that tend to speak even more in favor of the initial direction. In the Mäs/Flache model such biased and limited argument pools come about by an increased interaction probability of agents which already hold similar attitudes leading to enclaves of individuals with similar and every time more extreme inclinations. It is worth noting that such a persuasive argument account is also in agreement with the functional approach to argumentative reasoning put forth by Mercier & Sperber (Citation2011), who, however, put more weight on the intuitive sources of attitudes and supporting arguments. The learning process described in the present paper is potentially more related to such an account. In particular, it resonates well with the paradigm of automatic (implicit) attitude activation and the learning of evaluative associations put forth by Fazio and colleagues (Fazio, Citation2001; Fazio et al., Citation2004).

Another persuasion-based proposal to the modeling of bi-polarization has been made by Dandekar et al. (Citation2013) who base their argument on the work of Lord et al. (Citation1979) on biased assimilation. The principle idea is that people who are very convinced of their view tend ≫to accept “confirming” evidence at face value while subjecting “disconfirming” evidence to critical evaluation≪ (p. 2098). This is sometimes also referred to as confirmation bias. Unlike the model described above (Dandekar et al., Citation2013), take an abstract perspective in terms of opinion representation and include biased argument evaluation into the classic repeated averaging model (DeGroot, Citation1974) which operates on a single continuous opinion dimension ( $o_{i} \in [0, 1]$ ). A bias function is introduced that decides about strength and direction of opinion change as a function of the current conviction of one or the other view (the extremes $0$ and $1$ are interpreted as two opposing opinions on an issue). The model contains DeGroot’s averaging process as a limiting case and (Dandekar et al., Citation2013) shows that there is a critical bias level at which the process becomes polarizing. Noteworthy, polarization may occur even in the absence of opinion homophily.

A third mechanism capable of producing a bimodal opinion distribution has been proposed by Martins (Citation2008). The model originated from physics-inspired models where agents face binary choices and update their opinion by imitation of their neighbors which generally leads to all agents holding the same opinion. The main idea of Martins (Citation2008) is to introduce an internal opinion that encodes how many encounters are needed for an agent to change opinion. That is, whenever an agent observes another one expressing the same opinion, the internal opinion is enforced into the respective direction such that it becomes still harder for the agent to switch. These internal opinions—also referred to as inflexibility (Martins & Galam, Citation2013)—may polarize in the sense that two opposing camps of more and more inflexible or extreme supporters of the different options emerge.

The mechanism this paper adds to the polarization literature is related to this last approach by differentiating between an internal evaluation of different discrete options of opinion expression. Like in persuasion models, opinions are reinforced by encountering agents with similar views, but in our model this ≫reinforcement in agreement≪ is mediated by a very different psychological process. Namely, reinforcement or respectively a weakening of support of their expressed view is not obtained by a costly process of argument persuasion but rather by the positive (negative) experience that agreement (disagreement) brings about. Agents form their opinions on the basis of the social feedback they obtain by expressing them. The learning scheme employed in the model is in line with the psychological theory of Fazio et al. (Citation2004) which views attitudes as evaluative associations of varying strength mediated through positive and negative experience. Recent neuro-physiological studies provide support for such evaluative mechanisms in social interaction (Campbell-Meiklejohn, Bach, Roepstorff, Dolan, & Frith, Citation2010; Ruff & Fehr, Citation2014; see concluding section). Moreover, rooted in reinforcement learning our model is amenable to game-theoretic considerations which have recently been proven very productive in relating agent-based models to an economic model of inter-temporal optimization (Banisch & Olbrich, Citation2017). Here (in Section 5) we link to the developed body of literature on games on networks (Jackson & Zenou, Citation2014, and references therein) and derive analytical results for the stability of polarization.

3. Model description

3.1. Theoretical model

Suppose there are two opinions that agents can adopt and express. We denote these alternative options as $o_{i}$ , $i$ being an agent index, and set $o_{i} \in \{- 1, 1\}$ for further convenience. In the opinion model we put forth here, an agent (say $i$ ) is chosen at random and expresses his current opinion $o_{i}$ to a randomly chosen neighbor $j$ . That is, a first agent $i$ is chosen uniformly from the set of all agents and the second agent $j$ is sampled out of the set of $i$ ‘s neighbors. Agent $i$ expresses its opinion $o_{i}$ to agent $j$ and this agent responds to $i$ ‘s expression with approval or disapproval (agreement or disagreement) depending on her current opinion $o_{j}$ .

We further assume that agents become more convinced of an opinion if it is approved by their interaction partners and that their conviction in an expressed opinion is challenged if others disagree. This is accounted for by two real-valued terms per agent— $Q_{i} (1)$ and $Q_{i} (- 1)$ —that capture how well the expression of $1$ and $- 1$ respectively is perceived by peers in an agent’s social environment. That is, the $Q_{i} (o)$ represent an internal evaluation of the different options based on the social response the agent obtains on expressing them. These values are updated as

(1)

Q_{i} (o) \leftarrow \{\begin{matrix} (1 - α) Q_{i} (o) + α r_{i} : i f o = e x p r e s s i o n \\ Q_{i} (o) : e l s e . \end{matrix}

(1)

with

(2)

r_{i} = o_{i} o_{j}

(2)

leading to a positive feedback for $o_{i} = o_{j}$ and to a negative one if $o_{i} \neq o_{j}$ . The parameter $α$ is referred to as learning rate (see below) and governs the magnitude of change. In the context of social influence opinion dynamics $α$ can be seen as a susceptibility of the agents to revise its opinion evaluation on the basis of social feedback obtained on expressing it. If not otherwise stated, it is set to $α = 0.05$ for the simulations performed in this paper.

On expressing their current opinion agents thus receive affirmative or non-confirming response depending on the current opinion in their neighborhood. Agreement signals approval and leads to a positive experience ( $r_{i} = 1$ ) by which the evaluation $Q_{i} (o)$ of the respective expressed opinion increases. Conversely, a disconfirming response gives rise to a negative feedback ( $r_{i} = - 1$ ) and decreases the value associated to the respective opinion. We shall therefore interpret the values $Q_{i} (o)$ as associated to the two opinions as a strength with which the respective view is supported by $i$ or likewise as $i$ ‘s conviction regarding the two alternatives. In particular, in the case where only two competing opinions may be expressed, the difference $Δ Q_{i} = Q_{i} (1) - Q_{i} (- 1)$ can be interpreted as a conviction that one opinion is more favorable. Consequently, we assume that when asked to articulate their current opinion or to respond to such an articulation, agents choose to express that option which he more strongly supports at the current time step. In other words, $o_{i} = - 1$ if $Q_{i} (- 1) > Q_{i} (1)$ and $o_{i} = 1$ if $Q_{i} (- 1) < Q_{i} (1)$ or, more generally:

(3)

o_{i} = arg max_{o} Q_{i} (o) .

(3)

However, we assume that agents generally tend to follow (3) in their expression choice, but deviate from this scheme with a small probability $ϵ = 0.1$ . The mathematical reason for this is that we make sure, in this way, that both opinion options are tested by the agents so that both $Q$ -values are actualized from time to time; for this reason $ϵ$ is usually referred to as exploration rate. On the other hand, exploration seems also plausible especially when the difference in conviction is small.Footnote⁴ Moreover, one may argue that the noise introduced by $ϵ$ accounts for certain issues that might occur in communication such as misunderstanding or misinterpretation of an articulation.

Readers familiar with game theory and reinforcement learning will have realized that this model set-up casts opinion dynamics as a game played repeatedly on a network in which agents learn the best response by a simple form of independent Q-learning (Busoniu, Babuska, & De Schutter, Citation2008; Sutton & Barto, Citation1998). In this context, opinion expression should be seen as an action that leads to a certain reward $r$ and the (state-less) Q-values associated to the two possible actions (i.e. $o_{i} \in \{- 1, 1\}$ ) are updated based on this reward signal. In fact, we will make use of this analogy in the sequel to mathematically characterize a series of stylized situations that will be helpful to provide an overall picture of the model behavior.

We also note that the model behavior is essentially determined by the way in which the reward system is designed. In this basic model agents learn to avoid dissonance and play a coordination game, but we envision that different more complex and possibly heterogeneous rewards are a propelling ingredient that deserves further exploration. The main purpose of this paper, however, is to introduce this type of opinion game in the context of opinion models and polarization dynamics in particular, and to demonstrate that opinion polarization can result from very few relatively mild basic assumptions:

1. agreement is positively, disagreement negatively experienced,

2. these experiences drive opinion conviction ( $Δ Q_{i}$ ) and expression ( $o_{i}$ ),

3. the probability of interaction is structured.

hile the first two assumptions are realized through rewards and the update scheme as specified by (1)–(3), the third aspect is included in the model by an interaction network that determines the probability with which pairs of agents engage in communication with one another. In order to illustrate the model’s capability to generate stable bi-polarization in connected networks with a direct or indirect path between all pairs of agents, we generate a random geometric network (Dall & Christensen, Citation2002; Penrose, Citation2003) to define the neighborhood structure. According to that network model, $N$ agents are assigned a random position in the unit plane $(x_{i}, y_{i}) \in [0, 1] \times [0, 1]$ and a link is established whenever the distance between two agents is below a threshold $r$ . Notice that the average degree of the network depends on both the number of agents and the threshold $r$ and can be approximated as $π N r^{2}$ for large $N$ (Penrose, Citation2003). Consequently, the network density is proportional to $π r^{2}$ . A further question typically addressed in the analysis of random graph models and of special interest to the present discussion concerns the critical threshold $r_{c}$ above which a giant connected component is formed. In two dimensions, this threshold has been found by Dall & Christensen (Citation2002) to scale as $r_{c} = 4.52 / N$ with the system size. Notice finally that random geometric graphs are a special type of random graphs that differ from the latter especially regarding clustering (Dall & Christensen, Citation2002).

This graph model is very simple, comes with only a single parameter and is well-suited for the purpose of illustration as it naturally embeds into two dimensions. In fact, spatial random graphs can be seen as an intermediate between the regular lattice and the classical Erdős-Rényi random graph model which have both been used frequently in previous simulation studies. We do not further specify what the similarity in the unit square accounts for but remark that besides spatial proximity it may also mimic social proximity patterns as they come about due to homophily regarding age, status or socio-economic situation (Lazarsfeld & Merton, Citation1954; McPherson et al., Citation2001). Notice, again, that contrary to most existing models of opinion dynamics and especially those that aim at explaining bi-polarization, our model comes without any assumption about homophily regarding opinions. One of the main purposes of this paper is to show that static patterns of interaction and influence as caricatured by random spatial networks are sufficient for bi-polarization to emerge even if indirect influence patterns expand over the entire population.

3.2. Implementation details

In order to facilitate the reproduction of results and give readers the possibility to make their own explorations with the model, we provide all the details of the model in form of the two code snippets shown in and . Noteworthy, the model is simple enough to be implemented in a few lines in MatLab (R2013a for Mac) and is executable as it is presented ( $N$ and the number of iterations steps have to be specified). presents the model initialization and the iteration loop, the random geometric graph model. Notice that for the computation-intensive experiments the model was also implemented in C++ .

In addition, two online implementations of the model are available under

• www.universecity.de/demos/OpinionValuesSmall.html

• www.universecity.de/demos/OpinionValuesBig.html

The former is a model with $N = 100$ agents which is also described in the next section. The latter is an implementation with $N = 10000$ agents where the network is not visualized for performance reasons. Notice that running the models requires a Browser with WebGL support.

4. Model behavior on random geometric graphs

The aim of this section is to provide an intuition about the model behavior. We will look in some detail on a realization of the model on a random geometric graph. We have used a network parameter significantly above the critical value $r > r_{c}$ in order to generate a graph with a single connected component to highlight that the social feedback model goes beyond social influence models in its capability to generate stable opinion bi-polarization in connected graphs. We will consider individual trajectories as well as some macroscopic indicators and show that polarization patterns emerge in these networks.

4.1. Simulation setting

As described above we generate a random spatial network to define the neighborhood structure of the agents. $N$ agents are assigned random position in the unit plane $(x_{i}, y_{i}) \in [0, 1] \times [0, 1]$ and a link is established whenever the distance between two agents is below a threshold $r$ . The agent population is initialized by setting the initial values $Q_{i} (1), Q_{i} (- 1)$ at random according to a uniform distribution within $[0, 1]$ . This defines the initial opinions (i.e. what the agents express at $t = 0$ ). Due to the initialization of the $Q$ –values, on average, half of the population will initially hold opinion $1$ and the other half opinion $- 1$ .

The exemplary model realization discussed in this section is done with $N = 100$ agents. The threshold for the random geometric graph model is $r = 0.175$ and the resulting network is shown in (). The learning rate is $α = 0.05$ and the exploration rate $ϵ = 0.1$ .

4.2. Macroscopic and microscopic dynamics

An example run of the model with $N = 100$ agents is shown in (). It shows for 10000 simulation steps the microscopic system configuration at times zero, 2500, 5000, 7500 and 10000 along with different macroscopic observables that measure the amount of polarization in a population. Just as polarization mechanisms, also the empirical characterization of polarization patterns is recently receiving some attention and we refer to Bramson et al. (Citation2016) for an accessible overview of different measures and a conceptual discussion of the different aspects of polarization they account for. For the purposes of this paper, we follow DiMaggio et al. (Citation1996) in the definition of polarization measures and define dispersion as the variance $σ^{2}$ over the distribution of convictions $Δ Q_{i} = Q_{i} (1) - Q_{i} (- 1)$ and bimodality by its kurtosis

(4)

κ = \frac{\frac{1}{N} \sum_{i = 1}^{N} {(Δ Q_{i} - \overline{Δ Q})}^{4}}{σ^{4}} - 3.

(4)

Notice that the kurtosis is most often interpreted as a measure of outliers with $κ = 0$ for the normal distribution. Borrowing the interpretation from DiMaggio et al. (Citation1996,Footnote⁵ p. 694–696), positive kurtosis indicates a very peaked consensus distribution whereas it becomes negative for flat and even more so for bimodal distributions reaching $κ = - 2$ in the two-peaked case. For the sake of visualizing bimodality in the same interval as the other measures, therefore shows bimodality transformed as $(κ + 2) / 2$ such that a value of zero indicates complete bimodularity and a value of one no deviation from the normal distribution.

In addition to the measures used by DiMaggio et al. (Citation1996), the polarization measure introduced in Flache & Macy (Citation2011) which we refer to as dissimilarity is shown. Dissimilarity is defined as the standard deviation of the distribution of opinion distances between all pairs of agents and is zero for consensus and one for the case of the equally sized groups at the extremes (Mäs & Flache, Citation2013, p. 8). Moreover, the average opinion (fraction of agents that express $o_{i} = 1$ ) and the average strength of support (average over the absolute values of $Δ Q_{i}$ ) within the two different groups of supporters is shown over time.

Initially, due to the random initialization of values, approximately one half of the population supports $e x p r e s s i o n$ 1 and −1 is supported by the other half. Notice that these fractions do not change a lot during this simulation run even if single agents do change their opinion. The left-most network shows that supporters are initially distributed without a particular spatial organization. The individual convictions $Δ Q_{i}$ accounting for the strength with which the respective alternative is supported (size of the nodes in ) is at a relatively low level for both opinions. The initial average strength of support (dashed curves)—that is, the average absolute value over convictions of supporters of 1 and −1 respectively—is around $0.4$ for both alternatives.Footnote⁶

The strength of support is decreasing during an initial period of alignment because due to the random initial configuration agents meet on average with an equal number of agents from both camps such that no opinion is clearly favored and the $Δ Q_{i}$ tend to zero. However, this initial period results in a strong spatial organization into opinion regions that is clearly visible at step 2500. This spatial distribution of opinions remains rather stable in the subsequent steps. While strength of support is still small for most agents at $t = 2500$ (approximately at the initial level), some clusters have emerged in which one opinion is more strongly supported after 5000 steps. In particular, we observe two regions in which agents strongly support 1 (blue) which are connected by a bend of agents that moderately support 1. We also observe two disconnected clusters where −1 (red) is supported more and more strongly. However, decisive changes may still occur at the interfaces between regions in which different opinions are supported and some clusters may be invaded in a long transient. Despite support strength remaining smaller in those areas, on average, support strength increases for both opinions.

This evolution is also captured by the different polarization measures. First, dispersion and dissimilarity behave very similarly and show an initial decrease in polarization followed by a steady increase once the opinion clusters have formed. Kurtosis—relating to the bimodality of a distribution—behaves differently. Namely, there is an initial increase from a moderate value toward a value close to one capturing the fact that the initially uniform distribution approaches a normal one with the $Δ Q_{i}$ approaching zero during the very first period. The first peak in the measure is therefore indicative of a considerable reduction in polarization. However, as with the other measures, it indicates that polarization increases to the initial level after approximately 2500 steps (second network). It further decreases but, opposed to the other measures, the bimodality indicator reaches a stable level at around 4000 to 5000 steps. Notice that the value at which it settles is larger than zero. This is due to the fact that not all agents develop the same extreme level of conviction but some (i.e. those at the interface between different opinion regions) remain slightly less convinced. The saturation of the bimodality index at a low level therefore indicates that from time four to five thousand on there is a low but constant number of agents in an intermediate regime of conviction (cf. DiMaggio et al., Citation1996, p. 694).

To fully understand the dynamical behavior of the model, let us look at the temporal evolution of some particular agents. In the values $Q_{i} (1)$ (blue) and $Q_{i} (- 1)$ (red) are shown for three different agents. Events where the ranking of values changes so that another opinion will be chosen in expression and response are highlighted by yellow stars. Aside from providing a better intuition of typical agent behaviors under the social feedback dynamics, this close-up view highlights the differences of the model with respect to continuous social influence models and physics-inspired binary state models (see Section 2).

The first one (agent 29, located in the upper right corner of the networks in ) starts in a state where $Q (1) > Q (- 1)$ but both values are at a relatively high level. Initially, this agent is surrounded by an equal number of blue and red agents which leads to a decrease in both $Q (1)$ and $Q (- 1)$ as the expected reward (local feedback) is zero in such a situation. However, already at step 2500 (see second network in ()) the neighborhood of 29 has aligned to −1 (red) and from that time on the value $Q (- 1)$ increases steadily until it saturates at $Q (- 1) = 1$ which is the expected social reward in an homogeneous surrounding. Moreover, $Q (1)$ decreases step by step as the result of negative feedback on expressing $1$ due to the exploration probability ( $ϵ = 0.1$ ). $Q (1)$ will eventually approach −1

This behavior is characteristic for many agents in the simulation without a too strong initial preference. There is an initial phase of alignment in which a local consensus emerges—weakly supported, at first—which, once established, leads to an reinforcement of the respective view. Notice that agents with a strong or moderate initial support of red in that cluster do in fact never change opinion. Their values both tend to zero as well but the positive reinforcement of their initial preference begins before a change of ranking takes place.

Secondly, we consider agent 17 with a relatively strong initial preference for −1. Her neighborhood remains unaligned for a rather long time and still is at step 5000. In other words, the agent is located in the interface region between different opinion clusters that have emerged early in the simulation. As shown in , the agent forms part of a small cluster (along with 62, 75, 76) that is only slowly ≫invaded≪ by the blue opinion. Around step 7000, however, most of her neighbors express preference for opinion 1 which leads to a rapid increase of $Q (1)$ in subsequent steps.

Finally, another node at the interface between two opinion regions is considered, namely agent 69. With an initial preference for blue, this agent first aligns with the red cluster to which agent 29 belongs and learns to support this opinion rather strongly. However, as the neighboring cluster around agent 17 aligns more and more on 1, expressing −1 yields a negative feedback with higher probability and leads to an decrease of $Q (- 1)$ . Consequently, in between step 6000 to 8000 agent 69 experiences a second period of opinion changes the result of which is a slight preference of 1. It is likely that a weak to moderate preference for blue will be stable as the agent is surrounded by 4 red and six blue agents. However, there may be chains of random events that take the agent to temporarily supporting −1 again.

An important characteristic visible in this microscopic perspective is that agents do not immediately adapt their expressed opinion as a response to interaction with unaligned peers. The imitation process implemented in many physics-inspired models of binary opinion dynamics is replaced by a continuous re-evaluation of the two options. Periods of ≫indecision≪ during which individuals try different options alternate with periods during which agents show a clear preference of one or the other opinion.

Yet, this continuous adaptation is different from continuous models for opinion dynamics where an ≫average opinion≪ emerges within different network clusters. To the contrary, the social feedback mechanism produces polarization by reinforcement of the support assigned to an opinion once a community is locally aligned. In other words, the contraction mechanism leading to different opinion clusters in bounded confidence models (at least if the initial diversity is large enough) is replaced by a mechanism that drives opinion clusters to a higher degree of polarization (even if initial convictions are relatively close).

4.3. Global connectivity and consensus probability

Previous work by Flache & Macy (Citation2011) has started to address the effect of network density and long-range ties on the polarization process generated by different microscopic assumptions. In their paper a model with positive social influence and selection based on homophily has been compared with a model where negative ties may form if opinions are too different and drive opinions still further apart. While increased global connectivity due to the introduction of long-range ties fosters integration and consensus in the former, it fosters polarization under the latter assumptions.

In random geometric graphs, global connectivity is modulated by the distance threshold $r$ . If the radius $r$ is very small—that is, below the critical threshold of $r_{c} = 4.52 / N$ (Dall & Christensen, Citation2002) —the network consists of many disconnected components. As $r$ increases above the critical value ( $r > r_{c}$ ), a giant connected component forms and the network becomes globally connected. In this experiment we aim to provide a more complete picture of the model behavior on random geometric graphs by looking at its long-run behavior as a function of the radius $r$ . For this purpose, we consider the consensus probability (the fraction of realizations that end up with all agents in the same opinion state) and compare it to the fraction of graph realizations that consist of a single connected component. Notice that the latter is a very conservative measure of global connectivity because it considers graphs with a single isolated node as not connected. shows this for three different system sizes of 100, 200 and 500 agents. Each data point is obtained by averaging over 100 realizations of the model after $20000 \times N$ time steps.

Figure 5. Probability of consensus (all agents maximally support the same opinion) and of a single connected component as a function of the distance threshold r governing global connectivity. The inset shows the respective likeliness that a graph with single connected component is generated and polarization observed on it.

The figure shows that polarization is very likely in a considerable range of networks in which all agents influence one another through direct or indirect influence paths. There are three different regimes. For very low connectivity, the networks consist of disconnected parts each performing an independent social feedback process. This is the only regime where repeated averaging (Abelson, Citation1964; Friedkin, Citation1999; Friedkin & Johnsen, Citation1990) would predict persistent disagreement. In the second regime, starting with an $r$ that decreases with the system size, the probability that the graph is connected sharply increases. Notice that this transition takes place at values well above $r_{c}$ due to the fact that the probability of a single connected component is sensitive to single isolated nodes. The consensus probability, on the other hand, only gradually increases and reaches one at a level of connectivity that is considerably higher compared to the level at which a single connected component becomes certain. This means that in this region of the parameter space a large proportion of model realizations converges to polarized situation despite the fact that there is an influence path between all pairs of agents. This observation becomes more pronounced when the number of agents increases. For $N = 500$ and $r$ from $0.1$ to $0.15$ , for instance, almost all realizations lead to a connected graph and a polarizing opinion formation process on it. The inset of () illustrates this by multiplying the probability of polarization and that of a single connected component. Finally, as $r$ grows very large, the resulting networks become rather dense, less and less modular in terms of spatially divided subgroups, and consensus (all agents maximally support the same opinion) becomes the most likely outcome of the model.

In order to show that the spatial organization into relatively (but not completely) segregated communities of agents (see the networks in ) is decisive for polarization to emerge, () reprints the results for $N = 200$ and compares them to an Erdős–Rényi random graph of the same size, that is, a graph without any spatial organization. Notice that the probabilities are shown as a function of network density in order to compare the two models. While the probability of consensus remains low compared to the probability of a single component in the spatial random graph model, both behave very similar in the Erdős–Rényi model. That is, in the Erdős–Rényi graph we observe consensus whenever there is a single connected component.Footnote⁷ This indicates that the spatial organization into several communities of agents that influence one another and interact less frequently with other groups is decisive for polarization to emerge under the social feedback process developed in the paper. It points at the community structure of a network as the most important factor to explain polarization in the context of our model. In the remainder of this paper, we will support this interpretation by game-theoretic considerations.

Figure 6. Comparison of the random geometric graph (green) and the Erdos–Rényi random graph (black) for N = 200. Probability of consensus (all agents maximally support the same opinion) and a single connected component as a function of the network density.

5. Mathematical characterizations

By treating opinion expression as an action and the social feedback as the payoff that an expression receives the proposed opinion model is strongly rooted in a form of reinforcement learning known as Q-learning (Busoniu et al., Citation2008; Sutton & Barto, Citation1998). One of the interesting properties of this learning scheme is that its estimates of the value of different actions converge to the true expected utilities under certain conditions. The model proposed in this paper can be seen as a repeated stochastic game played on a network, and therefore its rootedness in Q-learning provides a productive connection to previous work on games on networks (see Jackson & Zenou, Citation2014 and references therein).

In fact, coordination games—as specified by our reward function (2) —have received particular attention and structural conditions for the stability of non-consensus configurations have been identified (Morris, Citation2000). These conditions are based on a notion of cohesion introduced in Wasserman & Faust (Citation1994) and basically state that at least two subgroups with more in-group than out-group connections must exist in a network to allow for an equilibrium in which different actions may survive.

In the context of our model, the existence of multiple cohesive groups leads to group polarization processes that may take different directions within the different groups. Using game-theoretic arguments, in this section, we show that the model leads to such a group polarization process in densely connected groups and that structural holes prevent a single opinion to spread over the entire network. We use simulations to establish the relevance of these theoretical results for our model. Moreover, we show that the interplay of these two essential properties leads to opinion bi-polarization in the social feedback model introduced in this paper.

5.1. Decision problem from the individual perspective

Let us first look at a single agent in a fixed environment. Assume therefore that a single agent $i$ is surrounded $k$ neighbors and denote by

(5)

o_{N (i)} = \frac{1}{k} \sum_{j \in N (i)} o_{j}

(5)

as the average expressed opinion in $i$ ‘s neighborhood $N (i)$ . The expected reward $i$ obtains on expressing $1$ is then given by $o_{N (i)}$ and expressing $- 1$ yields $- o_{N (i)}$ on average. Consequently, using these expectations, the update of the values $Q_{i} (1)$ and $Q_{i} (- 1)$ is given by

(6)

\begin{matrix} Q_{i}^{t + 1} (1) & = Q_{i}^{t} (1) + Pr (o_{i}^{t} = 1) α (o_{N (i)} - Q_{i}^{t} (1)) \\ Q_{i}^{t + 1} (- 1) & = Q_{i}^{t} (- 1) + Pr (o_{i}^{t} = - 1) α (- o_{N (i)} - Q_{i}^{t} (- 1)) \end{matrix}

(6)

where $Pr (o_{i}^{t} = o)$ are the probabilities that $i$ performs the respective action (expresses $1$ or $- 1$ ) at time $t$ . In the model this depends on the current value of $Q_{i}^{t} (1)$ and $Q_{i}^{t} (- 1)$ , but we just notice that if we allow for exploration ( $ϵ > 0$ , whatever small) these probabilities are strictly positive. Under this assumption, therefore, the fixed point of (6) is given by $Q_{i}^{*} (1) = o_{N (i)}$ and $Q_{i}^{*} (- 1) = - o_{N (i)}$ and the respective value difference by $Δ Q_{i}^{*} = 2 o_{N (i)}$ .

Therefore, if an individual is in a homogeneous environment with all neighbors in the same state (i.e., $o_{N (i)} = 1$ or $- 1$ ) $i$ ‘s conviction settles at a maximal value in compliance with neighbors whatever $i$ ‘s initial assignment of values has been. That is, $i$ becomes maximally convinced of the respective opinion. Noteworthy, as only the expressed opinions of neighbors are visible to $i$ this occurs even if neighbors are only weakly supportive of that view. Hence, in a homogeneous environment the model gives rise to a radicalization process by which weakly convinced members approach maximal conviction irrespective of the initial conviction in the neighborhood.

In this sense, the model captures a dynamical process of group polarization by which a group initially inclined into an attitudinal direction delivers a more extreme judgement after discussion (see Sunstein, Citation2002 and references therein as well as Lord et al., Citation1979 on biased assimilation). Contrary to the limited argument pool or biased processing argument, however, our model reveals a social influence route to polarization where agents explore the social acceptability of their opinion and conviction is strengthened in the absence of opposing voices.

5.2. Gate keeping at structural holes

Real social networks are not complete. They are rather sparse and exhibit complex structure. They tend to come in clusters of tightly connected groups within which information flows rapidly, but across which influence and information flow is reduced. Ties connecting different clusters (along with the respective nodes) have received particular attention in social network analysis (Borgatti, Mehra, Brass, & Labianca, Citation2009; Burt, Citation2004) due to their gate-keeping role and strategic position which may come as a competitive advantage in terms of access to information, but also as a challenge to maintain connections with conflicting groups.

In the context of opinion dynamics, it is important to understand whether and at what pace opinions spread across different communities. For this purpose, let us consider the interaction of two agents $A$ and $B$ both connected to a different community homogeneously supporting different opinions (see ). Notice that this is the paradigmatic example of a bridge as considered in Burt (Citation2004). Let us denote by $k_{A}$ the number of neighbors of $A$ and by $k_{B}$ the number of neighbors of $B$ respectively. Assume that the opinion in $A$ ’s community is −1 ( $o_{N (A) ∖ B} = - 1$ ) and that $B$ ’s community adheres to opinion 1 ( $o_{N (B) ∖ A} = 1$ ). We leave the respective support levels in the two communities unspecified and only look at the best choices for $A$ and $B$ . From the point of view of an interaction between $A$ and $B$ we can render such a situation as a game with the following bi-matrix for the single shot game:

Table

Display Table

Figure 7. Interaction of two agents A and B who are each linked to an opinion community of different sign.

It is immediately clear that as soon as the size of the communities linked to $A$ and $B$ respectively exceeds one, the Nash equilibrium of the game is given by $(o_{A}^{*} = - 1, o_{B}^{*} = 1)$ such that both agents maintain the opinion of their respective group. This shows that different opinion clusters are a stable outcome of the opinion formation process on clustered interaction networks such as those considered in the previous section.

In order to show that the game shown in the game matrix above can provide a valid approximation for the proposed social feedback model, let us briefly consider the relation between the above payoff matrix and the dynamics (1) of our model. Therefore, assume that agent B forms a community with just two other agents ( $k_{B} = 2$ ) and agent A is part of a bigger community with ten others ( $k_{A} = 10$ ) and that A’s community is in favor of $- 1$ and B’s community supports $1$ . Consequently, the theoretical payoffs for A are $π_{A} (1) = - 9 / 11$ and $π_{A} (- 1) = 9 / 11$ and for B $π_{B} (1) = 1 / 3$ and $π_{B} (- 1) = - 1 / 3$ . shows the evolution of the four values $Q_{i} (o_{i})$ along with the payoffs as specified above (l.h.s.) and the resulting differences $Δ Q_{i}$ and $Δ π_{i}$ (r.h.s.). The agents A and B are initialized such that they strongly disagree with their respective group. However, both quickly approach the game payoff values such that, in effect, they learn to play the associated opinion game.

Figure 8. Dynamical evolution of two agents A and B each linked to a fixed community of opposed sign. A is connected to kA = 10 agents supporting −1, B to kB = 2 agents supporting 1. A learning rate _ = 0:02 and exploration rate _ = 0:1 is used. The figure shows a quick convergence of the Q-values to the payoffs of the associated opinion game.

5.3. Opinion games and cohesive sets

The previous analysis points out the relevance of game-theoretic arguments for the analysis of the convergence behavior of the proposed model and the identification of conditions for the emergence of polarization. Coordination games and games of strategic complements more generally have received particular attention in the literature on games on networks (Jackson, Citation2010; Jackson & Zenou, Citation2014; Morris, Citation2000). From this perspective, our model belongs to the category of semi-anonymous games (Jackson, Citation2010) in which only the number of neighbors taking the different actions (expressing their opinion) plays a role regarding the best response of a player and not the specific interaction partners. In other words, players assign equal importance to all of their local neighbors.

In the context of this type of games, a particular notion of group cohesion (Wasserman & Faust, Citation1994) has proven an effective tool to characterize structural conditions for contagious spreading of one action to the entire population and conversely also for the stable coexistence of both alternatives in the population (Morris, Citation2000). Given a network along with a pay-off structure such as given with (2), the challenge is to identify action configurations over the network that are stable in the game-theoretic sense. That is, configurations where no player independently would be better off by switching to the other alternative. Notice that in our model, a player’s expression $o_{i}$ is a best response choice if at least half of its neighbors favor that opinion as well. Cohesion measured as ≫the relative frequency of ties among group members compared to non-members≪ (Morris, Citation2000, p. 64) is therefore a direct way to check whether a group of agents plays at equilibrium. Namely, given a group $S$ of agents such that each member has more connections to agents in $S$ than to others (in the complementary set $\overset{ˉ}{S}$ ), then this group plays at equilibrium whenever all members choose the same action. In consequence, whenever there are at least two distinct groups with cohesion greater or equal to $1 / 2$ (more in-group then out-group links) in a network, configurations in which different opinions are expressed in the different groups are stable.Footnote⁸

More formally and following (Morris, Citation2000), let us define the fraction of neighbors of agent $i$ that are also in $S$ as

(7)

π (S | i) = \sum_{j \in S} a_{i j} / \sum_{j = 1}^{N} a_{i j},

(7)

where $a_{i j}$ are the elements of the adjacency matrix. The cohesion of the entire group $S$ of agents is then defined as the smallest value $π (S | i)$ of all the individuals in $S$ , that is

(8)

c o h (S) = min_{i \in S} π (S | i) .

(8)

As mentioned above, the reward system (2) used in this paper corresponds to a symmetric coordination game for which the relevant cohesiveness level is $1 / 2$ . For the stability of a non-consensus configuration it is necessary that there is at least one partition of the network into $S$ and its complement $\overset{ˉ}{S}$ such that both $S$ and $\overset{ˉ}{S}$ are at least $1 / 2$ -cohesive. The intuition behind that is relatively simple. If, for example, $c o h (S) < 1 / 2$ this means that there is at least one agent $i \in S$ that has more neighbors in $\overset{ˉ}{S}$ than in $S$ . Therefore, if one opinion (say $1$ ) is expressed in $S$ and the other ( $- 1$ ) in $\overset{ˉ}{S}$ , agent $i$ would improve its expected pay-off by switching to $- 1$ destroying the group consensus in $S$ . As other agents in $S$ are connected to $i$ its switching may potentially lead to a cascade of further opinion changes by other members of the group $S$ and, depending on the connectivity structure, global alignment on expressing $- 1$ may result.

Let us finally illustrate the cohesion concept by a brief look at the example from the previous section where two agents are connected to different communities. It is easy to compute that A’s community $S_{A}$ with $|S_{A}| = 11$ is $10 / 11$ -cohesive because $A$ has 11 neighbors 10 of which are in $S_{A}$ and the rest of the agents is only connected to $A$ leading to a cohesion value of one. Respectively, $S_{B}$ with $|S_{B}| = 3$ is $2 / 3$ cohesive because $B$ has three neighbors and two of them in $S_{B}$ . Hence, the cohesion condition is satisfied for both subsets ( $c o h (S_{A}) > 1 / 2$ and $c o h (S_{B}) > 1 / 2$ ) and consequently the polarized outcome is stable.

5.4. Cohesion in two-community graphs

Let us generalize this example and consider the case of two communities with $M$ and $L$ agents where we assume that most connections are within and only a few across the two communities. Let $p$ denote the probability of an (undirected) inter-community link between an agent in $S_{M}$ and an agent in $S_{L}$ and $(1 - p)$ the probability for intra-community connections. The obvious advantage of this graph model is that we (externally) define a partition of the network for which we may study how the stability of the non-consensus configurations is affected as $p$ increases. Meanwhile this setting can be seen as a prototypic situation that occurs also in more complex social networks.

In fact, there is a long tradition in studying the two-community topology in population genetics (Wright, Citation1943) and also theoretical work on opinion dynamics has relied on this as a stylized description of segregated communication structures (Banisch, Citation2014a; Dandekar et al., Citation2013; Lamarche-Perrin, Banisch, & Olbrich, Citation2016). One could think of it in spatial terms such as, for instance, as a set of villages with intensive interaction among people of the same village and some contact across, but it could also be related to homophily regarding to social classes, ethnicity, religious communities, etc. shows a small realization of such a two-community setting. There are 7 nodes in each community and three links that connect agents from different groups. The numbers shown on the nodes represent the individual ratios $π (S | i)$ given that partition. In this case, these values give rise to a cohesion of $c o h (S) = min_{i \in S} π (S | i) = 2 / 3$ for both subgroups rendering a bi-polar situation where two competing view coexist as stable.

Figure 9. A small two-community graph with two densely connected components and a few »long-range« ties connecting them. The fractional numbers shown on the nodes correspond to the respective fraction of neighbors with the same opinion _(Sji).

$Figure 9. A small two-community graph with two densely connected components and a few »long-range« ties connecting them. The fractional numbers shown on the nodes correspond to the respective fraction of neighbors with the same opinion _(Sji).$

The ratio $π$ of an agent in (say) community $S_{M}$ is given by

(9)

π (S_{M} | i) = \sum_{j \in S_{M}} a_{i j} / \sum_{j \in S_{M} \cup^{S_{L}}}^{N} a_{i j} = m_{i} / m_{i} + l_{i}

(9)

where $m_{i}$ ( $l_{i}$ ) is used as a shorthand for the number of links that $i$ maintains to agents in $S_{M}$ ( $S_{L}$ ). Given that links are independently created with probability $p$ and $1 - p$ respectively, each row of the adjacency matrix can be seen as the result of a finite sequence of Bernoulli trails. Consequently, for an agent $i \in S_{M}$ the probability $\underset{i n}{Pr} (m)$ that exactly $m$ links to agents in the same community are created is given by

(10)

\underset{i n}{Pr} (m) = (1 - p)^{m} p^{M - 1 - m} (\begin{matrix} M - 1 \\ m \end{matrix})

(10)

where $M - 1$ corresponds to the number of potential within-community links excluding self-connections. The probability of exactly $l$ across-community connections $\underset{o u t}{Pr} (l)$ reads

(11)

\underset{o u t}{Pr} (l) = p^{l} (1 - p)^{L - l} (\begin{matrix} L \\ l \end{matrix}) .

(11)

Configurations where different opinions are expressed in the different communities loose their stability if $m_{i} < l_{i}$ for at least one agent $i$ , because $1 / 2$ -cohesion of the entire community rests upon $1 / 2$ -cohesiveness of each individual with respect to the group (8). For each single agent in $S_{M}$ the probability that (s)he is less than $1 / 2$ -cohesive is given by

(12)

Pr (m < l) = \sum_{m = 0}^{M - 1} \underset{i n}{Pr} (m) \sum_{l = m + 1}^{L} \underset{o u t}{Pr} (l) = q_{M} .

(12)

For convenience we denote this probability as $q_{M}$ . Likewise, the probability to be less than $1 / 2$ -cohesive for an agent in $S_{L}$ (denoted by $q_{L}$ ) is obtained by exchanging $m$ with $l$ and $M$ with $L$ respectively:

(13)

Pr (l < m) = \sum_{l = 0}^{L - 1} \underset{i n}{Pr} (l) \sum_{m = l + 1}^{M} \underset{o u t}{Pr} (m) = q_{L} .

(13)

For a network consisting of $M + L$ agents, the probability that there is no agent for which $m_{i} < l_{i}$ can be obtained by a similar reasoning conceiving the problem as a sequence of $M + L$ Bernoulli trails with probabilities $q_{M}$ and $q_{L}$ respectively. Following this argument, the probability that no agent in $S_{M}$ is less than $1 / 2$ -cohesive is given by $(1 - q_{M})^{M}$ and for $S_{L}$ it reads $(1 - q_{L})^{L}$ . Therefore, the probability that the partitions $S_{M}$ and $S_{L}$ both satisfy the cohesion condition for the stability of different actions in the different communities is given by

(14)

Pr [c o h (S_{M}) \geq 1 / 2 \land c o h (S_{L}) \geq 1 / 2] = (1 - q_{M})^{M} (1 - q_{L})^{L} .

(14)

() compares the theoretical probability (14) that both communities are at least $1 / 2$ -cohesive with an average over 100 network realizations and demonstrates the correctness of the previous calculations. It shows four symmetric ( $M = L$ , solid stars) and two asymmetric cases ( $M \neq L$ , dashed circles), but we shall focus on the symmetric ones here. In the cases where $M = L$ we observe that the rewiring probability $p$ at which the transition from stability to instability occurs increases with the number of agents in the communities. At the same time it becomes sharper with increasing size which indicates a discontinuous phase transition in the limit of an infinite system. In fact, it is easy to see that when $M \to \infty$ , $L \to \infty$ and $M = L$ the fraction of intra- and inter-community links is precisely $(1 - p)$ such that the critical value at which the communities loose $1 / 2$ -cohesion is precisely $p^{*} = 1 / 2$ .

Figure 10. Probability that both communities are at least 1 = 2 cohesive as a function of p. The figure compares the theoretical results (14) with an average over 100 network realizations.

The combinatorial analysis of cohesion hence predicts a phase transition from stable bi-polarization to consensus on the two island graph which becomes sharper with increasing system size. In order to assess the significance of this result for the convergence behavior of the model, we initialize the two–island system in a state of maximal polarization and check if polarization persists under the opinion reinforcement mechanism. That is, we initialize agents in community $S_{M}$ with a high support for opinion $- 1$ by setting $Q_{i \in S_{M}}^{0} (- 1) = 1$ and $Q_{i \in S_{M}}^{0} (1) = - 1$ such that $Δ Q_{i \in S_{M}}^{0} = - 2$ and agents in the other community just the opposite such that $Δ Q_{i \in S_{L}}^{0} = 2$ . For different $p$ , the system is then iterated for a relatively long period of $20000 \times N$ steps ( $N = M + L$ ) and we compute the fraction of realizations out of 100 that reached consensus at that time. Consensus (i.e. all agents express either $1$ or $- 1$ ) is a suitable indicator because once reached the respective opinion is globally reinforced and the probability that one agent reverses due to finite-size fluctuations is effectively zero. In a first experiment the community sizes are varied from $10$ to $500$ (, l.h.s) and the learning rate is set to $α = 0.01$ . In a second experiment the size is fixed to $M = L = 50$ and the influence of the learning rate $α$ is analyzed (reported on the r.h.s of ).

Figure 11. Probability of persistent polarization in a suite of 100 simulations per data point (inter-community coupling p) is compared to the 1 = 2-cohesion probability as computed by (14).On the l.h.s. the comparison is shown for various community sizes from 10 to 500 with a fixed learning rate _ = 0:01. On the r.h.s. the influence of the learning rate is studied for a system of fixed size M = L = 50.

The l.h.s of shows the results of this experiment for different system sizes and compares them to the theoretical 1/2-cohesion probability (14). All in all, a relatively good agreement between the simulations and the theoretical curves is observed which shows that the combinatorial analysis of group cohesion provides an accurate prediction for the persistence of polarization in the two-community scenario. However, there are two effects that the cohesion probability does not properly account for. First, for the large system with $M = L = 500$ the transition from stable polarization to consensus takes place at a lower value of $p$ compared to the theoretical curve. Second, the transition observed for the model dynamics is generally sharper than the theoretical prediction. In fact, the transition is rather sharp already for a relatively small system of 100 agents.

The r.h.s of —looking at the influence of the learning rate $α$ —sheds some light on both deviations. First, regarding the lower critical value for large systems it becomes clear that the learning rate must be small enough to match with the theoretical transition point. For a system of 100 agents a learning rate of $α = 0.05$ still leads to a transition from polarization to consensus at a significantly lower value of inter-community coupling whereas with $α < 0.01$ the theoretical curve is approached well. As the learning rate governs the fluctuations of the $Q$ -values, these results suggest that the closer the community graph is to a group cohesion of $1 / 2$ , the more likely it becomes that one community is invaded due to a sequence of out-group interactions by which agents in one community occasionally express the opinion of the other enabling that opinion to spread throughout the entire cluster. The probability of such a cascading invasion increases with the size of the system so that the theoretical curves are matched only if the learning rate (and hence the fluctuations in the $Q_{i}$ ) is reduced.

Cohesion is a structural measure which characterizes stable network configurations in the following sense: given the configuration of all players’ actions no agent alone can improve its payoff by switching to the opposed action. It thus assumes that all players know their best response and play accordingly. In the opinion model, however, agents learn which opinion is favored in their local environment, and they do so in a sequential manner receiving feedback only from one peer at a time. This means that they may deviate from ≫best response behavior≪ from time to time due to imperfect estimates $Q$ of their expected reward. A perfect matching can thus be observed only in the limit $α \to 0$ .

The second observation that the transition is generally sharper under learning dynamics has two distinct reasons. On the one hand, notice in (r.h.s.) that the match between theory and simulations becomes fairly accurate in the lower part ( $p > 0.39$ ) of the curve with decreasing $α$ (see brown stars for $α = 0.0001$ ). In this parameter regime where cohesion is generally very close to 1/2 a finite learning rate $α > 0$ may lead to a cascade triggered by fluctuations in the $Q_{i}$ as described above. On the other hand, for $p < 0.39$ we notice a higher probability of persistent opinion polarization – even increasing with decreasing $α$ – which points at an aspect the cohesion probability (14) does not account for. The reason for this is that the theoretical probability (14) is computed by fixing the network partition to $S_{M}$ and $S_{L}$ . In the random assignment of connections, it may, however, happen that a single node becomes relatively disconnected from its predefined set and at the same time cohesively connected to the other set. While (14) would predict a loss of cohesion of $S_{M}$ or $S_{L}$ the entire network still possesses two cohesive sets (slightly different from $S_{M}$ and $S_{L}$ ) on which different opinion are stably expressed.

6. Conclusion

This paper makes the following main contributions:

It develops a new mechanism for the emergence and persistence of opinion bi-polarization which is based on a parsimonious set of assumptions. As opposed to previous models aiming at an explanation of bimodal opinion distributions, the proposed social feedback model does not rely on negative social influence (Baldassarri & Bearman, Citation2007; Flache & Macy, Citation2011; Macy et al., Citation2003; Mark, Citation2003) or assumptions about opinion homophily or bounded confidence (Axelrod, Citation1997; Deffuant et al., Citation2000; Duggins, Citation2017; Hegselmann & Krause et al., Citation2002; Mäs & Flache, Citation2013).It discriminates an internal conviction and an externally expressed opinion by a learning process which reinforces an agents private conviction in its expressed opinion based on the rewarding or non-rewarding experiences made by communicating ones views in a social neighborhood. In large social networks that consist of different communities this social feedback mechanism gives rise to group polarization processes by which members of the same community become collectively more convinced of one opinion. This process plays out independently in different communities even if weak ties and individual links are maintained across the different groups because the rewards gained by adopting and expressing the group opinion is larger than the rewards attainable in less frequent interactions with the out-group. As a consequence, for two connected individuals that are members of different groups it is desirable to maintain their respective group opinion because the rewarding experiences from communication within their group outweigh the negative experiences of disagreement when they occasionally encounter.
The paper introduces reinforcement learning to opinion dynamics modeling. While reinforcement learning as a basic model of social behavior is not new and is, for instance, at the core of Homans’ work on social exchange theory (Homans, Citation1958, Citation1974), its application and interpretation as a model for opinion dynamics and polarization in particular is. In our model, we consider that opinion expression is a decision problem that involves an internal evaluation of the expected effect of available expression alternatives based on the social feedback these expressions previously received. Rejection and confirmation of expressed opinions by peers lead to a re-evaluation of the different options which is mediated via a reward signal. This mechanism leads to the formation of strong convictions in an opinion—that is, a clear evaluation that one opinion is preferred over the other—based on how acceptable it is to advocate that standpoint in a social neighborhood. It is noteworthy that studies in neurobiology suggest that ≫social influence mediates very basic value signals in known reinforcement circuity≪ (Campbell-Meiklejohn et al., Citation2010). However, while neurobiology posits that the ≫rewarding properties of social behavior may have evolved to facilitate group cohesion and cooperation≪ (Ruff & Fehr, Citation2014), our model suggests that polarization (as opposed to cohesion) across groups may be a side-effect of these rewarding properties. In other words, human ability to coordinate with in-groups comes at the expense of a likely alienation to out-groups, which we could refer to as a ≫tragedy of coordination≪.
Through the use of reinforcement learning, the paper establishes a link between models of opinion formation and standard game-theoretic notions of equilibrium and explores its usefulness in the analysis of the convergence behavior of the model. A particular advantage of social influence network theory (Abelson, Citation1964; French, Citation1956; Friedkin, Citation1999; Friedkin & Johnsen, Citation1990; Friedkin et al., Citation2016; Parsegov et al., Citation2017) is analytical tractability Some extensions of the theory prompted by the polarization problem introduce a non-linearity that is difficult to handle with analytical tools (Hegselmann & Krause et al., Citation2002). The model we put forth can also be seen as a non-linear extension to these models which, while being plausible and psychologically well-justified, provides a powerful tool to establish the connection to game-theory and well-established equilibrium concepts. Namely, by the Q-learning scheme adopted to operationalize the opinion formation process agents learn to associate values to the different opinion expressions that converge to the payoffs of the corresponding ≫opinion game≪. This has been shown and used throughout Section 5. The theory of games on networks in particular (Jackson & Zenou, Citation2014; Morris, Citation2000) has proven useful to establish conditions under which persistent disagreement can be expected by the mechanism proposed in this paper.
In that context, the notion of cohesive sets (Morris, Citation2000) has been shown to provide a useful structural measure for the characterization of network conditions for polarization. In particular, the paper shows that a bimodal distribution of opinions is a stable outcome of the social feedback model of opinion dynamics whenever at least two communities exist in a network with more connection within than across groups. In such a situation the reinforcement mechanism gives rise to a group polarization process that may take different directions within the different groups. That is, the cohesive structure of a network is decisive for bi-polarization in the context of our model and the effect of different connectivity patterns on the model dynamics (such as, for instance, long-range ties (Centola & Macy, Citation2007; Flache & Macy, Citation2011)) consequently becomes a question of whether the cohesive structure of the network undergoes a qualitative change or not.
In this sense, our model shows that persistent, even increasing polarization may be obtained despite a significant exposure to attitude-challenging content (Bakshy, Messing, & Adamic, Citation2015). Moderate levels of homophily and network segregation are sufficient for opinion polarization.

This paper paves the way for quite a few interesting topics to be addressed by future research. For instance, the model can be seen as an abstraction of recently proposed persuasion models of polarization in which opinion-confirming interactions reinforce commitment to that view whereas disconfirming interaction weakens opinion support due to biased argument processing (Dandekar et al., Citation2013) or biased argument pools (Mäs & Flache, Citation2013; Sunstein, Citation2002). On the other hand, the model is based on a more simple reinforcement procedure which does not assume any more complex kind of argument processing on the side of the agents but instead conceives opinion evaluation and consolidation as the result of positive and negative experiences that are assumingly related to agreement and disagreement with peers. Despite these different interpretations, the collective dynamics of our social feedback model and the argument persuasion model proposed in Mäs & Flache (Citation2013) share certain similarities. In fact, we conjecture that the incorporation of an opinion homophily mechanism at the level of convictions in our model would be compatible with the homophily concept employed in the model by Mäs & Flache (Citation2013) and that the learning rate of the reinforcement scheme used in this paper can be adjusted to the impact that the adoption of a new argument has in their model.

Second, the random geometric graph model used in the first part of the paper has mainly been chosen to illustrate the possibility of stable bi-polarization in a connected graph. On the other hand, the main purpose of the two–community setting used in Section 5 has been to show that the game-theoretic characterization of non-consensus equilibria in terms of cohesive sets provides a valid characterization for the opinion model as well. Future work should explore more complex and more realistic social networks. In particular, our analysis points at the relevance of community structure of a graph which, in network science, is typically captured by measures of modularity (Fortunato, Citation2010; Girvan & Newman, Citation2002). Future work should clarify on mathematical grounds the relation between the notion of cohesion as used in the game-theoretic literature and different conceptions of modularity with the respective approaches to community detection. In addition, we envision that the proposed model is actually well-suited as a tool for community detection and the identification of structural holes in real social networks.

Real social networks are not static but evolve in time and may undergo structural changes. A natural model extension is therefore to provide agents with the possibility to search for new interaction partners if their opinion deviates from the opinion in their current neighborhood. In the framework of reinforcement learning that we propose this can be incorporated in different ways. For instance, given their current opinion agents could learn the rewards they can expect from each other agent in the population and base their decisions of whom to interact with on that evaluation. In the current setting of binary opinions this would either lead to consensus with all agents strongly supporting one opinion or to a complete bi-separation of the network into two groups strongly supporting different views. Other outcomes can be achieved by making network changes costly (Bojanowski & Buskens, Citation2011).

On the whole, we believe that linking opinion dynamics with social feedback mechanisms bears great potential for modeling opinion formation processes in different social settings and the model presented here should be understood as an initial implementation of this paradigm. The theory of reinforcement learning in which different behavioral options (opinion expressions in our case) are constantly re-evaluated based on the rewards they give in a certain environment provides a theoretically convenient and very general framework to address a wide variety of questions that are relevant in the science of opinion dynamics. Most notably, it shifts the explanatory focus from mechanism of social influence and opinion exchange to the incentives and rewards of opinion expression in different social settings which is becoming more and more germane to understanding opinion exchange processes in social media platforms.

Acknowledgments

This project has received funding from the European Unions Horizon 2020 research and innovation program under grant agreement No 732942 (www.Odycceus.eu). We are grateful for the repeated discussion of the ideas described in the paper by the members of Odycceus and Sharwin Rezagholi in particular. The feedback by participants of the ≫Interdisciplinary Workshop on Opinion Dynamics and Collective Decision≪, July 5-7, 2017, Bremen, Germany and the ≫Social Simulation Conference (SSC) 2017≪, Dublin, Ireland, is also gratefully acknowledged. We also acknowledge a final reading by Michael MÃ¤s. Finally, the paper substantially improved due to the comments of three anonymous referees.

Notes

¹ See Takács et al. (Citation2016) for a recent experimental confirmation of opinion assimilation.

² The experiment was indicative of a negative influence only when people held very similar opinions, which can be seen as a tendency to individualization. Assumptions in line with this finding have been investigated in, for instance, Banisch (Citation2010) and Mäs, Flache, & Helbing (Citation2010).

³ It is worth noticing that there is a branch in theoretical biology which proposed very similar mechanisms to model sympatric speciation (bimodal distribution of phenotypes) by assortative mating (individuals preferentially mate with similar individuals). See Kondrashov & Shpak (Citation1998) and references therein. Some aspects of the relation between these models and models of opinion dynamics have been discussed in Banisch (Citation2016, Chapter 8).

⁴ In fact, this would motivate the so-called softmax action selection (Sutton & Barto, Citation1998) which assigns equal probability to the two options when there

Q

values are equal and gradually favors the option with larger

Q

the more they differ. We have tested this alternative and found no qualitative impact on the behavior of the model.

⁵ The reader is referred to their paper for some example distributions and the respective kurtosis values.

⁶ Notice that the theoretically expected value of this measure is 1/3 if both Q-values are drawn from a uniform distribution in $[0, 1]$ and that the value of almost 0.4 is the outcome for the randomly drawn initial condition used in this simulation.

⁷ Notice that the consensus probability is even slightly higher than the probability of a connected graph. The reason for this is that agents in different disconnected components may end up supporting the same opinion with a relatively high probability. For instance, in a random graph with two components and two opinions all agents converge to the same opinion in one half of the cases.

⁸ Cf. Morris (Citation2000, Prop. 5) and Jackson & Zenou (Citation2014, Prop. 3.3).

References

Abelson, R. P. (1964). Mathematical models of the distribution of attitudes under controversy. Contributions to Mathematical Psychology, 14, 1–160.
Google Scholar
Ajzen, I. (2001). Nature and operation of attitudes. Annual Review of Psychology, 52(1), 27–58. doi:10.1146/annurev.psych.52.1.27
PubMed Web of Science ®Google Scholar
Axelrod, R. (1997). The dissemination of culture: A model with local convergence and global polarization. The Journal of Conflict Resolution, 41(2), 203–226. doi:10.1177/0022002797041002001
Web of Science ®Google Scholar
Bakshy, E., Messing, S., & Adamic, L. A. (2015). Exposure to ideologically diverse news and opinion on facebook. Science, 348(6239), 1130–1132. doi:10.1126/science.aaa5139
PubMed Web of Science ®Google Scholar
Baldassarri, D., & Bearman, P. (2007). Dynamics of political polarization. American Sociological Review, 72(5), 784–811. doi:10.1177/000312240707200507
Web of Science ®Google Scholar
Banisch, S. (2010). Unfreezing social dynamics: Synchronous update and dissimilation. In Ernst, A. and Kuhn, S., editors, Proceedings of the 3rd World Congress on Social Simulation (WCSS2010).
Google Scholar
Banisch, S. (2014a). From microscopic heterogeneity to macroscopic complexity in the contrarian voter model. Advances in Complex Systems, 17(05), 1450025. doi:10.1142/S0219525914500258
Web of Science ®Google Scholar
Banisch, S. (2014b). The probabilistic structure of discrete agent-based models. Discontinuity, Nonlinearity, and Complexity, 3(3), 281–292. doi:10.5890/DNC.2014.09.005
Google Scholar
Banisch, S. (2016). Markov chain aggregation for agent-based models. Springer: Understanding Complex Systems. Springer International Publishing. 2016
Google Scholar
Banisch, S., Araujo, T., & Louçã, J. Opinion dynamics and communication networks. Advances in Complex Systems, (2010), 13, 95–111. doi:10.1142/S0219525910002438
Web of Science ®Google Scholar
Banisch, S., Lima, R., & Araújo, T. (2012). Agent based models and opinion dynamics as markov chains. Social Networks, 34, 549–561. doi:10.1016/j.socnet.2012.06.001
Web of Science ®Google Scholar
Banisch, S., & Olbrich, E. (2017). The coconut model with heterogeneous strategies and learning. Journal of Artificial Societies and Social Simulation, 20(1), 14. doi:10.18564/jasss.3142
Web of Science ®Google Scholar
Bojanowski, M., & Buskens, V. (2011). Coordination in dynamic social networks under heterogeneity. The Journal of Mathematical Sociology, 35(4), 249–286. doi:10.1080/0022250X.2010.509523
Web of Science ®Google Scholar
Borgatti, S. P., Mehra, A., Brass, D. J., & Labianca, G. (2009). Network analysis in the social sciences. Science, 323(5916), 892–895. doi:10.1126/science.1165821
PubMed Web of Science ®Google Scholar
Bramson, A., Grim, P., Singer, D. J., Fisher, S., Berger, W., Sack, G., & Flocken, C. (2016). Disambiguation of social polarization concepts and measures. The Journal of Mathematical Sociology, 40, 80–111. doi:10.1080/0022250X.2016.1147443
Web of Science ®Google Scholar
Burt, R. S. (2004). Structural holes and good ideas. American Journal of Sociology, 110(2), 349–399. doi:10.1086/421787
Web of Science ®Google Scholar
Busoniu, L., Babuska, R., & De Schutter, B. (2008). A comprehensive survey of multiagent reinforcement learning. IEEE Transactions on Systems, Man, and Cybernetics-Part C: Applications and Reviews, 38 (2), 2008. doi:10.1109/TSMCC.2007.913919
Google Scholar
Campbell-Meiklejohn, D. K., Bach, D. R., Roepstorff, A., Dolan, R. J., & Frith, C. D. (2010). How the opinion of others affects our valuation of objects. Current Biology, 20(13), 1165–1170. doi:10.1016/j.cub.2010.04.055
PubMed Web of Science ®Google Scholar
Carro, A., Toral, R., & San Miguel, M. (2016). The noisy voter model on complex networks. Scientific Reports, 6, 24775. doi:10.1038/srep24775
PubMed Web of Science ®Google Scholar
Castellano, C., Fortunato, S., & Loreto, V. (2009). Statistical physics of social dynamics. Reviews of Modern Physics, 81(2), 591. doi:10.1103/RevModPhys.81.591
Web of Science ®Google Scholar
Centola, D., Gonzalez-Avella, J. C., Eguiluz, V. M., & San Miguel, M. (2007). Homophily, cultural drift, and the co-evolution of cultural groups. Journal of Conflict Resolution, 51(6), 905–929. doi:10.1177/0022002707307632
Web of Science ®Google Scholar
Centola, D., & Macy, M. (2007). Complex contagions and the weakness of long ties. American Journal of Sociology, 113, 702–734. doi:10.1086/521848
Web of Science ®Google Scholar
Chacoma, A., & Zanette, D. H. (2015). Opinion formation by social influence: From experiments to modeling. PLoS ONE, 10(10), e0140406. doi:10.1371/journal.pone.0140406
PubMed Web of Science ®Google Scholar
Chatterjee, S., & Seneta, E. (1977). Towards consensus: Some convergence theorems on repeated averaging. Journal of Applied Probability, 14(1), 89–97. doi:10.2307/3213262
Web of Science ®Google Scholar
Clifford, P., & Sudbury, A. (1973). A model for spatial conflict. Biometrika, 60(3), 581–588. doi:10.1093/biomet/60.3.581
Web of Science ®Google Scholar
Dall, J., & Christensen, M. (2002). Random geometric graphs. Physical Review E, 66(1), 016121. doi:10.1103/PhysRevE.66.016121
Web of Science ®Google Scholar
Dandekar, P., Goel, A., & Lee, D. T. (2013). Biased assimilation, homophily, and the dynamics of polarization. Proceedings of the National Academy of Sciences, 110(15), 5791–5796. doi:10.1073/pnas.1217220110
PubMed Web of Science ®Google Scholar
Deffuant, G., Amblard, F., Weisbuch, G., & Faure, T. (2002). How can extremism prevail? a study based on the relative agreement interaction model. Journal of Artificial Societies and Social Simulation, 5(4), 1.
Web of Science ®Google Scholar
Deffuant, G., Neau, D., Amblard, F., & Weisbuch, G. (2000). Mixing beliefs among interacting agents. Advances in Complex Systems, 3(01n04), 87–98. doi:10.1142/S0219525900000078
Google Scholar
DeGroot, M. H. (1974). Reaching a consensus. Journal of the American Statistical Association, 69(345), 118–121. doi:10.1080/01621459.1974.10480137
Web of Science ®Google Scholar
DiMaggio, P., Evans, J., & Bryson, B. (1996). Have american’s social attitudes become more polarized? American Journal of Sociology, 102(3), 690–755. doi:10.1086/230995
Web of Science ®Google Scholar
Duggins, P. (2017). A psychologically-motivated model of opinion change with applications to American politics. Journal of Artificial Societies and Social Simulation, 20(1), 13. doi:10.18564/jasss.3316
Web of Science ®Google Scholar
Fazio, R. H. (2001). On the automatic activation of associated evaluations: An overview. Cognition & Emotion, 15(2), 115–141. doi:10.1080/02699930125908
Web of Science ®Google Scholar
Fazio, R. H., Eiser, J. R., & Shook, N. J. (2004). Attitude formation through exploration: Valence asymmetries. Journal of Personality and Social Psychology, 87(3), 293. doi:10.1037/0022-3514.87.3.293
PubMed Web of Science ®Google Scholar
Fishbein, M. (1963). An investigation of the relationship between beliefs about an object and the attitude toward that object. Human Relations, 16(3), 233–239. doi:10.1177/001872676301600302
Web of Science ®Google Scholar
Flache, A., & Macy, M. W. (2011). Small worlds and cultural polarization. The Journal of Mathematical Sociology, 35(1–3), 146–176. doi:10.1080/0022250X.2010.532261
Web of Science ®Google Scholar
Fortunato, S. (2010). Community detection in graphs. Physics Reports, 486(3–5), 75–174. doi:10.1016/j.physrep.2009.11.002
Web of Science ®Google Scholar
Frachebourg, L., & Krapivsky, P. L. (1996). Exact results for kinetics of catalytic reactions. Physical Review E, 53(4), R3009–R3012. doi:10.1103/PhysRevE.53.R3009
Web of Science ®Google Scholar
French, J. R., Jr. (1956). A formal theory of social power. Psychological Review, 63(3), 181. doi:10.1037/h0046123
PubMed Web of Science ®Google Scholar
Friedkin, N. E. (1999). Choice shift and group polarization, American Sociological Review, JSTOR, 1999, 856–875.
Google Scholar
Friedkin, N. E. (2015). The problem of social control and coordination of complex systems in sociology: A look at the community cleavage problem. IEEE Control Systems, 35(3), 40–51. doi:10.1109/MCS.2015.2406655
Web of Science ®Google Scholar
Friedkin, N. E., & Johnsen, E. C. (1990). Social influence and opinions. The Journal of Mathematical Sociology, 15(3–4), 193–206. doi:10.1080/0022250X.1990.9990069
Web of Science ®Google Scholar
Friedkin, N. E., & Johnsen, E. C. (2011). Social influence network theory: A sociological examination of small group dynamics. Cambridge, England: Cambridge University Press, 2011.
Google Scholar
Friedkin, N. E., Proskurnikov, A. V., Tempo, R., & Parsegov, S. E. (2016). Network science on belief system dynamics under logic constraints. Science, 354(6310), 321–326. doi:10.1126/science.aal1794
PubMed Web of Science ®Google Scholar
Galam, S. (2004). Contrarian deterministic effects on opinion dynamics: “the hung elections scenario”. Physica A: Statistical Mechanics and Its Applications, 333(C), 453–460. doi:10.1016/j.physa.2003.10.041
Web of Science ®Google Scholar
Galam, S. (2008). Sociophysics: A review of galam models. International Journal of Modern Physics C, 19(03), 409–440. doi:10.1142/S0129183108012297
Web of Science ®Google Scholar
Girvan, M., & Newman, M. E. (2002). Community structure in social and biological networks. Proceedings of the National Academy of Sciences, 99(12), 7821–7826. doi:10.1073/pnas.122653799
PubMed Web of Science ®Google Scholar
Gleeson, J. P. (2013). Binary-state dynamics on complex networks: Pair approximation and beyond. Physical Review X, 3, 021004. doi:10.1103/PhysRevX.3.021004
Web of Science ®Google Scholar
Goldthorpe, J. H. (1998). Rational action theory for sociology. The British Journal of Sociology, 49(2), 167–192. doi:10.2307/591308
Web of Science ®Google Scholar
Hedström, P., & Ylikoski, P. (2010). Causal mechanisms in the social sciences. Annual Review of Sociology, 36, 49–67. doi:10.1146/annurev.soc.012809.102632
Web of Science ®Google Scholar
Hegselmann, R., Krause, U. (2002). Opinion dynamics and bounded confidence models, analysis, and simulation. Journal of Artificial Societies and Social Simulation, 5(3), 2.
Web of Science ®Google Scholar
Homans, G. C. (1958). Social behavior as exchange. American Journal of Sociology, 63(6), 597–606. doi:10.1086/222355
Web of Science ®Google Scholar
Homans, G. C. (1974). Social behavior: Its elementary forms (Revised ed.). Oxford, England: Harcourt Brace Jovanovich.
Google Scholar
Ising, E. (1925). Beitrag zur Theorie des Ferromagnetismus. Zeitschrift für Physik A Hadrons and Nuclei, 31(1), 253–258.
Google Scholar
Jackson, M. O. (2010). Social and economic networks. Princeton University Press, 2010.
Google Scholar
Jackson, M. O. & Zenou, Y. Games on Networks Handbook of Game Theory, Elsevier, 2014, 95.
Google Scholar
Kimura, M., & Weiss, G. H. (1964). The stepping stone model of population structure and the decrease of genetic correlation with distance. Genetics, 49, 561–576.
PubMed Web of Science ®Google Scholar
Klemm, K., Eguiluz, V. M., Toral, R., & Miguel, M. S. (2003). Global culture: A noise-induced transition in finite systems. Physical Review E, 67, 045101(R). doi:10.1103/PhysRevE.67.045101
Web of Science ®Google Scholar
Kondrashov, A. S., & Shpak, M. (1998). On the origin of species by means of assortative mating. Proc. R. Soc. Lond. B, 265, 2273–2278. doi:10.1098/rspb.1998.0570
PubMed Web of Science ®Google Scholar
Lamarche-Perrin, R., Banisch, S., & Olbrich, E. (2016). The information bottleneck method for optimal prediction of multilevel agent-based systems. Advances in Complex Systems, 19(01n02), 1650002. doi:10.1142/S0219525916500028
Web of Science ®Google Scholar
Lazarsfeld, P., & Merton, R. K. (1954). Friendship as a social process: A substantive and methodological analysis. In M. Berger, T. Abel, & C. H. Page (Eds.), Freedom and control in modern society (pp. 18–66). New York, USA: Van Nostrand.
Google Scholar
Lord, C. G., Ross, L., & Lepper, M. R. (1979). Biased assimilation and attitude polarization: The effects of prior theories on subsequently considered evidence. Journal of Personality and Social Psychology, 37(11), 2098. doi:10.1037/0022-3514.37.11.2098
Web of Science ®Google Scholar
Lorenz, J. (2007). Continuous opinion dynamics of multidimensional allocation problems under bounded confidence: More dimensions lead to better chances for consensus. arXiv preprint, arXiv, 0708.2923.
Google Scholar
Macy, M. W., Kitts, J. A., Flache, A. & Benard, S. Polarization in dynamic networks: A Hopfield model of emergent structure. Dynamic social network modeling and analysis, National Academies Press (Washington, 2003), 2003, 162–173.
Google Scholar
Mark, N. P. (2003). Culture and competition: Homophily and distancing explanations for cultural niches. American Sociological Review, 68(3), 319–345. doi:10.2307/1519727
Web of Science ®Google Scholar
Martins, A. C. (2008). Continuous opinions and discrete actions in opinion dynamics problems. International Journal of Modern Physics C, 19(04), 617–624. doi:10.1142/S0129183108012339
Web of Science ®Google Scholar
Martins, A. C., & Galam, S. (2013). Building up of individual inflexibility in opinion dynamics. Physical Review E, 87(4), 042807. doi:10.1103/PhysRevE.87.042807
Web of Science ®Google Scholar
Mäs, M. & Bischofberger, L. Will the Personalization of Online Social Networks Foster Opinion Polarization? Available at SSRN 2553436, 2015.
Google Scholar
Mäs, M., & Flache, A. (2013). Differentiation without distancing. explaining bi-polarization of opinions without negative influence. PLoS ONE, 8(11), e74516. doi:10.1371/journal.pone.0074516
PubMed Web of Science ®Google Scholar
Mäs, M., Flache, A., & Helbing, D. (2010). Individualization as driving force of clustering phenomena in humans. PLoS Computational Biology, 6(10), e1000959. doi:10.1371/journal.pcbi.1000959
PubMed Web of Science ®Google Scholar
McPherson, M., Smith-Lovin, L., & Cook, J. M. (2001). Birds of a feather: Homophily in social networks. Annual Review of Sociology, 27, 415–444. doi:10.1146/annurev.soc.27.1.415
Web of Science ®Google Scholar
Mercier, H., & Sperber, D. (2011). Why do humans reason? Arguments for an argumentative theory. Behavioral and Brain Sciences, 34(02), 57–74. doi:10.1017/S0140525X10000968
PubMed Web of Science ®Google Scholar
Moran, P. A. P. (1958). Random processes in genetics. In Proceedings of the Cambridge Philosophical Society, volume 54, 60–71.
Google Scholar
Morris, S. (2000). Contagion. The Review of Economic Studies, 67(1), 57–78. doi:10.1111/roes.2000.67.issue-1
Web of Science ®Google Scholar
Moussad, M., Kämmer, J. E., Analytis, P. P., & Neth, H. (2013). Social influence and the collective dynamics of opinion formation. PLoS ONE, 8(11), e78433. doi:10.1371/journal.pone.0078433
PubMed Web of Science ®Google Scholar
Parsegov, S. E., Proskurnikov, A. V., Tempo, R., & Friedkin, N. E. (2017). Novel multidimensional models of opinion dynamics in social networks. IEEE Transactions on Automatic Control, 62(5), 2270–2285. doi:10.1109/TAC.2016.2613905
Web of Science ®Google Scholar
Penrose, M. (2003). Random geometric graphs. New York, USA: Oxford University Press.
Google Scholar
Petty, R. E., Cacioppo, J. T., & Goldman, R. (1981). Personal involvement as a determinant of argument-based persuasion. Journal of Personality and Social Psychology, 41(5), 847. doi:10.1037/0022-3514.41.5.847
Web of Science ®Google Scholar
Ruff, C. C., & Fehr, E. (2014). The neurobiology of rewards and values in social decision making. Nature Reviews Neuroscience, 15(8), 549–562. doi:10.1038/nrn3776
PubMed Web of Science ®Google Scholar
Simon, H. A. (1978). Rationality as process and as product of thought. The American Economic Review, 68(2), 1–16.
Web of Science ®Google Scholar
Slanina, F., & Lavicka, H. (2003). Analytical results for the sznajd model of opinion formation. The European Physical Journal B Condensed Matter and Complex Systems, 35(2), 279–288. doi:10.1140/epjb/e2003-00278-0
Web of Science ®Google Scholar
Sood, V., & Redner, S. (2005). Voter model on heterogeneous graphs. Physical Review Letters, 94(17), 178701. doi:10.1103/PhysRevLett.94.107601
PubMed Web of Science ®Google Scholar
Sunstein, C. R. (2002). The law of group polarization. Journal of Political Philosophy, 10(2), 175–195. doi:10.1111/1467-9760.00148
Web of Science ®Google Scholar
Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction. London, England: MIT Press Cambridge.
Google Scholar
Takács, K., Flache, A., & Mäs, M. (2016). Discrepancy and disliking do not induce negative opinion shifts. PLoS ONE, 11(6), e0157948. doi:10.1371/journal.pone.0157948
PubMed Web of Science ®Google Scholar
Urbig, D., & Malitz, R. (2005). Dynamics of structured attitudes and opinions. Troitzsch, K.G. (ed.): Representing Social Reality. Pre-Proceedings of the Third Conference of the European Social Simulation Association (ESSA), September 5-9, Koblenz, Germany, 2005, pp. 206–212.
Google Scholar
Wasserman, S., & Faust, K. (1994). Social network analysis: Methods and applications (Vol. 8). Cambridge, UK: Cambridge University Press.
Google Scholar
Wright, S. (1943). Isolation by distance. Genetics, 28, 114–138.
PubMedGoogle Scholar

Opinion polarization by learning from social feedback

ABSTRACT

1. Introduction

2. Modeling opinion polarization