Modeling Bimodal Politics: Part 2
To illustrate my previous blog post, I’ve written some simulations to demonstrate the phenomenon of how political candidates can sample and reposition their own positions in order to maximize their probabilities of winning the most / more votes than their opponents. In these examples, the problem is to specifically model elections with a binomial voter distribution. However, the idea generalizes into an optimization problem. The optimization is suppose that you have an unknown noisy underlying function or distribution (the voter distribution) and you are allowed to repeatedly simultaneously sample the distribution at N different values (running an election) to see the amount of the function / distribution that is closest to each sample value (votes each candidate gets). From this one could then either predict what the underlying function/distribution is and/or know how to optimally respond to an opponent’s move. Additional noise/drift factors and sampling cost customs can also be added for increased complexity. What’s the most efficient way to get a good guess of the underlying distribution and strategic response given an opponent’s position?
In the practical case of the US, the value of N is typically two for each major political party. As mentioned, we are exploring a binomial peaked distribution. However, the underlying function could in theory be anything. The noise/drift factors are also very relevant in real political elections since elections always have a degree of luck and randomness while voter preferences do change over time. However, in the models that I created these are not taken into account.
On a related note, I’m not 100% on the robustness and reliability of the models due to flaws in either execution (bugs) or even underlying algorithm design. Therefore, if anyone can offer suggestions, improvements, and enhancements, I would appreciate it.
Model Principles
The basic premise of the models are that we have a spectrum of 0-100 that contains the realm of plausible candidate and policy stances. What the numbers actually stand for is irrelevant since one can always scale it to encompass whatever possible stances they wish to cover. From there, we create an underlying distribution of voter preferences along this spectrum. This spectrum is completely arbitrary and in theory can even vary from iteration to iteration. However for our context, we will create a distribution with two prominent peaks at 25 and 75, along with a much smaller peak at 50. This represents a political environment with fairly polarized viewpoints and a reduced amount of middling views.
Candidates can then position themselves at some initial point on this spectrum. Again, their initial positions are arbitrary and can be specified by the user or be generated randomly.
After that the model is run. Candidates win a voter’s vote if they are closer to this voter than any other candidate. However, as we will discuss later, it is also possible for a voter to not vote for any candidate. After every round candidates know their own position as well as the position of their opponents. However, they do not necessarily know what the underlying distribution is.
Interpolation
In order to update their beliefs and possible repositioning after a vote, candidates will use linear interpolation. Essentially, since they know the exact positions of each candidate and the total votes that each candidate received, they can linearly interpolate roughly the cumulative voter percentiles. For example suppose that a candidate positioned herself at 30 and received 15% of the votes. Furthermore, the candidate immediately to her left positioned himself at 25 while the candidate immediately to the right of her is at 35. In that case, we can linearly interpolate that the 15% of voters are uniformly distributed between 27.5-32.5. Obviously, the more candidates we have and more trials we run on our unchanging voter distribution, the better we would be able to refine our distribution prediction.
Candidate Strategy
In a two-person election then, it is easy to see optimal for candidates to find the median voter. In this case, it corresponds to the linearly interpolated point where 50% of voters are to the left and 50% are to the right. However, this will only hold in the case where all the voters vote. In cases, where voters do not vote, this may not necessarily be true.
Going forward, we will discuss the various variations I made to explore this.
Model 0
Model 0 is pretty simple. It’s so simple in fact, that in my code, I didn’t even have it written as a separate model. I simply had it as a subset baseline of the subsequent models. Model 0 generates a set of 100 candidates uniformly distributed between 0-100. These candidates run one simulated election and from there interpolate where the median voter position is. Since there’s a lot of candidates, the linear approximation becomes a pretty close representation of the actual distribution. If we want even higher precision, we could simply increase the number of candidates to an even larger numbers.
Although this method gives the most accurate representation of the underlying candidate distribution, it is not the most realistic. The reason being that real elections usually don’t have that many viable candidates who can get votes. Most of the time, there are two maybe three candidates that will command the largest share of votes. Even if a less popular candidate is closer to your personal beliefs, you probably won’t vote for him since you know he won’t win. This is obviously unfortunate, but also understandable since running a successful campaign does benefit from having more resources and entrenched political networks. The following models will focus on cases where the number of candidates is limited to two.
Models 1 & 2 Overview
Models 1 & 2 allow two candidates to run the election multiple times up until either convergence or a round limit with candidates updating their positions after each run. Both models share many of the same parameters that can be specified by the user.
MIN_DISTANCE: The minimum distance that most candidates must keep from each other on the political spectrum. This enforces the needs of candidates to be noticeably different. I set this to 3.
MAX_VOTER_DIFF_RATIO: The ratio that determines whether an election is “close” or not. If the difference between A and B’s votes divided by total votes is less than this, then we say that the election was close. In that case, although there might be a “winner” in terms of actual votes, A and B will evaluate the result as a draw for strategic repositioning purposes. This is initiated to 0.05.
POS_MAX: The maximum possible values in which candidates can take a position in starting at 0. This is initiated to 100 and should probably never actually be changed.
MAX_ITERS: The maximum number of rounds the simulation will be run. If the model doesn’t converge, this prevents it from looping infinitely. If the model does converge, fewer rounds may be run. This is initiated to 50.
VOTING_THRESH: The minimum amount of marginal benefit that a potential voter must receive from one candidate over the other, as specified by the BENEFIT_FUNC detailed next, in order to get a vote at all. For example suppose this parameter is 5 and a certain voter scores A = 20 and B = 18. In this case since the difference between A and B is less than 5, they would not vote for either. This is initiated to 0 for one run and 0.25 for another.
BENEFIT_FUNC: The function used to determine the benefit of a voter to a candidate. The default function is where c is the candidate position and v is the voter position. This function can be played around with (i.e. sub-linear or super linear scaling on distance).
A: The initial position of A. Initiated as a random uniform number between 0-50.
B: The initial position of B. Initiated as a random uniform number between 50-100.
ALLOW_CANDIDATE_FLIPPING: Allows a losing candidate to “switch” to the other side of the political spectrum in order to maximize his odds of winning future rounds. For example suppose that A is at 45 and B is at 50 and A loses. If this parameter is False, A’s next position must be left of 50. If it is True, then A may choose a position right of 50.
LOSER_MVT: Exclusive to Model 2. The fixed amount that candidates in the election can move per turn. Initialized to 1.
Model 1 Logic
Essentially, this model compels the “loser” of each election to adjust his strategy while the winner will stay at the same position. The intuition behind the winner staying at the same location is because a winner will not be compelled to change a “winning” position. The loser’s new position will be determined by the “Interpolation” function as described above. Note that that this will only apply if the difference in the two candidates’ votes is greater than the MAX_VOTER_DIFF_RATIO. If not, then both candidates deem the election as close and will move to half the midpoint between them (points that are 25% and 75% between A and B). This will continue until either the candidates are within MIN_DISTANCE of each other.
Since there are situations where candidates can end up in an infinite loop, for a candidate oscillating between two positions that give the same result, we have additional termination conditions. One is an oscillation check, which will terminate the run if the position of both candidates’ most recent position is the same as their position from 2 and 4 rounds prior. The second is MAX_ITERS which limits the total number of rounds.
From there, if ALLOW _CANDIDATE_FLIPPING is False, the simulation terminates. If it is True, the “loser” then “experiments” and crosses over to the other side of his opponent with the initial crossed point as the midpoint of his opponent’s position and the voter boundaries. The simulation is then rerun until it terminates again.
In reality, it’s unusual for candidates to drastically change their stances in real politics, so realistically this should always be set False. However, another way to interpret the flip is the death of one political party (the loser’s original party) and the birth of a new political party where the loser moves to that better represents the country.
A big problem of Model 1 is that the movement of candidates are pretty discontinuous with large jumps. This not only produces unstable results, but is also not very realistic. When changing their stances, candidates don’t just “jump” to drastically new positions after each election, but gradually change their stances over time.
Model 2 Logic
Model 2 tries to rectify the “jump” problem of Model 1. It scraps using the Interpolation as a basis for movement. Instead, it introduces a new variable called LOSER_MVT, which is a fixed amount in which candidates can move per round. If the election is close, both candidates move this amount towards each other. Otherwise, it is only the loser who moves this amount. The direction default is towards the loser’s opponent. However, if the loser has lost two times in a row, and his most recent performance (as measured by total votes the loser received) is less than his previous performance, then the loser will move in the opposite direction that he moved last time by twice the default distance. This is essentially backtracking and reversing his previous movement upon update of new information. This model also incorporates the same logic of candidate flipping, oscillation check, and maximum iterations.
Results
Model 0
Model 1
Model 2
Overall we see that if we have VOTING_THRESH = 0, regardless of the process, we always end up with the two candidates near the median voter as expected. However as alluded to before, this is not necessarily the case when some voters opt not to vote (VOTING_THRESH > 0). If we allow candidate flipping, we see that Model 1 and 2 can cause the candidates to crowd around one of the peaks. Model 2 VOTING_THRESH > 0 and ALLOW_CANDIDATE_FLIPPING to False results in both candidates staying near their respective peaks. In our current political climate, I think this is the most realistic model.
Candidates on Same Side
The following show what happens if the candidates start on the same side of the median voter under the same conditions.
Model 1
Model 2
Possible Extension To Predict Voter Distribution
The simulations save the position of each candidate as well as the cumulative votes they earned per round (in the code it’s called ‘self.results_dict’). From there, perhaps it is possible to utilize that data to train a model that can predict the optimal position to pick given the results of previous rounds and your opponent’s position. Essentially, suppose you know the previous positions and results of you and your opponent. You also know where you opponent is going position himself this round. Given this, where should you position yourself to maximize your likelihood of winning? This problem could potentially be evaluated starting with gradient descent. However, whereas gradient descent relies on a smooth continuous objective function, the case we have here only has discrete sampled points. As a result, I don’t know how to implement this. Other approaches are also welcome in exploring this and I would encourage people to think about it, and if you figure something else please tell me.
Other Model Extension
Another extension of these simulations can be found here. Like a traditional Hotelling mode, this paper assumes a varying cost of travel to either good A or B and a varying cost of price. However, instead of homogenous offered goods, the model includes heterogenous but equally beneficial goods as well as a varying “taste of variety” factors for people. Heterogeneous but equally beneficial means that goods A and B are nominally undifferentiated in terms of benefit, but are still different. The “taste of variety” factor makes it more or less likely for a person to try a good that has a higher procurement cost because it is different than what they’re used to. A good real world example is a consumer who buys both Nike and Adidas basketball shoes. Functionally, the shoes are pretty much the same and he should just buy shoes from whatever store is closer or offers better discounts. However, the consumer enjoys owning slightly different brands and styles of shoes. Therefore, for his next new pair, the consumer may be willing to travel to a further / more expensive store.
In terms of elections, this model is somewhat inverted. Voting involves a roughly constant cost of travel and price (going to the voting booth). However, the results of and value of different candidates can be vastly to different people. The paper also assumes a uniform distribution of people, which is also not the case.
If you increase this “taste” factor even for just one group of people who have a slight but not heavy initial preference for one candidate or the other (in the context of our political models, people who are from 25-50 or 50-75), this encourages that candidate to move towards the more “extreme” portion of his constituency. The reason being that your moderate supporters are unreliable and may not vote for you even if you make an effort to align more with them. Therefore, it’s much more important to secure your core constituency. As a response, the other political candidate will actually also move slightly in this direction in order to capture some of the leftover / abandoned moderate voters.
In the context of elections, this actually has a very counter-intuitive effect. A political party who has less reliable moderate supporters may actually move the general politics of the country further into the direction of that party. The reason is that unreliable moderate supporters force a candidate to focus on his core base. Therefore, depending on the “reliability” of the opponent’s core vs. moderate bases, it could shift the opponent towards the candidate’s policies. The opponent is more likely to shift if he believes that he can capture some more of the moderate voters left behind without sacrificing too many of his core voters. For example, two candidates A,B are initially at 40, 75. A “gives up” on the moderate voters and decides to focus on his core constituency by moving his positions to 25. B sees this and then moves to 60 to try and capture some of the moderate voters A left behind.
Conversely if it is instead the extreme voters who are more likely to have a “taste of variety” (0-25 or 75-100), then the political candidate is encouraged to move more towards the center. However, in the context of politics, this is not very realistic. After all, no one expects a stalwart Democrat to become a Republican as more likely than a moderate Democrat to become a Republican.