###### Which party has the most racially-representative voting base?

## Parties in the USA

*Sanders, Cruz, Paul, Rubio, Clinton, Bush, Walker, Carson, Fiorina, O’Malley, et. al….*

The 2016 Presidential field is expanding rapidly each week, but instead of focusing on who is running in which party, let’s look at how Americans identify with each party. For the sake of clarity, I will use the word race when describing the demographics in this article, although I recognize that there is much debate over whether some of these categories should be classified as races, ethnicities, or something else.

The data and analysis presented here is basic, yet insightful, or so I hope. Most of the data comes from a 2012 Gallup poll in which over 300,000 Americans were inquired about their race and party affiliation. The figure below displays the resulting racial composition of each major political affiliation—Republican, Democrat, and Independent—juxtaposed with the racial composition of the United States as a whole.

So which political party (including Independents) has the most racially-similar support base compared to the general population of the United States? The common quip is that Democrats are racially diverse and Republicans are not. Often, that statement has the underlying implication that Democrats are therefore more representative of the American population. The table below presents a simple analysis in which I compared a race’s proportion in a political party to their proportion in the US population. The most representative (i.e. the party with a racial proportion most similar to that of the general population) and least representative political parties are listed, with the right-hand column stating whether the least representative party is over- or under-representing that demographic in its support base. This analysis uses the baseline presumption that a political party should be representative of the general population. We see that white Democrats (60% of Democrats) are actually the most proportionally similar to the general population (63% of Americans), while there are 41% more white Republicans than you would expect. Interestingly, African Americans are equally misrepresented in both the Democrat and Republican support bases, with the former having 83% more black supporters relative to the proportion of African Americans in the general population and Republicans having 83% less. Additionally, we see that outside of white and other supporters, whom I presume include Native Americans and mixed-race Americans, neither major political party’s support base is notably representative of the US population. Instead, the collection of Americans who identify as Independents are the most representative for all but two racial groups.

We can also break each racial demographic into their corresponding political affiliations. With this view, we can perhaps get a better understanding as to why black supporters are so proportionally skewed in both parties. While Independents hold a plurality in other racial groups, African Americans are most likely to identify as Democrats. Conversely, African Americans are also the racial group least likely to identify as Republican. I will not go into the political history that may or may not explain this trend and will instead leave that task to the reader. When only considering the two major parties, all non-white groups are more likely to identify as Democrats (often by a near 2:1 ratio), while White Americans are most likely to identify as Republicans.

## Stats quo

Next, I wanted to do some light, and perhaps misguided, statistical analyses of political affiliation. I chose to use two metrics to quantify how well each political party represented the general population. First, I chose mean squared error (MSE), which measures the mean difference between a predicted value and the corresponding observed value. MSE is often called “prediction error” (how convenient!). In this case, our predicted value is the proportion a given demographic holds in the US population while the observed value is the proportion that demographic holds in a given political party. Second, I used a metric of my own creation called “mean proportion squared error” (MPSE). While very similar to mean squared error, this formula takes into account the relative differences by calculating the percent difference between predicted and observed values. For example, 50% of your apple grove is red apples, but red apples only make up 30% of the apples in your kitchen, your kitchen has 40% less red apples than expected and a mean proportion squared value of 0.16. For those familiar with statistics, you may also note that the formula for MPSE is very similar to mean percentage error, except that mean percentage error does not take the absolute value of percentages and thus allows two equal and opposite variances to negate each other.

Let me briefly explain why I chose to use mean proportion squared error. In short, I thought it was necessary because otherwise differences between predicted and observed values will matter less for minority populations than for the majority population—non-hispanic white Americans. This is because MSE treates percentages and proportions as normal numbers. For example, if the Pirate Party’s voter base is 50% white and 8% hispanic (remember, the US is 63% white and 17% hispanic), MSE would essentially view the errors as an absolute difference of 13% for white voters and an absolute difference of 9% for hispanic voters. Thus, we would conclude that the party base is least representative of White Americans. On the other hand, if we consider the proportion each group holds in the general population, and therefore treat the proportions as percentages and not normal numbers, we see that there are 21% less white voters than expected (13% absolute difference divided by 63% of US population) and 53% less hispanic voters than expected (9% absolute difference divided by 17% of US population) in the Pirate Party relative to the general population. In this case, we see that the Pirate Party is actually less representative of hispanic voters than white voters. I thought this a necessary distinction to make, particularly in a democracy where we often think in terms of representation (proportions). If you disagree with my belief that taking into account each demographics proportion in the US population, do not worry because I will still use mean squared error.

Once again, we see that Independents are the most representative of the American population. In both metrics, Independents have the best fit to the general population. On the other hand, we see that Republican supporters feature the worst fit to the general population. You may also note that MSPE penalizes both Democrats and Republicans for having poor fits to the US minority populations.

Lastly, I broke out each demographic’s squared and proportion squared error within each party. We see that white Republicans stand out as having the largest absolute difference from their proportion of the general population with a squared error of 0.0676 (absolute difference of 26%, treated as a number). Meanwhile, both Republicans and Democrats appear to have decent representation in minority groups when using MSE. However, remember that this is because these groups hold relatively small proportions of the general population. When we consider the proportion each demographic holds in the general population, Republicans actually have the worst representation among Asian, Hispanic, and African Americans. In all three cases, these groups are underrepresented in the Republican support base. On the other hand, we see that the Democrat support base is also not spectacularly representative of these three groups, with African Americans over-represented and Asian and Hispanic Americans underrepresented.

## Parting Thoughts

Race is a sensitive topic, particularly in the political arena. However, there has been much talk about how Republicans need to expand their support base to include more minorities, particularly as the proportion of white Americans decreases over time. This analysis revealed the Republicans weakness among minorities—at least in terms of who is openly affiliated with them. Given the record high number of Americans who now identify as Independent, this analysis may not fully capture the proportion of minorities who vote Republican. I’ll save that for another post. Additionally, with so many Americans identifying as Independent, it is no wonder that they are the most racially-similar to the general population. With the increasingly large cohort of Independents, one might reasonably expect regression to the mean. Both Republicans and Democrats face an uphill battle bringing hispanic supporters on board, although Republicans face a much larger challenge. Republicans also face an on-going challenge of attracting more black supporters; Republicans have not held a plurality in African American political affiliation since at least 1936. Both parties could also do more to bring asian supporters into their ranks.

## Methodology

Analysis was conducted in R, without packages. Nothing like plain, vanilla, base R! All figures were created in excel and powerpoint using the data exported from R.