Exploring and Visualizing Campaign Contributions (go back)
Is a company’s Fortune 500 ranking correlated to its contributions to House and Senate election campaigns?
(Adapted from a final project report for a Data Science class at Brown University in Spring 2016. I worked in a group of three students to clean, analyze, and visualize the data. Please note that this page is not mobile-friendly.)
One of the main issues in the 2016 presidential race was campaign contributions–where do they come from? Who receives them? We were interested in the role played by money in elections. By contributing massive donations to campaign cycles, are companies essentially “buying off” the government for their own means?
We began by analyzing publicly available datasets (from OpenSecrets.org) to examine financial contributions towards candidates running for the House of Representatives and the Senate. We explored which sectors donated the most to elections, and found the top 15 industries that contributed to certain campaigns.
In the bubble chart above, each node corresponds to an industry that made contributions during the 2014 House and Senate elections. The size of a node corresponds to the total value of all contributions by that industry.
In the next visualization, we displayed the top 15 industries that contributed to campaigns in the 2014 election cycle.
Our initial explorations were rather broad; we started out with the goal of discovering trends between campaign contributions and the outcomes of close elections. After receiving some feedback on our ideas, we decided to narrow our focus.
We wanted to know whether company contributions might be correlated with company performance; in other words, if a company contributes to at least one winning candidate in the House or Senate, is that contribution correlated with an increase in their own financial success?
As an indicator of company performance, we decided to use the yearly Fortune 500 rankings. We specifically chose this indicator because 1) it was easily accessible and 2) the methodology used by the Fortune 500 is to rank companies by total revenues for their respective fiscal years, and takes into account profits after taxes. For our purposes, this seemed to be sufficiently representative of an American company's annual financial success.
Our project required data from 2 different sources: Fortune 500 rankings from 2012 and 2013, and OpenSecrets election data from 2012. While there was data about Fortune 500 company profits in 2012 and 2013, we couldn't obtain any for 2014 and so we could not factor profit changes between 2013 and 2014 into our analysis. However, we were able to incorporate Fortune 500 rankings from 2014.
To obtain the election data on campaign contributions, we used the OpenSecrets API to gather the legislators that won in the 2012 election cycle and the top 10 industries and companies (and their breakdowns) that donated to each of them, and wrote that information into a JSON file. That way, we could read from our own file to do any analysis, and avoid making API calls every time we wanted information (since API calls were limited to 200/day). Pulling in the data from the API proved to be more work than we anticipated because there were many inconsistencies in the available data. For example, when we tried to match the legislators' ids to the companies that contributed to them in an election cycle, some of the ids didn't exist in the candidate contribution API call. Thus, we had to manually check for these specific ids and make sure we didn't call them, which took up the majority of our data collection time.
Fortune 500 Data
We found the Fortune 500 rankings for 2012, 2013, and 2014 from a website that pulled data from the Forbes website. We also manually pulled in information from the Fortune 500 website. Furthermore, we were able to get information on companies’ profits for the years 2012 and 2013.
Once we had our data, we needed to cross reference the companies in the Fortune 500 2012 ranking list with the list of companies that donated to legislators from the OpenSecrets data. Then we divided the 500 companies into 2 groups: companies that donated in 2012 to at least one legislator and companies that didn’t donate (or at least weren’t on the OpenSecrets list).
Our process for getting and cleaning the data was as follows:
Integration and Entity Resolution
Integrating the company names from the 2 data sources proved to be a significant challenge that we dealt with in our divideFortune500.py file. Many of the company names were not consistent across datasets. Some of the more annoying challenges we faced looked like this:
To resolve these differences, we performed the following on company names:
There were also more subtle issues that we had to manually resolve when we found companies that were suspiciously dropping off the list from 2012 to 2013. Some companies changed their names from 2012 to 2013. For example, the company Limited Brands became L Brands. In addition to this, we removed companies that were acquired or merged as we had no way of keeping track of their performances under their new names. One company (Catalyst Health Solutions) became private and was not included on the Forbes list the next year as a result, so we also excluded that data point. Finally, one company (Aon) dropped off the list because it relocated to London–and the Forbes 500 is only comprised of companies based in the US. That data point was removed as well.
Once we had our data integrated and resolved, we analyzed it to see whether companies’ contributions were correlated with their financial growth.
To do this, we completed a T-test on the average movement in rankings from 2012-13 of Fortune 500 companies that donated to winning candidates in the 2012 election (Group 1) versus average movement in rankings of companies that didn’t donate (Group 2).
|Group 1 (Contributed)||Group 2 (Not Contributed)|
Mean change in rankings: 0.475490196078
Sample size: 204
Mean change in rankings: -1.19217081851
Sample size: 281
We conducted a 2-sided t-test for the difference of means between Group 1 and Group 2.
Difference between means:
With a p-value of 0.5812, there is no significant difference in change in rankings from 2012-13 between companies that contributed to the 2012 election cycle and companies that did not at an alpha level of 0.05. Essentially there is no correlation between short term company performance and whether or not they contributed to winning candidates in an election. This was bad news for our hypothesis, but we feel that it’s good news overall. Perhaps we can still hold some hope for democracy, despite the fact that many companies hold an extraordinarily large amount of influence in Congress through lobbyists and donations. However, our project only takes into account money that can be tracked publicly. There is still the matter of "dark money," which refers to money given to nonprofits or PACs where the donors and amounts cannot be tracked.
Another factor contributing to why there may not be a statistically significant difference could be the fact we still have a relatively small sample size. The standard deviations for our data were extremely high. If there were 500,000 data points versus our 500 data points, there might be a more significant conclusion.
We visualized the companies’ rankings from 2012 to 2014. For the purposes of the visualization, if a company drops off the list, we show it as dropping to 501. We thought that it was cool to see that a lot more companies from the top 50 donated money than companies in the bottom 100 of the list. There are also a lot more companies that dropped off the Fortune 500 list from the bottom half of the list than the top half.
We analyzed the distribution of company rankings between companies that donated versus those that didn’t donate below. Some of the rows don’t add up to their bucket size because they had a faulty data point as explained in the Integration and Entity Resolution section above.
|Ranking||# of Companies that donated||# of Companies that didn’t donate|
Analyzing the data further, we discovered that of the companies that donated (Group 1), 44% of them rose in rankings, 4% stayed the same, and 51% dropped in rankings. Of the companies that didn’t donate (Group 2), 47% rose in rankings, 3% stayed the same, and 49% dropped. This was even more surprising to us as we thought that a greater percentage of companies that donated would rise in rankings than companies that didn’t donate.
We found the companies that dropped and rose the most in rankings to be really interesting. In Group 1, Calpine (a natural gas and geothermal energy company) donated to 3 legislators and dropped 95 spots. The biggest winner out of Group 1 was also an energy company: Energy Transfer Equity (natural gas company in Texas), which went from 312 to 161 and donated to a single legislator in Texas.
In Group 2, the Great Atlantic Pacific Tea company dropped the most as it went bankrupt. Rock-Tenn, a packaging supply store, rose the most from 449 to 291.
|Group 1||Group 2|
364 -> 459
Great Atlantic Pacific Tea
317 -> bankrupt
Energy Transfer Equity
312 -> 161
449 -> 291
In addition, we compared companies’ profit changes from 2012 to 2013 with the amount they contributed.
This was quite interesting as it can be seen that companies that donated lower amounts tended to lose more money, probably because companies that donated more were doing better overall. AIG and HP were 2 of the biggest losers and they both donated less than $25,000.
We were also curious whether the ranking of a company was correlated to how much they donated.
As you can see in the visualization above, almost all of the companies that donated $400,000 or more were top 100 companies, which definitely makes sense as they probably had more resources to spend. Northrup Grumman, Comcast, and Goldman Sachs were the top 3 donors overall.
Below is a summary of our process.
Further exploration might include analysis by country–do the results change in different countries, where corruption might be higher? We could also examine how donations influence politician's stances on issues–i.e. if a politician receives a contribution from a biased party, is it possible to see if that is correlated with changes in stance on certain issues?