# Reduce overspending on the most popular categories

**Summary**:
The standard voting method that most cities use can systematically lead to outcomes where too much money is spent on projects of one category, and too little is spent on projects of another category.
For example, in the Bielany district of Warsaw in 2020, 35% of votes went to education projects, but only 6% of the funding went to education.
Simulations show that using the Method of Equal Shares reduces this bias significantly.

Usually, the projects on the ballot in participatory budgeting can be divided into some broad categories, such as:

- Urban greenery and parks
- Public transport and roads
- Education
- Culture and leisure

Cities often organize projects by category to make browsing the list of projects easier for voters. Warsaw, for example, categorizes projects using 10 different tags (where projects can get multiple tags). This allows us to count, for each category, how many votes projects in that category received, and compare this number to the amount of money that was spent on projects in that category.

The standard voting method used in Warsaw and most other cities proceeds by just selecting the projects with the highest number of votes until the budget runs out. In some cases, this can lead to the most popular categories getting most of the money, and there will be too little money left for projects in other categories.

Using simulations, we quantified this effect. For the years 2020-22, we computed the voting outcome that Warsaw would have got if it had used the Method of Equal Shares, and compared this to the actual outcome. We then computed for each category, what fraction of the votes went to projects in the category. For the two rules, we then compared these fractions to the fraction of the budget that was spent on projects in the category. The results for different districts of Warsaw are shown below. Each point in the gray area corresponds to an election where the Method of Equal Shares produced an outcome that was more similar to the vote shares between categories, compared to the actual outcome.

We can see that the Method of Equal Shares would have reduced the overspending on the most popular categories in essentially all districts, and in some cases by a lot. There are a few exceptions, but in those cases the two methods performed similarly well.

Tip: you can hover over the points to see more information about the election.

## Details of the simulation

The simulation was performed in January 2023 using election data from the Pabulib library based on PB elections in Warsaw in the years 2020, 2021, and 2022. In each year, there were 19 elections: one for each of the 18 districts of Warsaw, and one for the city as a whole. In the simulation, we only considered the districts, for a total of 3 · 18 = 54 datasets. These datasets are annotated with categories: each project is assigned to (possibly several) categories. Some projects are not assigned to any category; for convenience we treat them as belonging to a category of "no tags".

We next computed the **vote shares** for each category in each election.
In the first step, each voter is represented by 1 point, which is assigned in equal parts to the projects the voter voted for.
For example, if a voter voted for 4 projects (there is a maximum of 10 votes), then each of the 4 projects gets assigned 0.25 point.
In the second step, for each project, we divide its points equally between the categories (tags) that it belongs to.
For example, if a project has 30 points and belongs to 3 categories, then each category gets assigned 10 points.
Finally, we compute the fraction of the total points that each category received (i.e., we normalize the vote shares to sum to 1).

We then computed the outcome of the Method of Equal Shares for each election.
For both that outcome and the actual outcome, we computed the **budget shares** for each category.
For each project that was included in one of the outcomes, we split its cost equally between the categories that it belongs to.
For example, if a project has a cost of 600 and belongs to 3 categories, then each category gets assigned 200.
Finally, we compute the fraction of the total cost that each category received (i.e., we normalize the budget shares to sum to 1).

Having computed the vote and the budget share, we can compute the **distance** between the two outcomes.
We do this using the "L1-distance": we sum over all the categories, taking the absolute value of the difference between the vote share and the budget share in that category.
For example, if there were just 3 categories with vote shares (10%, 30%, 60%) and budget shares (20%, 30%, 50%), then the distance would be 10% + 0% + 10% = 20%.
(The marks on the axes of the scatter plot have been rescaled so that the interesting points are in a [0, 10] square, so the numbers do not carry meaning.)
Note that a lower distance means that the budget distribution is closer to the vote share distribution.

We also used other distances (such as KL-divergence) but found that the L1-distance worked best; in any case, the results were similar in that the Method of Equal Shares was closer to the vote share than was the actual outcome, in the overwhelming number of cases.

## Example: Underspending on education projects in Bielany, Warsaw

To better understand the positive effect of the Method of Equal Shares, let's look at an example where the actual outcome performs much worse.

We consider the Bielany district of Warsaw. In each of the years 2020, 2021, 2022, a substantial fraction of the votes went to education projects. However, the actual outcome spent very little money on education projects. For example in 2020, 35% of the votes went to education projects, but only 6% of the budget was spent on education projects.

While there was underspending on education projects, there was overspending on other categories, especially concerning public space, environment, and transit and roads. For example in 2020, 34% of the votes went to these categories, but 55% of the budget was spent on these categories.

The Method of Equal Shares would have reduced the overspending on these categories (spending 43%), and would have increased the spending on education projects (spending 27%).

What is the underlying reason for this difference? It is because the education projects were mostly small-scale and cheap (for example, organizing a course over the summer), while the projects in the categories with overspending were mostly large-scale and expensive. The method used to produce the actual outcome (the one with overspending) does not take into account the project costs and spends most of the budget on a few of those projects. The Method of Equal Shares, on the other hand, does take into account the project costs and spends the budget more evenly across all projects.

Looking at the chart, we see a decreasing vote share on education projects over time. One could speculate that proposers of such projects are getting discouraged over time because their proposals are not successful, given the voting method currently in use.