The French Election Results Database (FERD)

Florent GOUGOU

doi:https://doi.org/

N°3 / 2025

The French Election Results Database (FERD)

Data for the Fifth Republic

Florent GOUGOU

Résumé

This article introduces data produced within the French Election Results Database (FERD) framework. French election results are generally analysed according to official ‘political nuances’ (or political shades) defined by the Ministry of the Interior. However, datasets based on official categories have major limitations because of the administrative logic underlying the Ministry’s identification of candidates. In this article, which focuses on the Fifth Republic, I describe the data sources and specify the logic behind the coding of candidates that forms the basis of the FERD project. I also detail the structure of the files available and indicate how to obtain them via the Centre for Socio-Political Data (CDSP) databank.

Mots-clés

Election results

Candidates

Data collection

French elections

French politics

Plan de l'article

Télécharger l'article

Introduction: the main objectives of the FERD project

Elections are essential to the functioning of modern liberal democracies. To guarantee the integrity of the electoral process and strengthen citizens’ confidence in political institutions, public authorities are responsible for organizing elections and disseminating the associated information with transparency. In France, the Ministry of the Interior is the institution in charge of the electoral process. Along these lines, it relies on both its central administration and the decentralized services of the State. At the central level, the Bureau of Elections and Political Studies (BEEP) coordinates the organization of the electoral process; at the local level, the prefectures manage the practical aspects of the elections by collecting declarations of candidacy and coordinating the operations carried out in each polling station.

The data collected and disseminated by the Ministry of the Interior are obviously valuable for scientific research. The keystone of these official data is the codebook of ‘political nuances’ (i.e. extreme left, miscellaneous left, center, right, extreme right) that the Ministry establish before each election and share with the prefectures to enable the political identification of candidates. These political shades are, however, limited to about 20 categories whereas France counts more than 500 parties. Two main reasons explain this strategy. First, the Ministry of the Interior needs to communicate aggregate understandable national results, leading to a relatively low number of categories. Second, the Ministry is responsible for public financing, so that the categories chosen more often reflect financial affiliations declared by candidates than political parties competing in elections. Such administrative considerations are not relevant for scientific research. As a result, the political identification of candidates is not necessarily suited to the research questions social sciences raise using election results. The objective of the French Election Results Database (FERD) project is precisely to fill this gap, by enriching the data of the Ministry of the Interior and using an original codebook for classifying candidates that is applicable to all elections by direct suffrage: presidential, legislative, regional, cantonal/departmental, municipal and European.

Beyond the strategies that lead each candidate to claim (or deny) a particular label, the codebook used to identify the political orientation of candidates raises two distinct scientific issues. The first concerns the description of the political offer for each election: a relevant codebook must take into account both the partisan affiliations of the candidates and the pre-election alliances forged between parties. The second issue concerns the comparability of the political offer between different types of election and over time: the codebook must make it possible to compare the results of several elections.

To meet this challenge, the FERD project uses a codebook including three levels of coding for each election. Level 1 coding corresponds to all the shades assigned for the election covered. Level 2 coding corresponds to simplified codes that make it easier to read the results of the election in question, while retaining the specific features that shape the political offer. Level 3 coding corresponds to aggregated codes that make it possible to compare results between different types of election and over time. For each election, the codebook describes how these three levels of coding work together.

Data collection

The FERD project is based on six different types of data: declarations of candidacy filed by candidates, declarations of affiliation to groups with regard to the public organization of political life, official party nominations, candidates’ personal statements, information on candidates published in the press, and voting results collected by the Ministry of the Interior. In this first part, I describe these types of data, explain how they are obtained and detail how the information gathered is processed. These data are not available and/or relevant for all elections: figure 1 specifies for which elections they are used.

Declarations of candidacy

Declarations of candidacy refer to the administrative forms filed by individuals in order to run for election. These declarations are recorded by different institutions according to the type of election: the Constitutional Council for the presidential election, the Ministry of the Interior for European elections, and the prefectures for other elections.

Since the introduction of its Elections Information System (SIE) prior to the 1992 cantonal elections, the Ministry of the Interior has compiled a dataset of candidates ("livre des candidatures") for each election, with the exception of presidential elections. Such datasets compile three types of information: information relating to the administrative registration of candidacies (filing number, constituency code, candidate’s number for public campaigning), information relating to the identification of candidates (gender, last name, first name, date of birth, profession, mandates) and information relating to political positioning (the official political nuance).

Since the 2014 municipal elections, candidates’ datasets have been published on data.gouv the week after the candidacy period ends. For previous elections, similar datasets can be obtained from the BEEP. Datasets are made available as CSV files, where each line corresponds to a candidate. Data may be subject to corrections if candidates contest information or if candidacies are annulled by the courts. All information retained in the FERD project is taken from the latest file compiled by the Ministry of the Interior.

The official political nuance assigned to candidates by the prefectural services is the basic information used to compute national election results. Its allocation is also the main problem for scientific research and the reason why the FERD project has been built up.

Declarations of financial affiliation to groups or parties

Under the Fifth Republic, several pieces of legislation have been designed to guarantee political parties’ fair access to the media and to regulate their funding. Under these provisions, declarations of candidacy for some elections are accompanied by declarations of affiliation to political parties or groups.

Financial affiliation is a specific feature of legislative elections. It is a special declaration made as part of the allocation of public funding to political parties. When filing their candidacy, people standing for election can choose to support a funding association declared to the Ministry of the Interior prior to the election, so that the votes each candidate obtains in the first-round count towards their public funding.

Financial affiliation files are drawn up by the Ministry of the Interior and published on its website. These files contain information relating to the identification of candidates (constituency code, surname, first name, gender, date of birth) and information relating to the funding association. These files do not strictly match candidate datasets: candidates who do not join a funding association are not included. Financial attachment datasets are computed as Excel or PDF files.

Financial attachment is a valuable indicator for identifying the official candidates of political parties. However, it must be systematically cross-checked with other information: technical agreements may be made between parties to reach the conditions for access to public funding (regarding the number of candidates and their geographical distribution), so that financial attachments do not necessarily reveal the partisan affiliations of candidates.

Party nominations

Nominations are the result of the political parties’ internal candidate selection processes. In other words, they are the acts by which political parties officially designate their candidates. Nominations are not an obligation under the electoral rules; an individual does not need a party’s nomination to stand in an election.

Political parties draw up lists of party nominations that systematically include constituency codes and the surname and first name of the endorsed candidates. These lists may include information on candidates’ party affiliation and are distributed by the parties themselves ahead of the elections, usually via their websites or social networks. Where this is not the case, they are obtained from the parties’ election officers. They may take the form of HTML pages, Excel workbooks or PDF files.

Nominations are a key element in the coding of candidates. However, lists published by parties are not always perfectly reliable; some are incomplete, while others contain errors in the partisan affiliations of candidates. Nominations received by candidates are then always checked against other data sources.

Candidates’ statements

Candidates’ statements are drawn up by individuals standing for election to present their candidacy to the electorate. These statements are one-sided A4 documents, printed at the candidates’ expense and sent to each voter’s home address at the State’s expense.

Since the 2015 regional elections, the Ministry of the Interior has been gathering candidates’ statements in digital format on a dedicated portal: programme-candidats.interieur.gouv.fr. Candidates’ statements concerning French citizens living abroad are published on diplomatie.gouv.fr, the website of the Ministry of Foreign Affairs. All such files used in the FERD project have been web scraped.[1] For previous elections, some of the documents have been archived by the late Centre d’études de la vie politique française (Cevipof) and made available online via the Archelec portal: archelec.sciencespo.fr.

Candidates’ statements are very rich documents, which may include details of the candidates’ political background and their political opinions. Moreover, statements from candidates belonging to the same party or the same pre-election coalition generally use standardized messages and common iconographic elements (including party logos), which make it possible to bring together candidates from different constituencies. These elements are crucial in the political identification of candidates.

Press articles

The press is used as a complementary source to identify candidates, particularly unknown candidates in the legislative and departmental elections. The articles used in the project come mainly from the local press, which have in-depth knowledge of local politics.

Press articles used in the FERD project are accessed online or via databases, in particular Europresse. However, the press is not perfect when it comes to identifying candidates, and journalists frequently use the official political shades assigned by the Ministry of the Interior. Information obtained by the press is therefore systematically cross-referenced with other available data sources.

Election results published by the Ministry of the Interior

The election results as published by the Ministry of the Interior are compiled on the basis of the reports drawn up in each polling station and validated by the electoral commission. Since the 1992 cantonal elections, these results have been digitized, so that each level of territorial authority (polling station, commune, canton, legislative constituency, department, region) is covered by a specific file. For earlier periods, election results were initially found in printed documents (notably the white papers of the Ministry of the Interior) and were partly digitized as part of other research infrastructures, in particular the Banque de données socio-politiques (BDSP) in Grenoble.

Election results files published by the Ministry include the number of people registered, the number of voters, the number of blank ballot papers, the number of invalid ballot papers and the number of votes obtained by each candidate. When the electoral district is equal to or larger than the territorial unit referred to in the file, the results are allocated directly to the candidates. When the electoral district is smaller than the territorial unit in the file, the results are aggregated on the basis of the political nuances assigned by the Ministry of the Interior.

Since the 2014 municipal elections, the results files compiled by the Ministry of the Interior have been published in the beginning of the week following election day on data.gouv. These are CSV files, in which each line corresponds to a territorial unit. Corrections may be made once reports drawn in each polling station are checked by the electoral commission. Information retained in the FERD project is taken from the file compiled by the Ministry of the Interior after corrections.

Methods: coding the political nuance of candidates

The FERD project is based on three levels of coding: first, a precise coding, which includes all the shades assigned to understand the structuring of the political offer for a specific election (Level 1); second, a simplified coding, which facilitates the presentation of results while preserving the specific logics of the political offer (Level 2); and third, an aggregated coding, which allows the comparison of results between types of elections and over time (Level 3). A codebook of political shades is available for each election, including the links between the three levels of coding. This second part explains the general principles of coding and the specific features of the three levels.

Election specific individual coding: Level 1

Level 1 coding is the basic coding of all candidacies. It systematically involves cross-referencing the data sources presented in the first part of this article and is accompanied by the creation of a codebook.

The general principle of Level 1 nuances is to combine information relating to pre-election alliances and information relating to the individual party affiliation of candidates. These shades can therefore take two distinct forms. If the candidate is standing as part of an alliance between several parties, the form of the code is: ALLIANCE (PARTY). If the candidate does not come from an alliance, the code form is: PARTY. In this way, for the 2024 legislative elections, a candidate who is a member of the Socialist Party (PS) and who is invested as part of the left-wing alliance Nouveau Front Populaire (UG) is coded UG (PS). On the other hand, a member of the PS standing as a candidate in a constituency not covered by the agreement between the left-wing parties is simply coded PS. Finally, a candidate not officially member of a party but nominated by the PS, without being part of a pre-election alliance, is coded as DVG-PS. In the case of an electoral system based on lists, the coding logic is similar, but a single nuance is assigned to each list: the party affiliation used is that of the list leader.

The codebook is built up inductively as coding progresses. It is based on the observation of the political offer in order to give the finest account of the configuration of candidatures. The first stage consists of identifying the contours of pre-election alliances, at both national and local level (alliances are often forged at the departmental level, the basic territorial unit for many parties). The second stage consists of identifying the candidates representing these alliances and providing information on the nominations received by the candidates. The third step is to identify the party affiliation of each candidate. When sources are not consistent, party nominations and candidates’ statements are preferred.

Election specific aggregated coding: Level 2

The objective of Level 2 coding is both to describe the structuring of the political offer and to reflect the electoral balance of power in an election with a limited number of codes. It consists of simplifying the Level 1 coding by grouping together the rare modalities.

The general principle of Level 2 shades is similar to Level 1 shades: the codes combine information on pre-election alliances between parties and information on the party affiliation of candidates. The form of the code is either ALLIANCE (PARTY) or PARTY.

The Level 2 codebook is built upon the Level 1 codebook. Except for generic labels such as regionalist, miscellaneous left or miscellaneous right, the basic rule is not to include political shades with fewer than 30 candidates. For European, regional and presidential elections, given the small number of candidacies recorded, no general coding simplification rule linked to the number of observations is used a priori.

Comparative longitudinal coding: Level 3

The purpose of Level 3 coding is to compare election results over time and between different types of elections. Given the multi-party structure of French politics, the Level 3 codebook includes around twenty codes. These codes can be associated with parties that put forward candidates at almost every election (such as Workers’ Struggle, the Socialist Party or the Rassemblement National, for example) or with generic political tendences (such as regionalists, miscellaneous left, independent ecologists or miscellaneous right, for example).

The general principle of Level 3 political nuances is to simplify level 2 nuances. In this way, level 3 codes have a single form: NUANCE. These codes are also intended to last over time and to represent a large number of candidates over several elections. For example, over the period 2012-2024, the Level 3 codebook includes 25 codes covering all national, local and European elections (Table 1). These codes are listed from left to right.

Comparing coding procedures

One major contribution of the FERD project is to better identify candidates running for election. Two examples from recent elections demonstrate how much this work is valuable. The first one concerns the 2017 legislative elections. In the official dataset established by the Ministry of the Interior, 914 candidates have been labelled ECO, i.e. ecologists. In the FERD datasets, 39 different Level 1 nuances have been used to identify these 914 candidates, corresponding to 18 Level 2 nuances and 8 Level 3 nuances. The Level 3 coding reveals that 451 of these 941 candidates (49.3%) are actually from the Green Party (EELV), whereas 398 are authentic miscellaneous ecologists (43.5%), 34 miscellaneous left, 11 regionalists and so on.

The second example comes from the 2024 legislative elections. In the official dataset, 187 candidates have been labelled DVD, i.e. miscellaneous right. In the FERD coding, 41 Level 1 nuances have been used to classify these 187 candidates, corresponding to 25 Level 2 nuances and 13 Level 3 nuances. The Level 3 coding indicates that 71 of these 187 candidates (38.0%) are actually from LR and its allies, 58 only are authentic miscellaneous right, whereas 6 should have been label extreme right and 4 RN.

Datasets production

The main empirical contribution of the FERD project is the political identification of candidates based on the three levels of coding described above. The main methodological contribution is to provide two types of reliable and manageable data files on French elections: candidates files and results files. The FERD results files can be directly used for statistical analysis, whereas the original files from the Ministry require a great deal of editing before being used for research purposes.

Data consolidation

Prior to the production of the datasets, the original data distributed by the Ministry of the Interior are systematically subjected to consistency tests. These tests mainly concern the internal consistency of the data: (1) for each observation, the number of people registered must be greater than or equal to the number of voters, which must itself be greater than or equal to the number of votes cast; (2) for each observation, the sum of the votes obtained by each candidate must be strictly equal to the number of votes cast. These tests also concern the external consistency of the data: the total number of registered voters, voters, votes cast and votes for the various candidates must be identical, whatever the territorial unit of the results.

Since the introduction of the informatics system developed by the BEEP, anomalies have been rare but not non-existent. Inconsistencies and corrections made are systematically documented in the ‘Notes’ column of the codebook. The 2024 legislative elections provide an illustration of the process. During the test phase, inconsistencies were detected between the total number of votes cast and the sum of the votes obtained by each candidate, for both the first and second rounds. The polling stations concerned were identified (4 in the first round, 1 in the second round) and further tests showed that these basic errors were then transferred to the various levels of territorial units. As the anomalies were not corrected by the electoral commissions, which are responsible for verifying the reports drawn up in the polling stations, the results were proclaimed as they stood and the Ministry of the Interior was then unauthorized to make any change. The necessary corrections to the number of votes cast were made as part of the data consolidation and documented in the FERD results files.

Candidate files production

The FERD candidate files take over and enhance the structure of the candidate files computed by the Ministry of the Interior. All the variables initially included in the Ministry’s file are kept as they are: billboard number, registration number, sex, surname, name, official political nuance assigned by the Ministry, date of birth, profession and mandates held. Two other variables from the Ministry’s results files are added: the number of votes in the first round and (where applicable) the number of votes in the second round. The new variables added as part of the project are of two types. The first type is the political nuances according to the three coding levels: for each candidate, there are Level 1, 2 and 3 shades. The second type is the status at the end of each round of election: a status at the end of the first round (qualified, eliminated, defeated, elected), a status at the end of the second round (withdrawn, defeated, elected) and, in the event of victory, the round of election where it occurred.

Candidate files have a single structure, regardless of the type of election or electoral system: each line in the file corresponds to a candidate. In the case of elections based on lists of candidates, additional variables are added to describe the list on which the candidate is standing: name of the list, name of the list leader, position of the candidate on the list and nuance of the list. Table 2 summarizes the basic variables present in the candidate files and their meaning.

Results datasets production

The FERD results datasets include the number of registered voters, the number of voters, the number of blank votes, the number of invalid votes and the number of votes cast as published in the Ministry of the Interior results datasets but modify their structure to make them directly usable. Table 3 shows how the FERD and the Ministry datasets are structured, using four legislative constituencies in the first round of the 2024 legislative elections as illustrations.

In the Ministry of the Interior results datasets, the variables indicating the number of votes correspond to the number assigned to the candidates: the first variable refers to the number of votes in favor of candidate 1, the second variable to the number of votes for candidate 2, and the X variable to the number of votes in favor of candidate X. In the FERD files, the votes variables correspond to Level 2 or Level 3 shades: the first variable lists the number of votes for shade 1, the second variable the number of votes for shade 2, and the X variable the number of votes for shade X.

The results datasets have a common structure, regardless of the type of election and the electoral system: each line in the file corresponds to a territorial unit. Table 4 summarizes the basic variables included in the results datasets and their origin.

Data use and archiving

The aim of the FERD project is to give the scientific community access to reliable and manageable data on French elections results. All datasets are deposited on one of the main quantitative political science data repositories in France, data.sciencespo.fr. The CDSP is responsible for curating data and compiling metadata. Ultimately, both the data and the associated documentation are accessible under the terms of the CC BY SA 4.0 license.

Data access and use

The data are all deposited in the FERD collection of the CDSP research data repository. The following link provides access to the collection: https://data.sciencespo.fr/dataverse/ferd.

For each election covered, one Candidacies dataset and several Results datasets can be downloaded in Excel format. The results datasets are available with Level 2 and Level 3 coding for each relevant level of territorial unit. The variable and nuance codebooks are inserted directly into the files. A ReadMe sheet is included to introduce the title and the version of the dataset. Each dataset owns a DOI, here is the link for the Candidacies dataset of the 2024 legislative elections: https://doi.org/10.21410/7E4/K5CO76.

For an initial overview of the potential uses of FERD data, see several articles published in the Data and Methods section of the academic journal French Politics (Gougou, 2008; Gougou and Labouret, 2010, 2011, 2013). These articles were based on datasets which largely inspired the contours of the current project. An article in West European Politics also uses FERD data (Gougou, 2025).

Data citation

When using the FERD collection of datasets or part of it, please cite this manuscript. When specificities are described in the codebook of one specific dataset, it is recommended to cite the dataset as well. For any questions, suggestions or requests for collaboration regarding the FERD project, do not hesitate to email me.

Data protection

The FERD project basically deals with personal data, even though candidacies are public and most of the information used to identify the political affiliation of candidates is available on open data sources. To ensure the project complies with the deontological and ethical rules of social science research, it has been registered with the CNRS Data Protection Officer. Its registration certificate was obtained on 16 September 2024 under number 2-24212.

A project web page, accessible via this link, describes how candidates can access data and, if necessary, request corrections.

References

Gougou, F. (2008). The 2008 French Municipal Elections: The Opening and the Sanction. French Politics, 6(4), 395-406.

Gougou, F. (2025). The 2024 French Legislative Elections: Maintaining Elections, Political Crisis. West European Politics, 48(3), 723-737.

Gougou, F. & Labouret, S. (2010). The 2010 French Regional Elections: Transitional Elections in a Realignment Era. French Politics, 8(3), 321-341.

Gougou, F. & Labouret, S. (2011). The 2011 French Cantonal Elections: The Last Voter Sanction Before the 2012 Presidential Poll. French Politics, 9(4), 381-403.

Gougou, F. & Labouret, S. (2013). Revisiting Data on the 2012 French Legislative Elections: Political Supply, Party Competition and Territorial Divisions. French Politics, 11(1), 73-97.

Notes

[1] My warmest thanks to Benjamin Ooghe-Tabanou from the Médialab at Sciences Po Paris for his invaluable help in collecting these candidates’ statements.

Auteurs

Florent GOUGOU

Maître de conférences
Pacte
Sciences Po Grenoble - UGA, Pacte

Contacter l'auteur

Continuer la lecture avec l'article suivant du numéro

Studying geographical imaginaries with numeric mental maps

Camille DABESTANI, Elina MARVEAUX, Hugues PECOUT

Cet article présente la construction de la base de données de l’enquête étudiante « IMAGEUN Student survey » portant sur les imaginaires macrorégionaux, les pratiques et les représentations spatiales d’étudiant·es. Produite dans le cadre de l’ANR-DFG IMAGEUN, cette enquête par questionnaire a été réalisée entre novembre 2021 et juin 2022 auprès de plus de 2000 étudiant·es dans cinq pays (Allemagne, France, Irlande, Tunisie, Turquie). Elle mobilise notamment des cartes mentales numériques afin d’analyser les dimensions sémantiques et spatiales associées aux régions du monde. Après avoir...

Du même auteur

Tous les articles

Aucune autre publication à afficher.