Exploring Argentina's 2023 Election Data

Exploring Argentina's 2023 Election Data

Our data exploration begins with Argentina's 2023 election dataset. This blog post serves as the prelude to our analysis, as we initiate the transformation from CSV to SQL, preparing for a more profound understanding of the data.

Step 1: Creating a New Database

To start, I needed a place to store the election data, and what better way to do it than by creating a database? I used SQL to define a table called election_data, which would be the home for all the information I was about to import.

CREATE TABLE election_data (
    año TEXT,
    eleccion_tipo TEXT,
    recuento_tipo TEXT,
    padron_tipo TEXT,
    distrito_id TEXT,
    distrito_nombre TEXT,
    seccionprovincial_id TEXT,
    seccionprovincial_nombre TEXT,
    seccion_id TEXT,
    seccion_nombre TEXT,
    circuito_id TEXT,
    circuito_nombre TEXT,
    mesa_id TEXT,
    mesa_tipo TEXT,
    mesa_electores TEXT,
    cargo_id TEXT,
    cargo_nombre TEXT,
    agrupacion_id TEXT,
    agrupacion_nombre TEXT,
    lista_numero TEXT,
    lista_nombre TEXT,
    votos_tipo TEXT,
    votos_cantidad TEXT
);

Step 2: Importing the CSV File

With the database structure in place, it was time to import the dataset. I used the sqlite3 command to load the data from the CSV file into the election_data table.

sqlite3 election_data.db <<EOS
.mode csv
.import ResultadosElectorales_2023.csv election_data
EOS

Step 3: Transforming Data Types

One of the key tasks in data analysis is ensuring that the data types are correct. I wanted to work with integers for various columns, so I used sqlite-utils to transform the data types in the election_data table.

sqlite-utils transform election_data.db election_data \
    --type año integer \
    --type distrito_id integer \
    --type seccionprovincial_id integer \
    --type seccion_id integer \
    --type circuito_id integer \
    --type mesa_id integer \
    --type cargo_id integer \
    --type agrupacion_id integer \
    --type lista_numero integer \
    --type votos_cantidad integer

Exploring the Data

Now that the data was ready, I set out to uncover some insights. Here are a couple of example queries I ran:

Total Votes per Party

SELECT
    agrupacion_nombre,
    SUM(votos_cantidad) AS total_votes,
    ROUND((SUM(votos_cantidad) * 100.0) / (SELECT SUM(votos_cantidad) FROM election_data), 2) AS percentage
FROM
    election_data
GROUP BY
    agrupacion_nombre
ORDER BY
    total_votes DESC;

Total Votes per Party per District

SELECT
    distrito_nombre,
    agrupacion_nombre,
    SUM(votos_cantidad) AS total_votes,
    ROUND((SUM(votos_cantidad) * 100.0) / (SELECT SUM(votos_cantidad) FROM election_data), 2) AS percentage
FROM
    election_data
GROUP BY
    distrito_nombre,
    agrupacion_nombre
ORDER BY
    distrito_nombre,
    total_votes DESC;

These queries allowed me to gain insights into the election results, including the total votes for each party and how they varied across districts.

Conclusion

Our transition from CSV to SQL signifies the inception of a systematic data analysis journey. The dataset now stands poised for deeper exploration. The electoral data of 2023 is no longer raw information but a structured resource for rigorous analysis. As we close this phase, let's remember that this is just the first step. With SQL, we are equipped to unveil the intricacies of the electoral landscape, allowing data-driven insights to guide us in the quest for deeper understanding.

You can download the dataset from the Argentina government's website.

In future explorations, I hope to uncover more insights and create compelling visualisations to share with others. Data analysis is a powerful tool, and it's an exciting adventure every time.

Stay tuned for more data journeys!


🤖 Acknowledgement: AI technology was used in the production of this blog post.

Did you find this article valuable?

Support Diego Peralta by becoming a sponsor. Any amount is appreciated!