top of page

Cities and demography in Europe (Excel, pivot table…)

  • felixcvk
  • Jan 3, 2023
  • 4 min read

The goal of this project will be to work on dataset from Eurostat. We can have an interesting insight on where people live depending on their age or gender.




That can be immensely helpful for a company that want to better target where they should advertise the most depending on the audience they products or services is. For example, some cities will have more young people than others.


Fun fact: I will found out that the city I'm originally from has the highest percentage of people above 65 years old in all of Europe !!


We are going to see step-by-step how to do data cleaning and Exploratory Data Analysis we only using Excel (Pivot table, filter, graphs, Descriptive analysis and histogram utility tool…).

Database link here.


We will be able to find answer to the following questions:

-Which European country is the most populated?

-How do the the male/female ratio change between those countries?

-What countries highest/lowest percentage of people between 20 and 24 years old and then over 65 years old?

-We will then answer the same questions for cities and try to see if there is any differences in their demography between small, medium or big cities.


1) Getting the data

The data base is really useful because it allows us to download the selected tables in a spreadsheet which is really handy. These are the selected table from Eurostat:

-Total population

-Total male population

-Total female population

-Population from 20 to 24 years old

-Population from 65 to 74 years old

-Population from 75 years old and over


We download the data and have the following spreadsheet


2) Data Cleaning

As you can see the data is raw and we need to clean it in order to be able to use it.


First, we have to separate the countries and cities rows because they are in the same column as you can see below



This is easily solved by creating a column with variable country and city. That will allow us to differenciate them with a filter. We are lucky because all the countries are on top of the column making it easy for the separation.








For each country or city we can see that there are different number for each year


We only one one column for the total. We can solve this by taking the last number available with an Excel formular


We do repeat this procedure for every other sheet and then create a new sheet with the data we need


We will now delete any duplicates (with the delete duplicates option in Excel) and the rows with missing values by filtering them out.

After the last cleaning step, we have a total of 859 rows which 26 are countries and 833 cities.


3) Simple Analysis

We have two sets of data that we have to dissociate: Countries and cities. We will first analyse the demography of the european countries and try to have a global picture.

a) Countries

Because there are not too many countries, we can have a look at the bar chart for the total population of each country. And we can easely see the ones that are the most populated.


Lets now have a look at the percentage of male in each European countries

We can see here that in most European countries have less than 50% of males. It even goes down to almost 46% of men in Latvia and Lituania. This data makes actually sense because it is widely know that women usually live longer than men which could explain this difference. Of course this is probably not the only reason. It would be great to extend this analysis.


It is also interesting to note that three countries have actually a little more men than woman (Malta, Norway and Sweden). A deeper analysis should be made why this is the case.


The next graph is about here are young people from 20 to 24 years old the most represented


And below you can see the percentage of people over 65 years old:


b) Cities

Before going into deep, it good to have an overview of the data for cities and their statistics. In Excel there is a prebuilt tool for descriptive statistics. Here are the results:












With the histogramm below we can easely see what cities are the most representated in the data.



It would be great to separate each city depending on their size. SO we going to great three categories : small (<150 000), middle(<1 000 000) and big cities (>1 000 000).


We can see the sum of each category with a pivot table.

Our dataset reveals that people in Europe predominantly reside in medium-sized cities.


Let’s analyze the city and their demography.

First about their gender:

On the left the cities with the most male and on the right the ones with the most female
On the left the cities with the most male and on the right the ones with the most female

Lets do the same but only for big cities:

On the left the cities with the most male and on the right the ones with the most female
On the left the cities with the most male and on the right the ones with the most female

By comparing the categories and their average % of males, we can see that there is no much differences.


The second part is about which city young people from 20 to 24 years old go. We can easily answer this question by calculating the percentage of the total and then filtering the column:

And indeed, we can see that Oxford or Cambridge are at the top because they are student cities.


We can also filter out the small cities and medium ones.


We can also see the differences from city sizes and will conclude that there is no much difference between them.



We are going to do the same with people over 65 years old

Highest percentage on the right and lowest on the left
Highest percentage on the right and lowest on the left

You can see the city with the highest percentage of people over 65 years old is where I am originally from. How funny!


We can now inspect the differences between sizes of cities:



And can conclude that elderly people tend to prefer small cities.


Conclusion:

We just did some data cleaning and Exploratory Data Analysis only with the help of Excel and its differents tools (formular, pivot table, graphs, histogram utility tools…etc)

It would be interesting to do the same with another part of the world.

bottom of page