This article is a part of a series intended to show how to use Memgraph on real-world data to retrieve some interesting and useful information.
We highly recommend checking out the other articles from this series which are listed in our tutorial overview section.
Football is a word that could mean one of several sports. In this article, we are referring to the best-known type of football, association football. In North America, South Africa, and Australia, to avoid confusion with other types of football, it is called "soccer".
In professional football, a transfer is the action taken whenever a player under contract moves between teams. It refers to the transferring of a player's registration from one association football club to another. In general, the players can only be transferred during a transfer window and according to the rules. The transfer window is a period during the year in which a football team can transfer players. There are two transfer windows per season: winter and summer windows. Winter transfer windows are throughout January while the summer windows are from July till August.
Usually some sort of compensation is paid for the player's rights, which is known as a transfer fee. When a player moves from one team to another, their old contract is terminated and they negotiate a new one with the team they are moving to. In some cases, however, transfers can function similarly to player trades, as teams can offer another player on their team as part of the fee.
As you may presume, there is a lot of money involved in the game of transfers. According to FIFA, in 2018, from January till September, there were 15,626 international transfers with fees totaling US$ 7.5 billion dollars.
Football season is that part of the year during which football matches are held. A typical football season is generally from August/September to May, although in some countries, such as Northern Europe or East Asia, the season starts in the spring and finishes in autumn due to weather conditions encountered during the winter.
In this article, we will present a graph model of football transfers from season 1992/1993 to season 2019/2020 in following five leagues:
- English Premier League
- French Ligue 1
- German Bundesliga
- Italian Serie A
- Spanish Primera Division
The model consists of the following nodes:
Team- a football team with a property
Player- a professional football player, contains properties
"Luka Modric") and
League- a football league where multiple teams play in, contains one property
Transfer- represents football transfer that connects a
Playerthat is transferred from one
Season. Transfer contains one optional property
80.50) that represents a transfer fee in millions of euros and one regular property
1995) that represents how old was a player when the transfer occurred.
Season- a football season with two properties
Nodes are connected with the following edges:
:TRANSFERRED_FROM- connects team node
Transferrepresenting a team where the player is being transferred from.
:TRANSFERRED_TO- connects node
Transferto team node
Teamwhere player is being transferred to.
:TRANSFERRED_IN- connects player node
Transferrepresenting a player that was transferred in the connected transfer.
:HAPPENED_IN- connects node
Transferto the node
Seasonin which transfer has happened.
:PLAYS_IN- connects node
Teamthat plays in league node
You have two options for exploring this dataset.
If you just want to take a look at the dataset and try out a few queries, open
Memgraph Playground and continue with
the tutorial there. Note that you will not be able to execute
On the other hand, if you would like to add changes to the dataset, download the
Memgraph Lab desktop application and navigate
Datasets tab in the sidebar. From there, choose the dataset
Football player's transfers and continue with the tutorial.
In the queries below, we are using Cypher to query Memgraph via the console.
Now when we have a dataset of football transfers from season 1992/1993 to season 2019/2020 loaded in Memgraph, we are ready to gain some information out of it.
1) Let's say you want to find 20 most expensive transfers. As mentioned before, transfers fees are represented in millions of euros.
2) What about finding the most expensive transfer per season?
3) How about finding out which teams your favorite player has played for? If you wish to check the teams for another player, replace "Sime Vrsaljko" with the name of your favorite player.
You might wonder why we haven't specified a direction in our Cypher traversal with
As we want to find the teams that player was transferred from (
(:Transfer)<--(:Team)) and transferred to
(:Transfer)-->(:Team)), we want to collect both inbound and outbound connections. In order to do so, we
omit the arrow (
<) in our Cypher command.
4) Find players that were transferred to and played for FC Barcelona and count them by the player game position.
5) Football has seen a lot of rivalries develop between clubs during its rich and long history. One of the most famous ones is between fierce rivals FC Barcelona and Real Madrid. There is a term, El Clasico, for a match between those two teams. Let's find all the transfers between FC Barcelona and Real Madrid.
6) FC Barcelona is one of the most valuable football clubs in the world. Players often want to play there as long as possible. But what about those players who didn't fit in well? Where do they go?
7) What are the teams that most players went to in season 2003/2004? The results may surprise you.
8) In great teams, there are players who seem to be irreplaceable. When they leave, the club board is often struggling to find a proper replacement for them. Let's find out which positions club "FC Barcelona" spent money on in season 2015/2016.
9) But what was the highest transfer amount per position FC Barcelona spent on in seasons from 1992/1993 till 2019/2020?
10) Now, let's find who were the most expensive players per position in team FC Barcelona.
If we needed to get the maximum transfer fee per position we would only need first
MATCH in the
above query, making it way shorter. In order to match players with maximum transfer fees per position
our query is split into two parts:
MATCHin the query finds the maximum transfer fee per position.
MATCHin the query is finding all players transferred to "FC Barcelona" with the same position and transfer fee equal to the maximum one from the previous query.
11) If you want to find all player transfers between two clubs you can do that also.
In the above query, we will find all players that transferred from "FC Barcelona" to "Sevilla FC". It
will include direct transfers (from "FC Barcelona" to "Sevilla FC") and indirect transfers (from "FC Barcelona"
to one or multiple other clubs and lastly "Sevilla FC"). That is the reason why we started first
searching for all players and transfers that were transferred from "FC Barcelona". Next up is the player transfer
traversal through transfers and teams all the way to the "Sevilla FC".
For this part, we used the breadth-first search (BFS) algorithm with lambda filter
(e, v | condition).
It's a function that takes an edge symbol
e and a vertex symbol
v and decides whether this edge and vertex pair
should be considered valid in breadth-first expansion by returning true or false (or Null). In the above example,
lambda is returning true if a vertex has a label
Team or a label
Transfer. If a vertex is
Transfer there is an
additional check where we need to make sure the transfer is one of the transfers of players transferred from "FC Barcelona".
It needs to be either
Transfer because to get from a team that made the transfer to
the team where the player is being transferred to, we need to go through the node
Transfer that connects those two teams.
So the traversal from "FC Barcelona" to "Sevilla FC" will go through the following nodes: Transfer, Team, Transfer, Team, Transfer, etc.
12) In the previous query, we found all transfers between two clubs. Let's filter out direct ones now. We need to add a small change in the query to only get indirect transfers.
In this query, the only difference is that we need to find players who had a direct transfer to Sevilla first.
In the next
MATCH we use that information to check whether players that were transferred from FC Barcelona,
didn't have direct transfer to Sevilla FC.
If you are running this in Memgraph Lab you can change the query a bit in order to get all nodes and edges required for a visual graph representation of players transferring through teams.
MemgraphLab graph visual representation draws nodes and edges from query results. If you only have nodes in the results then only nodes will be drawn on the canvas. If you have both nodes and edges present in the results, MemgraphLab is able to draw nodes and connections between them because it has all the information relevant for drawing.
In order to change the query to accommodate that, we need to change the types of results that are returned
and collect any missing edge or node information throughout the query. The first part of the query where we
check whether the player was transferred from "FC Barcelona" to "Sevilla FC" stays the same. In order to draw
all connections from players to transfers, we need to collect edges connecting them. That is the reason why we
are collecting edges
e through variable
player_to_transfers because it contains information on which player
is connected to which transfer.
With that in mind, our results contain all the information for the graph visual:
- A path that contains
Teamnodes, and all the edges collected on the
- A list of
- A list of
Player - Transferedges
Here is a picture of how it will look if you run the query in MemgraphLab.