I have a dataset which contains information about fifa players:
Player id: Unique identifier for each player.
Player: Name of the player.
Position: Position played by the player (e.g., Forward, Midfielder, Defender, Goalkeeper).
Number: Jersey number of the player.
Club: Club the player belongs to.
Club (country): Country where the player’s club is located.
D.O.B: Date of birth of the player in DD.MM.YYYY format.
Age: Age of the player during the 2014 FIFA World Cup.
Height (cm): Height of the player in centimeters.
Country: National team country of the player.
Caps: Number of times the player has represented their national team in international matches before the 2014 FIFA World Cup.
International goals: Number of goals scored by the player for the national team before the 2014 FIFA World Cup.
Plays in home country?: Indicates whether the player plays for a club in their home country (TRUE/FALSE).
the task is to create a property graph that will be implemented in neo4j to answer a couple questions. i’ve used python to divide up csv in 3 nodes – players, club and a relationship called “belong_to”
code:
players_node_table = players_table.loc[:, [‘Player id’, ‘Player’, ‘Position’, ‘D.O.B’, ‘Age’, ‘Height (cm)’, ‘Country’, ‘Caps’, ‘International goals’]]
club_node_table = club_node_table.drop_duplicates(subset=[‘Club’, ‘Club (country)’])
belong_to_rel_table = players_table.loc[:, [‘Player id’, ‘Club’]]
once i import this into neo4j, the nodes and relationship are imported and exist within the database but the relationships are being duplicated, for example – neymar has a “belong_to” relationship with FC Barcelona 13 times?
What other way could I potentially implement this
any help is appreciated!
watermelon_lily is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.