I have a very large excel file that contains lots of data that I need to process. Normally I would do it in Excel but that would take me hours and I know that Pandas does it within seconds.
The first column “PlayerID” needs to be normally sorted.
The second column “Country” contains names of countries. In Excel I would filter here country by country to proceed.
The third column contains states of each country, there could be a random number of states for this country, there are states that have either up to 12 players or up to 24 players
The fourth column contains color names that are connected to the PlayerID and would have to be also sorted but not by alphabet but by a given color order (if up to 12 players per state:
red-green-blue-yellow-white-grey-brown-violet-turquois-black-orange-pink
or if up to 24 players:
red-green-blue-yellow-white-grey-brown-violet-turquois-black-orange-pink-red2-green2-blue2-yellow2-white2-grey2-brown2-violet2-turquois2-black2-orange2-pink2).
Now a filter has to be set per state and a new column gets created where increasing numbers are assigned to the colors. For the first filter it just starts counting up for all red = 1, green = 2, blue = 3 and so on. When all the players are assigned numbers, the column gets filtered by the next state and the numbering continues again in the color-order, but the numbers are continuing (so they do not start again from 1). When all states of a country are completed, the next country is filtered out with the next state and the numbering starts from 1 on color red.
The order of states per country is irrelevant, whatever comes first.
Here is a code example. I made a large example so that it becomes maybe more clearer. I commented the numbering column out, because that is the one that I would like to create with the script, it is just for your reference.
import pandas as pd
pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)
data = {'PlayerID' : ['12319353-999', '12319353-002', '12319353-001', '12319382-999', '12319382-001', '12319380-999', '12319380-001', '12319381-999', '12319381-001', '12319383-999', '12319383-001', '12319504-999', '12319504-002', '12319504-001', '12319503-999', '12319503-001', '12319406-999', '12319406-001', '12319405-999', '12319405-002', '12319405-001', '12319404-999', '12319404-001', '12319699-999', '12319699-002', '12319699-001', '12320050-999', '12320050-001', '12319700-999', '12319700-001', '12319701-999', '12319701-001', '12319598-999', '12319598-001', '12319354-999', '12319354-007', '12319354-006', '12319354-005', '12319354-004', '12319354-003', '12319354-002', '12319354-001', '12319423-999', '12319423-001', '12319418-999', '12319418-001', '12319411-999', '12319411-002', '12319411-001', '12319410-999', '12319410-001', '12320073-999', '12320073-001', '12319426-999', '12319426-002', '12319426-001', '12319415-999', '12319415-001', '12319424-999', '12319424-001', '12319419-999', '12319419-001', '12319425-999', '12319425-002', '12319425-001', '12319409-999', '12319409-001', '12319416-999', '12319416-001', '12319420-999', '12319420-001', '12319421-999', '12319421-001', '12319467-999', '12319467-001', '12319357-999', '12319357-001', '12319468-999', '12319468-001', '23149893-999', '23149893-001', '12319452-999', '12319452-001', '12319384-999', '12319384-001', '12319363-999', '12319363-001', '12319362-999', '12319362-003', '12319362-002', '12319362-001', '12319435-999', '12319435-001', '12319337-999', '12319337-001', '12319371-999', '12319371-001', '12319473-999', '12319473-002', '12319473-001', '12319472-999', '12319472-002', '12319472-001', '12319474-999', '12319474-001', '12319471-999', '12319471-001', '12319492-999', '12319492-001', '12319470-999', '12319470-003', '12319470-002', '12319470-001', '12319476-006', '12319476-005', '12319476-004', '12319476-003', '12319476-002', '12319476-001', '12319375-999', '12319375-004', '12319375-003', '12319375-002', '12319375-001', '12319339-999', '12319339-003', '12319339-002', '12319339-001', '12319374-999', '12319374-003', '12319374-002', '12319374-001', '12319376-999', '12319376-002', '12319376-001', '12319373-999', '12319373-002', '12319373-001', '23149884-999', '23149884-001', '12319372-999', '12319372-001', '12319493-999', '12319493-001', '12319494-999', '12319494-001', '12319506-999', '12319506-001', '12319507-999', '12319507-005', '12319507-004', '12319507-003', '12319507-002', '12319507-001', '12319394-999', '12319394-003', '12319394-002', '12319394-001', '12319505-006', '12319505-005', '12319505-004', '12319505-003', '12319505-002', '12319505-001', '12319413-999', '12319413-001', '12319508-999', '12319508-002', '12319508-001', '12319414-999', '12319414-001', '12319407-999', '12319407-002', '12319407-001', '12319427-999', '12319427-002', '12319427-001', '12319342-999', '12319342-001', '12319343-999', '12319343-001', '12319445-999', '12319445-003', '12319445-002', '12319445-001', '12319449-999', '12319449-001', '12319448-999', '12319448-001', '12319440-999', '12319440-004', '12319440-003', '12319440-002', '12319440-001', '12319446-006', '12319446-005', '12319446-004', '12319446-003', '12319446-002', '12319446-001', '12319390-999', '12319390-001', '23149885-999', '23149885-001', '12319450-999', '12319450-003', '12319450-002', '12319450-001', '12319442-999', '12319442-001', '12319444-006', '12319444-005', '12319444-004', '12319444-003', '12319444-002', '12319444-001', '12319491-999', '12319491-001', '12319351-999', '12319351-002', '12319351-001', '12319490-999', '12319490-001', '12319345-999', '12319345-005', '12319345-004', '12319345-003', '12319345-002', '12319345-001', '12320154-999', '12320154-001', '12320152-999', '12320152-001', '12319344-999', '12319344-001', '12319349-999', '12319349-004', '12319349-003', '12319349-002', '12319349-001', '12319346-999', '12319346-002', '12319346-001', '12319498-999', '12319498-003', '12319498-002', '12319498-001', '12319495-999', '12319495-001', '12319499-999', '12319499-001', '12319500-999', '12319500-003', '12319500-002', '12319500-001', '12320102-999', '12320102-002', '12320102-001', '12319496-999', '12319496-001', '12319497-999', '12319497-002', '12319497-001', '12319396-999', '12319396-009', '12319396-008', '12319396-007', '12319396-006', '12319396-005', '12319396-004', '12319396-003', '12319396-002', '12319396-014', '12319396-013', '12319396-012', '12319396-011', '12319396-010', '12319396-001', '12319347-999', '12319347-001', '12319399-999', '12319399-001', '12319348-999', '12319348-001', '12319398-999', '12319398-003', '12319398-002', '12319398-001', '12319397-999', '12319397-008', '12319397-007', '12319397-006', '12319397-005', '12319397-004', '12319397-003', '12319397-002', '12319397-001', '23149878-999', '23149878-001'],
'Country': ['Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'Germany', 'Germany', 'UK', 'UK', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'UK', 'UK', 'UK', 'UK', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'Germany', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'UK', 'USA', 'USA', 'USA', 'USA', 'UK', 'UK', 'USA', 'USA', 'USA', 'USA', 'USA', 'UK', 'UK', 'UK', 'USA', 'USA', 'USA', 'USA', 'Germany', 'Germany', 'USA', 'USA', 'Germany', 'Germany', 'Germany', 'Germany', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'USA', 'Germany', 'Germany'],
'State' : ['DE-State3-24', 'DE-State3-24', 'DE-State3-24', 'DE-State1-24', 'DE-State1-24', 'DE-State1-24', 'DE-State1-24', 'DE-State1-24', 'DE-State1-24', 'DE-State1-24', 'DE-State1-24', 'DE-State1-24', 'DE-State1-24', 'DE-State1-24', 'DE-State1-24', 'DE-State1-24', 'USA-State2-24', 'USA-State2-24', 'USA-State2-24', 'USA-State2-24', 'USA-State2-24', 'USA-State2-24', 'USA-State2-24', 'UK-State3-12', 'UK-State3-12', 'UK-State3-12', 'UK-State3-12', 'UK-State3-12', 'UK-State3-12', 'UK-State3-12', 'UK-State3-12', 'UK-State3-12', 'UK-State3-12', 'UK-State3-12', 'DE-State3-24', 'DE-State3-24', 'DE-State3-24', 'DE-State3-24', 'DE-State3-24', 'DE-State3-24', 'DE-State3-24', 'DE-State3-24', 'USA-State2-24', 'USA-State2-24', 'USA-State2-24', 'USA-State2-24', 'USA-State2-24', 'USA-State2-24', 'USA-State2-24', 'USA-State3-12', 'USA-State3-12', 'USA-State2-24', 'USA-State2-24', 'USA-State2-24', 'USA-State2-24', 'USA-State2-24', 'USA-State2-24', 'USA-State2-24', 'USA-State2-24', 'USA-State2-24', 'USA-State2-24', 'USA-State2-24', 'USA-State2-24', 'USA-State2-24', 'USA-State2-24', 'USA-State2-24', 'USA-State2-24', 'USA-State2-24', 'USA-State2-24', 'USA-State2-24', 'USA-State2-24', 'USA-State2-24', 'USA-State2-24', 'DE-State3-24', 'DE-State3-24', 'UK-State2-12', 'UK-State2-12', 'DE-State3-24', 'DE-State3-24', 'DE-State3-24', 'DE-State3-24', 'DE-State3-24', 'DE-State3-24', 'UK-State1-24', 'UK-State1-24', 'UK-State2-12', 'UK-State2-12', 'UK-State2-12', 'UK-State2-12', 'UK-State2-12', 'UK-State2-12', 'UK-State2-12', 'UK-State2-12', 'UK-State1-24', 'UK-State1-24', 'DE-State3-24', 'DE-State3-24', 'DE-State3-24', 'DE-State3-24', 'DE-State3-24', 'DE-State3-24', 'DE-State3-24', 'DE-State3-24', 'DE-State3-24', 'DE-State3-24', 'DE-State3-24', 'DE-State3-24', 'DE-State1-24', 'DE-State1-24', 'DE-State3-24', 'DE-State3-24', 'DE-State3-24', 'DE-State3-24', 'DE-State3-24', 'DE-State3-24', 'DE-State3-24', 'DE-State3-24', 'DE-State3-24', 'DE-State3-24', 'DE-State3-24', 'DE-State3-24', 'DE-State3-24', 'DE-State3-24', 'DE-State3-24', 'UK-State1-24', 'UK-State1-24', 'UK-State1-24', 'UK-State1-24', 'DE-State2-12', 'DE-State2-12', 'DE-State2-12', 'DE-State2-12', 'DE-State2-12', 'DE-State2-12', 'DE-State2-12', 'DE-State2-12', 'DE-State2-12', 'DE-State2-12', 'DE-State2-12', 'DE-State2-12', 'DE-State2-12', 'DE-State2-12', 'DE-State1-24', 'DE-State1-24', 'DE-State1-24', 'DE-State1-24', 'DE-State1-24', 'DE-State1-24', 'DE-State1-24', 'DE-State1-24', 'DE-State1-24', 'DE-State1-24', 'DE-State1-24', 'DE-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State2-24', 'USA-State2-24', 'USA-State2-24', 'USA-State2-24', 'USA-State2-24', 'USA-State2-24', 'UK-State1-24', 'UK-State1-24', 'UK-State1-24', 'UK-State1-24', 'UK-State2-12', 'UK-State2-12', 'UK-State2-12', 'UK-State2-12', 'UK-State2-12', 'UK-State2-12', 'UK-State2-12', 'UK-State2-12', 'UK-State2-12', 'UK-State2-12', 'UK-State2-12', 'UK-State2-12', 'UK-State2-12', 'UK-State2-12', 'UK-State2-12', 'UK-State2-12', 'UK-State2-12', 'UK-State2-12', 'UK-State2-12', 'UK-State1-24', 'UK-State1-24', 'UK-State1-24', 'UK-State1-24', 'UK-State2-12', 'UK-State2-12', 'UK-State2-12', 'UK-State2-12', 'UK-State3-12', 'UK-State3-12', 'UK-State3-12', 'UK-State3-12', 'UK-State3-12', 'UK-State3-12', 'UK-State3-12', 'UK-State3-12', 'UK-State1-24', 'UK-State1-24', 'UK-State3-12', 'UK-State3-12', 'UK-State3-12', 'UK-State1-24', 'UK-State1-24', 'UK-State1-24', 'UK-State1-24', 'UK-State1-24', 'UK-State1-24', 'UK-State1-24', 'UK-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'UK-State1-24', 'UK-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'UK-State1-24', 'UK-State1-24', 'UK-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'DE-State1-24', 'DE-State1-24', 'USA-State1-24', 'USA-State1-24', 'DE-State1-24', 'DE-State1-24', 'DE-State1-24', 'DE-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'USA-State1-24', 'DE-State1-24', 'DE-State1-24'],
'Color' : ['white', 'white', 'white', 'blue', 'blue', 'green', 'green', 'yellow', 'yellow', 'red', 'red', 'grey', 'grey', 'grey', 'brown', 'brown', 'blue2', 'blue2', 'white2', 'white2', 'white2', 'yellow2', 'yellow2', 'red', 'red', 'red', 'yellow', 'yellow', 'blue', 'blue', 'green', 'green', 'white', 'white', 'grey', 'grey', 'grey', 'grey', 'grey', 'grey', 'grey', 'grey', 'green', 'green', 'yellow', 'yellow', 'grey2', 'grey2', 'grey2', 'red', 'red', 'brown', 'brown', 'white', 'white', 'white', 'grey', 'grey', 'blue', 'blue', 'red', 'red', 'pink', 'pink', 'pink', 'violett', 'violett', 'orange', 'orange', 'black', 'black', 'turquois', 'turquois', 'blue', 'blue', 'yellow', 'yellow', 'green', 'green', 'red', 'red', 'yellow', 'yellow', 'pink', 'pink', 'white', 'white', 'green', 'green', 'green', 'green', 'red', 'red', 'green', 'green', 'red2', 'red2', 'black', 'black', 'black', 'turquois', 'turquois', 'turquois', 'violett', 'violett', 'brown', 'brown', 'green2', 'green2', 'orange', 'orange', 'orange', 'orange', 'pink', 'pink', 'pink', 'pink', 'pink', 'pink', 'green2', 'green2', 'green2', 'green2', 'green2', 'red', 'red', 'red', 'red', 'white', 'white', 'white', 'white', 'yellow', 'yellow', 'yellow', 'blue', 'blue', 'blue', 'red', 'red', 'green', 'green', 'pink', 'pink', 'black', 'black', 'turquois', 'turquois', 'violett', 'violett', 'violett', 'violett', 'violett', 'violett', 'blue2', 'blue2', 'blue2', 'blue2', 'white2', 'white2', 'white2', 'white2', 'white2', 'white2', 'violett2', 'violett2', 'yellow2', 'yellow2', 'yellow2', 'brown2', 'brown2', 'green2', 'green2', 'green2', 'red2', 'red2', 'red2', 'grey', 'grey', 'white', 'white', 'grey', 'grey', 'grey', 'grey', 'brown', 'brown', 'violett', 'violett', 'blue', 'blue', 'blue', 'blue', 'blue', 'turquois', 'turquois', 'turquois', 'turquois', 'turquois', 'turquois', 'orange', 'orange', 'black', 'black', 'black', 'black', 'black', 'black', 'violett', 'violett', 'brown', 'brown', 'brown', 'brown', 'brown', 'brown', 'turquois', 'turquois', 'grey', 'grey', 'grey', 'violett', 'violett', 'brown', 'brown', 'brown', 'brown', 'brown', 'brown', 'green', 'green', 'red', 'red', 'yellow', 'yellow', 'blue', 'blue', 'blue', 'blue', 'blue', 'blue', 'blue', 'blue', 'yellow', 'yellow', 'yellow', 'yellow', 'red2', 'red2', 'white', 'white', 'orange', 'orange', 'orange', 'orange', 'grey', 'grey', 'grey', 'brown', 'brown', 'violett', 'violett', 'violett', 'orange', 'black', 'black', 'black', 'black', 'black', 'black', 'black', 'black', 'orange', 'orange', 'black', 'black', 'black', 'black', 'turquois', 'turquois', 'pink', 'pink', 'red2', 'red2', 'green2', 'green2', 'green2', 'green2', 'grey2', 'grey2', 'grey2', 'grey2', 'grey2', 'grey2', 'grey2', 'grey2', 'grey2', 'white', 'white']}
#'Numbering' : ['19', '19', '19', '3', '3', '2', '2', '4', '4', '1', '1', '6', '6', '6', '7', '7', '38', '38', '40', '40', '40', '39', '39', '25', '25', '25', '28', '28', '27', '27', '26', '26', '29', '29', '20', '20', '20', '20', '20', '20', '20', '20', '25', '25', '27', '27', '41', '41', '41', '43', '43', '30', '30', '28', '28', '28', '29', '29', '26', '26', '24', '24', '35', '35', '35', '31', '31', '34', '34', '33', '33', '32', '32', '17', '17', '17', '17', '16', '16', '15', '15', '18', '18', '13', '13', '18', '18', '15', '15', '15', '15', '14', '14', '2', '2', '28', '28', '25', '25', '25', '24', '24', '24', '23', '23', '22', '22', '14', '14', '26', '26', '26', '26', '27', '27', '27', '27', '27', '27', '29', '29', '29', '29', '29', '1', '1', '1', '1', '34', '34', '34', '34', '33', '33', '33', '32', '32', '32', '30', '30', '31', '31', '12', '12', '10', '10', '9', '9', '8', '8', '8', '8', '8', '8', '17', '17', '17', '17', '19', '19', '19', '19', '19', '19', '23', '23', '18', '18', '18', '22', '22', '37', '37', '37', '36', '36', '36', '6', '6', '5', '5', '19', '19', '19', '19', '20', '20', '21', '21', '16', '16', '16', '16', '16', '23', '23', '23', '23', '23', '23', '12', '12', '11', '11', '24', '24', '24', '24', '32', '32', '31', '31', '31', '31', '31', '31', '10', '10', '30', '30', '30', '9', '9', '7', '7', '7', '7', '7', '7', '2', '2', '1', '1', '4', '4', '3', '3', '3', '3', '3', '3', '3', '3', '5', '5', '5', '5', '13', '13', '6', '6', '11', '11', '11', '11', '7', '7', '7', '8', '8', '9', '9', '9', '13', '11', '11', '11', '11', '11', '11', '11', '11', '13', '13', '11', '11', '11', '11', '10', '10', '14', '14', '15', '15', '16', '16', '16', '16', '20', '20', '20', '20', '20', '20', '20', '20', '20', '5', '5']
df = pd.DataFrame(data)
print(df)
To some up again, the steps I would do in Excel:
- Sort PlayerID by alphabet
- Filter column Country by whatever country
- Filter column State by whatever state
- Sort column Color by the color-code
- Start by assigning numbers to the colors
- Filter the next state and continue the increasing number value from the previous state but again starting on red (according to color order)
- Next country (numbering will start on 1 again)
Is there a more elegant way than a nested for-loop?
4