I have a data frame below that corresponds letters to numbers.
import pandas as pd
# initialize list of lists
data = [['A', 1], ['B', 2], ['C', 3], ['D', 4], ['E', 5], ['F', 6]]
# Create the pandas DataFrame
df = pd.DataFrame(data, columns=['Letters', 'Num'])
Now keeping this, can we convert a sentence containing these letters to numbers and then taking a sum of it
Example
I have a sentence like
sen = 'Ab df ec'
Now these corresponds to 12 46 53. There it is 1+2+0+4+6+0+5+3 = 21 (0 is for space). Therefore the sentence Ab df ec
will make a sum of 21
1
A simple dictionary is all you need. Pandas is not helping you here.
data = [['A', 1], ['B', 2], ['C', 3], ['D', 4], ['E', 5], ['F', 6]]
data = {a:b for a,b in data}
data[' '] = 0
sen = 'Ab df ec'
print(sum(data[i.upper()] for i in sen))
Output:
21
One of the fastest way would be to do the following:
from string import ascii_letters
sen = 'Ab df ec'
sen_idx = [ord(x)-64 for x in sen.upper() if x in ascii_letters]
ord(x)
– each symbol in python has it’s own number value. LikeA - 65
,Z - 90
,a - 97
,z - 122
. Basically letters are ordered, but upper and lower case letters have their own numbers.- convert your string to
upper
and applyord
only to upper letters (because in your case upper case and lower case letters must have same numbers). - avoid taking
ord
from spaces, commas and other punctuation by checking that all letters belong toascii_letters
Second solution:
from string import ascii_uppercase
# dict with all letters and their index
# {'A': 1, 'B': 2, 'C': 3, ...}
data = {letter: idx+1 for idx,letter in enumerate(ascii_uppercase)}
sen = 'Ab df ec'
# for non letters use 0
sen_idx = [data.get(x, 0) for x in sen.upper() ]
print(sen_idx) # [1, 2, 0, 4, 6, 0, 5, 3]