relatively new to python and coding, but I think I’ve slammed my head against the wall enough for this one. I’ve come to stack overflow for sage advice.
I am writing a program that will search through a target folder for predefined terms stored in a “keyterms.csv.” keyterms.csv has one column of words and phrases separated by a newline, nothing else. When a word or phrase from the keyterms file is found in a file in the target directory, the program returns the name of the culprit file(s) with number of entries of the keyterm found.
Traceback (most recent call last):
count = len(re.findall(self.ktxt, ttxt))
TypeError: unhashable type: 'dict'
so I think the answer is obvious that re.findall only works with strings or something that spits out only strings, and will not work with dicts, lists or anything else like that. I have tried several snippets of code from SO to split the read file into individual strings to feed into re.findall, to no avail. Should I be feeding the strings into an array or something and for.each the array? not sure what exactly I’m missing.
disregard my copious print statements, just using them to trace what’s going on.
def read_key
and def find
are where the problem lies as far as I can tell. I have tried splitting self.ktxt = list(reader)
intro strings and sending it to def find
in several ways but maybe I just fundamentally don’t understand what’s going on with it.
import os
import re
import csv
class FindKterms:
def __init__(self, path, query):
self.path = path
self.query = query
self.ktxt = {}
self.searched = {}
print('def init complete ...')
def read_key(self):
print('opening keyterms.csv')
with open(self.query, newline='', encoding='utf_8') as f:
reader = csv.reader(f, delimiter='n')
self.ktxt = list(reader)
def find(self):
if self.path[-1] != '/':
self.path += '/'
print('Appended "/" to given filepath')
print('starting find function ...')
for root, dirs, files in os.walk(self.path):
for file in files:
print('Searching: ' + file + ' ...')
f = open(root + file)
ttxt = f.read()
f.close()
print('successfully read and closed ' + file)
count = len(re.findall(self.ktxt, ttxt))
if count > 0:
self.searched[root + file] = count
def get_results(self):
return self.searched
the main()
seems to work fine calling the functions and moving data around.is calling the functions and seems to work fine. Just the strings from keyterms.csv need to be used as a match case for re.findall across the target directory.
thanks for any help, I feel like I’m right there but the solution eludes me.
3xogenic is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.