I have a folder named “files_folder2” and it contains multiple text files (eg: a.txt, b.txt, c.txt).
I have a second folder named “StopWords”, and it also contains multiple text files(eg: currency.txt, names.txt, geographic.txt). Each of this text files contains stopwords that i want to filter out from a.txt, b.txt etc.
TASK:
1. I want to loop through all the text files in “files_folder2”, read and print each text file separately…… (Eg: a.txt will print separately, b.txt will print separately).
2. I want to loop through all the text files in “StopWords” folder, and remove all the stopwords contained in the text files from “a.txt”.
3. Store the clean “a.txt, b.txt, c.txt” as a new text files.
My code is printing all the text files on the same page.
import glob
import codecs
import os
#Code reading the text files
text_folder_path = "files_folder2"
text_files = os.listdir(text_folder_path)
for file in text_files:
if file.endswith(".txt"):
file_path = os.path.join(text_folder_path, file)
with codecs.open(file_path, 'r', encoding='utf-8',
errors='ignore') as f:
data = f.read()
#Code reading the StopWOrds
stopwords_folder_path = "StopWords/StopWords"
stopwords_files = glob.glob(os.path.join(stopwords_folder_path, '*.txt'))
for file in stopwords_files:
with open(file, 'r') as w:
stop_words = w.read()