I am trying to rename several files (thousands) with the following algorithm in Python:
There is a list of word-patterns. Each entry starts on a new line and contains text data of one or more words. For example:
r'House',
r'Big Tree',
r'Word',
Next, the script takes the names of the files in the folder and compares each pattern (Text Data String) and looks for matches in the file name. If the script finds one or more matches, it removes everything except them (and the extension) from the file name. And the remaining matches are renamed according to the pattern with which the comparison was made.
The search ignores case, special characters and different delimiters like ‘-‘, ‘_’, ‘,’ (of which there can be more than one). When searching for several words (‘Big Tree’), all possible combinations like:
__big_tree_
; big,tree_
etc.
Here are the original file names for an example:
-__big_tree_and_5_more_house_and_1_more_drawn_by_a_n_name__f95c144ac5d5cf0a0baa24d6593cc699.jpg
- Table-Home-Word-7325313.jpeg
- kiwi 40055619 virtual YouTube,Search EN, 100+ bookmarks 115480616_p0.png
The output should be:
- Big Tree House.jpeg
- Word.jpeg
-
- kiwi 40055619 virtual YouTube,Search EN, 100+ bookmarks 115480616_p0.png
Так как у последнего файла не оказалось совпадений в имени, он не переименовался
- kiwi 40055619 virtual YouTube,Search EN, 100+ bookmarks 115480616_p0.png
I am new to Python and can’t figure out how to implement a similar task.
All the solutions found on the internet mostly work with a single pattern, whereas I will have thousands of them.
Ideally, I am trying to implement a variant in which the text database will be stored separately and separately a universal regular expression that allows to search for matches.
Thank you all in advance.
After much trial and error, the best I could do was the following code:
Unfortunately it is not fully functional and due to moving files to a temporary folder it duplicates files when restarted
import os
import re
import shutil
# Patterns that will be checked for matches in file names
patterns = [
r'House',
r'Bit Tree',
r'Word',
r'Word2',
r'Home',
]
# Filename filtering
def filter_filename(filename):
matches = []
for pattern in patterns:
match = re.search(pattern, filename)
if match:
matches.append(match.group())
return ' '.join(matches)
# File Renaming sending them to a temporary directory
def rename_files(directory):
temp_dir = os.path.join(directory, 'temp')
os.makedirs(temp_dir, exist_ok=True)
for filename in os.listdir(directory):
old_filepath = os.path.join(directory, filename)
if os.path.isfile(old_filepath):
base_filename, file_extension = os.path.splitext(filename)
new_filename = filter_filename(base_filename)
if new_filename:
new_filename = new_filename
new_filepath = os.path.join(temp_dir, new_filename)
# Adding a one for uniqueness of the name
count = 1
while os.path.exists(new_filepath):
if count < 9999:
new_filename_with_count = f"{new_filename} {count}{file_extension}"
new_filepath = os.path.join(temp_dir, new_filename_with_count)
count += 1
else:
break
shutil.move(old_filepath, new_filepath)
# Returning from a temporary folder
for filename in os.listdir(temp_dir):
temp_filepath = os.path.join(temp_dir, filename)
new_filepath = os.path.join(directory, filename)
shutil.move(temp_filepath, new_filepath)
os.rmdir(temp_dir)
# Directory for processing
directory = './.'
rename_files(directory)
Nicetas is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.