Say you have an application that saves image(.jpg, .png etc) and text(.txt, .xml) files, and the application has all of the files paths hard coded throughout code, per the example below.
Assuming the following file path structure exists:
- jpg –
\MyFileServerMediajpg
- png –
\MyFileServerMediapng
- txt –
\MyFileServerTexttxt
- xml –
\MyFileServerTextxml
The files in the destination file paths are referenced inside of a table, so in a VBA example the file path would be hard coded as:
Dim myFilePath as String
Dim myFileName as String
myFileName = "puppies.jpg" 'This would be the result of a query in a real scenario
myFilePath = "\MyFileServerMediajpeg" & myFileName
Now say I have to split \MyFileServer
into \MyMediaFileServer
and \MyTextFileServer
What would have been ideal is if I had a central location that I could have just changed a table value or variable, rather than trudging through every single function in the application to update the hard coded path. Just to change it again in the future if the situation arises again. So, my goal is to make sure my puppies image can be shown with minimal effort.
Concerning industry standard, I was wondering what is the best option, be it my two options below or another separate option.
Option 1:
Single Self Referential Table:
Defined in Mike’s answer at hierarchical/tree database for directories in filesystem
by a hierarchical structure like the following:
(ROOT)
/
Dir2 Dir3
/
Dir4 Dir5 Dir6
/
Dir7
You simply have a table like the following:
ID ParentID FilePath
______________________________________________
1 NULL \MyFileServer
2 1 Images
3 2 jpg
4 3 10 MB Files
OR Option 2:
Self Referential Table With Master Roots:
A slight deviation on option 1.
Perhaps a master file path would be clearer. Where there is a separate table.
ID Name FilePath
______________________________________________
1 Image Root \MyMediaFileServerImages
2 Text Root \MyTextFileServerText
Then in the structure mentioned in option 1.
ID ParentID RootID FilePath
______________________________________________
1 NULL 1 jpg
2 1 NULL 10 MB Files
I feel that option 1 is overall more simple to change, but will become more confusing as a tree expands, and option 2 allows for an easier absolute file path change. IS Option 1 the best solution to storing file paths or is there another industry standard I am not aware of?
2
Both options are over-engineered, involving the database is inappropriate here, and the directory structure is too deep.
Instead, for administrator-friendly software, I see the following patterns emerging:
-
Paths are configured in config files, not database.
Usually, you want to be able to copy
the database dump to another system.
Often, the database server is on another system.
You simply don’t expect the database content to be entangled
with file system details,
the only exception being strictly relative paths.
Finally, having to make configuration changes in a database is a pain. -
There’s just one data directory to configure, if any.
This depends on the application, of course,
but you almost always want to have everything in a single directory.
Then the application, not the config file,
takes care of its inner structure.
In your case, all I want to configure is:\MyFileServer
Another common approach is not to configure it at all,
but to a well defined sub directory.
This is either within the user’s HOME directory
or within some generic application data directory
(different conventions on different operating systems,
but that’s another topic).If you want another place,
you simply replace that directory with a symbolic link, such as:{...}Data -> \MyFileServer
-
The inner structure of the data directory is managed by source.
Good applications have internal functions or property table for
all common directories,
usually all combined in a single class or module.The base is something like
getRootPath()
which reads the config file and returns something like\MyFileServer
.Then, there functions like
getMediaPath()
orgetTextPath()
which internally just callgetRootPath()
(or each other) and append their relative path.A common variation from the theme is to let those functions
take a filename (or relative file path) as argument,
and to return the full path to that file.
For example,getMediaPath("great.jpg")
would return\MyFileServerMediagreat.jpg
.Ideally, these functions also create the directory if it doesn’t exist yet.
Another approach is to create each directory only when the first file
is written to it.
Either way, the point is not to expect a fully populated directory structure,
which is one less thing the admin (or installer) might get wrong.If a future version of the application needs some more subdirectories,
it usually just creates those,
without bothering the administrator to creative those,
or to force them to add those paths to a config file,
or any another stupid hassles.
(see 2: There’s just one data directory) -
Directories structures are not deeper than necessary.
Your directory structure is essentially split by file extention.
{YourRoot}Mediajpggreat.jpg
{YourRoot}Mediapnggreat.png
{YourRoot}Texttxtgreat.txt
{YourRoot}Textxmlgreat.xml
Is that really necessary?
The files are already uniquely determined by their extension,
so why not skip those intermediate paths?{YourRoot}great.jpg
{YourRoot}great.png
{YourRoot}great.txt
{YourRoot}great.xml
Of course, applications have good reasons for structuring
the data directory.
But that’s usually a separation by usage (or purpose),
not merely by file extension. -
If you must, split uniformly.
The only reasons for further splitting the directories
is if they get too large.
In that case, you still don’t split by file extension,
but by something that provides a uniform partition.Often, this is time based, e.g. daily or monthly:
{YourRoot}2015-01great.jpg
{YourRoot}2015-01stuff.png
- …
{YourRoot}2015-02others.jpg
- …
If you have uniform file names (e.g. hashes),
their prefix is often chosen as subdirectory name
(and cut from the original file name), such as:{YourRoot}1234567.jpg
- …
{YourRoot}1343577.png
- …
3
Option 1 is definitely overkill.
Mike’s answer is awesome, if you want to store data about directories and their structure.
But that is not what you need here. You don’t really care about folder hierarchy and the relations between them. You just want to know what the paths ARE.
You can get these parameters from either from a config/ini file or from a table (as you describe in Option 2). I find that config files are more popular, but there are advantages to both methods.
You can find a good discussion of these in Scriptin’s answer to Should I use a config file or database for storing business rules?