Virtual Shelving
19 June 2022[This entry was revised and expanded on 2022:07/07.]
I am always uncomfortable with the process of organizing books and articles on shelves or in boxes. I desire to have them grouped by each author and by each subject of interest; these desires cannot be reconciled without having multiple copies of each book and of each article, which multiplicity I cannot afford.
Electronic copies are a different matter. Even without multiple copies, symbolic links, which I discussed in a previous entry, make it possible effectively to list the same file in multiple directories. Hereïn, I'll explain the principle structure that I use for organizing documents, and I'll present some small utilities that facilitate creating and maintaining that structure on POSIX-compliant file systems. This structure is not as fine-grained as might be imagined, but it strikes a balance appropriate to my purposes. (For a more sophisticated system one should employ an application storing and retrieving documents mediated by a cataloguing relational database.)
As with many systems, mine have each a directory named
. Its two subdirectories relevant to this discussion are Documents
Authors
and Subjects
.
The entries in Subjects
are subdirectories with names such as
, Economics
, Logic and Probability
, and Mathematics
. Philosophy
In turn, the entries in each of these are subdirectories with the names of authors.
Finally, in each of these subdirectories are entries for files containing their work corresponding to the superdirectory. For example, Documents/Subjects/Logic and Probability/Johnson William Ernest/
would have entries for works by him on logic or on probability, but his article on indifference curves would be listed instead in
Documents/Subjects/Economics/Johnson William Ernest/
.
Most of the subdirectories of Authors
have names corresponding to the subdirectories in the third level of the Subjects
substructure, but all of these subdirectories in Authors
are different directories from those in the Subjects
substructure.
Each of most of these subdirectories of Authors
lists not subdirectories nor files, but symbolic links. These links take their names from the subdirectories of
Subjects
, but they do not link to those subdirectories. Instead, each links to an author-specific sub-subdirectory. Thus, for example, Documents/Authors/Johnson William Ernest/Logic and Probability
is a symbolic link to Documents/Subjects/Logic and Probability/Johnson William Ernest
. It is as if the subject-specific collection of an author's works is the author-specific collection of works on that subject, just as it should be.
One could, instead, use the complementary organization, in which the Subjects
substructure were ultimately dependent upon the Authors
substructure, or use a hybrid organization in which some of the dependency flows one way and some the other. The determinant should be what is most important to preserve if the collection is copied to a file system that does not support symbolic links, as in the case of a SD card with a FAT file system.
I've sketched the principal structure, but want to note useful complications of two sorts.
The first is that symbolic links may be used to place some subjects effectively under others. For example, logic an probability fall within the scope of philosophy. As well as having a directory named
listed in Logic and Probability
Subjects
, I have a symbolic link to it listed in Philosophy
. Indeed, when a subject falls within the intersection of other subjects, each may have such a symbolic link, and I have links to
Documents/Subjects/Logic and Probability
not only in Philosophy
but in Mathematics
and in Economics
.
The second is that symbolic links may be used effectively to list a document with multiple authors in the directory for each author. And essentially the same device may be used to classify a single document under different subjects.
Although this organization is not especially fine-grained, it requires the creation of many directories and symbolic links. I've written seven utilities in Python to reduce the burden. Two of those utilities were presented in a previous 'blog entry because they can be put to more general purpose. Here, I will present five more.
(Again, these utilities are written for POSIX-compliant file systems. Windows is not POSIX-compliant. A full discussion of the relevant issues would be tedious, as would be an effort to rewrite these programs to support Windows.)
The first of these programs, mkdocdir.py
, takes two or three arguments, and creates corresponding directories and symbolic links. The first argument is the name of an author, and the second is the name of a subject. A command mkdocdir.py "De Morgan Augustus" Mathematics
will create directories Documents
, Documents/Authors
, Documents/Subjects
, Documents/Authors/De Morgan Augustus
, Documents/Subjects/Mathematics
, and Documents/Subjects/Mathematics/De Morgan
unless they already exist, and will create the symbolic link Documents/Authors/De Morgan Augustus/Mathematics
(to Documents/Subjects/Mathematics/De Morgan
) unless it already exists. The optional third argument, uses an alternative to Documents
, but is otherwise the same; a command mkdocdir.py "De Morgan Augustus" Mathematics MyDocs
will create MyDocs/Subjects/Mathematics/De Morgan Augustus
&c.
#!/usr/bin/env python import os import sys subjectSource = True argc = len(sys.argv) if argc < 3 or argc > 4: prog_name = sys.argv[0][sys.argv[0].rindex("/")+1:] print("Syntax: " + prog_name + " <author-name> <subject> [<documents-directory>]\n" + "<documents-directory> defaults to \"Documents\".\n" + "Examples: " + prog_name + " \"De Morgan Augustus\" Mathematics\n" + " " + prog_name + " Menger_Carl Economics MyDocs") else: if argc < 4: dir_documents = os.path.expanduser("~") + "/Documents" else: dir_documents = sys.argv[3] os.makedirs(dir_documents,0o777,True) os.chdir(dir_documents) if subjectSource == True: dir_sources = "Subjects" dir_links = "Authors" d3 = sys.argv[1] d4 = sys.argv[2] else: dir_sources = "Authors" dir_links = "Subjects" d3 = sys.argv[2] d4 = sys.argv[1] source = dir_sources + "/" + d4 + "/" + d3 link = dir_links + "/" + d3 + "/" + d4 os.makedirs(source,0o777,True) os.makedirs(dir_links + "/" + d3,0o777,True) if not os.path.exists(link): os.chdir(dir_links + "/" + d3) os.symlink("../../" + dir_sources + "/" + d4 + "/" + d3,d4)
The second, cpdoc.py
, copies a file into the structure, creating directories and symbolic links as necessary; it thus combines the functionality of creating the relevant directories and symbolic link with that of copying a file into the structure. cpdoc.py
takes three or four arguments. The first is a file-path, the second the name of an author, and the third the name of a subject. The optional fourth again specifies an alternative to Documents
. A command cpdoc.py "A Treatise on Probability.pdf" "Keynes John Maynard" "Logic and Probability"
will create the relevant directories and symbolic link, and then copy a file named
from the present working directory into A Treatise on Probability.pdf
Documents/Subjects/Logic and Probability/Keynes John Maynard
.
#!/usr/bin/env python import os import sys import shutil subjectSource = True argc = len(sys.argv) if argc < 4 or argc > 5: prog_name = sys.argv[0][sys.argv[0].rindex("/")+1:] print("Syntax: " + prog_name + " <file> <author-name> <subject> [<documents-directory>]\n" + "<documents-directory> defaults to \"Documents\".\n" + "Examples: " + prog_name + " ~/Downloads/On_the_Syllogism.pdf \"De Morgan Augustus\" \"Logic\"\n" + " " + prog_name + " introductorylec02whatgoog.pdf Whately_Richard Economics MyDocs") else: if argc < 5: dir_documents = os.path.expanduser("~") + "/Documents" else: dir_documents = sys.argv[4] os.makedirs(dir_documents,0o777,True) file = os.path.abspath(sys.argv[1]) os.chdir(dir_documents) if subjectSource == True: dir_sources = "Subjects" dir_links = "Authors" d3 = sys.argv[2] d4 = sys.argv[3] else: dir_sources = "Authors" dir_links = "Subjects" d3 = sys.argv[3] d4 = sys.argv[2] source = dir_sources + "/" + d4 + "/" + d3 link = dir_links + "/" + d3 + "/" + d4 os.makedirs(source,0o777,True) os.makedirs(dir_links + "/" + d3,0o777,True) if not os.path.exists(link): os.chdir(dir_links + "/" + d3) os.symlink("../../" + dir_sources + "/" + d4 + "/" + d3,d4) shutil.copy2(file,dir_documents + "/" + source)
The third, mvdoc.py
, is the same as cpdoc.py
, except that mvdoc.py
moves the file specified into the structure, instead of copying it.
#!/usr/bin/env python import os import sys import shutil subjectSource = True argc = len(sys.argv) if argc < 4 or argc > 5: prog_name = sys.argv[0][sys.argv[0].rindex("/")+1:] print("Syntax: " + prog_name + " <file> <author-name> <subject> [<documents-directory>]\n" + "<documents-directory> defaults to \"Documents\".\n" + "Examples: " + prog_name + " ~/Downloads/On_the_Syllogism.pdf \"De Morgan Augustus\" \"Logic\"\n" + " " + prog_name + " introductorylec02whatgoog.pdf Whately_Richard Economics MyDocs") else: if argc < 5: dir_documents = os.path.expanduser("~") + "/Documents" else: dir_documents = sys.argv[4] os.makedirs(dir_documents,0o777,True) file = os.path.abspath(sys.argv[1]) os.chdir(dir_documents) if subjectSource == True: dir_sources = "Subjects" dir_links = "Authors" d3 = sys.argv[2] d4 = sys.argv[3] else: dir_sources = "Authors" dir_links = "Subjects" d3 = sys.argv[3] d4 = sys.argv[2] source = dir_sources + "/" + d4 + "/" + d3 link = dir_links + "/" + d3 + "/" + d4 os.makedirs(source,0o777,True) os.makedirs(dir_links + "/" + d3,0o777,True) if not os.path.exists(link): os.chdir(dir_links + "/" + d3) os.symlink("../../" + dir_sources + "/" + d4 + "/" + d3,d4) shutil.move(file,dir_documents + "/" + source)
The fourth places a symbolic links to a specific work in another directory. It can be used effectively to list a single work under multiple directories, associated with different authors or with different subjects or with both. lndoc.py
takes three or four arguments. The first is a file-path, the second the name of an author, and the third the name of a subject. The optional fourth again specifies an alternative to Documents
. A command lndoc.py "Reconsideration of the Theory of Value.pdf" "Allen Roy George Douglas" Economics
will create the relevant directories and symbolic link to a directory, and then create a symbolic link to a file named
from the present working directory into Reconsideration of the Theory of Value.pdf
Documents/Subjects/Economic/Allen Roy George Douglas
. It will also create the associated directories and a symbolic link Documents/Authors/Allen Roy George Douglas/Economics
to Documents/Subjects/Economics/Allen Roy George Douglas
if these do not already exist.
#!/usr/bin/env python import os import sys import shutil subjectSource = True argc = len(sys.argv) if argc < 4 or argc > 5: prog_name = sys.argv[0][sys.argv[0].rindex("/")+1:] print("Syntax: " + prog_name + " <file> <author-name> <subject> [<documents-directory>]\n" + "defaults to \"Documents\".\n" + "Examples: " + prog_name + " ~/Downloads/On_the_Syllogism.pdf \"De Morgan Augustus\" \"Logic\"\n" + " " + prog_name + " introductorylec02whatgoog.pdf Whately_Richard Economics MyDocs") else: if argc < 5: dir_documents = os.path.expanduser("~") + "/Documents" else: dir_documents = sys.argv[4] os.makedirs(dir_documents,0o777,True) file = os.path.abspath(sys.argv[1]) os.chdir(dir_documents) if subjectSource == True: dir_sources = "Subjects" dir_links = "Authors" d3 = sys.argv[2] d4 = sys.argv[3] else: dir_sources = "Authors" dir_links = "Subjects" d3 = sys.argv[3] d4 = sys.argv[2] source = dir_sources + "/" + d4 + "/" + d3 link = dir_links + "/" + d3 + "/" + d4 os.makedirs(source,0o777,True) os.makedirs(dir_links + "/" + d3,0o777,True) if not os.path.exists(link): os.chdir(dir_links + "/" + d3) os.symlink("../../" + dir_sources + "/" + d4 + "/" + d3,d4) os.symlink(file,dir_documents + "/" + source + "/" + os.path.basename(file))
Each of the above four programs contains a line
. If this line is changed to subjectSource = True
then the aforemention complementary structure is created, with the subjectSource = False
Subjects
substructure dependent upon the Authors
substructure. The code could be modified so that a command-line switch determined which organization were used.
The program mkdoclnk.py
is used to create the dependent substructure from the independent substructure. That is to say that if the documents are stored in the Subjects
substructure and the Authors
substructure has not been created or not fully created, then mkdoclnk.py
. If the documents are stored in the Authors
substructure, then mkdoclnk.py
can create a dependent Subjects
substructure. If the substructures have different names (such as
and Autori
), then Materia
mkdoclnk.py
can accept those. If a hybrid system is to be used, two passes of mkdoclnk.py
can complete it.
mkdoclnk.py
takes zero to three arguments. If the first is not given, then the source substructure is assumed to be in a directory named Subjects
. If the second is not given, then the dependent substructure is assumed to be in a directory named Authors
. And if the third argument is not given, them the source and dependent substructures are assumed to be in a directory named Documents
. A command mkdoclnk.py
constructs the Authors
substructure in Documents
from the Subjects
substructure in Documents
. A command mkdoclnk.py Pundits Topics Yappings
constructs a Topics
substructure in Yappings
from a Pundits
substructure in Yappings
. #!/usr/bin/env python import os import sys sys.argc = len(sys.argv) if sys.argc > 4: prog_name = sys.argv[0][sys.argv[0].rindex("/")+1:] print("Syntax:\n " + prog_name + " [<sources-directory> [<links-directory> [<parent-directory>]]]\n" + "Defaults: " + prog_name + " Subjects Authors ~/Documents\n" + "Example: " + prog_name + " Authors Subjects") else: if sys.argc < 4: os.chdir(os.path.expanduser("~") + "/Documents") else: os.chdir(sys.argv[3]) dir_home = os.getcwd() if sys.argc < 3: dir_links = "Authors" else: dir_links = sys.argv[2] if sys.argc < 2: dir_source = "Subjects" else: dir_source = sys.argv[1] list_dir_source = [entry for entry in os.scandir("./" + dir_source) if entry.is_dir() and not os.path.islink(entry)] for entry in list_dir_source: list_dir_links = [sub_entry for sub_entry in os.scandir(entry.path) if sub_entry.is_dir() and not os.path.islink(sub_entry)] for sub_entry in list_dir_links: dir_candidate = "./" + dir_links + "/" + sub_entry.name if not os.path.exists(dir_candidate): os.makedirs(dir_candidate,0o777,True) if not os.path.exists(dir_candidate + "/" + entry.name): os.chdir(dir_candidate) os.symlink("../../" + dir_source + "/" + entry.name + "/" + sub_entry.name, entry.name) os.chdir(dir_home)
Tags: document management, documents, e.books, ebooks, electronic books, electronic documents, programming, Python, symbolic links, symlinks
Leave a Reply