Virtual Shelving

19 June 2022

[This entry was revised and expanded on 2022:07/07.]

I am always uncomfortable with the process of organizing books and articles on shelves or in boxes. I desire to have them grouped by each author and by each subject of interest; these desires cannot be reconciled without having multiple copies of each book and of each article, which multiplicity I cannot afford.

Electronic copies are a different matter. Even without multiple copies, symbolic links, which I discussed in a previous entry, make it possible effectively to list the same file in multiple directories. Hereïn, I'll explain the principle structure that I use for organizing documents, and I'll present some small utilities that facilitate creating and maintaining that structure on POSIX-compliant file systems. This structure is not as fine-grained as might be imagined, but it strikes a balance appropriate to my purposes. (For a more sophisticated system one should employ an application storing and retrieving documents mediated by a cataloguing relational database.)

As with many systems, mine have each a directory named Documents. Its two subdirectories relevant to this discussion are Authors and Subjects.

The entries in Subjects are subdirectories with names such as Economics, Logic and Probability, Mathematics, and Philosophy.

In turn, the entries in each of these are subdirectories with the names of authors.

Finally, in each of these subdirectories are entries for files containing their work corresponding to the superdirectory. For example, Documents/Subjects/Logic and Probability/Johnson William Ernest/ would have entries for works by him on logic or on probability, but his article on indifference curves would be listed instead in Documents/Subjects/Economics/Johnson William Ernest/.

Most of the subdirectories of Authors have names corresponding to the subdirectories in the third level of the Subjects substructure, but all of these subdirectories in Authors are different directories from those in the Subjects substructure.

Each of most of these subdirectories of Authors lists not subdirectories nor files, but symbolic links. These links take their names from the subdirectories of Subjects, but they do not link to those subdirectories. Instead, each links to an author-specific sub-subdirectory. Thus, for example, Documents/Authors/Johnson William Ernest/Logic and Probability is a symbolic link to Documents/Subjects/Logic and Probability/Johnson William Ernest. It is as if the subject-specific collection of an author's works is the author-specific collection of works on that subject, just as it should be.

One could, instead, use the complementary organization, in which the Subjects substructure were ultimately dependent upon the Authors substructure, or use a hybrid organization in which some of the dependency flows one way and some the other. The determinant should be what is most important to preserve if the collection is copied to a file system that does not support symbolic links, as in the case of a SD card with a FAT file system.

I've sketched the principal structure, but want to note useful complications of two sorts.

The first is that symbolic links may be used to place some subjects effectively under others. For example, logic an probability fall within the scope of philosophy. As well as having a directory named Logic and Probability listed in Subjects, I have a symbolic link to it listed in Philosophy. Indeed, when a subject falls within the intersection of other subjects, each may have such a symbolic link, and I have links to Documents/Subjects/Logic and Probability not only in Philosophy but in Mathematics and in Economics.

The second is that symbolic links may be used effectively to list a document with multiple authors in the directory for each author. And essentially the same device may be used to classify a single document under different subjects.

Although this organization is not especially fine-grained, it requires the creation of many directories and symbolic links. I've written seven utilities in Python to reduce the burden. Two of those utilities were presented in a previous 'blog entry because they can be put to more general purpose. Here, I will present five more.

(Again, these utilities are written for POSIX-compliant file systems. Windows is not POSIX-compliant. A full discussion of the relevant issues would be tedious, as would be an effort to rewrite these programs to support Windows.)

The first of these programs, mkdocdir.py, takes two or three arguments, and creates corresponding directories and symbolic links. The first argument is the name of an author, and the second is the name of a subject. A command mkdocdir.py "De Morgan Augustus" Mathematics will create directories Documents, Documents/Authors, Documents/Subjects, Documents/Authors/De Morgan Augustus, Documents/Subjects/Mathematics, and Documents/Subjects/Mathematics/De Morgan unless they already exist, and will create the symbolic link Documents/Authors/De Morgan Augustus/Mathematics (to Documents/Subjects/Mathematics/De Morgan) unless it already exists. The optional third argument, uses an alternative to Documents, but is otherwise the same; a command mkdocdir.py "De Morgan Augustus" Mathematics MyDocs will create MyDocs/Subjects/Mathematics/De Morgan Augustus &c.

#!/usr/bin/env python
import os
import sys

subjectSource = True

argc = len(sys.argv)
if argc < 3 or argc > 4:
    prog_name = sys.argv[0][sys.argv[0].rindex("/")+1:]
    print("Syntax: " + prog_name
                     + " <author-name> <subject> [<documents-directory>]\n"
                     +
          "<documents-directory> defaults to \"Documents\".\n"
                     +
          "Examples: " + prog_name
                       + " \"De Morgan Augustus\" Mathematics\n"
                     +
          "          " + prog_name
                       + " Menger_Carl Economics MyDocs")
else:
    if argc < 4:
        dir_documents = os.path.expanduser("~") + "/Documents"
    else:
        dir_documents = sys.argv[3]
    os.makedirs(dir_documents,0o777,True)
    os.chdir(dir_documents)
    if subjectSource == True:
        dir_sources = "Subjects"
        dir_links = "Authors"
        d3 = sys.argv[1]
        d4 = sys.argv[2]
    else:
        dir_sources = "Authors"
        dir_links = "Subjects"
        d3 = sys.argv[2]
        d4 = sys.argv[1]
    source = dir_sources + "/" + d4 + "/" + d3
    link = dir_links + "/" + d3 + "/" + d4
    os.makedirs(source,0o777,True)
    os.makedirs(dir_links + "/" + d3,0o777,True)
    if not os.path.exists(link):
        os.chdir(dir_links + "/" + d3)
        os.symlink("../../" + dir_sources + "/" + d4 + "/" + d3,d4)

The second, cpdoc.py, copies a file into the structure, creating directories and symbolic links as necessary; it thus combines the functionality of creating the relevant directories and symbolic link with that of copying a file into the structure. cpdoc.py takes three or four arguments. The first is a file-path, the second the name of an author, and the third the name of a subject. The optional fourth again specifies an alternative to Documents. A command cpdoc.py "A Treatise on Probability.pdf" "Keynes John Maynard" "Logic and Probability" will create the relevant directories and symbolic link, and then copy a file named A Treatise on Probability.pdf from the present working directory into Documents/Subjects/Logic and Probability/Keynes John Maynard.

#!/usr/bin/env python
import os
import sys
import shutil

subjectSource = True

argc = len(sys.argv)
if argc < 4 or argc > 5:
    prog_name = sys.argv[0][sys.argv[0].rindex("/")+1:]
    print("Syntax: " + prog_name
                     +
          " <file> <author-name> <subject> [<documents-directory>]\n"
                     +
          "<documents-directory> defaults to \"Documents\".\n"
                     +
          "Examples: " + prog_name
                       +
          " ~/Downloads/On_the_Syllogism.pdf \"De Morgan Augustus\" \"Logic\"\n"
                     +
          "          " + prog_name
                       +
          " introductorylec02whatgoog.pdf Whately_Richard Economics MyDocs")
else:
    if argc < 5:
        dir_documents = os.path.expanduser("~") + "/Documents"
    else:
        dir_documents = sys.argv[4]
    os.makedirs(dir_documents,0o777,True)
    file = os.path.abspath(sys.argv[1])
    os.chdir(dir_documents)
    if subjectSource == True:
        dir_sources = "Subjects"
        dir_links = "Authors"
        d3 = sys.argv[2]
        d4 = sys.argv[3]
    else:
        dir_sources = "Authors"
        dir_links = "Subjects"
        d3 = sys.argv[3]
        d4 = sys.argv[2]
    source = dir_sources + "/" + d4 + "/" + d3
    link = dir_links + "/" + d3 + "/" + d4
    os.makedirs(source,0o777,True)
    os.makedirs(dir_links + "/" + d3,0o777,True)
    if not os.path.exists(link):
        os.chdir(dir_links + "/" + d3)
        os.symlink("../../" + dir_sources + "/" + d4 + "/" + d3,d4)
     shutil.copy2(file,dir_documents + "/" + source)

The third, mvdoc.py, is the same as cpdoc.py, except that mvdoc.py moves the file specified into the structure, instead of copying it.

#!/usr/bin/env python
import os
import sys
import shutil

subjectSource = True

argc = len(sys.argv)
if argc < 4 or argc > 5:
    prog_name = sys.argv[0][sys.argv[0].rindex("/")+1:]
    print("Syntax: " + prog_name
                     +
          " <file> <author-name> <subject> [<documents-directory>]\n"
                     +
          "<documents-directory> defaults to \"Documents\".\n"
                     +
          "Examples: " + prog_name
                       +
          " ~/Downloads/On_the_Syllogism.pdf \"De Morgan Augustus\" \"Logic\"\n"
                     +
          "          " + prog_name
                       +
          " introductorylec02whatgoog.pdf Whately_Richard Economics MyDocs")
else:
    if argc < 5:
        dir_documents = os.path.expanduser("~") + "/Documents"
    else:
        dir_documents = sys.argv[4]
    os.makedirs(dir_documents,0o777,True)
    file = os.path.abspath(sys.argv[1])
    os.chdir(dir_documents)
    if subjectSource == True:
        dir_sources = "Subjects"
        dir_links = "Authors"
        d3 = sys.argv[2]
        d4 = sys.argv[3]
    else:
        dir_sources = "Authors"
        dir_links = "Subjects"
        d3 = sys.argv[3]
        d4 = sys.argv[2]
    source = dir_sources + "/" + d4 + "/" + d3
    link = dir_links + "/" + d3 + "/" + d4
    os.makedirs(source,0o777,True)
    os.makedirs(dir_links + "/" + d3,0o777,True)
    if not os.path.exists(link):
        os.chdir(dir_links + "/" + d3)
        os.symlink("../../" + dir_sources + "/" + d4 + "/" + d3,d4)
     shutil.move(file,dir_documents + "/" + source)

The fourth places a symbolic links to a specific work in another directory. It can be used effectively to list a single work under multiple directories, associated with different authors or with different subjects or with both. lndoc.py takes three or four arguments. The first is a file-path, the second the name of an author, and the third the name of a subject. The optional fourth again specifies an alternative to Documents. A command lndoc.py "Reconsideration of the Theory of Value.pdf" "Allen Roy George Douglas" Economics will create the relevant directories and symbolic link to a directory, and then create a symbolic link to a file named Reconsideration of the Theory of Value.pdf from the present working directory into Documents/Subjects/Economic/Allen Roy George Douglas. It will also create the associated directories and a symbolic link Documents/Authors/Allen Roy George Douglas/Economics to Documents/Subjects/Economics/Allen Roy George Douglas if these do not already exist.

#!/usr/bin/env python
import os
import sys
import shutil

subjectSource = True

argc = len(sys.argv)
if argc < 4 or argc > 5:
    prog_name = sys.argv[0][sys.argv[0].rindex("/")+1:]
    print("Syntax: " + prog_name
                     +
          " <file> <author-name> <subject> [<documents-directory>]\n"
                     +
          " defaults to \"Documents\".\n"
                     +
          "Examples: " + prog_name
                       +
          " ~/Downloads/On_the_Syllogism.pdf \"De Morgan Augustus\" \"Logic\"\n"
                     +
          "          " + prog_name
                       +
          " introductorylec02whatgoog.pdf Whately_Richard Economics MyDocs")
else:
    if argc < 5:
        dir_documents = os.path.expanduser("~") + "/Documents"
    else:
        dir_documents = sys.argv[4]
    os.makedirs(dir_documents,0o777,True)
    file = os.path.abspath(sys.argv[1])
    os.chdir(dir_documents)
    if subjectSource == True:
        dir_sources = "Subjects"
        dir_links = "Authors"
        d3 = sys.argv[2]
        d4 = sys.argv[3]
    else:
        dir_sources = "Authors"
        dir_links = "Subjects"
        d3 = sys.argv[3]
        d4 = sys.argv[2]
    source = dir_sources + "/" + d4 + "/" + d3
    link = dir_links + "/" + d3 + "/" + d4
    os.makedirs(source,0o777,True)
    os.makedirs(dir_links + "/" + d3,0o777,True)
    if not os.path.exists(link):
        os.chdir(dir_links + "/" + d3)
        os.symlink("../../" + dir_sources + "/" + d4 + "/" + d3,d4)
    os.symlink(file,dir_documents + "/" + source + "/" + os.path.basename(file))

Each of the above four programs contains a line subjectSource = True. If this line is changed to subjectSource = False then the aforemention complementary structure is created, with the Subjects substructure dependent upon the Authors substructure. The code could be modified so that a command-line switch determined which organization were used.

The program mkdoclnk.py is used to create the dependent substructure from the independent substructure. That is to say that if the documents are stored in the Subjects substructure and the Authors substructure has not been created or not fully created, then mkdoclnk.py. If the documents are stored in the Authors substructure, then mkdoclnk.py can create a dependent Subjects substructure. If the substructures have different names (such as Autori and Materia), then mkdoclnk.py can accept those. If a hybrid system is to be used, two passes of mkdoclnk.py can complete it.

mkdoclnk.py takes zero to three arguments. If the first is not given, then the source substructure is assumed to be in a directory named Subjects. If the second is not given, then the dependent substructure is assumed to be in a directory named Authors. And if the third argument is not given, them the source and dependent substructures are assumed to be in a directory named Documents. A command mkdoclnk.py constructs the Authors substructure in Documents from the Subjects substructure in Documents. A command mkdoclnk.py Pundits Topics Yappings constructs a Topics substructure in Yappings from a Pundits substructure in Yappings.
#!/usr/bin/env python
import os
import sys

sys.argc = len(sys.argv)
if sys.argc > 4:
    prog_name = sys.argv[0][sys.argv[0].rindex("/")+1:]
    print("Syntax:\n " + prog_name
                     + " [<sources-directory> [<links-directory> [<parent-directory>]]]\n"
                     +
          "Defaults: " + prog_name + " Subjects Authors ~/Documents\n"
                     +
          "Example: " + prog_name + " Authors Subjects")
else:
    if sys.argc < 4:
        os.chdir(os.path.expanduser("~") + "/Documents")
    else:
        os.chdir(sys.argv[3])
    dir_home = os.getcwd()
    if sys.argc < 3:
        dir_links = "Authors"
    else:
        dir_links = sys.argv[2]
    if sys.argc < 2:
        dir_source = "Subjects"
    else:
        dir_source = sys.argv[1]
    list_dir_source = [entry for entry in os.scandir("./" + dir_source)
                 if entry.is_dir() and not os.path.islink(entry)]
    for entry in list_dir_source:
        list_dir_links = [sub_entry for sub_entry in os.scandir(entry.path)
                     if sub_entry.is_dir() and not os.path.islink(sub_entry)]
        for sub_entry in list_dir_links:
            dir_candidate = "./" + dir_links + "/" + sub_entry.name
            if not os.path.exists(dir_candidate):
                os.makedirs(dir_candidate,0o777,True)
            if not os.path.exists(dir_candidate + "/" + entry.name):
                os.chdir(dir_candidate)
                os.symlink("../../" + dir_source
                           + "/" + entry.name + "/" + sub_entry.name,
                           entry.name)
                os.chdir(dir_home)

Tags: , , , , , , , , ,

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.