Facebook Pixel

Directory Structures & Implementations

A directory is essentially a symbol table that translates file names into their corresponding physical directory entries on a disk. The OS needs a logical way to organize these files so users can locate them efficiently.

Over time, as computers evolved from single-user mainframes to modern multi-user personal devices, directory structures evolved from very simple flat lists to highly complex graphs.

1. Single-Level Directory

The simplest possible structure. All files across the entire system are contained in one single, massive master directory.

  • Pros: Extremely simple to implement and understand.
  • Cons: Massive naming collisions. If user Alice names a file `data.txt`, user Bob cannot use that name. Furthermore, users cannot group related files together. This structure is completely useless for multi-user systems and only existed on the earliest personal computers.

2. Two-Level Directory

This structure solves the multi-user naming collision problem. The OS maintains a Master File Directory (MFD), which contains individual User File Directories (UFD). Every user gets exactly one private directory.

  • Pros: Alice and Bob can now both have a file named `data.txt` without any collisions because their files live in separate UFDs. The search path is isolated.
  • Cons: A user still cannot logically group their own files. If Bob has 5,000 files, they are all dumped into his single UFD. He cannot create a "Work" folder and a "Personal" folder.

3. Tree-Structured Directory

The natural evolution. There is a root directory that can contain both files and subdirectories. Subdirectories can contain more files and subdirectories, creating a hierarchy of arbitrary depth.

This is the standard model you are used to on MS-DOS, Windows, and base UNIX systems.

  • Pros: Perfectly allows logical grouping. Users can create deep, organized folder structures for specific projects. Searching is highly efficient using Absolute paths (starting from the root, like `/home/bob/work`) or Relative paths (starting from the current directory).
  • Cons: Sharing files is difficult. A file exists in exactly one place in the tree. If two users need to collaborate on a document, they have to copy it, leading to version control nightmares.

4. Acyclic Graph Directory

This structure extends the tree to solve the sharing problem. An acyclic graph allows directories to share subdirectories and files. A file can appear in multiple places in the directory structure simultaneously, without duplicating the actual data on the disk.

How is it implemented?
UNIX implements this using "Links" (Hard links and Soft/Symbolic links). Windows implements this using "Shortcuts". They are essentially pointer files that redirect the OS to the actual physical file.
  • Pros: Elegant sharing. Two users can collaborate on the exact same physical file from entirely different paths.
  • Cons: Deletion becomes complex. If Alice deletes the shared file, what happens to Bob's link? The OS must maintain "Reference Counts". When a link is deleted, the count drops. The physical file is only deleted from the disk when the reference count hits zero.

5. General Graph Directory

If we allow links to point anywhere without restriction, we might accidentally create cycles (loops) in the directory graph (e.g., Folder A links to Folder B, and Folder B links back to Folder A).

  • Pros: Utmost flexibility in how files are mapped.
  • Cons: Search algorithms and backup utilities can get caught in infinite loops traversing the cycle over and over. Deletion requires complex Garbage Collection algorithms to find unreferenced cycles. Because of these fatal flaws, most modern OSes strictly prevent cycles, restricting themselves to Acyclic Graphs.

Under the Hood: Directory Allocation

Regardless of the logical structure (Tree vs Graph), the OS must physically store the directory data (the list of filenames and their pointers) on the disk. There are two primary ways to implement this under the hood:

  • Linear List: A simple array or linked list of file names with pointers to the data blocks. It is very easy to program, but extremely slow to search. Finding a file requires a linear search O(n) checking every entry one by one.
  • Hash Table: A linear list paired with a hash data structure. The hash function takes the file name and computes exactly where its pointer is located. Search time drops to O(1). This is vastly faster, though the OS must write extra logic to handle hash collisions.

Summary

Directory structures balance simplicity against flexibility. While single and two-level directories are obsolete, the Tree structure forms the backbone of all modern organization. By upgrading the Tree into an Acyclic Graph using Links and Shortcuts, modern operating systems allow seamless file sharing without falling into the trap of infinite loops.

Please Login.
Please Login.
Please Login.
Please Login.
Please Login.
Please Login.
Please Login.
Please Login.
Please Login.
Please Login.
Please Login.
Please Login.
Please Login.
Please Login.
Please Login.
Please Login.
Please Login.
Please Login.
Please Login.
Please Login.