Summary: I've come up with a variation of hard links officially called AGMS links (Alias Generated Morphing Symbolic links), like Macintosh aliases but better, for my experimental RAM file system for BeOS. The main idea is to let you put your files inside multiple directories; it's useful for filing your data so that you can find your aunt's letter about the cottage in both a directory of aunt related things and a directory of cottage related things. Unlike Macintosh aliases or symbolic links, they are valid even if you delete the original file - one of the AGMS links will morph and become the file itself. Unlike ordinary hard links, it is reasonable to have an AGMS hard link refering to a parent directory - cycles in the file name hierarchy are supported correctly. Finally, because they appear to the OS as symbolic links (which happen to morph once in a while), they are compatible with existing software. How it Appears to the User: Each file system item which is conceptually in multiple directories shows up as a regular file (or whatever it is) in its main parent directory and as AGMS links in the other parent directories. Ideally I'd like to have the AGMS links show up as a new kind of file type (specified by the st_mode field returned by the read status lstat() function), but most software won't understand that, thus I'm using fake symbolic links (until someone rewrites the OS and utilities like "ls"). The use of fake symbolic links for secondary references to a file/directory (rather than hard links) lets existing software work without getting stuck in directory cycles. You'll also see backwards links such as ".." to the main parent directory of an object (a hard link as usual), and "..." as the second one (a fake symbolic link to the directory containing the first AGMS link refering to the item, should really be a hard link but "..." isn't treated like ".." by the OS and utilities so I can't do that), and "...." as the parent directory containing the second AGMS link refering to the item, and so on. Since you can make a cycle of directories (a directory with itself as a subdirectory, even indirectly through other subdirectories, which would otherwise cause infinite loop problems), there is extra code to allow for that sort of situation. It will prevent anything bad from happening, such as deleting the last link between the rest of the file system and a cycle. It also morphs AGMS links into hard links and vice-versa if it needs to change the main parent directory ".." as part of the process of keeping the ".." chain to the root as pure hard links. Historical Development Path, or From Hard Links to AGMS Links: I considered several ways of implementing hard links, half implemented some, and decided that other ways were better. The traditional Unix way of doing hard links just sets the relevant directory entries (containing a name string and an inode number) to all reference the same inode number (Unix inodes don't store the true name of the file). The entries don't even have to use the same name. They only allow the root user to make hard links to directories, and require an extra flag to the "ln" command for safety. Presumably the root user knows enough to avoid making cycles. Pure hard links also have the problem of sharing attributes of the target, such as the position of the icon (a file attribute in BeOS), which causes problems if there are links from different directories (the icon will have the same position in all directory windows, which would look bad in most of them). Unix and BeOS symbolic links store a path string identifying the target (either relative or absolute path) along with the name (doesn't have to be the same as the target) in the directory. The OS and utilities realise that it's not the real file and shouldn't be followed in recursive traversals. It works reasonably well, but if you move or rename the file, it stops working because the path string doesn't change. I much later found out about Macintosh OS "aliases". They are a lot like Unix hard links (renaming or moving the file doesn't cause problems), except that they show up as aliases (names displayed in italic) and the OS and utilities know they aren't the real file or item, thus avoiding recursive loop problems. But when you delete the target of an alias, the alias that remains is broken (using it brings up an error message saying it can't find the target). One idea (now abandonned) I came up with was something I call "unique hard links". Along the way, I found out that the BeOS desktop expects that the true name of an object to match the name in the directory (otherwise draging icons to move them doesn't work correctly). While having multiple names for a file sounds cool to programmers, it's confusing to users, causes problems with the BeOS desktop software, and similar functionality can be implemented with ordinary symbolic links, so I'm open to dropping that requirement. "Unique hard links" are hard links which require that the name in the directory is the same as the true name of the file or directory item. Thus the item can only appear once in a directory, a minor loss in flexability. If you rename a file, all the directories it is in also change their idea of the name to match (thus renaming requires a check for duplicate names in all parent directories). This is nice for users - they see the same name everywhere. However, recursive directory operations (such as using "zip" to compress a directory tree into an archive file) will still get stuck in an infinite loop (until the combined path length exceeds the maximum). Deleting items is a lot more complex - it needs to check that removing a link to a directory won't isolate a section of the file system graph from the root. Another idea I came up with and abandonned was something I call "persistent symbolic links", which turn out to be a lot like Mac aliases. These show up as symbolic links to the file/directory item being referenced, but internally are hard links. I require that all items have one ordinary path from the root directory all the way down to the item (thus imposing a tree structure on the messy directed graph of items). That ordinary path is what shows up in the path string of the fake symbolic links. Internally, each item has an array of parent directories. The first directory (array[0]) is treated as the main parent directory (works like the normal parent directory). The remaining parent directories are auxiliary ones in which the item shows up as a fake symbolic link - the directory listing / Walk() code lists it as a fake symbolic link if the true name is different from the directory's idea of the name or if the directory isn't the main directory of the item. That way recursive directory listings wouldn't get stuck in infinite loops and other operations would work normally. The trouble is that you have to store the full ordinary path to the item in all the hard links, and if the main path changed, all the pseudo-symbolic hard links have to be updated. Alternatively you could build the full path string dymamically by doing a get_node/lock/read name/unlock/put_node for each main parent directory up to the root. For simple operations such as Walk(), that seems a bit excessive. It also brings more chances for deadlock if some other thread is modifying any of those directories. Either way, the string that's returned could be obsolete before it gets used if some other thread renames any of the directories (symbolic links have the same problem). Even worse, when one of the directories in the main path to the root gets its main link deleted, all items attached to it need to have their main parent directory changed so that it uses a directory which has an ordinary path from the root (or if it is the only parent directory of some items, the deletion has to return the directory-not-empty error). This is similar to the check for isolation of a section of the directory graph. An aside on VNodeIDs and Fake Symlinks. This is a trick to have a directory sometimes show up as a symbolic link and sometimes show up as itself, but without creating a real symbolic link. It's a feature that I originally needed for "persistent symbolic links" but now need for for backward references to AGMS link home directories (the "...", "...." and so on extra parent directories that correspond to AGMS links targetting a directory). Actually you could make an AGMS link named "..." but that's awkward and isn't similar to how ".." is handled (and I don't want to store separate attributes for the "..." directory icon position etc). I essentially just want readdir() to list "..." as a symbolic link to the appropriate parent directory. I found out that readdir() gives the caller the vnodeid of the item and its parent, but doesn't specify what kind of file it is so I can't simply tell the caller that it is a symbolic link. D'oh! Instead, "ls" calls rstat() to find out if an item is a symbolic link, file or directory. The trouble is that rstat() takes a pointer to a shadow vnode, and if we used the actual item's pointer, it wouldn't be a symbolic link, it would be whatever it is. So, after much head scratching, I've decided to hand out fake vnodeids when a symbolic link is required in walk() and readdir(). Sure, a few vnodes for fake symbolic links may hang around in memory after deleting the real file, but that shouldn't cause any problems. The fake vnodeid contains a mangled vnodeid of the AGMS link FSO. One way to mangle it is to set the high 32 bits to non-zero and leave the lower 32 bits set to the normal vnodeid. Another way is to add one to the target vnodeid, which is normally an even multiple of 8. I'll use the add-one technique. When one of those mangled vnodeids is converted to a shadow vnode pointer, read_vnode() will use the pointer to the file system object plus 1. Since memory allocation is done on 8 byte boundaries, we can separate real FSO pointers from fake symbolic link shadow vnode pointers by seeing if the lower 3 bits are zero. rstat() will then be able to return info on the file if the provided address is 0 modulo 8, or return info on a simulated symbolic link if the pointer is 1 modulo 8. Now, after half way implementing some other ideas, I'm leaning towards MacOS style aliases with a few extra features. I've also made up a name for them: AGMS links (Alias Generated Morphing Symbolic links). Alternatively it could stand for Absolutely Grand Morphing Symbolic links, or Amazingly Great Morphing Symbolic links :-). Not coincidentally, they are also my initials. AGMS links will appear as symbolic links to the OS and users, but internally they are file system objects themselves which just store a pointer to the target file system object (or a vnodeid - same thing) rather than a path string. When the OS asks for the path, it will dynamically return a string with the current true path. If the target is deleted from its main (true) parent directory, it will change the target's main parent directory to be some other parent directory, replacing the AGMS link object in that directory with the actual target file (the target's true name will also change to match the AGMS link it is replacing). Cycle detection will prevent the user from deleting more than one file/directory at a time yet allow them to delete redundant links to non-empty directories. Implementation Considerations: Because the presence of AGMS links allows for cycles in the file name tree (turning it into a directed graph), we need graph traversal algorithms for ensuring that all directories (except the root) always have a valid main parent directory. Another desired feature is that errors (such as out of memory) during a rename or delete leave the file system unchanged. This leads to a transaction type of approach, where the code figures out all the work that needs to be done, allocates items needed (such as strings for new names), then goes and implements the changes in a way which avoids doing anything that could fail. If it fails at any step before the final implementation step, it just frees what it had allocated and exits. Most operations (create, stat, write, etc) don't involve changes that affect more than one or two things in the file system, so they aren't covered here. So only deletion and renaming need this fancy treatment. Renaming because rename a/b x/y unlinks b from directory a, unlinks y from directory x (maybe deletes it too if it doesn't have any other parents) and adds a link in directory x to b. Deletion (actually, it's more like unlink; deletion of the object doesn't happen until the last parent goes) of a directory can disconnect a ring/cycle of subdirectories from the rest of the file system. An example of this is a tree of directories A/B/C with an AGMS link named D inside C that goes back to B. B is essentially a subdirectory of directories A and C, with B's main parent directory ".." leading to A and B's extra parent directory "..." leading to C. A _ A has B as a subdirectory, ordinary (hard link) connection. |/ \ B | B has C as a subdirectory. | ^ C | C contains an AGMS link named D. | ^ (D) | D is an AGMS link pointing to B, () to show it is an AGMS link. \__/ Now remove the A-B link. As part of that operation, C is promoted to be the new main parent directory of B, and the now useless AGMS link D is deallocated. A _ / \ B | B has C as a subdirectory. | ^ C | C has B as a subdirectory. \__/ B now has C as a parent and C has B as a parent, so they won't get deallocated because they still have parents. However, there's no way of accessing the B/C directories from anywhere else in the file system, or for getting from B/C to the root via ".." operations. We want to detect that. As well you can unlink a directory which formerly provided the current path to the root, leaving some other directories without a series of ".." operations leading to the root. Their main parent directories need to be updated. For example in directory A add an AGMS link X pointing to directory C and then unlink B from A (cd A ; ln -d B/C X ; rmdir B). B's main parent becomes C (via AGMS link D which gets merged with B since you can't have an AGMS link as the main parent directory). Also the main parent of C becomes A (via merging with X), since it has a connection to the root while B just uselessly circles around to C again. Watch out when merging a directory with an AGMS link - there may be other AGMS links pointing to the former AGMS link, which need to be fixed up to point to the directory. Finally, C used to be inside B but when it switched its main parent directory from B to A, it leaves behind a new AGMS link named Z in directory B. Initial situation: A _ A has B as a subdirectory, and contains AGMS link X. /\ / \ (X) B | B has C as a subdirectory. Main parent is A. \ | ^ X is an AGMS link in A which points to C. Main parent A. C | C contains an AGMS link named D. Main parent B. | ^ (D) | D is an AGMS link pointing to B. Main parent C. \__/ After "rmdir B": A A just contains directory C. | _ \ / \ C | C is a directory which contains B. Main parent A. | | B | B is a directory which contains AGMS link Z. | ^ (Z) | Z is an AGMS link to C. \__/ For more details, see the actual code. - Alex (March 8, 2002) Example (November 2003): Sun Nov 2 11:08:49 232 /MyDiskThree/Test>ls -Rla .: total 2 drwxr-xr-x 1 agmsmith agmsmith 3 Sun Nov 02 11:07:47 2003 . drwxrwxrwx 0 agmsmith agmsmith 2 Sun Nov 02 11:07:16 2003 .. drwxr-xr-x 3 agmsmith agmsmith 2 Sun Nov 02 11:04:56 2003 A lrwxrwxrwx 1 agmsmith agmsmith 1 Sun Nov 02 11:07:47 2003 AGMSLinkToA -> /MyDiskThree/Test/A lrwxrwxrwx 1 agmsmith agmsmith 1 Sun Nov 02 11:07:42 2003 AGMSLinkToB -> /MyDiskThree/Test/A/B A: total 3 drwxr-xr-x 3 agmsmith agmsmith 2 Sun Nov 02 11:04:56 2003 . drwxr-xr-x 1 agmsmith agmsmith 3 Sun Nov 02 11:07:47 2003 .. lrwxrwxrwx 0 agmsmith agmsmith 2 Sun Nov 02 11:07:47 2003 ... -> /MyDiskThree/Test lrwxrwxrwx 0 agmsmith agmsmith 2 Sun Nov 02 11:08:29 2003 .... -> /MyDiskThree/Test/A/B drwxr-xr-x 2 agmsmith agmsmith 2 Sun Nov 02 11:08:29 2003 B -rw-r--r-- 1 agmsmith agmsmith 29 Sun Nov 02 11:04:56 2003 This is in directory A A/B: total 2 drwxr-xr-x 2 agmsmith agmsmith 2 Sun Nov 02 11:08:29 2003 . drwxr-xr-x 3 agmsmith agmsmith 2 Sun Nov 02 11:04:56 2003 .. lrwxrwxrwx 0 agmsmith agmsmith 2 Sun Nov 02 11:07:47 2003 ... -> /MyDiskThree/Test lrwxrwxrwx 1 agmsmith agmsmith 1 Sun Nov 02 11:08:29 2003 AGMSLinkBackToA -> /MyDiskThree/Test/A -rw-r--r-- 1 agmsmith agmsmith 29 Sun Nov 02 11:04:51 2003 This is in directory B Sun Nov 2 11:08:51 233 /MyDiskThree/Test>rmdir A Sun Nov 2 11:09:01 234 /MyDiskThree/Test>ls -Rla .: total 2 drwxr-xr-x 1 agmsmith agmsmith 2 Sun Nov 02 11:07:47 2003 . drwxrwxrwx 0 agmsmith agmsmith 2 Sun Nov 02 11:07:16 2003 .. drwxr-xr-x 2 agmsmith agmsmith 2 Sun Nov 02 11:04:56 2003 AGMSLinkToA lrwxrwxrwx 1 agmsmith agmsmith 1 Sun Nov 02 11:07:42 2003 AGMSLinkToB -> /MyDiskThree/Test/AGMSLinkToA/B AGMSLinkToA: total 2 drwxr-xr-x 2 agmsmith agmsmith 2 Sun Nov 02 11:04:56 2003 . drwxr-xr-x 1 agmsmith agmsmith 2 Sun Nov 02 11:07:47 2003 .. lrwxrwxrwx 0 agmsmith agmsmith 2 Sun Nov 02 11:08:29 2003 ... -> /MyDiskThree/Test/AGMSLinkToA/B drwxr-xr-x 2 agmsmith agmsmith 2 Sun Nov 02 11:08:29 2003 B -rw-r--r-- 1 agmsmith agmsmith 29 Sun Nov 02 11:04:56 2003 This is in directory A AGMSLinkToA/B: total 2 drwxr-xr-x 2 agmsmith agmsmith 2 Sun Nov 02 11:08:29 2003 . drwxr-xr-x 2 agmsmith agmsmith 2 Sun Nov 02 11:04:56 2003 .. lrwxrwxrwx 0 agmsmith agmsmith 2 Sun Nov 02 11:07:47 2003 ... -> /MyDiskThree/Test lrwxrwxrwx 1 agmsmith agmsmith 1 Sun Nov 02 11:08:29 2003 AGMSLinkBackToA -> /MyDiskThree/Test/AGMSLinkToA -rw-r--r-- 1 agmsmith agmsmith 29 Sun Nov 02 11:04:51 2003 This is in directory B Sun Nov 2 11:09:03 235 /MyDiskThree/Test>rmdir AGMSLinkToA Sun Nov 2 11:09:24 236 /MyDiskThree/Test>ls -Rla .: total 1 drwxr-xr-x 1 agmsmith agmsmith 1 Sun Nov 02 11:07:47 2003 . drwxrwxrwx 0 agmsmith agmsmith 2 Sun Nov 02 11:07:16 2003 .. drwxr-xr-x 2 agmsmith agmsmith 2 Sun Nov 02 11:08:29 2003 AGMSLinkToB AGMSLinkToB: total 2 drwxr-xr-x 2 agmsmith agmsmith 2 Sun Nov 02 11:08:29 2003 . drwxr-xr-x 1 agmsmith agmsmith 1 Sun Nov 02 11:07:47 2003 .. lrwxrwxrwx 0 agmsmith agmsmith 2 Sun Nov 02 11:04:56 2003 ... -> /MyDiskThree/Test/AGMSLinkToB/AGMSLinkBackToA drwxr-xr-x 1 agmsmith agmsmith 2 Sun Nov 02 11:04:56 2003 AGMSLinkBackToA -rw-r--r-- 1 agmsmith agmsmith 29 Sun Nov 02 11:04:51 2003 This is in directory B AGMSLinkToB/AGMSLinkBackToA: total 2 drwxr-xr-x 1 agmsmith agmsmith 2 Sun Nov 02 11:04:56 2003 . drwxr-xr-x 2 agmsmith agmsmith 2 Sun Nov 02 11:08:29 2003 .. lrwxrwxrwx 1 agmsmith agmsmith 1 Sun Nov 02 11:09:24 2003 B -> /MyDiskThree/Test/AGMSLinkToB -rw-r--r-- 1 agmsmith agmsmith 29 Sun Nov 02 11:04:56 2003 This is in directory A Sun Nov 2 11:09:26 237 /MyDiskThree/Test>