scruss wrote: ↑
Sun Jun 24, 2018 8:00 pm
Okay, very nicely done, but: does it matter that the file/folder paradigm is an “illusion”? Computers are built on abstractions, and a filesystem is one of many.
On 99 out of 100 levels, the "illusion" of a file does NOT
matter. For the great majority of people, it has no practical implication.
But there is one giant exception: file creation time – crtime!
Some might think "who cares?" But it is what it is – it happens to be the rabbit hole I'm exploring right now.
Without understanding the "true nature" of a Linux file and its illusions, it would be impossible to grasp what "creation time" actually means. That's why I felt it necessary in my tutorial to first explain the "trinity" of the "file" – the file's name, the file's data, and the file's inode. And yes, there's also the file's path – though as I explained, that's not technically "part" of a file. Understanding all of this is absolutely necessary to have a complete grasp of timestamps.
Once you have that understanding, a kind of false labeling is revealed – that file "creation time" is NOT necessarily
"file creation time". It's actually "inode creation time"! Of course, if the file still happens to be on the same partition as when you first created it, it actually would be the correct "creation time" – but only by coincidence!
In fact, I would argue that all of this reveals a conceptual gap in the ext4 filesystem. It's not a bug – but at the very least, it reveals the overly "filesystem-centric" attitude of ext4.
Most computer scientists would agree that in an ideal world, a filesystem should be the SERVANT OF DATA
, not the other way around. In other words, it should be as unobtrusive as possible and fully "respect the data" – both the file's content data and, whenever possible, its metadata. After all, a filesystem is only a means to an end. Therefore, "the data" should not be forced to conform to the filesystem unless there's a very good reason.
Let me prove my point with a little story:
When did work begin on The New York Times
bestseller, The Philosophy of Mike
[Don't bother looking it up on Amazon – this is a thought experiment!]
The answer is January 1, 2015. That's the day I started working on my book and saved the file for the very first time. That was my book's original "creation time"!
NOTE ON THE SUBTLE, BUT IMPORTANT, CHOICE OF TERMINOLOGY:
I honestly don't like the term "creation time" for any file-related purpose. Why? Because it raises a fundamental question about the ultimate meaning of "creation". Is a book "created" when you only completed the first page and then saved the file for the first time? Or is it only truly "created" when you completed the book and saved the file for the very last time? In the real world, almost everyone would agree that a book isn't truly "created" until the book, in its final published form, actually exists! Think about the famous Michelangelo sculpture, David. Most people would agree that David wasn't "created" when all that existed was his big toe. That literally was NOT David – it was simply a toe! Perhaps it was the beginning of the David "origin story", but it certainly wasn't David! So by any meaningful standard, David was not "created" at that point. But if it were based on the Linux ext4 notion of "creation", David was indeed "created" when his toe first appeared! As a result, I prefer the term "BIRTH
" to sidestep the misleading implications of "creation" time. The term "birth" is conceptually superior and closer to reality because it implies that it's "just the beginning" of the file – just as a person's final adult form is NOT "created" at the moment of birth. In the real world, as a baby grows to a child and then adulthood, it's continuously "modified" – exactly what happens to a typical file as you work on it over a period of months or years. But knowing the birth time is still essential if you ever want to answer a very basic question: "hmmmmm... when did I first start writing my book?" That, to me, is asking when the book was "born". It wasn't completed or "created" at that point – but it was born!
OK, so back to my story...
Because I'm such a speedy writer, I completed the entire book only a month later – on February 1, 2015. Hence, that was the "last modification" time.
Then, a month later on March 1, 2015, I opened the file, hit the print button, and placed the printed copy in my bookcase. That would be the "last access" time.
Then, three years later in 2018, a neighbor of mine asked if he could borrow my printed copy and keep it in HIS bookcase for a while. I said "no problem dude – anything for a neighbor!"
A few months later, he returns the book to me and I happen to notice that it now says "Creation Date: June 25, 2018".
How could that be? Did he somehow rewrite my book and create a new philosophy? It's a 2015 book and I personally wrote it – so I know it wasn't created in 2018. To resolve the mystery, I use an OCR scanner to capture all "the data" – both the raw data contents of the book AND the "metadata" of the book's cover.
After running a binary compare, I confirm that all "the data" has remained unchanged – except for the creation date!
So I confront him about this oddity.
He then explains to me that the entire town has slipped into a parallel universe. Everything's the same except for one thing: All bookcases now have a hidden "booksystem". He explains that just as computer files require a "filesystem" for storage, bookcases now require a "booksystem". He then explains the hidden architecture of this booksystem. He says that one of the core components of the booksystem is the "bnode". He says it's just like the inode on Linux systems – except the "b" stands for book!
Apparently, whenever a book is placed inside a new bookcase – the equivalent of a new partition – the book must be stripped of its original "creation time" and reassigned a new, arbitrary "creation time" that's based on the arbitrary time that someone placed the book in the new bookcase!
He says this must happen because that's simply the nature of the booksystem – the book must be assigned a new bnode!
I tell the guy I don't care what the "internal logic" of this ridiculous booksystem is! Everyone knows that my famous philosophy book was created in 2015 – not 2018. But because he drank the Kool-Aid of the new universe, he insists that it makes perfect sense – because all that matters is the date of "bnode creation"!
I tell the guy that's insane – but then he claims that it's a "technical limitation" of the bnode system.
First of all, WHO CARES! If that indeed is a technical limitation of the booksystem, then all that means is that the booksystem itself is bogus and needs to be fixed!
But even that doesn't make sense. You see, before I handed it to my neighbor, I used 2 separate ink stamps to place a "last accessed" and "last modified" timestamp on the "metadata" of the outside cover.
Both the atime and mtime timestamps had been completely preserved – exactly the same thing that happens on a Linux system when you move a file from one partition to another! Both timestamps still said "2015". [Yes, that's right – it means the neighbor just wanted to impress his friends by displaying my book in his book case. He never actually opened it or read it.]
This definitely reveals a conceptual flaw in the booksystem that CANNOT be excused by any technical explanation. Why? Because the preservation of the atime and mtime metadata proves that the booksystem IS
capable of transferring metadata from the bnode of my bookcase to the bnode of my neighbor's bookcase!
Now, back to the world of Linux:
Maybe there is some value in recording the inode creation time – I'm not necessarily against that if there's at least some valid "use case".
But if there is a valid use for it, it should not be given a misleading label. Instead, if it does have some value, I propose that it be called itime
– "inode creation time" – the time the inode on the current partition was "created" to accept the new file. Sound familiar? It is! It's exactly what crtime means today!
For backward compatibility, if there is a future Linux filesystem such as ext5, the current crtime metadata could be mapped to itime. In other words, itime in the future would simply mean what crtime means today.
And then there would be "birth time" – btime
– the time when the file was originally "born". In other words, btime would be when the newborn file was first saved by the author on its original partition. It would always answer a very basic question at any time in the future, no matter what partition it found itself on: "When did the author start working on the book – when was the file first born?"
In case Theodore Ts'o
or some other Linux luminary stumbles across this post some day, here is my proposal for how the stat command could behave in a future ext5 filesystem – and how it compares to the current ext4 implementation.
NOTE 1: The stat command currently displays atime as "Access" and ctime as "Change". To remain consistent with that convention, I've displayed btime as "Birth" and itime as "Inode".
NOTE 2: The timestamps in the following graphic are based on the fanciful thought experiment you just read. It's what the stat command would generate AFTER my neighbor moved the book file to his "partition". In other words, this is what HIS system would say when he ran the stat command on my book's file. Except this version is my ext5 proposal – we are no longer in the bizarre parallel universe – so the timestamps actually make sense. Birth is btime – the time I first started working on my book and initially saved the file. And Inode is itme – the time my neighbor moved the file to his system (and thus the time when a new inode was created on his computer). Note also that Change Time (ctime) also reflects the time my neighbor received the file – since any movement of a file is also considered to be a change in the file's status (as is currently the case with ext4):
To view this image at full resolution, right-click and select "open image in new tab" – or on phones and tablets, "tap and hold" and save it to your pictures for full-size viewing.
PS: For those reading this in the present day, check out my implementation of the "xstat" command
– it provides everything the stat command currently does PLUS "creation time" – which as you know, at the moment, really means "inode creation time"!