RPi_Mike
Posts: 68
Joined: Sat Dec 09, 2017 12:57 am
Location: United States

TUTORIAL: File Sorting on the Raspberry Pi

Sun Jun 10, 2018 3:39 am

INTRODUCTION: Whether you're a secretary or a scientist, the ability to sort files is about as fundamental as it gets in the world of computing. The problem with this "obscure but fundamental" topic is that the way computers sort files by name is COMPLETELY MESSED UP! You will not find a truly universal standard for sorting files and the Raspberry Pi is no exception. Instead, you will find a variety of different behaviors that make little or no intuitive sense to the vast majority of people. Even within the Linux world, you will find several different implementations that affect the order in which files are presented or processed. To be clear, I'm not suggesting there's any bug or mistake with the Raspberry – or any other computer. Like most things, if you take a "deep dive" on the peculiarities of file sorting on any system, you will usually find that it has some kind of "internal logic" that explains its behavior. That's what my tutorial explores. More importantly, I reveal a powerful new sorting technique that I developed after much experimentation and testing.

RASPBERRY PI FILE SORTING BEHAVIOR: To make things even more confusing, sorting behavior doesn't just vary from one computing system to another – it can vary WITHIN a single system as well! This is very much the case for the Raspberry and its official operating system – Raspbian Stretch. Just to prove my point, I assembled a group of test files and ran a series of sorting experiments. The file names I chose begin with an assortment of numbers, letters, words, and symbols. In my chart, the first two columns are probably the most relevant to most users. The first column shows how File Manager (PCManFM) sorts files by ASCENDING NAME. The second column shows how Terminal sorts the same files when you run the extremely fundamental "ls" command – also by ASCENDING NAME. The ls command is also referred to as the "list command" (that's a lower-case "l" as in "list", not the number 1). I'll explain the other columns in a moment, but your key takeaway at this point should be how RADICALLY DIFFERENT the sorting behaviors are. Pick any color and follow it with your eyes. You will see that File Manager and the ls command – even though they are BOTH part of the same operating system – will sort the exact same files in very different ways:
File_Name_Sorting_RPi_Mike.png
File_Name_Sorting_RPi_Mike.png (135.5 KiB) Viewed 1173 times
To view this file sorting image at full resolution, right-click and select "open image in new tab" – or on phones and tablets, “tap and hold” and save it to your pictures for full-size viewing.

IMPLICATIONS OF MY CHART: As you can see in the above chart, Raspbian's File Manager does a fairly impressive job of sorting files (by name) in a rather intuitive, predictable and "human friendly" way. Unfortunately, File Manager is basically just a nice piece of custom code that rests on top of the "core" Raspbian operating system. Most programs and scripts you use will NOT rely on the kind of sorting behavior you see in File Manager. Instead, they will typically rely on the sorting behavior exhibited by the ls command, which is part of the GNU "coreutils". As the author of the gigantic FFmpeg / mpv tutorial, I first encountered this phenomenon when experimenting with numeric image sequences that need to be fed into FFmpeg to generate a movie – where the frames must be in the proper order. FFmpeg – like almost every piece of software – is simply not going to reinvent the wheel. In other words, if an operating system already has a built-in sorting mechanism, no rational programmer is going to waste their time creating an entirely new sorting algorithm on their own. In this regard, FFmpeg is no exception. But that can be a real problem if you don't understand how Raspbian's "internal" sorting mechanism behaves. And even if you do understand how it works, that alone is not enough – because you'll still need to know how to control it and make it bend to your will so that the sorting behavior conforms to your wishes!

BIG PROBLEM: If you ask any reasonably intelligent kindergartner to list, in order, the numbers 1, 2, and 10, almost all of them will give you the correct answer. They will say "1, 2, 10". But look at the results of Raspbian's internal "sorting engine" – which is reflected by the output of the ls command. It thinks the correct order is "10, 1, 2". If you were feeding an image sequence into FFmpeg, for example, you would end up with a movie where *ALL 3* frames would be in the wrong location. That would be quite the strange movie – a movie that begins in the future, jumps to the past, and ends in the middle! Now I'm not saying that "10, 1, 2" is wrong – but it's certainly not "human friendly" or even slightly intuitive to most people. There is, however, a certain kind of strange logic to it. If you think about it, the number 2 is the "odd man out" in that numeric sequence. If you look at them more as letters rather than numbers, only 10 and 1 have something in common. That commonality is that they both begin with the same character – the number "1"! In other words, if you think of "1" as being the equivalent of the letter "A", they both begin with "A"! As a result, from a grouping perspective, it makes sense to group the 10 and the 1 together before you get to the 2! I'll come right out and say it, however – no matter what the historical reasons or other explanations, I think that's a RIDICULOUS approach to sorting! Whenever possible, humans should be the ones training computers – not the other way around! If you have to completely rethink the rules of something as basic as sorting, you're allowing yourself to be trained by the computer! That would not necessarily be a bad thing if that new "training" lead to a deeper, more enlightened understanding of the universe – but if it forces you to think in a "messed up" way, that's not good at all!

NAME AND TIME: When all is said and done, there are really only 2 main ways to sort a file – by name or by time! In other words, you can either sort files by their file name or by their timestamp. Both of these are part of the file's "metadata", since they're not technically part of the file's contents. If you have a picture, for example, the picture itself obviously consists of "picture data" – the various colors and brightness levels of each pixel. If you have a picture of a tree, for instance, the file name and timestamp are clearly not part of the tree's photographic depiction – which is why they are considered to be "meta" to the image itself. Now, before anyone starts correcting me, I'm fully aware that you can also sort files by size and type – such as whether they are large or small or end with an extension like .jpg or .txt. That kind of sorting can be extremely useful in limited cases. But the focus of my tutorial is PRACTICAL file sorting – so although I won't ignore the sorting of files by size or type, I will only mention that in passing. Besides, those methods of sorting are quite self-explanatory!

RENAMING FILES: If you don't like the default sorting behavior of Raspbian's "internal engine", you have only one realistic choice in most cases – YOU MUST RENAME THE FILES IN THE ORDER YOU WANT THEM! In theory, you might be able to change a program's internal file sorting order as it processes or displays files – but that capability is RARE! In a few cases, such as the excellent image-viewing program Feh, there is some limited flexibility in this regard – but it's certainly not a common feature!

For example, Raspbian's default image viewer – GPicView – does NOT display files in any recognizable sorting order. It clearly uses the file name to determine the order and it seems to have its own "method to the madness", but it's certainly not intuitive or useful to the average person. For instance, it will display 00 before 0 and 10 before 1 – so you might think it's following the same order as the standard ls command (without any options). But that is NOT the case – because it places the tilde and underscore characters at the top of the sorting list, whereas the ls command will put those toward the bottom. I won't bore you with more details – but suffice it to say that GPicView also does not match the behavior of File Manager, POSIX, or "natural" sorting! Clearly, in a case like this, if you want GPicView to flip through a bunch of images in a predictable manner, your only choice is to rename the files themselves – in a predictable order that you control!

Another theoretical approach would be to alter your system's environmental variables so that the internal sorting behavior changes. I demonstrated that in my chart when I switched the behavior from "Raspbian style" sorting to POSIX-style sorting by invoking "LC_ALL=C" as a prefix to the ls command. But as you can see, POSIX-style sorting generated an equally absurd sequence of "1, 10, 2" for the numeric series of "1, 2, 10". That's hardly an improvement in my eyes! You can also trigger the ls command's "-v" option to generate a "natural style" sorting behavior. Basically, this attempts to understand version numbers inside file names and sort them accordingly. This is all well and good, but keep in mind that for 99% of software, such as FFmpeg, you're not going to be able to get "under the hood" and slip in a "-v" option to alter your software's sorting behavior! Instead, with almost all software, it's simply going to mirror the behavior exhibited by the ls command – without any options or alterations. Finally, just to complete the explanation of my chart, the "-X" option sorts by extension type – .bmp files come before .jpg files, for example. If you look carefully, however, you'll see that it precisely matches the sorting behavior of the standard ls command without any options – except that it also takes into account the alphabetical order of the file extension. No great revolution there! If you read the official manual pages for the ls command, you'll see that my chart pretty much reflects the sum total of name-based sorting options. [Yes, super nerds, there are also options to sort by ctime, atime and directory order. But since ctime is merely "change time" and not "creation time", it has very limited value. Separately, since Raspbian uses a "relatime" setting like most Linux systems do by default, atime is virtually useless. And directory order? Hardly worth talking about!]

PADDING WITH LEADING ZEROS: This, in my opinion, is the ultimate way to exercise total control over the sorting behavior of your files. Whether it's Linux, Windows or Mac, there is only one truly reliable, "universally recognized" file sorting scheme. If your file names begin with a numeric sequence that always consists of the exact same number of digits – a consistent format that is assured through the use of "leading" zeros that are "padded" to an underlying number series – your system will always recognize them in their intended order! So, for example, 001.jpg, 002.jpg, 010.jpg, will always be correctly interpreted in the proper "1, 2, 10" sequence by all modern computing systems. But what is the best way to do this?

MY UNIVERSAL FILE SORTING FORMULA: I have personally struggled with this issue, off and on, for several months. Although I quickly came up with a variety of methods for specific sorting scenarios, my dream was to generalize them into a "universal" command line that could be applied to almost ALL sorting scenarios. To be clear, I am certainly not the Einstein of file sorting. My universal formula borrows ideas from more than a dozen Internet postings that I discovered through many hours of Googling. My unique contribution, however, is that I have taken all of those disparate ideas – often presented in completely different contexts – and synthesized them into one unified formula. Even that wasn't enough, however, to perfect my method. I also spent several frustrating hours tweaking the "equation" until it finally worked without throwing inexplicable error messages or behaving in unexpected ways. This was definitely a case of mining "undocumented obscurity" to exploit the hidden power of Raspbian Linux for all it is worth! So here it is – my universal file sorting command line:
File_Sorting_Command_Line_RPi_Mike.png
File_Sorting_Command_Line_RPi_Mike.png (15.7 KiB) Viewed 1201 times
NOTE: Several variations of my command line are available in copy-friendly plain text in the examples section that appears at the bottom of my tutorial. To view this 1080-wide command line image at full resolution, right-click and select "open image in new tab" – or on phones and tablets, “tap and hold” and save it to your pictures for full-size viewing.

start=1: As you may have noticed, I highlighted – in alternating blue and red – the 5 adjustable elements inside my file sorting command line. So let me start by explaining "start=1". What this does is very simple: It determines the "starting number" for your sorted file sequence. If, for example, you changed "start=1" to "start=300", your file names will begin with 0300, 0301, 0302, etc. If you need to insert files into an existing sequence – such as the frames of a movie – it can be very handy to have FULL CONTROL over the starting value – which also means, by extension, that you also have full control over the ending value. In most cases, however, "start=1" is probably the most useful setting. If you set it to 1, your files names will begin with 0001, 0002, 0003, etc. That is certainly the most "universal" setting. But my command line's start value is extremely flexible. You can even set it to "start=0". If you do that, your file name sequence will go like this: 0000, 0001, 0002, etc.

-tr: In the world of Raspbian, the "-tr" option means "reverse time" order – the "t" for time, the "r" for reverse. But don't let that fool you! Quite ridiculously, different operating systems – including different distributions of Linux – use OPPOSITE meanings for "reverse time" sort. But at least in Raspbian, "reverse time" counterintuitively means what I and most rational people would refer to as a "chronological sort" – NOT a "reverse chronological" sort. In my book, "reverse time" would always mean "reverse chronological" order – not "chronological" order. Here's a common sense example: If you asked the average person to list, in chronological order, the birth years of 1970, 1980, and 1990 – based on the times those people were "created" – almost all of them would say "1970, 1980, and 1990"! But if they thought like Raspbian's internal sorting engine, they instead would say "1990, 1980, and 1970"! That's probably because the people behind coreutils thought it would be more convenient to list "newest files first" as the default behavior. In other words, without adding the "-r" for reverse option, the default behavior is to place the newest files you created or edited at the top of the list. I actually agree with this approach on ONE BIG LEVEL – because it's my preferred sorting behavior as well. After all, most of the time when you're doing things on your computer, you want to see the most recent files you were working on at the top of the list! What I object to, however, is the decision to construe this as being "chronological" in time. It's not! It's definitely "reverse chronological" order no matter what they say!

RASPBIAN'S BIZARRE NOTION OF TIME – TEST IT YOURSELF: Don't believe me? Try it yourself by creating a brand-new empty folder called "TEST" – or whatever name you wish. Then, open that folder and right-click inside it. Then click "Create New" and then click "Empty File". Using that method, create 3 empty files called 1970, 1980 and 1990. Be sure to create "1970" first and "1990" last. Then, open that folder in Terminal and run the following command. The "-1" at the end forces the ls command to list "one file per line" – a much more "human readable" format than having to read the file names horizontally across the screen:

ls -t -1

You will see that the output – although in theory being in "normal" time order – is actually what most people would call "reverse chronological" order. Think about the people who were born as being letters in the alphabet. Who is the equivalent of the letter "A" in the alphabet? In other words, who came first? It's obviously the person born in 1970! Ascending chronological order should clearly be A, B, C – not C, B, A! Anyway, here's the absurd output the "normal" time command produces:

1990
1980
1970

But if you add "r" for reverse and run this command line instead:

ls -tr -1

You will get the following output – which I consider to be the correct, perfectly normal "chronological" order – even though they consider it to be in "reverse" time order:

1970
1980
1990


DO YOU TAKE PICTURES? Understanding Raspbian's odd notion of time is probably most important when dealing with pictures and images. If you took a bunch of pictures with your camera on a vacation, you'd probably like to have those pictures in the same order you took them! So if you took a picture on Monday, you'd probably like that to appear before another picture that you took on Tuesday. Likewise, if you've turned your Raspberry into a surveillance camera by using the outstanding motionEyeOS system, you'd probably want the image captures to be sorted in the same order they were acquired. The problem is that if Raspbian's internal sorting engine doesn't like the EXACT format of your file names – and it can be very picky – it will NOT sort them the way you want! Instead, they will end up completely out of sequence!

In Raspbian, therefore, it's important to appreciate the value of "reverse time" (-tr) sorting – because it will cause the oldest picture or file to be listed FIRST. If you think about it, the oldest picture is also the FIRST PICTURE taken by your camera! So for almost all image applications, you'll want Raspbian's "reverse time" sorting! Technical note: Sorting with the ls command is actually based on the most recent file modification time – not the "creation" time. But when you're dealing with image sequences, for example, modification time is usually the same as the creation time – unless of course the image was later edited and re-saved under the same file name.

*.jpg: This item in my sorting command line is entirely optional – but it can be critical in certain cases. Under ideal circumstances, the contents of your folder would be entirely "pure". In other words, the only files inside it are the ones you wish to sort. This is certainly the cleanest way to approach things. But if you have other unrelated files – such as .txt or .wav files, for example – you need to either remove them from the folder first – or filter them out when you run the sort. That's because the sorting technique I developed is completely agnostic. From its standpoint, *ALL* files in a folder are fair game. That makes it both powerful but also dangerous if you don't know what you're doing! So in this example, we've added "*.jpg" to ONLY sort files that end in "*.jpg". That means it will ignore all .png files, mp4 files, .txt files, etc. You can obviously change "*.jpg" to "*.png" or "*.txt" – or whatever you want. You can also completely remove that item from my command line if you don't need it or want it!

%04d: This item controls the "digit format" of the sorted files. If you're certain that you will never have a need for more than 9,999 sorted files, "%04d" is perfect. But if you're doing an intense project that might involve several million files – a possibility I'm raising for the benefit of readers on computers more powerful than the Raspberry – you need to carefully consider how many "digits" you'll need. For example, standard 30 frame-per-second video – if run continuously for 24 hours – will generate almost 2.6 MILLION frames per day! That's 2,592,000 frames to be exact. If you notice, that's 7 digits long. However, in less than 4 days, it will break the 10 MILLION frame mark. At that point, you're now in 8-digit territory. So be sure to "plan ahead" when you select the digit value. It's always best to "over plan" and give yourself some extra wiggle room. So if you're planning to break 10 million files, be sure to set it to at least "%08d" – or even better, "%09d" so that you can make it past the one-month mark! For many common applications – like a brief list of a few dozen items in a folder – using "%02d" is probably the cleanest-looking format.

- $f: This useful option includes the ORIGINAL file name in the newly renamed file. So instead of "My First Picture.jpg" being renamed to "0001.jpg", it will actually be renamed to the very clear and legible "0001 - My First Picture.jpg". I personally love this option because it's THE BEST OF BOTH WORLDS! You get all the sorting advantages of prefixing it with a properly padded number with leading zeros – but you also get to retain the original file name (which will have no impact on the sorting order). But if for some reason you wish to have "pure" numeric values for your newly sorted and renamed files, you can easily DELETE this item from the command line. But you must "delete" it in a very specific way. Please see my "CRITICAL NOTE" in EXAMPLE 2, below, for the simple change you need to make!

WARNING: My command line has proven itself to be very reliable. But "external" events that have nothing to do with my command line can strike at any time – for example, your Raspberry could get hit with a power surge while it's actively sorting and renaming your files. That could permanently mangle them! So if the files you're sorting are of any great importance, there's only one way to protect yourself with 100% certainty: First make a backup copy of all the files you're about to sort and put them on a physically separate storage device! Just sayin'.

CRITICAL TIP: Make sure File Manager is CLOSED when you run the sorting command line. Otherwise, it will process the files much more slowly – because File Manager will be constantly updating the displayed contents of the open folder. As long as you follow that tip, the command line is very fast. It will sort and rename several thousand files per minute on the Raspberry Pi 3!

PASSIVELY TEST BEFORE YOU SORT & RENAME: You should always passively and "non-destructively" test how different options with the ls command affect the sorting order BEFORE you commit to anything! In other words, open Terminal in your folder and run "ls -tr -1", for example, to see how "chronological" sorting order will behave with your files. If it happens to work well for your particular purpose, then you might as well "burn it in" to your actual file names. That way, a standard file name sort in either File Manager or Terminal with the ls command – without any options – will automatically list them in the order you want! It also means that any software that uses the Raspberry's "internal sorting engine" will process them in the correct order. Remember to always end your command line with a "-1" when you run your passive test – because that way it will list each file on easily readable separate lines!



EXAMPLE 1: YOU TOOK A SERIES OF PICTURES WHERE THE FILE NAMES NEED TO BE SORTED IN CORRECT CHRONOLOGICAL ORDER – WHILE ALSO RETAINING THE ORIGINAL FILE NAMES:

1.jpg
2.jpg
3.jpg
10.jpg
11.jpg
12.jpg


Unfortunately, Raspbian's internal "sorting engine" – including the ls command – will sequence the file names in this non-numerical, non-chronological order:

10.jpg
11.jpg
12.jpg
1.jpg
2.jpg
3.jpg


To address that, run this command line – it's the same one that appears in my "universal sorting formula" graphic:

start=1; ls -tr *.jpg | cat -n | while read n f; do mv "$f" "`printf "%04d - $f" $start`"; ((start++)); done


The files will now be renamed to this:

0001 - 1.jpg
0002 - 2.jpg
0003 - 3.jpg
0004 - 10.jpg
0005 - 11.jpg
0006 - 12.jpg


IMPORTANT COMMENTS: Notice how it kept the original file names in perfect condition – but added a properly padded numeric sequence as a prefix! This now makes the pictures "universally sortable" by almost any computing system or software. For any of this to work, of course, an extremely modest assumption is being made – that when your camera takes a picture, the file receives a basic timestamp. And no – I'm not referring to the EXIF data that may also record the time the picture was taken. None of that is needed! All that's required is that the file itself has a simple time associated with it. What makes my technique even more flexible is that the time and date don't even have to be correct. Even the year could be wrong. All that matters is that the pictures were saved in chronological order. I'm not aware of any modern camera that doesn't behave in this manner.






EXAMPLE 2: YOU TOOK A SERIES OF PICTURES WHERE THE FILE NAMES NEED TO BE SORTED IN CORRECT CHRONOLOGICAL ORDER – WHILE ELIMINATING THE ORIGINAL FILE NAMES:

start=1; ls -tr *.jpg | cat -n | while read n f; do mv "$f" "`printf "%04d.jpg" $start`"; ((start++)); done


Using the same files listed in Example 1, the above command line will sort and rename the files like this:

0001.jpg
0002.jpg
0003.jpg
0004.jpg
0005.jpg
0006.jpg


CRITICAL NOTE: If you look carefully, it's not a completely simple matter of deleting the " - $f" part to get rid of the original file name. Instead, that part must be replaced with ".jpg" immediately after the "%04d" part with no spaces. Obviously, if you were renaming .png or .txt files, for example, you would have to change that part from "%04d.jpg" to "%04d.png" or "%04d.txt", etc.






EXAMPLE 3: YOU TOOK A SERIES OF "HOW TO" PICTURES TO HELP A FRIEND REPLACE A CAR BATTERY – BUT YOU WANT THEM SORTED IN A PROPER CHRONOLOGICAL ORDER:

First, open the hood of the car to get access to the car battery.jpg

After you get access to the battery, disconnect the top wire.jpg

Carefully disconnect the bottom wire after you remove the top wire.jpg




Unfortunately, both File Manager AND Raspbian's internal sorting engine will sort them, by name, in perfect alphabetical order – which also happens to be the COMPLETELY WRONG order for our needs:

After you get access to the battery, disconnect the top wire.jpg

Carefully disconnect the bottom wire after you remove the top wire.jpg

First, open the hood of the car to get access to the car battery.jpg




But if you run the following command line, you'll get them in the perfect order shown below – with a very neat 2-digit prefix. This of course assumes that you took the pictures in a chronological, step-by-step order:

start=1; ls -tr *.jpg | cat -n | while read n f; do mv "$f" "`printf "%02d - $f" $start`"; ((start++)); done


01 - First, open the hood of the car to get access to the car battery.jpg

02 - After you get access to the battery, disconnect the top wire.jpg

03 - Carefully disconnect the bottom wire after you remove the top wire.jpg






EXAMPLE 4: BURN "NATURAL-STYLE" SORTING INTO YOUR FILE NAMES – SO THAT A STANDARD NAME SORT WILL AUTOMATICALLY REFLECT "NATURAL" NAME SORTING ORDER. FOR EXAMPLE, A SERIES OF FILES WITH "VERSION NUMBERS" INSIDE THEIR FILE NAMES:

Document v1.txt
Document v2.txt
Document v3.txt
Document v10.txt
Document v11.txt
Document v12.txt


By default, Raspbian's internal sort will place them in this unfriendly order:

Document v10.txt
Document v11.txt
Document v12.txt
Document v1.txt
Document v2.txt
Document v3.txt


The following command will burn the "natural sort" directly into the file names. Note that we are no longer sorting by timestamp through the use of the "-tr" option. Instead, we are now sorting by file name in a very specific way – by using the ls command's "natural" sort option. Note also that we have now changed the "*.jpg" to "*.txt" in order to selectively ignore all non-text files:

start=1; ls -v *.txt | cat -n | while read n f; do mv "$f" "`printf "%04d - $f" $start`"; ((start++)); done


The above command line will rename the files like this:

0001 - Document v1.txt
0002 - Document v2.txt
0003 - Document v3.txt
0004 - Document v10.txt
0005 - Document v11.txt
0006 - Document v12.txt






EXAMPLE 5: SORT FILES BY FILE SIZE (the largest files will appear at the top):

start=1; ls -S | cat -n | while read n f; do mv "$f" "`printf "%04d - $f" $start`"; ((start++)); done

NOTE: You can add the "r" option to reverse the file size order – in other words, you would change the "-S" to "-Sr". This will place the smallest files at the top.

NOTE: As you can see, we are no longer filtering files with *.jpg or *.txt extensions. In most cases, if you're sorting by file size, you would want to include ALL files in your sort. Every situation is different, of course – which is why my formula gives you maximum flexibility.






EXAMPLE 6: SORT FILES BY EXTENSION TYPE (an alphabetical sort that's derived from standard ls command sorting):

start=1; ls -X | cat -n | while read n f; do mv "$f" "`printf "%04d - $f" $start`"; ((start++)); done

NOTE: As you can see, we are not filtering files with *.jpg or *.txt extensions. In most cases, if you're sorting by extension type, you would want to include ALL files in your sort.






EXAMPLE 7: SORT FILES IN ACCORDANCE WITH POSIX STYLE:

start=1; LC_ALL=C ls | cat -n | while read n f; do mv "$f" "`printf "%04d - $f" $start`"; ((start++)); done

NOTE: POSIX style follows "traditional sort order" and uses "native byte values" to determine the sequence.

ejolson
Posts: 1637
Joined: Tue Mar 18, 2014 11:47 am

Re: TUTORIAL: File Sorting on the Raspberry Pi

Sun Jun 10, 2018 7:04 pm

RPi_Mike wrote:
Sun Jun 10, 2018 3:39 am
INTRODUCTION: Whether you're a secretary or a scientist, the ability to sort files is about as fundamental as it gets in the world of computing.
Nice rant. Your solution to put spaces in filenames is particularly amusing. What do you think about using tabs instead of spaces?

RPi_Mike
Posts: 68
Joined: Sat Dec 09, 2017 12:57 am
Location: United States

Re: TUTORIAL: File Sorting on the Raspberry Pi

Sun Jun 10, 2018 8:52 pm

ejolson wrote:
Sun Jun 10, 2018 7:04 pm
Nice rant. Your solution to put spaces in filenames is particularly amusing.

My solution is not dependent on spaces. It's just a clean-looking, human-friendly aesthetic. If someone doesn't want the 2 spaces, they can remove them! It has no impact on the performance of my command line either way.

Simply change the "%04d - $f" part to "%04d-$f". You can also change the hyphen to an underscore.

User avatar
scruss
Posts: 1699
Joined: Sat Jun 09, 2012 12:25 pm
Location: Toronto, ON
Contact: Website

Re: TUTORIAL: File Sorting on the Raspberry Pi

Mon Jun 11, 2018 12:44 am

I think ejolson's ribbing you. It's Not the Unix Way¹ to put spaces in filenames. Spaces in filenames make processing them in the shell much more difficult. While you can never assume that Others (who are, naturally, Wrong) won't put spaces in filenames, you'll make the processing of files that you have control of much easier if you avoid them.

Similarly, do your best not to create filenames with !, + or ~ in them. It's technically possible to have ‘/’ in a filename, but it is immensely difficult to do and causes real problems to the shell.

You might want to look at the rename(1p) command, as it is one of the most powerful renaming tools you'll ever see.

--
¹: “the Unix Way” is a vague handwavey Right Way of Doing Things, often invoked by those who wish to appear Much More Important than they really are. Invoking the Unix Way is also a great way of avoiding proper references and explanations.
‘Remember the Golden Rule of Selling: “Do not resort to violence.”’ — McGlashan.

RPi_Mike
Posts: 68
Joined: Sat Dec 09, 2017 12:57 am
Location: United States

Re: TUTORIAL: File Sorting on the Raspberry Pi

Mon Jun 11, 2018 2:36 am

scruss wrote:
Mon Jun 11, 2018 12:44 am
I think ejolson's ribbing you. It's Not the Unix Way¹ to put spaces in filenames....... do your best not to create filenames with !, + or ~ in them.

I originally posted my tutorial in the General Discussion because it's broadly useful for people of many skill levels. Hence my "secretary and scientist" reference. For whatever reason, the RPF decided to move it to this "nerd-centric" programming forum. There's not the slightest bit of "programming" in my tutorial!

One thing's for certain – none of my tutorials are written for high-volume posters at the top 1% expertise level.

Instead, my target audience is the great bulk of users – the "tool users", not the "tool makers".

Personally, when I work with text files or music videos or pictures or MP3s, I like nice "clean" file names – clear and legible and easy-to-read. I like visual separation and can't stand it when everything is all bunched together. My attitude is that whenever possible, computers should conform to MY WISHES – not the other way around. For the overwhelming majority of my everyday files, I'm simply not concerned with whether their names are "shell friendly" or pay homage to the "UNIX way" – because they will never see the light of day in a shell. And when I do use my build of FFmpeg in the shell – to process image sequences, for example – it has no issue with spaces. But like many programs, it cares deeply about the file names having a rational sorting order.

More importantly, however, is the giant elephant in the room: Anyone who knows C or Python or writes shell scripts is more than qualified – absurdly well-qualified – to know how to replace a simple space with an underscore! My command line is extremely flexible and fully supports "no spaces" for those who want that.

Finally, I've never advocated the use of unorthodox characters in file names. I agree that would be foolish. My only purpose in including !, + and ~ was to provide a fully comprehensive assessment of sorting behavior across the widest range of possibilities. In many cases, people are handed a bunch of files that may not observe best practices – but they still have to deal with them. My command line handles those situations perfectly.

I realize a one-percenter like you already knows all this stuff – but I wanted to spell it out anyway for the benefit of less-savvy readers.

ejolson
Posts: 1637
Joined: Tue Mar 18, 2014 11:47 am

Re: TUTORIAL: File Sorting on the Raspberry Pi

Mon Jun 11, 2018 7:58 am

RPi_Mike wrote:
Mon Jun 11, 2018 2:36 am
My attitude is that whenever possible, computers should conform to MY WISHES – not the other way around.
I misunderstood your intention behind putting the spaces. Still, as pointed out above, it's easier to get computers to do what you want when there aren't a bunch of spaces in the filenames.

I personally like the LC_ALL=C setting you mention at the end of your post as it lists all the uppercase filenames first, which is traditionally why they were uppercase in the first place.
Last edited by ejolson on Mon Jun 11, 2018 5:59 pm, edited 1 time in total.

User avatar
Gavinmc42
Posts: 2032
Joined: Wed Aug 28, 2013 3:31 am

Re: TUTORIAL: File Sorting on the Raspberry Pi

Mon Jun 11, 2018 8:27 am

I have been using underscore _ for separation.
And those leading zero's are fun to play with when taking images, so I usually include the timestamp in the file name ;)
I'm dancing on Rainbows.
Raspberries are not Apples or Oranges

droleary
Posts: 149
Joined: Fri Feb 09, 2018 3:45 am
Location: Minneapolis, MN USA
Contact: Website Skype

Re: TUTORIAL: File Sorting on the Raspberry Pi

Mon Jun 11, 2018 2:28 pm

RPi_Mike wrote:
Mon Jun 11, 2018 2:36 am
I originally posted my tutorial in the General Discussion because it's broadly useful for people of many skill levels. Hence my "secretary and scientist" reference. For whatever reason, the RPF decided to move it to this "nerd-centric" programming forum. There's not the slightest bit of "programming" in my tutorial!
There also was only the slightest bit of RPi-specific content, so it really should have been moved to off topic. I generally don't see it useful to put tutorials in a discussion forum. I'd encourage the RPF to move more things out of here and into other areas of the web site that are more suited to "static" documentation.
One thing's for certain – none of my tutorials are written for high-volume posters at the top 1% expertise level.

Instead, my target audience is the great bulk of users – the "tool users", not the "tool makers".
The problem is that the "great bulk of users" tend to have no understanding that the "tool makers" still haven't figured out that spaces in file names are a thing most modern users expect. Or, at the very least, it's something a lot of older tools can have problems with, and there is very little that the "tool users" can do about it.

jbudd
Posts: 624
Joined: Mon Dec 16, 2013 10:23 am

Re: TUTORIAL: File Sorting on the Raspberry Pi

Mon Jun 11, 2018 3:07 pm

It's quite an achievement to present a whole tutorial about renaming files in various sort orders without mentioning the Linux sort command!

ejolson
Posts: 1637
Joined: Tue Mar 18, 2014 11:47 am

Re: TUTORIAL: File Sorting on the Raspberry Pi

Mon Jun 11, 2018 5:50 pm

droleary wrote:
Mon Jun 11, 2018 2:28 pm
The problem is that the "great bulk of users" tend to have no understanding that the "tool makers" still haven't figured out that spaces in file names are a thing most modern users expect.
Agreed. Spaces in filenames don't present much of a difficulty to someone who knows about computers.

The problem I see is that spaces steepen the learning curve for a newbie to transition from manually clicking on things with a mouse to being able to automate things with scripts. Steepen the learning curve enough and the newbies become modern users who don't realize how easy things could have been. The resulting lack of computer literacy across the board--all due to spaces in filenames--may not be an issue, depending on the socioeconomic goals of the country and the people who live there.

RPi_Mike
Posts: 68
Joined: Sat Dec 09, 2017 12:57 am
Location: United States

Re: TUTORIAL: File Sorting on the Raspberry Pi

Thu Jun 14, 2018 8:37 am

RENAMING FILES WITH TIMESTAMPS: Gavinmc42's brief comment about timestamps got me curious, so I decided to take a deep dive on extracting a file's timestamp and burning it into the file name itself!

This is not so much a file sorting technique but a file renaming technique – although, under the right conditions, it can certainly be used for file sorting purposes as well.

THE PROBLEM: If you're leisurely taking a single picture ONCE every several seconds or minutes, this is a fairly straightforward matter. But as I discovered in my testing, things get much more complicated if you simply take two pictures within the same second. This can easily happen if you hit the shutter button twice in a second – or it can happen constantly if you've placed your camera in "continuous shooting mode". Since modern cameras are able to take several pictures per second in a variety of scenarios, that means you can end up with several different pictures with the same timestamp! Not only would that cause problems if you attempted to rename the files by timestamp – since multiple files are obviously not allowed to have the same file name – but even if you somehow sidestepped that issue, it could easily cause your files to end up in non-chronological order.

Unlike Raspbian – which usually timestamps files to the nanosecond (or at least the microsecond in certain cases) – many cameras still use timestamps that only go out to the second! I'm referring to the file's timestamp, which is usually the most recent "file modification time" – not the picture's internal EXIF data. But typically, on most cameras, the EXIF timestamp ends up being the same thing. More importantly, at least on the cameras I've examined, the EXIF data's timestamp also does not go beyond a whole-number second. It's certainly possible that some camera models use higher-precision timestamps – an approach that would eliminate this problem at the source. Nonetheless, in many cases, you'll be confronted with this fundamental issue – whether you use the file's timestamp or the EXIF timestamp.

Here's a test of several images I took in continuous shooting mode on a consumer-grade Canon camera. I ran the following command to generate the files' timestamp data. As you can see from the ls command's output, all the file timestamps end in ".000000000". That's not good if several pictures were taken within the same second!

ls -tr --full-time

-rw-r--r-- 1 pi pi 2651844 2018-06-12 09:26:18.000000000 -0400 IMG_5961.jpg
-rw-r--r-- 1 pi pi 2684889 2018-06-12 09:26:18.000000000 -0400 IMG_5960.jpg
-rw-r--r-- 1 pi pi 2729342 2018-06-12 09:26:18.000000000 -0400 IMG_5959.jpg
-rw-r--r-- 1 pi pi 2782527 2018-06-12 09:26:20.000000000 -0400 IMG_5964.jpg
-rw-r--r-- 1 pi pi 2744825 2018-06-12 09:26:20.000000000 -0400 IMG_5963.jpg
-rw-r--r-- 1 pi pi 2703361 2018-06-12 09:26:20.000000000 -0400 IMG_5962.jpg


In the listing below, I've removed all the irrelevant data so that only the seconds and file names are displayed. I've also grouped them in triplets to make it easy to see the pattern. My Canon camera is clearly doing a quick burst of 3 pictures in a row – all inside the same second. It then rests for about a second – and then does another quick burst of 3 pictures. As you can see, the ls command does NOT do a good job at sorting the files! (As I explained in great detail in my tutorial, if you want a chronological sort of files, you must nonsensically use the "reverse time" option. But that, at least directly, is not the issue here – since I've already taken that into account with the "-tr" option.)

Instead, the real source of the problem is how the ls command behaves when it's confronted with 3 files in a row that have the exact same timestamp. The authors of coreutils were obviously clever enough to anticipate this possibility – otherwise the command would lock-up or throw an error message. Apparently, therefore, when it confronts files with the same timestamp, it does a "secondary sort" based on file name to resolve the sorting conflict. Unfortunately, the secondary sort ends up sequencing the pictures in REVERSE chronological order! In other words, it goes backward in time – from image 5961 to 5960 to 5959. To make things even more confusing, the ls command only does this within each subset of identical timestamps! So it basically creates multiple messed-up patterns inside a larger messed-up pattern! I'm guessing that all of this is an unfortunate byproduct of their "backward" notion of time. They probably figured that if you're using the reverse option on the time – which you're forced to do in the first place because of their bizarre notion of time – you must also want the file names in reverse order as well. Bad assumption on their part!

18.000000000 IMG_5961.jpg <---3rd pic taken: ls thinks it's the 1st pic!
18.000000000 IMG_5960.jpg <---2nd pic taken: ls is accidentally correct!
18.000000000 IMG_5959.jpg <---1st pic taken: ls thinks it's the 3rd pic!

20.000000000 IMG_5964.jpg <---6th pic taken: ls thinks it's the 4th pic!
20.000000000 IMG_5963.jpg <---5th pic taken: ls is accidentally correct!
20.000000000 IMG_5962.jpg <---4th pic taken: ls thinks it's the 6th pic!


THE SOLUTION: Use the date command (with -r option) to extract the file's timestamp, burn it into the file name, and append the file's original name to the end:

for f in *.jpg; do mv -n "$f" "$(date -r "$f" +"%Y-%m-%d---%H-%M-%S---%4N")___$f"; done


The above command line generates the following file names:

2018-06-12---09-26-18---0000___IMG_5959.jpg
2018-06-12---09-26-18---0000___IMG_5960.jpg
2018-06-12---09-26-18---0000___IMG_5961.jpg
2018-06-12---09-26-20---0000___IMG_5962.jpg
2018-06-12---09-26-20---0000___IMG_5963.jpg
2018-06-12---09-26-20---0000___IMG_5964.jpg


EXPLANATION: This neatly lists the year, month, day – then hour, minute and second. At the end is the original file name. Do you see the "0000" that comes just after the seconds? That's the nanoseconds, truncated to 4 digits. In this particular example, that obviously offers no sorting value – because the camera itself did not record that level of temporal precision. Fortunately, since I've included the original file name at the end of the command line, the file names end up in correct chronological order anyway. If you think about it, the camera must still internally use sequential file names – even if it's not recording sufficiently distinct timestamps for each picture. So by placing the original file name at the end, the chronological order is preserved.

But in other cases – such as imaging software that's running directly on the Raspberry – fractional seconds can be extremely useful for sorting purposes. That's especially true if the file names themselves are not "time-sort friendly". In many cases, going out to the 1/100th of a second is probably sufficient. Although if you're using hardware acceleration to generate small images, for example, the Raspberry is capable of approaching (or exceeding) 100 frames per second. So at a minimum, you should go out to the millisecond – 3 decimal places. But just to take it one magnitude further, I decided to generate the timestamps out to the 1/10,000th of a second – 4 decimal places. Be aware that those are truncated values – in other words, there's no rounding at the 4th decimal place. Instead, it just cuts off all numbers to the right. You can adjust the displayed precision of the nanoseconds by changing the "%4N" to "%5N", for example – all the way up to the individual nanosecond at "%9N".

WANT THE PURE TIMESTAMP? If you only want the pure timestamp without the original file name, simply change the ___$f part to .jpg – or .png or .txt, or .mp4, etc. In this case, the command line would look like this:

for f in *.jpg; do mv -n "$f" "$(date -r "$f" +"%Y-%m-%d---%H-%M-%S---%4N").jpg"; done

CRITICAL NOTE: Canon actually used CAPITAL letters for the ".JPG" extension – but the command line uses lower-case ".jpg". That will throw the following error message:

date: '*.jpg': No such file or directory
mv: cannot stat '*.jpg': No such file or directory

To avoid this, simply make sure that the case in the command line matches the case in the files. For example, change .jpg to .JPG – or .png to .PNG, etc.


NANOSECOND FILE SORTING TECHNIQUE: The following images were generated on my Raspberry – and then renamed using the above command line. As you can see, my nanosecond technique works perfectly when the data is available in the timestamp – it no longer says "0000". The values are automatically padded with leading zeros as well. So one millisecond, for example, would be listed as 0010 instead of "1". This is ideal for sorting purposes!

2018-06-12---20-16-04---1204___Image01.jpg
2018-06-12---20-16-04---2447___Image02.jpg
2018-06-12---20-18-10---0982___Image03.jpg
2018-06-12---20-18-10---4804___Image04.jpg
2018-06-12---20-20-05---1523___Image05.jpg
2018-06-12---20-20-05---2436___Image06.jpg


TEST FIRST: Because it's always good to "test before you regret", I highly recommend that you first conduct a no-risk test to see how everything behaves BEFORE you actually use the command line in a permanent way. So create a new folder and place a COPY of about 20 of your images inside – then run the command line and see how it behaves!

HAT TIP: The timestamp-based file renaming method is closely based on the work of John1024 at Stack Overflow. I added the nanosecond file sorting technique to differentiate between files with the same whole-number value for seconds. I also changed the formatting and added the original file name in the command line to compensate for the ls command's limitations when it confronts identical timestamps.

TWO-PASS METHOD: You could even use a "two-pass" file renaming method – by combining the timestamp renaming method with the completely different technique described in my main tutorial. The primary application would be any finicky piece of software that requires all file names to begin with a padded numeric sequence. Since the timestamp method would have already placed the file names in proper chronological order, you would simply leverage the ls command's standard file name sorting method (without any -tr). In other words, after you burned the timestamps into the file names, you would run this command from my main tutorial:

start=1; ls *.jpg | cat -n | while read n f; do mv "$f" "`printf "%04d___$f" $start`"; ((start++)); done


When applied to the above files, you get the following file names:

0001___2018-06-12---20-16-04---1204___Image01.jpg
0002___2018-06-12---20-16-04---2447___Image02.jpg
0003___2018-06-12---20-18-10---0982___Image03.jpg
0004___2018-06-12---20-18-10---4804___Image04.jpg
0005___2018-06-12---20-20-05---1523___Image05.jpg
0006___2018-06-12---20-20-05---2436___Image06.jpg


ANTI-SPACERS REJOICE: For your viewing pleasure, I used a non-space version of my command line!

jbudd
Posts: 624
Joined: Mon Dec 16, 2013 10:23 am

Re: TUTORIAL: File Sorting on the Raspberry Pi

Thu Jun 14, 2018 10:55 am

You seem intent on making the output of the ls command match your preferred format - creation time and/or numeric rather than text order, but I think this is just a step towards your actual goal.
I don't know how you are getting the resulting files into ffmpg. Is file globbing order relevant, and is it actually the same as default ls order?
Zsh allows you to specify file globbing order.

You mention testing your command lines with cp before using the mv version.
There is a flag available cp -p to preserve file creation,modification and access times. So cp -p $file $newfile; rm $file does the same as mv $file $newfile but without losing the file creation data. (have to put "$file" "$newfile" if by some misfortune there are spaces in the file names)

RPi_Mike
Posts: 68
Joined: Sat Dec 09, 2017 12:57 am
Location: United States

Re: TUTORIAL: File Sorting on the Raspberry Pi

Thu Jun 14, 2018 2:34 pm

jbudd wrote:
Thu Jun 14, 2018 10:55 am
You seem intent on making the output of the ls command match your preferred format - creation time and/or numeric rather than text order, but I think this is just a step towards your actual goal.
I don't know how you are getting the resulting files into ffmpg. Is file globbing order relevant, and is it actually the same as default ls order?

The various techniques I've explored definitely achieve all my file sorting goals! And yes, FFmpeg's use of file globbing follows the ls command's default order. However, importing image sequences into FFmpeg is merely one interest of mine. Although FFmpeg initially triggered my curiosity on this subject, I've since found these file sorting techniques to be very useful for a wide range of other applications. That said, I happen to be the author of the gigantic FFmpeg / mpv tutorial – so if you want to see how I approach image sequences in the context of FFmpeg, check out APPENDIX 5.


You mention testing your command lines with cp before using the mv version.
There is a flag available cp -p to preserve file creation,modification and access times. So cp -p $file $newfile; rm $file does the same as mv $file $newfile but without losing the file creation data.

As for "testing [my] command lines with cp", I'm not sure how you made that inference. In fact, I never mentioned the cp command. All I said was that it was a good idea to make a COPY of some of your files in a separate folder – and then use my actual command line to run a simple test on those files to see how it behaves. I'm a big fan of "real-world" testing – so if I'm going to test a command line, I'm always going to use the actual command line! I would never consider using some other command (such as cp) as a proxy for what mv might do, for example.

Finally, my mv-based command lines have NO impact on the timestamps you mention – mtime, atime and crtime are all completely preserved. I just conducted a series of tests to confirm this. So there's no need to switch to cp with the -p flag. The only timestamp that changes is ctime – which of course is NOT creation time but "change time". In the context of my command line, that simply reflects the time the file name was changed to its new sort-friendly format. Besides, modification time (mtime) is the only timestamp I really care about – since that's what the ls command and File Manager both use for time-based sorting.

jbudd
Posts: 624
Joined: Mon Dec 16, 2013 10:23 am

Re: TUTORIAL: File Sorting on the Raspberry Pi

Thu Jun 14, 2018 4:20 pm

I'm not sure how you made that inference.
You are absolutely right. You never suggested testing with cp.
Finally, my mv-based command lines have NO impact on the timestamps you mention – mtime, atime and crtime are all completely preserved.
You are partly right. Here is an example from my Pi showing that ctime is not preserved (though I note that you mention crtime not ctime) I have edited out some irrelevant lines and coloured the output.

$ ls --full-time
-rw-r--r-- 1 pi pi 0 2018-06-14 16:45:12.637445113 +0100 01.jpg

$ stat *
File: 01.jpg
Access: 2018-06-14 16:45:12.637445113 +0100
Modify: 2018-06-14 16:45:12.637445113 +0100
Change: 2018-06-14 16:45:12.637445113 +0100

$ for f in *.jpg; do mv -n "$f" "$(date -r "$f" +"%Y-%m-%d---%H-%M-%S---%4N").jpg"; done

$ ls --full-time
-rw-r--r-- 1 pi pi 0 2018-06-14 16:45:12.637445113 +0100 2018-06-14---16-45-12---6374.jpg

$ stat *
File: 2018-06-14---16-45-12---6374.jpg
Access: 2018-06-14 16:45:12.637445113 +0100
Modify: 2018-06-14 16:45:12.637445113 +0100
Change: 2018-06-14 16:48:27.486223320 +0100


I am sure that your one liners are excellent for some people, but not for me. I apologise if you feel disrespected at all by my comments. I shan't make any more.

RPi_Mike
Posts: 68
Joined: Sat Dec 09, 2017 12:57 am
Location: United States

Re: TUTORIAL: File Sorting on the Raspberry Pi

Thu Jun 14, 2018 5:43 pm

jbudd wrote:
Thu Jun 14, 2018 4:20 pm
Finally, my mv-based command lines have NO impact on the timestamps you mention – mtime, atime and crtime are all completely preserved.
You are partly right. Here is an example from my Pi showing that ctime is not preserved (though I note that you mention crtime not ctime) I have edited out some irrelevant lines and coloured the output.

Actually, sir, I'm not partly right – I'm COMPLETELY right. Re-read my carefully-written words in the last paragraph of my above post – and then read your own words that I was responding to. Your presentation is a classic "straw man" fallacy.

Allow me to quote you: "There is a flag available cp -p to preserve file creation, modification and access times."

Those are the only 3 timestamps you mentioned – so let me repeat them for emphasis: creation, modification and access times. Never once did you say anything about change time (ctime) – which you seem to be confusing with creation time (crtime). They are totally different things!

And just so you know, you won't find crtime with the common stat command you're using. Instead, you have to run the debugfs command directly on the file's Inode number (with a few other parameters, including device name). But I don't wish to get into all that on here – because "creation time" is generally not a meaningful thing in the Linux world to begin with. But since you explicitly brought up "creation time", I tested it and confirmed that it does NOT change.

So let me repeat my previous claim – which still remains 100% true: "My mv-based command lines have NO impact on the timestamps you mention – mtime, atime and crtime are all completely preserved."

Finally, what makes your "gotcha" doubly amusing is that *I* am the one that first mentioned that ctime – which is change time, not creation time – is the ONLY timestamp that DOES change! And then you go about "proving" to me that ctime changes? With a red font and everything? I suppose it's nice that you confirmed my own statement on ctime to be true. And for that I thank you. Unfortunately, none of this is relevant because you never mentioned that inconsequential timestamp in the first place!

User avatar
scruss
Posts: 1699
Joined: Sat Jun 09, 2012 12:25 pm
Location: Toronto, ON
Contact: Website

Re: TUTORIAL: File Sorting on the Raspberry Pi

Fri Jun 15, 2018 2:23 am

RPi_Mike wrote:
Thu Jun 14, 2018 8:37 am
RENAMING FILES WITH TIMESTAMPS:
… or use a tool that understands camera metadata, such as ExifTool (in the libimage-exiftool-perl package) and jhead.

exiftool is ridiculously complete, and sometimes a cheatsheet is useful. It can rename on millisecond timestamps based on several metrics. Here's an example (nicked from the cheatsheet) to get started with:

Code: Select all

exiftool -v '-Filename<${datetimeoriginal}${subsectimeoriginal;$_.=0 x(3-length)}.%e' -d %Y%m%d_%H%M%S .
jhead -ft will set the file's modification time to the file's DateTimeOriginal value. For me, this helps with sorting.
‘Remember the Golden Rule of Selling: “Do not resort to violence.”’ — McGlashan.

RPi_Mike
Posts: 68
Joined: Sat Dec 09, 2017 12:57 am
Location: United States

Re: TUTORIAL: File Sorting on the Raspberry Pi

Sat Jun 16, 2018 3:12 am

scruss wrote:
Fri Jun 15, 2018 2:23 am
exiftool is ridiculously complete, and sometimes a cheatsheet is useful. It can rename on millisecond timestamps based on several metrics.

It almost feels like we're discussing different things. There are only two core scenarios when it comes to sorting images by time:

1: The imaging device recorded fractional seconds data – one or more of the file's timestamps AND/OR the EXIF data contain milliseconds, microseconds, etc.

2: The imaging device did NOT record fractional seconds data – there is no fractional seconds data in any form (file or EXIF). Only whole-number seconds are available.

That's about it – not much more to explain. If critical time data is missing, it's missing. I don't mean to come across as Captain Obvious, but there's no way to resurrect something that never existed in the first place.

Of course, if no two images were acquired within the same second, item 2 is a non-issue.

But my Canon PowerShot, for example, only records whole-number seconds – in both the file's timestamps and the EXIF data. This was definitely an issue when I used it in continuous shooting mode – although as I mentioned, at least the file names were in a proper sequential order.

It was just an observation about something I've personally encountered.

The real value of my "nanosecond file sorting technique" is when you DO have the fractional time data but the file names, for one reason or another, are either not in proper numerical sequence – or they use a naming scheme that's not "sequence compliant" with either the software you're using or Raspbian's internal "sorting engine". Fortunately, most imaging applications on the Raspberry generate timestamps to at least the microsecond. And several cameras out there – especially the newer models – generate fractional time data as well.

File sorting is very much a case-by-case situation. Some will find my various sorting techniques extremely useful – while others will never need them!

User avatar
scruss
Posts: 1699
Joined: Sat Jun 09, 2012 12:25 pm
Location: Toronto, ON
Contact: Website

Re: TUTORIAL: File Sorting on the Raspberry Pi

Sat Jun 16, 2018 6:03 pm

Maybe we are. I was suggesting mature and tested purpose-built tools for managing image naming and sorting based on metadata as an alternative to hand-rolling chains of commands in the shell. exiftool can construct a file name from the image sequence number if there are images shot within the same second. It can't reconstruct sub-second data if it's not there in the image; nothing can, as you said.

File sorting is an immensely personal thing. Most times, all I really care about is the age of the file, so the output of ls -lNrt is pretty much perfect for me. I might care about dictionary sorting (it used to be my job at Collins) but for large sequences I already stick with lower-case and zero-padded sequence numbers, so it doesn't matter to me. Working across multiple systems (especially NAS with their own loose idea of time stamps) being able to reconstruct an image's mtime from file metadata using jhead is useful to me. It may not be for you. There's more than one way to do it.
‘Remember the Golden Rule of Selling: “Do not resort to violence.”’ — McGlashan.

RPi_Mike
Posts: 68
Joined: Sat Dec 09, 2017 12:57 am
Location: United States

Re: TUTORIAL: File Sorting on the Raspberry Pi

Sat Jun 16, 2018 7:58 pm

scruss wrote:
Sat Jun 16, 2018 6:03 pm
File sorting is an immensely personal thing.

I agree – file sorting is certainly a very personal thing.

ExifTool is quite robust and you did a good job extolling its merits.

Like most tasks, it's all about having "the right tool for the job".

When dealing specifically with image-related sorting, ExifTool could certainly be part or all of the solution.

When dealing with the broader topic of sorting all files – images or otherwise – some of my "universal" techniques could also be quite useful.

Interested readers now have several options!

ejolson
Posts: 1637
Joined: Tue Mar 18, 2014 11:47 am

Re: TUTORIAL: File Sorting on the Raspberry Pi

Sat Jun 16, 2018 8:47 pm

RPi_Mike wrote:
Sat Jun 16, 2018 7:58 pm
When dealing with the broader topic of sorting all files – images or otherwise – some of my "universal" techniques could also be quite useful.
As has been mentioned, the Unix way to solve a problem is by combining several general purpose tools that each do one thing well. In my opinion (except for the spaces) the original post illustrates the Unix way well.

For me there is a difference between sorting files and renaming them. I understand that the files are being renamed according to a sort order, but my personal preference is to leave the names of the files exactly as the camera named them. Then create a production and editing work flow that doesn't rename the files. Following the pattern already outlined, one way to do this would be to move the original files into a source subdirectory and then create a set of symbolic links with prefixed sequence numbers that link into that directory. There are likely other ways to accomplish the same thing without changing or renaming the original files.

Return to “General programming discussion”

Who is online

Users browsing this forum: No registered users and 5 guests