This post will probably be boring for you, but this is mostly just a reminder to myself, written in form of a blog post.
So, I have a directory structure: /some/path/imported/DATE/TIME/file, where DATE is date of importing, in format YYYY-MM-DD, and TIME is time of importing, in format HHMMSS.
So, example paths look like this:
./2009-02-26/143251/5a6d001b94e47960fe41a262f70ed96a ./2009-02-26/143321/8e45f68421dad6129914fe068dfa5748 ./2009-02-26/143407/aa04aa9c1e8f87b25fef98bd9a64e94d ./2009-02-26/143415/65180d1328e21959229e47b9288b6996 ./2009-02-27/083542/5a6d001b94e47960fe41a262f70ed96a ./2009-02-27/084906/aa04aa9c1e8f87b25fef98bd9a64e94d ./2009-02-27/084926/65180d1328e21959229e47b9288b6996 ./2009-02-27/155648/65180d1328e21959229e47b9288b6996
As you can see some of the files were imported many times.
Now, I need to find the latest import of given file.
So, I need a way to convert above list into:
./2009-02-26/143321/8e45f68421dad6129914fe068dfa5748 ./2009-02-27/083542/5a6d001b94e47960fe41a262f70ed96a ./2009-02-27/084906/aa04aa9c1e8f87b25fef98bd9a64e94d ./2009-02-27/155648/65180d1328e21959229e47b9288b6996
Of course – with 10 imports, it's simple. But what if I had 10000 of them?
Luckily, it is rather simple:
find . -mindepth 3 -maxdepth 3 -exec basename {} \; | \ sort -u | \ while read DIR; \ do \ find . -name "$DIR" | \ sort | \ tail -n 1; \ done
Of course I typed it originally as one-liner 🙂
While writing the post I realized I could do better:
find . -mindepth 3 -maxdepth 3 | \ sort -r -t/ -k4,4 -k2,2 | \ awk -F/ 'BEGIN{prev="/"} ($4!=prev) {print $0; prev=$4}'
Well. I understand the code, and what it does, but it doesn't change the fact that I'm not really fan of shell programming.