New Year's meme: What are the oldest files in your home directory?

Celebrate the new year with a blog post discussing the oldest files that are still sitting somewhere beneath your home directory! The procedure is simple:

  1. Run the following script in your home directory. (You might want to use less to read the output.)
  2. Ignore files whose date does not reflect your own activity.
  3. List the oldest files in a blog post and discuss!
#!/usr/bin/env python
"""Print last-modified times of files beneath '.', oldest first."""
import os, os.path, time
paths = ( os.path.join(b,f) for (b,ds,fs) in os.walk('.') for f in fs )
for mtime, path in sorted( (os.lstat(p).st_mtime, p) for p in paths ):
    print time.strftime("%Y-%m-%d", time.localtime(mtime)), path

Only include files whose last-modified time is a date on which you really touched the file. The file's time should neither result from an error (a few files beneath my own home directory have an incorrect date of 1970-01-01), nor from unpacking someone else's archive that has old files inside of it. For example, I myself have excluded the following pair of nearly 17-year-old files because their dates reflect their age inside of the Python 3.0 source archive, instead of the actual moment last month when they became part of my home directory:

1992-03-02 ./src/Python-3.0/Demo/scripts/wh.py
1992-03-02 ./src/Python-3.0/Tools/scripts/dutree.doc

But there is no requirement that the actual content of each file you list be your own. Whether you wrote the file yourself long ago, or downloaded it from some ancient and forgotten FTP site, you have a story to share!

Within the rules given above, here are the oldest files beneath my own home directory:

# My oldest five files!
# (The links return their content.)

1989-05-17 ./archive/unixpc/cee/crobots/BCRMAD.R
1989-05-17 ./archive/unixpc/cee/crobots/BCRONE.R
1990-01-18 ./archive/unixpc/ref/train.gz
1990-01-18 ./archive/unixpc/ref/xmas.gz
1990-05-18 ./archive/unixpc/save/Rhodes/treasure

You can see that these files were moved, long ago, into an archive directory for files that are no longer part of an active project. These files all date from the era when my home directory was hosted on the Unix PC which my father, a Bell Laboratories engineer, brought home in the 1980s as our personal computer.

We should start with the zipped files from January 1990, since they actually contain even older content, from December 1987 —more than twenty years ago! My father received them as Christmas greetings from fellow engineers at the Labs. You can view xmas right in your browser, since it is a simple ASCII-art holiday greeting (the name “Merrimack Valley” at the bottom refers to the particular Bell Labs location at which my father worked). Viewing train is more difficult, since it contains a VT-100 animation that will display much too quickly if you dump the file to a modern terminal. Instead, download the file and use this Python program to display it at a more traditional speed; you might want to light a candle and play some Christmas music while the animation is displaying:

#!/usr/bin/env python
"""Display the 'train' file, slowly."""
import sys, time
for c in open('train').read():
    sys.stdout.write(c)
    sys.stdout.flush()
    time.sleep(1.0/9600.0) # (drat, really sleeps 0.01s)

Though Dad probably retained the files only through the Christmas season, I was fascinated by watching them scroll silently across the black screen in glowing, green, phosphoric characters that seemed to leave trails behind them like shooting stars, and so I kept the files long after many subsequent Christmases had come and gone.

The other three files are all plain-text files, and are all of my own making. The BCRMAD.R and BCRONE.R programs were experiments in the old crobots game, where you wrote small C-language programs to control robots that drove around on the screen and shot at each other. The names seem to refer to the fact that the first file simulates a “mad bull” that drives straight at targets shooting as fast as it can, while the second program always takes “one shot” at a target then flees to another location on the screen. This would have been one of my very first forays into C programming — perhaps my very first — and so the algorithms are, I must admit, not exactly pinnacles of sophistication.

The last file is named treasure, and it is, indeed, a real treasure of a file to have discovered this New Year's morning! It is my long-lost account, carefully annotated with the venerable mm macro package (as you can see from the troff include directive on the first line), of the day in early 1990 when I discovered buried treasure that my father had hidden as a child on my grandparent's property! I actually saw the “treasure” last week (an ancient coffee can with old toys inside) sitting on a shelf when I visited my grandmother on Christmas Day. I will have to pull it down from the shelf and put some pictures on Flickr, along with excerpts from this long-lost story of its discovery!

In the meantime: what are the oldest files under your home directory?

Posted: Thursday, January 1st, 2009 at 2:27 pm
Categories: Computing, Python, Web Notes

You can leave a response, or trackback from your own site.

  • rm

    Just use this instead of your Python script:

    $ ls -altr | head

  • Brandon Craig Rhodes

    @rm: The ls -R command, unfortunately, groups its output by directory — first showing all the files that are in ./archive/, then separately listing the files in ./archive/unixpc/, and so forth. Which means that the -t option only sorts files within each individual directory listing; it does not move the oldest file of them all up to the top of the whole ls output.

    So, while ls -R is certainly helpful in other cases, it does not answer the question posed here: what are the oldest files anywhere, at any depth, beneath your home directory?

  • Doug Hellmann

    Thanks for the trip down memory lane, Brandon! I’ve posted the names of some of my old files on my blog.

  • Keegan Carruthers-Smith

    Ugly hack for oldest files under your home directory

    $ find -type f -exec ls -altr '{}' \; | awk '{ print $6 " " $0 }' | sort | head

  • Brandon Craig Rhodes

    @Doug: wow, Pascal programs!

    @Keegan: Your shell script does, indeed, return the result I am looking for. Note, however, that it is vastly slower than the Python script I suggested! Here on my laptop, my Python script runs in 2.9s while your script takes 167.4s — about sixty times slower. But I think that with a few tweaks we can improve it.

    First, since you are only passing regular files to ls, you can remove the options -a and -r which only apply to directories. And since you are letting the final sort put the files in the correct order, we can omit -t since the files are going to be reordered anyway. But these changes are just cosmetic simplifications.

    The real reason for the slow speed is that you are invoking ls separately for every file anywhere beneath the home directory! This amounts to more than 100,000 separate process invocations here on my laptop, and I imagine that a more active developer would have even more files that I do. To improve the situation, we should use xargs to pass as many files to each invocation of ls as it can handle, which will minimize the number of times a process fork must occur.

    This gives us the shell script:

    find -type f -print0 | xargs -0 ls -l | awk '{print$6,$0}' | sort | head
    

    which finishes, on my laptop, in 2.3s — even more quickly than my Python script!

    But I still prefer the Python script, both for its portability — I think that it will probably run on Windows (has anyone tried?) — and, of course, for the fact that the Python script made this post suitable for the Python planets. :-)

  • Oldest file (of my own) in my home directory

    [...] An interesting enough question/meme for me to jump on (via David Hancock, via Brandon Rhodes) [...]

  • Aaron West

    Thanks for posting the Python code Craig. I was having some look with find and ls but your script did the trick in the end. I have a blog post going live on Monday morning about this at http://www.trajiklyhip.com/blog

    Cheers!

  • Robert Lehmann

    You don’t have to use ls and parse its output in order to obtain the mtime of a file. Instead one can use the way more lightweight stat utility (directly from find without re-piping with xargs, times are in seconds since epoch then, though):

    find -type f -exec stat -c "%Y %n" {} '+' | sort -n

    Actually invoking other utilities than find is a waste of resources — find is perfectly capable of performing all those tasks itself:

    find . -type f -printf "%T+ %p\n" | sort

  • Martin Vilcans

    Interesting idea. I did the same. Thanks for the nostalgia trip.

  • Brandon Craig Rhodes

    @Robert: Wow, I didn’t know find could do that! While your find that calls stat runs terribly slowly on my laptop — again, because like other solutions that invoke one command per file, it runs up running more than 100,000 commands — the second one that uses find’s innate printing ability ran faster than any of the other solutions here! I’m glad to know find can accept formatting codes.

    If only it worked for Windows users too. :-)

  • Personal history in ~ · DragonFly BSD Digest

    [...] and figure out where they came from.  The page I linked to uses a Linux-specific search, but some other pages have a scripted way to do it that should work on DragonFly. Categories Goings-on [...]

  • Jason R. Coombs

    I like the Python script because it runs on my Windows box. I had to pass u’.’ to os.walk because I’m using Python 2.5 and have files with unicode characters that will crash the script unless the results are retrieved as unicode.

  • Jason R. Coombs

    Okay. So I debugged the unicode issue, and ran into another issue. Some files I have report mtimes python -c “print datetime.datetime(1970,1,1)-datetime.timedelta(seconds=1195402711)”
    1932-02-14 07:41:29

    Turns out I got those files from a CD burned of my MRI, so probably was created by custom software that set that value arbitrarily or used it for something else.

    So after filtering out all the garbage, the first hit is an old game called Cytroxia by Caffeine Software. Then the game “HitchHiker’s Guide to the Galaxy,” a Zork-like adventure game. Then JUMPMAN (I was pretty into games, I guess). Some wav files, then some scraps from a slambook I put together in 1993 in Aldus Pagemaker 4.

    I also tracked down a handy Windows Powershell command, since we don’t have ‘find’ without cygwin:

    dir -rec . | sort -property lastwritetime | select -first 5

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>