This website is assembled with JavaScript, mainly because that's the easiest way to use the excellent KaTeX. I'm using Metalsmith to write the blog, which has been very pleasant, but every so often the essential madness of JS shines through.
For testing, I imported a few posts from an old dead blog.
To my suprise, the posts show up out of order, but only once every 20 or so times I rebuild the blog. I get enough nondetermistic threading bugs in the day job, and I thought that writing a blog would be a more civilised affair. I start looking around for something to blame.
Metalsmith uses a package called
recursive-readdir
to list the files containing posts. The documentation for
recursive-readdir
warns that the order of files inside directories
is "not guaranteed", and sure enough it returns entries in
nondeterministic order. (Node internally runs calls to fs.stat
on
background threads, so the callbacks get run in random order).
But I sort the list of files that readdir
returns:
.use(collections({
posts: {
pattern: '*/index.md',
sortBy: 'date',
reverse: 'true'
}
}))
metalsmith-collections
sorts the posts by date
, in reverse order. Surely sorting will put
the posts in a particular order, no matter what order they arrive
in. Isn't that what sorting is? But no, not in JavaScript.
The date
field comes from the "frontmatter", a chunk of
YAML-formatted metadata at the top of a post's source file:
---
title: "The incomparable JavaScript"
date: 2017-03-05
---
This website is assembled with JavaScript, ...
The old posts I'd imported were written using Octopress, which writes its dates like so:
---
title: "Some old nonsense"
date: 2012-12-30 13:12
---
The formats are different, but this doesn't seem like it should cause
a problem: both of them are valid ISO-8601 dates. If you sort them as
strings, you get the same answer as if you sort them as dates. Even if
you sort them by some bizarre ordering that puts "2012-12-30 13:12"
before "2017-03-05"
, it should at least be deterministic. So what's going on?
Well, to parse the frontmatter, Metalsmith calls gray-matter
which calls js-yaml
, which makes a brave effort to parse dates
and times using this regex:
var YAML_DATE_REGEXP = new RegExp(
'^([0-9][0-9][0-9][0-9])' + // [1] year
'-([0-9][0-9])' + // [2] month
'-([0-9][0-9])$'); // [3] day
var YAML_TIMESTAMP_REGEXP = new RegExp(
'^([0-9][0-9][0-9][0-9])' + // [1] year
'-([0-9][0-9]?)' + // [2] month
'-([0-9][0-9]?)' + // [3] day
'(?:[Tt]|[ \\t]+)' + // ...
'([0-9][0-9]?)' + // [4] hour
':([0-9][0-9])' + // [5] minute
':([0-9][0-9])' + // [6] second
'(?:\\.([0-9]*))?' + // [7] fraction
'(?:[ \\t]*(Z|([-+])([0-9][0-9]?)' + // [8] tz [9] tz_sign [10] tz_hour
'(?::([0-9][0-9]))?))?$'); // [11] tz_minute
This matches 2017-03-05
, but not 2012-12-30 13:12
- it only
matches times if they include a seconds
field. When
js-yaml
finds a match, it turns it into a JS Date
, otherwise, it
leaves it as a string. So, I end up with two posts with these dates:
date: new Date(2017, 3, 5)
date: "2012-12-30 13:12"
So what happens if you sort a list containing dates and strings in
JavaScript? Well, at some point, you're going to compare a string and
a Date
, and JS doesn't know how to do that. Its response is to
convert both to numbers,
presumably on the basis that it does know how to compare those, and
it wishes you'd just asked it a simple question like that rather than
something so complicated.
Converting the Date
turns it into the number of milliseconds since
epoch, which is vaguely reasonable, but converting the string yields
NaN
("it's Not a Number", says Javascript, beaming proudly, before
being gently pulled away from the keyboard by embarassed parents). We
get:
date: 1491346800000
date: NaN
The check 1491346800000 < NaN
returns false
. The check NaN < 1491346800000
also returns false
. When we sort a list containing
these two, the order that things come out it depends on the order they
went in.