arXiv, Our Printing Press

IMG_20160714_091400

Johannes Gutenberg, inventor of the printing press, and possibly the only photogenic thing on the Mainz campus

I’ve had a few occasions to dig into older papers recently, and I’ve noticed a trend: old papers are hard to read!

Ok, that might not be surprising. The older a paper is, the greater the chance it will use obsolete notation, or assume a context that has long passed by. Older papers have different assumptions about what matters, or what rigor requires, and their readers cared about different things. All this is to be expected: a slow, gradual approach to a modern style and understanding.

I’ve been noticing, though, that this slow, gradual approach doesn’t always hold. Specifically, it seems to speed up quite dramatically at one point: the introduction of arXiv, the website where we store all our papers.

Part of this could just be a coincidence. As it happens, the founding papers in my subfield, those that started Amplitudes with a capital “A”, were right around the time that arXiv first got going. It could be that all I’m noticing is the difference between Amplitudes and “pre-Amplitudes”, with the Amplitudes subfield sharing notation more than they did before they had a shared identity.

But I suspect that something else is going on. With arXiv, we don’t just share papers (that was done, piecemeal, before arXiv). We also share LaTeX.

LaTeX is a document formatting language, like a programming language for papers. It’s used pretty much universally in physics and math, and increasingly in other fields. As it turns out, when we post a paper to arXiv, we don’t just send a pdf: we include the raw LaTeX code as well.

Before arXiv, if you wanted to include an equation from another paper, you’d format it yourself. You’d probably do it a little differently from the other paper, in accord with your own conventions, and just to make it easier on yourself. Over time, more and more differences would crop up, making older papers harder and harder to read.

With arXiv, you can still do all that. But you can also just copy.

Since arXiv makes the LaTeX code behind a paper public, it’s easy to lift the occasional equation. Even if you’re not lifting it directly, you can see how they coded it. Even if you don’t plan on copying, the default gets flipped around: instead of having to try to make your equation like the one in the previous paper and accidentally getting it wrong, every difference is intentional.

This reminds me, in a small-scale way, of the effect of the printing press on anatomy books.

Before the printing press, books on anatomy tended to be full of descriptions, but not illustrations. Illustrations weren’t reliable: there was no guarantee the monk who copied them would do so correctly, so nobody bothered. This made it hard to tell when an anatomist (fine it was always Galen) was wrong: he could just be using an odd description. It was only after the printing press that books could actually have illustrations that were reliable across copies of a book. Suddenly, it was possible to point out that a fellow anatomist had left something out: it would be missing from the illustration!

In a similar way, arXiv seems to have led to increasingly standard notation. We still aren’t totally consistent…but we do seem a lot more consistent than older papers, and I think arXiv is the reason why.

Advertisements

2 thoughts on “arXiv, Our Printing Press

  1. Wyrd Smythe

    I’ve been wondering lately about how the electronic world in general will affect future history, especially now that we routinely record, not just images, but moving images complete with sound.

    The same process you describe with anatomy books and equations in arXiv papers may come to apply to history in general.

    Liked by 1 person

    Reply
  2. ohwilleke

    As good a place as any to make this pitch. arXiv costs money and takes time. It is not a “for profit” venture and does not rely upon advertising revenue. It needs donor and volunteer support to keep running. Yet it is arguably one of the biggest global institutional assets known to the discipline, and through that to the advancement of the human prospect in the long run. I do my share to contribute to this critical part of the scientific infrastructure and you (readers and author alike) should too.

    Liked by 1 person

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s