An ironic young man ... may be viewed as a pest to society.

- Carlyle

Sunday, June 12, 2011

Archives, Antiquarianism, and the Digital Age

One day, they find (in the old papers from the mill) the draft of a letter from Vaucorbeil to the Prefect.
The prefect has asked whether Bouvard and Pécuchet are dangerously insane. The doctor's letter is a confidential report explaining that they are just two harmless imbeciles. They recapitulate their actions and thoughts, which for the reader should be a critique of the novel.
"What shall we do with this?"--No time for reflection! Let's copy! The page must be filled, the "monument" completed. All things are equal: good and evil, beautiful and ugly, insignificant and characteristic. There is no truth in phenomena.
End with a view of our two heroes leaning over their desk, copying.
- Gustave Flaubert, notes for the last chapter of Bouvard and Pécuchet [reprise from 3 years ago]
When you use an archive or a manuscript department in Russia, every dusty old file you work with comes with a little sheet of paper in the front with rows of dates and names. This is a list of everyone--often back to the 1940s--who's ever looked at this file before you, and, ideally, what they used it for. Adding your name to the list, as the archivists insist you do, is a little thrill: you feel like you're joining a whole tradition of anonymous scholarly scribblers who've tried to puzzle out the quirks in your manuscript. (Of course, when you find a blank sheet, or one on which the last entry is from 1953, it's an entirely different kind of satisfaction.)

Signing dozens of these things in the past few weeks has made me perplexed and even a little obsessed about the question of what it all means. Or, more specifically: what makes me different from them? If I see the same name recurring through all the folio miscellanies I've been reading, am I, in effect, just copying someone else's work, despite never having heard of them before? What am I even contributing here? These are not, I hasten to add, the kinds of texts that newish techniques of interpretation borrowed from anthropology or literary studies can really contribute to unpacking, at least not at this stage of the process. The relevant questions are much more basic. Who was the author of, say, this 18th-century list of forms of Sino-Russian diplomatic address tucked in the back of a 1787 collection of Russian-language sources on China? Why is the paper noticeably different from the rest of the book? For that matter, who made the book and why, who read it, who sold it and bought it, how did it end up in the Imperial Public Library? With manuscript books these problems are omnipresent, because they're all unique and heterogeneous texts in a much more profound way than any pomo novel. We know so little about the contents that even a discovery of authorship or the succesful attribution of an ex-libris counts as a major revelation.

This is not a category of sources I'm used to using. As an eighteenth-century historian, paradoxically enough, I do almost all my primary-source work in the confines of Google Books in between posting on Facebook and reading rage comics written by anonymous nineteen-year-olds somewhere in Ohio. Even when the books I'm working with are relatively unknown, they're provided with title pages and call numbers and keyword classifications and even decodings of pseudonymous authorship and other esoteric bibliographica. In the reassuring high-contrast black-and-white of print, there is clearly a text: something firm and concrete and page-numbered that can be interpreted with the help of all the field-specific apparatus I've accumulated. With manuscript books, half the time I'm not even sure if the text has an end or not, and how many pages it might be missing.

There's one thing that the older historians who'd inscribed their names above mine definitely had over me. They still knew how to read handwriting. I don't think I've written anything by hand longer than a couple of paragraphs in years, and even beautiful modern handwriting can be a struggle to get through. Eighteenth-century Russian script, in which about a good chunk of the letters are set off in a modified form above the line but look for all the world like indistinguishable squiggles, is just painful, especially if you're trying to transcribe it at any length:
And then, too, they were the beneficiaries of a tradition of scholarship that emphasized precise and rigorous technical training and source-mastery. In fact, really, that's the heart of the issue. Historians spent so long castigating their more strictly-trained predecessors for their obsession with antiquarian drudgery and their lack of imagination that they left us, their students, totally helpless in philological matters. Reading Soviet-era catalog descriptions is an especially humiliating experience, since the catalog's author inevitably manages to pry more information out of the text than you can even dream of doing. I can barely figure out what a watermark is supposed to depict half the time; they've not only decoded it but found the precise year and date the paper was manufactured.

It would be tempting to just take my helplessness as a given and give up on the thing entirely; there must still be some intellectual historians out there who can make a whole career without seeing the inside of an archive. But this is actually where things get interesting. As a historian of the digital generation, I am far better equipped than any of my predecessors, for all their tedious training, with the kinds of technical information that was their hallmark. This is, essentially, the result of three pieces of technology which are monumental in themselves but which can now be taken essentially for granted:
  1. Scanning and ClearScan OCR, courtesy of Adobe Acrobat;
  2. Google Books;
  3. PDF and document indexing at the level of the operating system, whether in OS X or in Windows 7.
The confluence of these technologies means that I can type in any proper noun that seems to be important in a given text and have it produce results across a wide range of scanned reference materials and other scholarly guides that I keep on my computer (in addition to notes, papers, and so on that I've produced myself). Google Books is key for this, because Google Books now preserves virtually anything ever cranked out by a nineteenth-century antiquarian. Where in the past, even discovering the existence of a reference publication or collection of published documents--even one that made it into print in a run of several hundred--would often have required physically traveling to the library where the book is kept, I can now draw instantly on dozens of painstakingly assembled technical references covering virtually every aspect of my subject matter.

The delicious irony of this is that this is anything but a natural Whiggish consequence of the evolution of technology. In fact, the whole thing is built on a central and highly contingent cultural factor that never seems to get mentioned: the existence of a period, whose end mostly corresponded with the last decades (1900-1920 or so) from which sources made it into the public domain, in which a substantial fraction of the scholarly world thought it was worthwhile to contribute tiny bricks to the edifice of human knowledge. We today mostly think those people were naive; to assert that the notion of an edifice was incoherent in the first place is basically conventional wisdom. But if the historiographical enlightenment that eventually led to the demise of positivism had emerged in the 1890s instead of the 1930s, then Google Books wouldn't be nearly as useful for us today. In a way, it turned out that those antiquarians were more right than they could have possibly imagined: with enormous book databases, indexing, and OCR, their bricks really did become part of an edifice that is looking more and more coherent every day.

My broader point here is that there's a lot less that separates the new, trendy digital knowledge from the old, musty book knowledge than we like to think, despite unenlightening commentary from both sides of the aisle. After all, check out the Wikipedia article on, say, Ultramontanism, and what do you find? Two citations at the bottom from the 1913 Catholic Encyclopedia and the 1911 Encyclopedia Britannica. Now, it's fairly obvious that the Wikipedian who produced the entry did not physically go to the library and purposely locate the oldest Catholic reference source he could find. Instead, he (or she, or more likely whatever bot was doing the actual work) went straight to the public-domain sources and took as much text as he could from there. Because of the extent to which the Catholic Encyclopedia and other similar texts are used throughout the Wikipedia world, the site has now become possibly one of the finest sources on obscure dogmatic questions anywhere. Modern nerds obsessively catalogue characters from One Piece; old nerds did the same for the Jansenist controversy-- and the insults hurled at both groups are surprisingly similar. Turns out they're the winners in the end.

3 comments:

  1. I'm not sure what you're getting at by pronouncing antiquarianism dead as of the 1930s. In Southern history I find that almost everything of real worth has been done since the 1940s (which makes the older material in Google Books minimally useful and often misleading, at least for anything besides the Civil War). And I would associate the demise of the culture that produced the 1911 and the Catholic Encyclopedia with WWI, not the "demise of positivism." (When I wrote about Wikipedia & the 1911 I quoted it on the subject of modern artillery: "Massed guns with modern shrapnel would, if allowed to play freely upon the attack, infallibly stop, and probably annihilate, the troops making it.")

    ReplyDelete
  2. Positivism, in the way I'm using it here, means "an approach to knowledge that treats it as essentially reified and fungible," which was the governing (if often unarticulated) philosophy of many, many of the people who spent their lives putting together those enormous document collections and references. By the 1930s this was already beginning to look outdated, although it wasn't quite dead yet. I don't think there's any necessary conflict between this kind of process and an event like WWI.

    To me, 19th century antiquarian document collecting and publishing (and the associated study of historical minutiae) has a very different feel and flavor from the same type of work that was done after the 1930s. It is possible that this varies by field.

    ReplyDelete
  3. This is really cool. Without being too reductive of a very complicated history of historiography, can I suggest it would be useful cautiously to distinguish the project of information-gathering from that of information-processing? We may make something quite different of 'how things actually were' from time to time, but having the materials at hand to do the making with is pretty important no matter what.

    ReplyDelete