The Schoenberg Institute for Manuscript Studies at Penn brings manuscript culture, modern technology and people together.

1 Comment

Digital Manuscripts as Critical Edition

The following post is the written version of a presentation that Christoph Flüeler, Director of e-codices and Professor at the University of Fribourg, presented at the 50th International Congress on Medieval Studies in Kalamazoo, MI, May 2015. It has been very lightly edited by Dot Porter. Prof.  Flüeler has long been a leader in digital manuscript studies, and in his talk he proposed an exciting vision of digital manuscripts as critical edition. With Prof. Flüeler’s permission, we are very pleased to share his talk here on the SIMS blog. He will soon develop these thoughts into a longer article, which will be published in a more formal venue.

The point of departure for my contribution is as follows: in coming years an enormous number of manuscripts, tens of thousands of them from thousands of manuscript collections throughout the world, will be digitized and made available on the Internet. A few years from now perhaps a majority of all manuscripts of great cultural, artistic, and scientific value will be accessible online. As this happens, quality requirements regarding image quality, metadata, and user interfaces will markedly increase, and standards will be established, so that all over the world metadata and images can be processed and annotated via comprehensive and specialized manuscript portals and interoperable image viewing platforms. This presumption is based on careful observation of developments during the past ten years and of the large number of projects currently planned or in progress. Everyone who attended yesterday’s session entitled “All Medieval Manuscripts Online: Strategic Plans in Europe” with presentations by the British Library, the Bibliothèque nationale de France, the Bayerische Staatsbibliothek München, and e-codices knows that I refer here only to concretely planned projects.

If digital manuscripts become ever more important for scholarly research in future, the following question arises: whatis the “scholarly research value” of digital manuscripts?

Discussion of the matter has thus far been conducted in an undifferentiated manner by persons interested in defending the exclusive status of the originals and who in some cases go so far as to question whether digital reproductions have any scholarly research value at all. This point of view strikes me as rather unconstructive, because it simply dismisses as unreliable these resources on which most researchers already rely, and on which they will in future base their work to an ever greater degree.

My perspective is a bit different. What we need to do is to ask the following question: what preconditions must be met in order for a digital manuscript to be understood as a reliable resource for scholarly research, such that a scholarly researcher can, without any great misgivings or doubts, utilize the digital object as the basis for serious research and make use of it to the fullest possible extent?

Central for my reflections is the different status of the physical manuscript and the digital manuscript. The fact that the relationship between physical manuscript and digital manuscript has barely been examined up to this point is rather astonishing. It is probably because a serious theoretical consideration of the immediate precursors of the digital manuscript was never undertaken; I speak here of print facsimiles and microfilms. Facsimile editions are hugely popular with collectors. The production of facsimiles is normally understood as a work of fine craftsmanship. While scholarly researchers are employed in their production, they contribute only the accompanying commentary. Theory has obviously been considered out of place when it comes to the production of facsimiles.

Microfilms, on the other hand, have always been seen as not particularly attractive research aids, are often incomplete, often contain errors, and are as a rule only black-and-white. They are still maintained as archival copies, but for scholarly researchers their usefulness as reproductions has (for the most part) been superseded.

In this context I will not raise the matter of the qualities that distinguish a digital manuscript from a facsimile edition or a microfilm. The advantages of the digital manuscript are too obvious to require enumeration here.

I would like to ask, instead, how a digital manuscript stands in relation to a critical edition of a text. Can the publication of a digital manuscript on the internet be understood as an edition? Further: could such an edition even be regarded as a critical edition?

I would like to consider again the statement I made earlier, in which I asserted that for scholarly research purposes a digital manuscript must be understood as a reliable resource to the extent that medievalists from various disciplines (for ex. History, Art History, History of Law, History of Philosophy, Classical Philology, etc.) can utilize the digital object as the basis for serious research and make use of it to the fullest possible extent.

This echoes the proper purpose of a critical text edition. A critical text edition does exactly this, and the science of creating editions has since the 19th century developed methods for achieving this goal. A critical text edition aims to create an authoritative and easily accessible text. Its usefulness is, however, often far greater: a critical text edition can, for example, highlight the historical dimensions of the transmission of a text and use a critical apparatus to tease out intertextual aspects of the text in ways that far exceed simple transcription. In addition, a critical text edition can drill down to a more original text, identify errors in transmission, and provide a text so convincing in its authenticity that it comes to be accepted in the scholarly research community as an authoritative version of the text.

If we do not insist that the definition of edition can only be applied to a traditional text edition, we can in point of fact understand the publication of a digital manuscript on the Internet as a scholarly edition.

In the meantime there are already thousands of texts which have received their first publication as digital manuscripts. This is also true of hundreds of texts found on e-codices. It is important not to underestimate the usefulness to scholarly research of this additional method of editing, especially for texts that have never been edited previously and that would perhaps otherwise never have been critically edited.

What scholars need are good, scientific editions. This is true for both text editions and editions of digital manuscripts. We can only regard as serious critical editions those that follow established scientific criteria, developed with a firm grounding in the concept that the publication can substitute for the original as a resource for research, up to a certain point and for specific purposes, and that it offers some type of added value beyond that of the original. A digital manuscript, like a traditional critical edition, is not merely a cheap copy, but ideally can show aspects of the primary resource, i.e. the original manuscript, that were not visible in such a way when viewing the original.

It is obviously important to note that the critical edition of digital manuscripts is a different task from the critical edition of texts transmitted in manuscripts. It is, however, not any less exacting.

No edition theory has yet been written concerning digital manuscripts. I can only briefly enumerate some relevant themes and desiderata.

A digital manuscript edition should, like a critical text edition, follow documented scholarly research criteria and not produce a plain, unexamined reproduction of the material object—in this case a physical manuscript, but should—as I already emphasized—create some added value and bring out new aspects of the manuscript that have not previously been observed or recognized; and a digital manuscript should obviously provide a reliable foundation for current research of the original manuscript.

The most authentic possible scientific reproduction is the first step. Completeness, high image quality, and true color must be provided. Measurability and verifiability are fundamental to access for all purposes of scholarly research. Digital manuscripts consist of digital reproductions. It is therefore essential to provide not only metadata about the manuscript, but also metadata about the digital object. The colors must be measurable, not only by using a simple color sample strip, but by employing a complex Color Management System. This is actually already standard these days, but as soon as the files are uploaded to the Internet, all the care that goes into this is often ignored. Image metadata, such as IPTC metadata, should be available together with the digital image, and should be linked closely enough that when images are transferred—for example, into another image viewing platform—the image metadata are automatically attached. Dimensions should be measurable in every part of a manuscript. Simply including ruler in an image is here, as in other cases, not sufficient; a digital measuring tool with flexible usability would always be preferable. In practice, we are for the most part still a long way from such precise, reliable and measurable digital images at this point; however, they are fundamental for serious scientific work. How is one to conduct serious research with images, if the images on the screen are often slightly distorted, the colors are not accurate, and no reliable measuring tool is available? Not to mention poor resolutions of less than 300 dpi! Products like this are simply a waste of money.

Digital manuscripts do not consist merely of digital reproductions though. A digital manuscript is a virtual product that reproduces a tangible object in its entirety. This includes the proper sequencing of images. A data model must ensure that the image sequence remains intact when displayed in other viewing platforms. The same is obviously true for metadata regarding the physical manuscript and the digital manuscript, which aid in understanding the manuscript as manuscript, but also as digital object. I am referring to metadata in the broad sense. This includes: basic metadata, structural metadata, scholarly descriptions, image descriptions, metadata regarding codicology, digital object metadata, reports about additional restoration, and ideally even the full range of existing research literature. Finally, this includes—and very importantly—the transcriptions and editions of the text contained in the manuscript. If a critical edition of a digital manuscript is to comprehend the physical manuscript in its entirety, then text editions form part of it. In the future, text editions should not be understood as separate from digital objects, but as integral parts of them. I regard these integral parts not as competing or the edition as an absolute condition, but rather that these are complementary pieces of the ideal whole. Metadata can be added as desired—the richer the data included, the greater the usefulness and the scholarly research value.

A digital manuscript can and should be used to show more than is visible or explicitly contained in the original. Illustrations can be enlarged. Structural elements of the codex and the text can be accentuated. Individual illustrations or parts of the text can be annotated, and transcriptions and editions can be set next to the page images. Codicological features such as quires, watermarks, and color analysis can not only be provided, but can even be analyzed and interpreted within a digital manuscript. The research area of Image and Text Recognition is hard at work on tools to recognize and analyze layout, script types, scribal practices and eventually even texts.

It is important to emphasize that a digital manuscript should display a manuscript in its entirety. But we should even go a step further. A critical edition of a digital manuscript should not treat only a single manuscript, but should include as much data as possible about other related manuscripts and sources, in order to promote viewing the special qualities and features of the particular manuscript in a broader context. In this area as well the established methods of scientific editing aid me in developing criteria that can be applied to digital manuscript editions.

One fundamental task when editing critical editions of medieval texts transmitted in manuscripts is to collate individual transcriptions of texts and thereby obtain new information. The critical apparatus presents variations of the text as transmitted by the manuscripts used for the edition. This critical apparatus delivers indications of explicit and implicit references to other works as well and unfolds the intertextuality of the text. This means that a critical text edition goes beyond transmission of the text found in a single manuscript.

A critical digital edition of a manuscript can for example expand quire analyses, descriptions of illustrations, script analysis, structural analysis, water mark analysis of a single manuscript via metadata for another digital object, or other objects can be incorporated for the purpose of gaining new information. Let me offer just one example: quire composition and layout analysis can be performed across manuscripts from the same scriptorium or other scriptoria in order to recognize features peculiar to a particular manuscript, a scriptorium, or an entire epoch. A digital manuscript is thus more than just a digital version produced from a single physical object. It effectively has the potential toincorporate the entirety of manuscript transmission contained in all medieval manuscripts.

The publication of medieval manuscripts on the Internet has made amazing progress during the past ten years. Digital manuscript libraries have transcended the status of pilot projects. Digital manuscript libraries have become more professional and have by now become an essential part of the research infrastructure. This is surely due to the fact that not just a few individual manuscripts, but over 15,000 medieval manuscripts have been presented online up until now.

However the success and importance of digital manuscript libraries depend not so much on the number of digitized manuscripts as on the scientific quality of those digital manuscripts, which can only achieve fundamental change in the area of manuscript research through a critical theory of the digital manuscript.

Thank you for your kind attention.

Kalamazoo, May 15, 2015

Christoph Flüeler


OPenn:  Primary Digital Resources Available to All through Penn Libraries’ New Online Platform

The Penn Libraries and the Schoenberg Institute for Manuscript Studies are thrilled to announce the launch of OPenn: Primary Resources Available to Everyone (, a new website that makes digitized cultural heritage material freely available and accessible to the public.  OPenn is a major step in the Libraries’ strategic initiative to embrace open data, with all images and metadata on this site available as free cultural works to be freely studied, applied, copied, or modified by anyone, for any purpose.  It is crucial to the mission of SIMS and the Penn Libraries to make these materials of great interest and research value easy to access and reuse.  The OPenn team at SIMS has been working towards launching the website for the past year.  Director Will Noel’s original idea to make our Medieval and Renaissance manuscripts open to all has grown into a space where the Libraries can collaborate with other institutions who want to open their data to the world.

OPenn launches with the entire corpus of manuscripts donated to the Penn Libraries in 2011 by SIMS founder Lawrence J. Schoenberg and his wife Barbara Brizdle Schoenberg.  The Schoenberg Collection features manuscripts from all over the world, with a focus on science, technology, engineering and mathematics.  To interest the public in the visual splendor of materials on OPenn we have uploaded some images from the Schoenberg Collection onto Flickr at, with links in the records to OPenn.

More datasets, including manuscripts from the University of Pennsylvania’s own holdings and items from other institutions, will be added to the site in the near future, so stay tuned.  Historic diaries from a variety of institutions belonging to the Philadelphia Area Consortium of Special Collections Libraries (PACSCL) are next in line for inclusion on OPenn.  Many of these documents are unknown while others are celebrated, such as the Union League’s Tanner manuscript: a firsthand account of the events surrounding the assassination of Abraham Lincoln.

Images of the manuscripts are currently available on OPenn at full resolution, with derivatives also provided for easy reuse on the web.  Downloading, whether several select images or the entire dataset, is easily accomplished by following instructions or recipes posted in the Technical Read Me on OPenn.  The website is designed to be machine-readable, but easy for individuals to use, too.

SIMS’ very own Dot Porter has already used the dataset to create e-books from the images and metadata on OPenn.  You can download the e-books in the free and open epub format at Penn Libraries’ Scholarly Commons.   She has also used the Internet Archive BookReader, an open source online page-turning book reader, to generate online versions of each manuscript.  An example using LJS 225, Litterarum simulationis liber, can be seen at: .  You can search and browse manuscripts in OPenn (along with digitized manuscripts from The Digital Walters) here:  These formats serve as excellent tools for raising awareness of manuscript culture and for showcasing manuscripts’ unique graphics and aesthetic appeal.  OPenn also enables rigorous study and scholarly discovery by increasing ease of study for researchers interested in these manuscripts.  For instance, images of individual pages can be manipulated to re-create the order in which the pages were written, as opposed to the order in which they were collated for binding, providing leeway in exploration that researchers might not have otherwise.

These are just a few ways the data can be manipulated, but we anticipate surprises once scholars and researchers begin using data on OPenn.  We hope you are inspired to reuse OPenn data and to share your project with the world.  If you have any questions or comments, send us an email at

Leave a comment

Manuscript Road Trip: Reconstructing the Beauvais Missal

The latest Manuscript Road Trip post by SIMS friend Lisa Fagin Davis announces a new adventure in digital fragmentology.

Manuscript Road Trip

The Flight into Egypt, Walters Art Museum, MS W.188, f.112r The Flight into Egypt, Walters Art Museum, MS W.188, f.112r

If you’ve been travelling with me on this virtual road trip around the United States, you have almost certainly come to know the dismembered beauty known as The Beauvais Missal. I’ve mentioned it many times and shown you several different leaves found in various collections. And I’ve ruminated about the possibility of digitally reassembling this masterpiece of thirteenth-century illumination. Well, it’s time to stop dreaming and start doing.

Cleveland Museum of Art, ACC. 1982.141 verso Cleveland Museum of Art, Acc. 1982.141 verso

Working with the “Broken Books” project at the St. Louis University, I have begun a digital reconstruction of the Beauvais Missal. The “Broken Books” project will result in the development of a platform for reconstructing broken books as well as the establishment of a metadata structure designed specifically for manuscript fragments and leaves. My Beauvais Missal project will serve as one of several case studies in…

View original post 2,365 more words

Leave a comment

Ms. Codex 909, [Le livre des Eneydes]: A Fine Example of Lettre Bâtarde

My interest in Ms. Codex 909 began last summer while I was taking a non-credit course, Introduction to Paleography, offered by the Schoenberg Institute for Manuscript Studies and taught by Penn manuscript cataloger Amey Hutchins and Schoenberg Database of Manuscripts project researcher and English doctoral candidate Alex Devine.  I would highly recommend it to anyone with an interest in paleography, whether you’re a graduate student, librarian, or independent scholar.  Each student was asked to give a final presentation on the script used in a manuscript of his or her choosing.  While there were many intriguing manuscripts to choose from, I landed on [Le livre des Eneydes], the first French translation of Virgil’s Aeneid.  It is, in fact, one of four extant manuscripts known to contain Octovien Saint-Gelais’s translation.  Of the other three manuscripts, two are located at the Bibliothèque Nationale de Paris, while the remaining codex can be found at The Hague.

While I transcribed the first two folios for the project, I did not attempt a reading of the entire manuscript due to time constraints; however, an article written in 1990 by Thomas Brückner goes into some detail about Saint-Gelais’s groundbreaking translation.  According to Brückner, the translation is a loyal reproduction of the narrative, but the Virgilian style of the epic poem is not painstakingly interpreted, as later Renaissance translators would attempt under the influence of the ancient poetic concept of imitatio.  Saint-Gelais often omits words, usually descriptive words that were used by Virgil for the sake of embellishment, but rarely an entire verse.  He injects numerous Latinisms into his text, taking a Latin word and giving it a French ending.  Brückner also notes that he draws words from the gloss of Servius at times, and not from the original Latin.  In a way, Saint-Gelais edits Virgil’s text too.  He inserts explanations that do not appear in the original epic for the sake of the reader, due to the fact that he is translating from hexameter into decasyllabic couplets and details may get lost from one line to another.[1]

The manuscript begins with a beautiful sloping ductus written in lettre bâtarde, sometimes called lettre bourguignonne.  Its namesakes are the Dukes of Burgundy, Philip the Good and Charles the Bold, who were patrons of the arts in the 15th century and who commissioned many deluxe manuscripts in the French vernacular.  Penn’s manuscript itself may have been meant for a noble audience, as the large blank spaces left for illumination may indicate.  The script’s other name, lettre bâtarde, refers to the hybridity of the script that displays characteristics of Gothic Textura but incorporates calligraphic features as well.  The Gothic influence can be seen in the single compartment a, the minims that are hard to tell apart, quadrangles that are sometimes formed at the bottom of the minims, and the inclusion of spiky details like horns.  The scribe uses calligraphic technique to create the looping ascenders of the b, h and l.  The letter f and the straight s slope from right to left across the page and are a tell-tale sign of lettre bâtarde.  They are created using a quill with a flexible nib, which can create great differences in the width of strokes.  The f and straight s are thin at the top, with a broad thick stroke in the middle that thins again to a pointed descender with a thin hairline stroke.  The calligraphic technique also includes many flourishes to the letters.


The minims are reminiscent of Gothic Textura


The sloping straight s is typical of lettre batarde

As I was flipping through the manuscript for the first time, I noticed what appeared to be two changes in the scribal hand.  I decided to focus my presentation on a comparison of the three hands to determine if my initial thought was correct.  For the sake of ease, I’ll call the hands the first hand, the second hand and the third hand based on where the change occurs in the book.  The change from first to second occurs around f. 50v and the switch from the second hand to the third hand occurs at f. 63r.

The reason that I estimate where the second change occurs is because it is much harder to tell where the first switch in hand happens.  It isn’t until f. 50v that I feel most certain it’s another hand.  There are a number of differences that become apparent as you move closer to f. 50v, but the change seems to happen gradually.  The script becomes smaller and the o, a, and d become rounder.  The minims are more reminiscent of Gothic script and the loops on the ascenders are smaller.  The l, f and straight s become shorter.  The first hand is more angular and the f and s are more slanted, more like archetypal lettre bâtarde.  The first hand has more cursive thin hairline strokes and the highest level of execution of the lettre bâtarde of all three hands.  The gradual change in the appearance of the script could mean a few things, and begs the question of whether it is truly two distinct hands.  It could mean that there was only one scribe, but they are slowly modifying their hand until it comes to look quite different by the time you reach f. 50v.  On the other hand, no pun intended, a second scribe may have been trying to imitate some of the features of the first scribe but slowly reverted to their own stylistic idiosyncrasies.  I would be interested to hear the opinions of more learned scholars than myself who might want to take a look at Codex 909.


First hand


Second hand


Third hand

The change from the second hand to the third hand is much more noticeable and I have more confidence that it is another scribe altogether.  Only after f. 63r do the ascenders of each top line of every page double in size, taking up the height of almost an entire line (see f.62v-63r below).  The hand looks a bit sloppier, as if it was written more hastily, and in fact there are some erasures and added lines that suggest this may have been the case.  The writing also takes up more space on the page and the lines often run past the ruled lines.  The strokes are thicker and there is more embellishment to the letters, especially compared to the second hand, which is the most spare of the three.  Of all the letters, the easiest ones to look at to identify differences were the d and the straight s in this codex.

62v1_16 63r1_16

I really enjoyed my time looking closely at this manuscript, which is a fine example of lettre bâtarde.  As Albert Derolez reminds us, “the impact of the individual scribes on the appearance of [lettre bâtarde] was quite strong, and, whilst the main characteristics remain the same, the visual impression produced by different pages of Bastarda script can be very varied.”[2]  This is certainly the case with Codex 909 and it was what made it such an interesting manuscript for my final project.  I hope this post piques your interest in my findings.  I invite you to take a look at the Le livre des Eneydes on Penn in Hand or come see it in person in the Kislak Center reading room.

[1] Brückner, Thomas.  Un traducteur de Virgile inconnu du XVIe siècle : Jean d’Ivry.  Les lettres Romanes, XLIV, n.3, August 1990, pp. 171-180.

[2] Derolez, Albert.  The Paleography of Gothic Manuscript Books from the Twelfth to the Early Sixteenth Century.  Cambridge:  Cambridge University Press, p.160.

Leave a comment

Manuscript Road Trip: The Schoenberg Institute for Manuscript Studies

Manuscript Road Trip

The Flight into Egypt, Walters Art Museum, MS W.188, f.112r The Flight into Egypt, Walters Art Museum, MS W.188, f.112r

As we head north out of Baltimore on I-95, we’ll cross the Delaware River and head into Wilmington, where there are manuscripts to be found at the University of Delaware.

The pre-1600 manuscripts at the University are part of a collection with the shelfmark “MSS 095.” There’s a list of the relevant records here and some highlights are described here. Of particular interest to me is a relatively recent acquisition, U. Delaware MSS 095 no. 31, a Book of Hours for the use of Noyon. There aren’t any images on the Special Collections website, but there are a few on this blogpost written by a Special Collections staff member, as well as a little information about the manuscript’s history. But I’d like to know more…how did it get to Delaware, and what can be gleaned about its history before…

View original post 1,290 more words

Leave a comment

LJS 454 – Seiyō Senpaku Zukai

For the majority of the Edo period (1600-1868), the Japanese shogunate enforced a policy of isolationism referred to the sakoku policy, codified in the 1630s and ended with Matthew Calbraith Perry’s (1794-1858) high-pressure negotiations to open Japan to Western trade. The sakoku period, however, did not relegate Japan to the status of hermit kingdom: trade was enacted with both China and Korea, as well as with the Ryukyuan and the Ainu peoples (each of whose domains would eventually become annexed by Japan). Of Western powers, however, only the Dutch were permitted to trade with the Japanese, and only on a small artificial island at Nagasaki Harbor. Along with material goods the Japanese imported a great deal of so-called “Dutch Learning” (Rangaku): medicine; astronomy; geography; engineering; and, as LJS 454 demonstrates, naval sciences.

LJS 454, Seiyō senpaku zukai 西洋舩舶圖解 (uniform title Gunkan zukai 軍艦図解) is documented innocuously enough in The Lawrence J. Schoenberg Collection of Manuscripts (Philadelphia : Schoenberg Institute for Manuscript Studies, 2013) with the descriptive title “Treatise on how to pack a Dutch merchantship.” It was only this past year that the manuscript was made accessible to the Japanese Studies department, who translated the title slip on the scroll Seiyō senpaku zukai as “An illustrated guide to Western ships”. But neither of these titles hint at the original intent of this work: a practical guide to naval self-defense.

Historical Background

In 1792, the Finland-Swede Adam Kirillovich Laxman (1766-1803?) was commissioned by the Russian Empire to return two Japanese castaways to Japan, with the aim of acquiring trading rights from the shogunate. Laxman landed on Hokkaido and was received by the Matsumae clan, the rulers of northernmost fiefdom of the Japanese shogunate centered at Edo (present day Tokyo). While Laxman’s trade concessions were not granted, he was issued documents promising that one Russian ship would be permitted entry at Nagasaki. It would take more than another decade, however, for Russians to attempt to use this travel pass.

Dejima (1820s)

“Plattegrond van de Nederlandse faktorij op het eiland Deshima bij Nangasaki” (1824/1825) (source: Wikimedia)

In the early 1800s, Nikolai Petrovich Rezanov (1764-1807) was commissioned by Tsar Aleksander I to open up trade with Japan at Nagasaki. Despite his attempts to woo the shogunate in 1804, the documents received from the Matsumae clan were not recognized, and Rezanov was sent back to Russia. Embittered by his failure, Rezanov plotted revenge against Japan, and employed two Russian naval officers, Nikolai Khvostov and Gavriil Davydov, to enact his vengeance. The two led a devastating raid on the Japanese settlement at the island of Iturup (whose territory is still in dispute between Japan and Russia today), and at several other points in the Sea of Okhotsk. Along their warpath, Khvostov and Davydov sent a threatening missive in French to the Matsumae clan, warning that further attacks would come if Japan didn’t open itself to Russian trade.

Motoki Shoei portrait

Portrait of Motoki Shōei. (via the City of Nagasaki website)

Despite the fact that these two officers acted on no official capacity, the shogunate considered this a legitimate threat from the Russian Empire, and the Dutch interpreters at Nagasaki were ordered to expand their skillsets by learning French and Russian. One of the interpreters chosen to learn French was Motoki Shōei 本木正栄 (1767-1822), also called Motoki Shōzaemon 本木庄左衛門. Shōei was the son of Motoki Ryōei (else “Yoshinaga” (1735-1794)), who made a name for himself by translating Dutch books on natural sciences, in particular astronomy. Shōei followed in his father’s footsteps as a Dutch interpreter, and his language skills were advanced enough that he was chosen to act as an official interpreter for Rezanov’s mission to Japan in 1804 (Rezanov himself, however, did not have a positive assessment of Motoki, and requested a new interpreter during negotiations). Besides forming the basis of Motoki’s French studies, the Khvostov and Davydov incident also left the government at Edo nervous about Western naval strength. At the behest of the shogunate, Motoki was chosen to translate critical Western materials into Japanese, including a treatise on Dutch gunnery, a map of the world, and a pictorial guide to Dutch warships. While the exact titles of the original materials are not clear, it is this final item that seems to be the basis of the original text of LJS 454, Gunkan zukai.

Gunkan Zukai and its Manuscripts

LJS 454 is one of several manuscript copies of Gunkan zukai extant in the world, and one of the only known copies existing outside of Japan. The variant copies available for inspection show that the textual content remains consistent across extant copies.

LJS 454 scroll

LJS 454 scroll with title piece “Seiyō senpaku zukai”.

The work is broken into three major parts. The first is a general survey of ships, with the section title Gunkan zukai kōrei 軍艦圖解考例 (“Introductory thoughts on illustrations of warships”). This introductory segment is likely the derivation of the title Gunkan zukai, though it is unclear if Motoki intended for his work to be called that. The kōrei is a lengthy discussion of various aspects of ships, including the circumstances leading to the document, the classification and nomenclature of ships, and remarks on the experience of sailing. This section ends with an attribution to Motoki.

The next major section is a series of illustrations. Some, like the copy held at the Museum of Sea Sciences (with a closeup of illustration here) in Kotohira, Kanagawa, show finely detailed shading on the illustrations. That copy, incidentally, is reportedly in Motoki’s own hand, and was owned at one time by the revolutionary Sakamoto Ryōma (1836-1867). Other copies, like Penn’s LJS 454 and Waseda University’s copy (fully digitized) have unshaded diagrams. Still other copies, like the one owned by Tokyo Metropolitan Library (available in reprint) have both shaded and unshaded elements. Other variations include levels of rubrication and the order of illustrations. Finally, some copies have clear notations on their date of copying. The copy held at the Nagasaki Prefectural Nagasaki Library (also available in a 1943 reprint) has a copying date of 1842. LJS 454, unfortunately, has no such information to help date it, though it could have been produced no earlier than 1808.

The final section is a series of remarks on the methods of nautical warfare, and is ostensibly the purpose for this work, despite it being the shortest section of the three.

While Motoki’s work is commonly referred to as Gunkan zukai, again, there is no direct evidence that his document was intended to have that title. The copy in Kotohira (reported to be in Motoki’s hand) is referred to as Seiyō gunkan kōzō bunkai zusetsu 西洋軍艦構造分解図説 (“A pictorial analysis of the structure of Western warships”). The Union Catalogue of Early Japanese Books (Nihon Kotenseki Sōgō Mokuroku) database, an authoritative source for information on Japanese books, offers the variant title Furansu gunkan kaibōzu 払郎察軍艦解剖図 (“An anatomy of French warships”). LJS 454, meanwhile, has a prominent title piece offering Seiyō senpaku zukai 西洋舩舶圖解 (“An illustrated guide to Western ships”). Saigusa Hiroto and Kodama Reizō, the explicators of the reprinted Nagasaki manuscript, had known of this last title but were unable to verify that it was a variant title of the work Gunkan zukai. LJS 454 confirms their supposition that the works are one and the same.

Source Materials

While Motoki is commonly considered the “translator” of this work, in the attribution of LJS 454’s “Introductory thoughts” he is referred to as the yakujutsu 訳述. This is a compound statement of two roles of translator (yaku) and “expressor” (jutsu). In context of Gunkan zukai, yakujutsu might be understood as “creator by way of translation.” Indeed, it appears that Motoki translated and recontextualized several elements of Dutch and possibly French materials to create a new work.

The introductory segment of Gunkan zukai (the kōrei) notes publications that served as its foundation, including a specific reference to a diagram published by “Korunerisu Kiri[p]peru” (Cornelis Kribber, active 1739-1780) in Utrecht. While the specific Kribber print is not immediately available for inspection, a likely related print from 1730s Nuremberg shows remarkable similarities to Motoki’s illustrations. Many of these same illustrations appear in L’Art de batir les vaisseaux et d’en perfectionner la construction, originally published in Amsterdam in 1719. This French edition itself seems to be a compilation from earlier Dutch works. While Motoki likely used a source similar to one of these, it is unclear if his translations derived from Dutch sources exclusively or if it drew from French compilations of them. At best, there are unclear references in Motoki’s manuscript notes on compiling Gunkan zukai (held at the Nagasaki City Museum) to a colleague who owned a pictorial guide to Western ships.

Gunkan zukai sundials

Comparison of three sundial images. From left to right: 1730 Nuremberg print; Waseda University’s Gunkan zukai; LJS 454.

Motoki’s manuscript notes notwithstanding, it is still unclear how many items he used as his source materials, if any were owned by Dutch traders at Nagasaki, and if any were in French. It is also unlikely that Motoki would have acquired significant command of the French language in the months between the Iturup incident in February 1808 and the creation of Gunkan zukai in summer of the same year, though he could have made use of a Dutch/French dictionary on hand at Nagasaki to translate diagrams.

The Legacy of Motoki and Gunkan Zukai

While Gunkan zukai may have been commissioned with the intent to protect Japan against Western threats by using Westerners’ knowledge against them, only a few short months after its initial compilation, Japan once again faced a rogue Western commander. In October 1808, the HMS Phaeton under the command of Fleetwood Pellew (1789-1861) entered Nagasaki harbor in an attempt to capture Dutch trading ships, which were now under the authority of the newly Napoleonic “Kingdom of Holland.” In an attempt to fool the Dutch, Pellew flew the Dutch flag on the Phaeton. When several Dutch traders at Nagasaki rowed out to meet this false friend, Pellew revealed the ship’s true colors, capturing the Dutch and threatening to execute them as well as destroy other ships in the harbor. Outgunned, the Nagasaki government gave into Pellew’s demands.

With the English now demonstrating a potential threat to Japanese interests, the Japanese government ordered its Dutch interpreters to add English to their list of languages. Once again, Motoki Shōei was tasked with learning a new Western language. Motoki went on to create the first English grammar in Japan, Angeria kōgaku shōsen 諳厄利亞興學小筌 (“A beginning to studying English,” 1811), and later the first Japanese-English dictionary of some 6,000 words, Angeria gorin taisei 諳厄利亞語林大成 (“The complete forest of English,” 1814). He also compiled a Japanese-French dictionary and grammar, Furansu jihan 払郎察辞範 (“A model of French vocabulary”), completed in that same year 1814. While none of these texts became standard texts, they surely served as references for future students of Western languages in Japan.

As demonstrated with the attacks at Iturup and the all-too-subsequent Phaeton Incident, Japan’s isolationist policy was simply not strong enough to secure the nation without also assimilating knowledge from the very cultures against whom it was protecting itself. Moreover, despite the strict sakoku policy, unwanted visitors would continue to find their way into Japanese-controlled territories. In only a few short decades Japan would find the chains of sakoku broken with the arrival of Matthew Perry’s “black ships.”

Whether Motoki’s detailed Gunkan zukai was ever used for practical reference is unknown, though with at least seven documented copies in Japan and an eighth here at Penn, it is clear that his work was respected for its invaluable knowledge of 18th century Western maritime culture.

Selected Bibliography

  • Gunkan zukai. Suijōsen setsuryaku 軍艦図解. 水蒸船說略. Edo kagaku koten sōsho 46. Kōwa Shuppan, 1983.
  • Katsumori, Noriko 勝盛典子. “Gunkan zukai” to “Hippokuratesu zō” : Oranda tsūshi Yoshio-ke no bunka bunsei-ki [Gakugeiin no notō kara 65] 「軍艦図解」と「ヒポクラテス像」―阿蘭陀通詞吉雄家の文化・文政期 [学芸員のノートから 65]. [Kōbe Shiritsu] Hakubutsukan dayori 68, p. 6-7, 2000.
  • Loveday, Leo. Language contact in Japan : a sociolinguistic history. Clarendon Press, 1996.
  • March, G. Patrick. Eastern destiny : Russia in Asia and the North Pacific. Praeger, 1996.
  • McOmie, William. From Russia with all due respect : Revisiting the Rezanov Embassy to Japan. The human studies 163, p. A71-A154, December 2007.
  • Sangyō gijutsu hen. Kaijō kōtsū 産業技術篇. 海上交通. Nihon kagaku koten zensho 12. Asahi Shinbunsha, 1943.
  • Tsuzuki, Ichirō 続一郎. Motoki Shōei yakujutsu “Gunkan zukai” to Itō Keisuke yaku “Banpō sōsho gunkan hen yakkō” ni tsuite : Furansugo kotohajime no kanren 本木正栄訳述の「軍艦図解」と伊藤圭介訳「萬宝叢書軍艦篇訳稿」について―フランス語事始との関連. Rangaku shiryō kenkyū 307, p. 117-133, 1976.


Libraries Supporting Digital Scholarship: The Schoenberg Institute for Manuscript Studies as an Object Lesson

A version of this talk was presented as the keynote for the annual meeting of the Association of College and Research Libraries – Delaware Valley Chapter, in Philadelphia PA on November 6, 2014.

Thank you very much, and thank you especially to Terry Snyder for inviting me to speak with you all this morning. Today is a good day to talk about the Schoenberg Institute for Manuscript Studies (SIMS); after this talk I will be heading down the hall to attend the annual SIMS Advisory Board meeting, and tomorrow and Saturday I’ll be attending the 7th annual Schoenberg Symposium on Manuscripts in the Digital Age. So this is an auspicious week for all things SIMS.

The topic of this talk is the Schoenberg Institute for Manuscript Studies and how it may be considered an object lesson for libraries interested in supporting digital scholarship. Penn Libraries has invested a lot in SIMS, and while much of SIMS will be very specific to Penn, I hope our basic practices might provide food for thought for other institutions interested in supporting research and scholarship in the library.

SIMS is a research institute embedded in the Kislak Center for Special Collections, Rare Books and Manuscripts in the University of Pennsylvania Libraries. It exists through the generosity and vision of Larry Schoenberg and his wife, Barbara Brizdle, who donated their manuscript collection (numbering about 300 objects) to Penn Libraries, with the agreement that the Libraries would set up an institute to push the boundaries of manuscript studies, including but not limited to digital scholarship. (Although my job focuses on the digital, indeed that term features in my official title, I also have responsibilities for our physical manuscript collections). Penn did this, and SIMS was launched on March 1, 2013. As a research institute we develop our own projects and push our own agenda, and although many of our projects are highly collaborative we do not “serve” scholars; we are scholars.

Guided by the vision of its founder, Lawrence J. Schoenberg, the mission of SIMS at Penn is to bring manuscript culture, modern technology and people together to bring access to and understanding of our intellectual heritage locally and around the world.
We advance the mission of SIMS by:

  • developing our own projects,
  • supporting the scholarly work of others both at Penn and elsewhere, and
  • collaborating with and contributing to other manuscript-related initiatives around the world.

SIMS has 13 staff members, but it is helpful to know that of this list only two are dedicated to SIMS work full-time (Lynn Ransom, Curator, SIMS Programs and Jeff Chiu, Programmer Analyst for the Schoenberg Database of Manuscripts). Everyone else on staff is either part time (the SIMS Graduate Fellows) or has responsibilities in other areas of the libraries, and beyond. Mitch Fraas, for example, is co-director of the Penn Digital Humanities Forum, a hub for digital humanities at Penn hosted through the School of Arts and Sciences.

Over the last couple of weeks, as I have been considering what I might say to you all this morning, I have also been spending a lot of time working on the Medieval Electronic Scholarly Alliance, a federation of digital medieval collections and projects that I co-direct with Tim Stinson, a professor of English at North Carolina State University. MESA is essentially a cross-search for many and varied digital collections, enabling one (for example) to search for a term – we have a fuzzy search that will include variant spellings in a search – and then one can facet the results by format (for example illustrations, or physical objects), discipline, or genre. One can also federate by “resource”, searching only those items that belong to particular collections

Searching MESA for Jerusalem with fuzzy search enabled, limited to format of “Illustration”.

The work that I’ve been doing for MESA over the past two weeks involves taking data provided to us and converting it from whatever format we get, into the Collex RDF XML format required by MESA. In some cases, this is relatively easy. The Walters Art Museum, for example, through its Digital Walters site, provides high-resolution images of their digitized manuscripts using well-described and consistent naming conventions, and also provides TEI-XML manuscript descriptions that are also consistent as well as being incredibly robust. These files are all released under a Creative Commons Attribution-ShareAlike 3.0 Unported license, and they are easy to grab or point to once you know the organization of the site and the naming conventions.

Walters Art Museum manuscripts on The Digital Walters site.

Not all project data is so simple to access.

The British Library Catalogue of Illuminated Manuscripts, although the data is open access (the metadata under a creative commons license, the images are in the public domain), it is “black boxed” – trapped behind an interface. The only way to access the data is to use the search and browsing capabilities provided by the online catalog. To get the data for MESA, our contact at the BL sent me the Access database that acts as the backend for the website, and I was able to convert that to the formats I needed to be able to generate our RDF.

Images from Harley 603 from the British Library Catalogue of Illuminated Manuscripts.

So what does all this have to do with SIMS? Well, as I was doing this conversion work, I had a bit of an epiphany. I realized that pretty much everything we do at SIMS can be described in terms of


And as I thought about how I might describe our various projects in terms of data reuse, I also realized that reuse of data is not new. In fact, it is ancient, and thinking in these terms puts SIMS at the tail end of a long and storied history of scholarship.


I’m not starting at the beginning, but I do want to give you a sense of what I mean when I say that data has been reused for the past couple thousand years (at least). One of my favorite early examples would have to be ancient Greek epics, such as the Iliad.

Iliad. Book 10. 421-434, 445-460, P. Mich. Inv. 6972, Special Collections Library (2nd c. BCE)

Here is a papyrus fragment, housed in the University of Michigan Libraries and dating from the second century BCE, containing lines from Book 10 of the Iliad. Thousands of similar fragments survive, containing variant lines from the poem.

Marciana Library 822, Venetus A, fol. 24r (10th c.)

And this is a page from the manuscript commonly known as Venetus A, Marciana Library 822, the earliest surviving complete copy of the Iliad, dating from the 10th century (a full 12 centuries younger than the papyrus fragment). In addition to the complete text, you can see that there are many different layers of glosses here: marginal, interlinear, intermarginal. These glosses contain variant readings of the textual lines, variants which are in many cases reflected in surviving fragments.

Penn Ms. Codex 1058, Glossed Psalter, fol. 12r (ca. 1100)

My next example is from a Glossed Psalter from our collection, Ms. Codex 1058, dating from around 1100. This manuscript is also glossed, but rather than variant readings, these glosses are comments from Church Fathers, pulled out of the context of sermons or letters or other texts, and placed in the margin as commentary on the psalm text.

Penn Ms. Codex 1640, Thomas of Ireland Manipulus Florum, fol. 114r

This example is a bit later, an early 14th century Manipulus Florum, Ms. Codex 1640. Like the glossed psalter, quotes from the church fathers and other philosophers are again pulled out of context, but in this case they are grouped together under a heading – in this example, the heading is “magister”, or teacher, and presumably the quotes following describe or define “magister” in ways that are particularly relevant to the needs of the author.

Penn LJS 267, De ludo scacchorum seu de moribus hominum et officiis nobilium … fol. 136v

Text is not the only type of data that can be reused, historically or now. We can also reuse material. Can you all see the sign of material reuse here? Check the top and bottom of the page. This is a palimpsest. What’s happened here is that a text was written on some parchment, and then someone decided that the text was no longer important. But parchment was expensive, so instead of throwing it away (or just putting it on a shelf and forgetting about it) the text was washed or scraped off the page, and new text was written over top. We can still see the remnants of the older text.

Penn LJS 395, Manuscript pastedowns from De proprietatibus rerum, back pastedown side 2

This is a page from LJS 395, a 13th century manuscript fragment that’s been repurposed to form part of the binding for a 16th century printed book. This is really typical reuse, and many fragments that survive do so because they were used in bindings.

How about this one?

Penn Ms. Codex 1056, Book of Hours Use of Rouen, ff. 24v-25r

This is a trick question. This is an opening from a 15th century book of hours from our collection, to compare with this.

Penn Ms. Coll 713, Breviary Collages, No. 1

This 17th century Breviary Collage was created by literally cutting apart a 15th century Flemish Breviary and pasting the scraps onto a square of cardboard. It is a bit horrifying, but it’s my favorite example of both reuse of material and, if not reuse of text, then reuse of illustration. Certainly the content is being reused as much as the material. Although I would never do this to a manuscript (and I hope none of you would do this either), I feel like I have a kindred spirit in the person who did this back in the 1800s, someone who saw this Breviary as a source of data to be repurposed to create something new.

I do this, only I do it with computers. Here is my collage.

Collation Visualization for LJS 266`

Okay, it’s not a collage, it’s a visualization of the physical collation of Penn LJS 266 (La generacion de Adam) from the Schoenberg Collection of Manuscripts, just one created as part of our project to build a system for visualizing the physical aspects of books in ways that are particularly useful for manuscript scholars. Collation visualization creates a page for each quire, and a row on that page for each bifolium in the quire. On the left side of each row is a diagram of the quire, with the “active” bifolium highlighted. To the right of the diagram is an image of the bifolium laid out as it would be if you disbound the book, first with the “inside” of the bifolium facing up, then the “outside” (as though the bifolium is flipped over).

To generate a visualization in the current version of collation visualization, 0.1 (the source XSLT files for which are available via my account on GitHub), I need two things: manuscript images, and a collation formula (the collation formula describes the number of quires in a codex, how many folios in each quire, if any folios are missing, that kind of thing). To create this particular visualization, first I needed to get the images.

LJS 266 in Penn in Hand

Our digitized manuscripts are all available through Penn in Hand, which is very handy for looking at manuscript images and reading descriptive information, but much like the British Library database we looked at earlier, it’s a black box.

Downloading an image file from Penn in Hand

It is possible to use “ctrl-click” to save images from the browser, but the file names aren’t accessible (my system reverts to “resolver.jpg” for all images saved from PiH, and it’s up to me to rename them appropriately).

Collation formula for LJS 266 in Penn in Hand (the third entry under Notes:)

The collation formula is in the description, and it’s easy enough for me to cut and paste that into the XSLT that forms the backbone of Collation 0.1.

It is actually possible to get XML from Penn in Hand, by replacing “html” in the URL with “XML”

XML in Penn in Hand

The resulting XML is messy, but reusable – a combination of Dublin Core, MARC XML, and other various non-standard tagsets.

Screenshot of OPenn (under construction)

Because we know how important it is to have clean, accessible data (indeed my own work and other SIMS projects depend on it), we have been working for the past year on OPENN, which will publish high-resolution digital images (including master TIFF files) and TEI-encoded manuscript descriptions (generated from the Penn in Hand XML) in a Digital Walters-style website – Creative Commons licenses for the TEI, and the images will be in the public domain. OPenn is still in development, but will be launched at the end of 2014.

Having consistent data for our manuscripts in OPenn will enable me to do with our data what I already did with the Digital Walters data: programmatically generate collation visualizations for every manuscript in our collection. Because the Digital Walters data was accessible in a way that made it easy for me to reuse it, and was described and named in such a way that it was easy to figure out what images match up with which folio number, I was able to generate collation visualizations for every manuscript represented in the Digital Walters that includes a collation formula, and I was able to do it in a single afternoon. The complete set of visualizations is available here.

Mock-up of collation form

Version 0.2 of Collation will be based on a form (this is the current mock-up of how the form will look), instead of supplying a collation formula one would essentially build the manuscript, quire by quire, identifying missing, added, and replaced folios, and the output would be both a visualization and a formula.

Why do this? It is a new way of looking at manuscripts in a computer, completely different from the usual page-turning view, and one that focuses on the physicality of the book as opposed to its state as a text-bearing object. A new view will hopefully lead to new research questions, and new scholarship.

Moving on from Collation, the standard-bearing project for SIMS (and one that predates SIMS itself by many years) is the Schoenberg Database of Manuscripts (SDBM). This is a project that reuses data on a massive scale, and does it to great effect.

Entry #1 in A Catalogue of the Medieval Manuscripts in the University Library, Aberdeen, By M. R. James (1932)

This photo is the first entry in the catalog of manuscripts at the University of Aberdeen Library, written by M. R. James. This entry, and other entries from this catalog, and from many other library and sales catalogues, have been entered into the SDBM.

Entry from Schoenberg Database of Manuscripts (current version)

Here is that same entry in the current version of the catalog. However! This year Lynn Ransom received a major grant from the NEH to convert the database to new technologies, and I’d rather show you that version.

Entry in the Schoenberg Database of Manuscripts (new version)

So, here is that same entry again in the new version of the Schoenberg Database, which is currently under development. “What is the big deal?” I hear you ask. As well you may. Let me show you a different entry from that same catalogue.

Entry for a record with eight matching records

You can see in this example, on the “Manuscript” line: “This is 1 of 8 records referring to SDBM_MS_5688.” The SDBM is in effect a database of provenance – it records, not where manuscripts are now but where they have been noted over time, through appearances in sales and collections catalogues. This manuscript has eight records representing catalogs dated from 1829 to 1932. This enables us to trace the movement of the manuscript during the time represented in the database.

Eight records for a single manuscript from SDBM.

Why create the Schoenberg Database? Although it was begun by Lawrence Schoenberg as a private database, which enabled him to track the price of manuscripts, we develop it now to support research around manuscript studies, and around trends in manuscript collecting. Study of private sales in particular could be useful in other areas of studies, such as economic history (since manuscripts are scarce, and expensive, and people will be more likely to purchase them and pay more money for them when they have money to spare).

A new project, one that we have been working on just this year, is Kalendarium. Instead of a database consisting of manuscript descriptions from catalogs, Kalendarium will be a database consisting of data from medieval calendars themselves.

Calendar from Ms. Codex 1056, Book of Hours Use of Rouen, ff. 1v-2r

This is a couple of pages of a calendar from Penn Ms. Codex 1056, a 15th century Book of Hours. Calendars, common in Books of Hours, Breviaries and Psalters, essentially list saints and other celebrations for specific days of the month. Importance may be indicated by color, as you can see here some saints names are written in gold ink while most are alternating red and blue (red and blue being equally weighed, and gold used for more important celebrations).

A major expectation of Kalendarium is that the data will be generated through crowdsourcing, that is, we’ll build a system where librarians can come and input the data for a manuscript in their collection, or scholars and students can input data for a calendar they find online, or while they are looking at a manuscript in a library. The thing is, transcribing these saints names can be difficult, even for someone trained in medieval handwriting. So, instead of transcriptions, we’ll be enabling people to match saints’ names and celebrations to an existing list. And where do we get that list?

Ask and ye shall receive. In the late 1890s, Hermann Grotefend published a book, Zeitrechnung des deutschen mittelalters und der neuzeit… (Hannover, Hahn, 1891-98.),  that included a list of saints, and the dates on which those saints are venerated. And it’s on HathiTrust, so it’s digitized, so we can use it!

Well, it’s in Portable Document Format, more commonly known as PDF. Like Penn in Hand and the British Library Catalog of Illuminated Manuscripts, PDF is another kind of black box. Although it’s fine for reading, it’s not good for reuse (there are ways to extract text from PDF, although it’s usually not very pretty) Luckily, we were able to find another digital version.


This one’s in HTML. Not ideal, not by a long shot, but at least HTML provides some structure, and there is structure internal to the lines (you can see pipes separating dates, for example). Doug Emery, Special Collections Digital Content Programmer and the SIMS staff member responsible for Kalendarium, has been working with a collaborator in Brussels to generate a usable list from this HTML that we can incorporate into Kalendarium as the basis for our identification list.

Kalendarium prototype site

We have a prototype site up, it’s not public and it’s only accessible on campus now. We’ve been experimenting, you can see a handful of manuscripts listed here.

Kalendarium form

Similar to Collation 0.2, in Kalendarium you’re using the system to essentially build a version of your calendar. You can identify colors, and select saints from a drop-down list. Unfortunately we have already found that many saints that are showing up in our calendars aren’t in Grotefend, or they are celebrated on dates not included in Grotefend; but this is an opportunity for us to contribute to the list in a major way.

Why do this at all? Calendars are typically used to localize individual manuscripts – if we see that particular saints are included in a calendar, we can posit that the book containing that calendar was intended to be used in the areas where those saints were venerated. However, if we scale up, we’ll be able to see larger patterns: veneration of saints over time, saints being venerated on different days in different places, and we should be able to see new groupings of books as well.

Another set of projects SIMS is involved in, the Penn Parchment Project in 2013 and the Biology of the Book Project starting in 2014, involves testing the parchment in our manuscripts – literally reusing the manuscript, extracting data from the material itself. This involves taking small, non-destructive samples to gather cells from the surface of the parchment and testing them to see what type of animal the parchment is made from. Results are interesting; as part of the Penn Parchment Project, an individual who wishes to remain anonymous made expert identification of ten manuscripts from the Penn collection, and got only five of them correct. Clearly, parchment identification could benefit from a more scientific approach. More recently we have joined Biology of the Book, a far-reaching collaboration (including folks at University of York in the UK, Manchester University, The Folger Shakespeare Library, the Walters Art Museum, Library of Congress, University of Virginia, The Getty, and others) to begin the slow process of moving forward a much larger project with the aim to perform DNA analysis on larger numbers of manuscripts. Very little is actually known about the practices surrounding medieval parchment making, including the agricultural practices that supported the vast numbers of animals that were used to create the manuscripts that survive today (and, of course, all those that don’t survive). We think of parchment as an untapped biological archive, and a database containing millions of DNA samples would enable us to discover the number of animals used to build manuscripts, where those animals were bred (and how far they were imported and exported), what breeds were used – many questions that are simply impossible to answer now.

Mitch Fraas, Curator, Digital Research Services and Early Modern Manuscripts, creates maps and other visualizations relating to early books, and blogs about them at He’s used data from the Schoenberg Database of Manuscripts (which is available for download in comma separated format on the SDBM website, and is updated every Sunday) and data extracted from Franklin, the Penn Libraries’ catalogue, to generate some different visualizations, one of which is shown here: Charting Former Owners of Penn’s Codex Manuscripts.

Diagram: Charting Former Owners of Penn’s Codex Manuscripts (click for interactive version)

The yellow dots are owners, and the larger the dot, the more manuscripts the owner is connected to (Lawrence Schoenberg and Sotheby’s are quite large, as is Bernard M. Rosenthal, a bookseller in New York). Clicking an owner shows the number of manuscripts connected to that person or institution, and clicking a manuscript shows the number of owners connected to that manuscript. This visualization was developed using data from Franklin, and the blog post linked above provides details on how it was done.

Mapping pre-1600 European manuscripts in the U.S. and Canada

Just this week, for the 7th Annual Lawrence J. Schoenberg Symposium on Manuscript Studies in the Digital Age, Mitch has created a new map, Mapping pre-1600 European manuscripts in the U.S. and Canada, using data from the Directory of Institutions in the United States and Canada with Pre-1600 Holdings. This map shows the location of all holdings included in the directory. Larger collections have larger dots on the map. Clicking a dot will give one more information about the owner and the collection, and there are options for showing current collections or former collections, or for showing only collections with codices (full books, as opposed to fragments or single sheets).

Ms. Roll 1066: Genealogical Chronicle of the Kings of England to Edward IV, circa 1461

We have almost reached the end, but I would like to finish by featuring the project of last year’s SIMS Graduate Fellow, the brand new Dr. Marie Turner, which is still underway, and which is a great example of data reuse to finish on. Several years ago, Marie transcribed our Ms. Roll 1066, a 15th century genealogical roll chronicling the Kings of England from Adam to Edward IV. Her transcription was combined with images of the roll and built into a website, the screenshot here, with links between her transcription and areas on the page. But Marie’s vision is larger than this single roll. There are several other rolls of this type in existence, and her vision is to expand this single project, this silo, to not only incorporate other rolls, but to become a space for collaborative editing (transcription, description, translation, and linking) for the other rolls as well. We have successfully pulled the data from the existing site and converted it into XML, following the Text Encoding Initiative Guidelines, which we’ll use to generate the data we need to import into our new software system.

The new Rolls Project will be built in DM, formerly Digital Mappaemundi, an established tool for annotating and linking images, which has been developed by Martin Foys, a medievalist, and which has recently been brought to SIMS for hosting and continued development.

A screenshot of La Chronique Anonyme Universelle, edited by Lisa Fagin Davis, published in DM

This screenshot illustrates how DM looks in terms of linking annotations to areas of an image, and you can also link areas of images together. Just last week we got a production version of DM set up on our servers at Penn, and next week we’ll be importing our data – the data we exported from the earlier edition of Ms. Roll 1066 project – into that production version. We’ll also be importing images of a half dozen other genealogical rolls. We are immensely excited to move the Rolls project to the next phase – and it was all made possible by


I’d like to close with just a few thoughts about WHAT SIMS IS – and whether or not we are an effective object lesson for libraries supporting digital scholarship is probably up for debate. We certainly do scholarship, effectively, within the context of the library, and we do it ourselves: We are scholars, not service providers. However, I think it’s important to note that our scholarship, our research, our tools and our projects are not ends unto themselves. They will all serve to support more work, to allow other scholars to ask new questions, and hopefully to help them answer those questions.
Since we are not service providers, faculty and graduate students aren’t our clients, they are our collaborators, our equals, our partners. We are in this together!
Finally, and I could have said more about this throughout my talk, we take pride in our data. We want data from all of our projects – all the data that we have reused and brought in from other places – to be consistent, with regard to formatting and documentation, accessible, in the technical sense of being easy to find, and reusable, with regard to both format (it is unlikely you will find PDFs as the sole source for any information on our site) and license. Likewise our code; we make use of Github (a site for publishing open source code) individually and through the Library’s account, and all our code is and will always be open source.

Thanks so much again, and I’m happy to take questions now.

Leave a comment

Destroying Medieval Books – And Why That’s Useful

Really excellent post on the destruction and reuse of medieval books from Erik Kwakkel.


Old furniture, broken cups, worn-out shoes and stinky mattresses: we don’t think twice about throwing things out that we don’t need anymore. And books? Here things are a bit different. Apart from the fact that you may find it morally abject to throw out a book, that noble carrier of ideas, the object retains its economic value much longer than many other man-made things. Old and worn books will usually have a second – third, fourth or fifth – life in them, for example on the shelves of the secondhand bookstore. Indeed, old age may even increase their value dramatically, as visitors of book auctions will know.

The final curtain call of any book, including medieval ones, is when its content is no longer deemed correct, valid, or useful. Between the end of the Middle Ages and the nineteenth century thousands and thousands of medieval manuscripts were torn apart, ripped to pieces, boiled, burned, and stripped for parts. While these…

View original post 925 more words

1 Comment

Volvelles: LJS 64, Illustrations to Peurbach, p. 4, Theorica motus orbis supremi super cetero mundi

Over the next several months, we’ll be creating Vines (short six-second videos) and animated gifs of all the moving volvelles in our copy of Illustrations to Georg von Peurbach’s Novae theoricae planetarum, LJS 64. This project has a few different aims. First, we’d like to show off one of the gems of our collection. This mid-16th century manuscript was created entirely by hand, to illustrate the theories of planetary motion described in Peurbach’s work. Volvelles are rotating diagrams that illustrate motion through the use of rotating circles. Although the volvelles in LJS 64 start out fairly simply (the volvelle shown in this post is a single piece of paper) as the book progresses they become more complex, and include layered circles, some of those layers having varied rotation points, and some with cut-outs that show the layers underneath. A facsimile of the manuscript is online at Penn in Hand, so you can page through a get a sense of what the volvelles look like – but those volvelles won’t move.

To get a sense of how the volvelles function, we’re creating two different virtual versions of each. One is an animated gif, created by layering and animating still images of the volvelle in Photoshop. The second is a short video, created using the Vine app, which shows a hand moving the pieces of the volvelle in real time. The more complex diagrams may require multiple Vines to show the complete movement. This leads us to the final aim of this project: to illustrate how different a fully virtual, contrived interaction with a physical object (an animated gif) is different from a hands-on interaction with that same object. Although the animated gif and the video ostensibly show the same thing, they are substantially different. And although the video supposes to show “here is how it looks in real life,” it still isn’t the same experience that you would have if you were sitting at the table moving the volvelle yourself.

Without further ado, here are our first virtual volvelles. This volvelle is captioned Theorica motus orbis supremi super cetero mundi (Theory/observations of the motion of the highest orb/body above the rest of the world.)

Animated gif, Theorica motus orbis supremi super cetero mundi, p. 4

Theorica motus orbis supremi super cetero mundi, p. 4

Theorica motus orbis supremi super cetero mundi, p. 4

1 Comment

A Legacy Inscribed: the Lawrence J. Schoenberg Collection of Manuscripts

The exhibition A Legacy Inscribed: The Lawrence J. Schoenberg Collection of Manuscripts is now available online. The original exhibition was curated by Lynn Ransom and took place March 1 – August 16, 2013 in the Penn Library’s Goldstein Family Gallery, located in the Kislak Center for Special Collections, Rare Books and Manuscripts.

In 2011, University of Pennsylvania Board members Barbara Brizdle Schoenberg and LawrenceJ. Schoenberg (C53, WG56) donated the Lawrence J. Schoenberg Collection of Manuscripts to the libraries. The Schoenberg collection brings together many of the great scientific and philosophical traditions of the ancient and medieval worlds. Documenting the extraordinary achievements of scholars, philosophers, and scientists in Europe, Africa and Asia, the collection illuminates the foundations of Penn’s academic traditions.

Each section of the exhibition – Arts and Sciences, Communication, Design, Education, Engineering, Law, the Medical Arts, and Social Policy and Practice – showcases texts, textbooks, documents, and letters that embody the history and mission of the schools that form the University. Often illustrated with complex diagrams and stunning imagery, the manuscripts bring to the present the intellectual legacy of the distant past.