The Schoenberg Institute for Manuscript Studies at Penn brings manuscript culture, modern technology and people together.


An Ideal Collation of LJS 101

By Jesse McDowell

Certainly the book I was assigned in Will Noel and Dot Porter’s course in Rare Book School, “The Medieval Manuscript in the Twenty-First Century,” ascribed to the course’s title quite seemingly. I spent my time working with a nicely old Carolingian manuscript from 9th-century France, LJS 101. Like many medieval manuscripts, this one has been bound more than once, and so came the use of text-matching and open data to literally restructure the original physicality of the book in digital form.

LJS 101 is a parchment manuscript bound in 10 quires containing Boethius’s Latin translation of Aristotle’s De interpretatione (On Interpretation). It originates from north-central France, most likely the abbey at Saint-Benoît-sur-Loire (also known as the Abbaye de Fleury).

LJS 101 1v-2r

However, the book contains more than a translation. Though LJS 101 originates in the 9th-century, it contains replacement leaves added to the beginning and end in the 11th-century (current fols. 1-4 and 45-64), with seemingly the same hand correcting the first scribe in his translational slips. Also in the 11th century, color was added to existing initials and to diagrams charting Aristotle’s formal connection between language and logic. Though my narrative does not perform anything close to a textual study, we’ll call these two scribes Scribe A (9th-cent hand) and Scribe B (11th-cent hand) in the same vein as textual critics.

Scribe A composed the body of the translation itself from folios 5-44. Scribe B’s contribution is sparser, his hand (as mentioned above) showing complete replacement leaves at folios 1-4 and 45-64, as well as added corrections and glossed material throughout the manuscript. Scribe B introduces new genres in folios 45-64 (as the catalogue describes):

“the Perihermeniae attributed to Apulieus, a poem by Decimus Magnus Ausonius on the seven days of Creation, a sample letter of a monk to an abbot with [more] interlinear and marginal glosses, and other miscellaneous verses, definitions, and excerpts.”

What is most striking about Scribe B’s contribution to this manuscript is the firm variety which it comes with. We know that Carolingian miniscule became a widely used script to compose codices for professional and educational purposes from the 9th to the 13th-century. Scribes used this script widely on into the 12th-century for a variety of reasons, right when books were being produced by a number of different workers in the secular world. The script itself became codified and sharpened as a professional way of composing. If we can approximate the burgeoning use of Carolingian miniscule to c. 800, then we can hypothesize that Scribe A was quite new at this script. Indeed, the spaces between words are much clearer in Scribe B’s contributions in fol. 1-4 and 45-64, as this is one chief characteristic of a well-practiced hand.

In attempting to establish the structure of the book, there seemed to be a discrepancy between my collation formula and that determined by the cataloguer. The manuscript contains a 19th-century foliation and has prickings throughout, though this proved fruitless when trying to establish just how many gatherings the bound object contained. Initially the collation differences presented a problem in establishing structure. In short, the catalogued information accounted for 9 quires to the 10 I came up with over and over again in my own count.

We were both accounting for the same number of leaves in the manuscript. We also both noticed a discrepancy in the current 2nd and 3rd quires. Herein was the problem: with some help from Will Noel, I discovered that the current quires 2-3 were misplaced. The first problem was that Gregory’s Rule – hair side of parchment always faces hair side – was broken, at 8v-9r. Under my collation, there was no way to tell where the ‘missing’ leaf was.

Codicologists meet problems with incorrect binding all the time, and while this binding and foliation didn’t account for the discrepancy in the leaves, re-examination of the book eventually did. The manuscript’s current binding comes in an English diced Russia leather for Sir Thomas Phillips in the 19th-century. The watermark on the pastedown shows a shelfmark for ‘J. Whatman 1832.’ This could very well be a manifestation of the use of old books by antiquarians in the 19th-century who, not well understanding the nature of medieval codices, re-bind and re-label them into new codices for personal keepsakes and exhibition. Elizabeth Kolbert writes to this reality of antiquarians and aristocrats housing artifacts and even fossils in the 19th-century as collectors, rather than researchers. Alas, this seems to epitomize the life of most old books in the hands owners who do not expose them to research. In the same vein, we should not keep books from being digitized. According to Tim Stinson‘s research, less than 2% of the entirety of medieval manuscripts in the world have been digitized. Though this statistic is now a few years old, we might be able to look to the future of accessible manuscripts with a sense of positivity, as recently the Vatican Library put over 4,000 of their manuscripts online for free. But what does this mean for the researcher? Certainly the step now is to not just digitize quickly, but release the manuscripts as Open Data (not with restrictive licenses), as Penn has done with their collections in OPenn.

In the case of LJS 101, I couldn’t examine this central problem of a missing folio without digital images and access to 19th-century editions. I had to find where the text matched up after 8v— for it was not the current 9r. It was time to search for a relatively modern edition of the Latin text itself to reconcile the discrepancy. I found an 1877 edition in HathiTrust (ed. Karl Meiser).

Text-Searching an 1887 Edition

The end of the current 8v contained text that correlated with that of the current 12r.

Bottom 4 lines 8v

So, the text at bottom of 8v reads (from above):

quod significat subiectum est quocirca unū[m] quoque  

where the matching text at the top of 9r doesn’t match up to what follows in the 1877 edition:

reterea non quod nos intelligum eequum 

The text itself notified us of where Gregory’s Rule had been broken—where that of leaf 8r in the currently bound manuscript actually coincides with 12v. What essentially occurred was a mis-binding and foliation done in the 19th-century where the currently bound 12v should ideally be 7v. The text at the bottom of 12v currently reads

apud Scythas amara nec acida, sed apud ipsos quoque

and the following text on 13r doesn’t correlate:

p[er] mixtio ista significat; Quod si unum significant to [to]ta p[er] mixtio pars inde separate nihil extra designat;

From the 1877 Latin edition the correct text after ipsos quoque is sunt dulcia et apud omnes, thus

                        [last line 12v]: apud Scythas amara nec acida, sed apud ipsos quoque [begin 8r]: sunt dulcia et apud omnes gentes eodem modo: ita quoque omnia nomina si naturaliter essent, isdem omnes homines uterentur. 

0241_0029_web  12v-8r  8r web

This is but one example of how the text within the manuscript hadn’t matched up correctly. Instead of continuing to chart out the discrepancies, I’ll explain how I rendered a collation based off these findings. The text was the governing factor in matching up folios in the right order, and on this more minute level, we can see how it logically makes sense to re-puzzle a book whose folios are out of order. On a more general level, all that happened here was that quires 2 and 3 had been separated where they should have been bound together. If rebound, the second and third quire should simply be ‘quire 2.’ Based off the 19th-century foliation, the current folio 5 should ‘ideally’ be 1, 9 should be 2, 10 should be 3, and so on (for of course 8 folios in the quire).

As I presume, Penn isn’t in the business of physically rebinding a 9th-century book, especially when the current binding is in great condition. If we wanted to see this book as it was originally bound, or at least how it was bound before 1832, how could we reposition the folios against the foliation in the upper right-hand corner? We could certainly sit down with a pen and notebook and draw up a new collation, but what if we want to read the text from leaf to leaf as if it was in a correct order?

The interface I used to visualize such a structure was the digital visualization collation, a system initially developed by a collaborative team led by Dot Porter, to visualize collation diagrams based off of a model, rather than by counting and charting by hand. We were informed during our course discussions on collation that this system had been created for visualizing collation models, but we soon learned it can do so much more. At its core, this program provides something that fundamentally invigorates collation methods for medievalists; it can also wear many hats. It can allow for the repurposing of different outlets of methods at the very outset.

The repurposing I refer to came about when I was able to recreate a binding that reflected the original quire structure. Instead of laying out the entire structure online, I used the program to visually capture what couldn’t be imagined without both the digital images and the visualization program. For instance, since the program automatically begins at “Quire 1” with every collation formula, the screenshots provided render “Quire 1” where I am actually visualizing quire 2, a quire 2 that currently doesn’t exist in the book itself.

First we will see quires 1-3 as the book is currently bound.  Scribe A added these leaves making up fol. 1-4 for an introduction and the decorating of a beautiful initial, and what follows is what you would see if you walked into Kislak Center and opened up this book upon request.

Currently bound quires 1-3, Q1:

Screen Shot 2015-11-11 at 6.38.37 PM


Currently bound Q2:

Screen Shot 2015-11-11 at 6.38.56 PM


And currently bound Q3:

Screen Shot 2015-11-11 at 6.39.10 PM


These are screenshots of the visualized quires; they are live online here.

Now onto the digital reconstruction. If you were to walk into the Kislak Center and gaze upon this book, you couldn’t read the book straight through with matching Latin unless you were accompanied by this corrected version:

Here is the live online visualized quire for the reconstruction, and below is the screenshot:


Screen Shot 2015-11-11 at 7.08.01 PMScreen Shot 2015-11-11 at 7.08.14 PM


What this small foray reveals is the strikingly fundamental role that digital scholarship, and digitizing itself, can play in medieval studies. What’s more, this demonstration solicits but one aspect of what the digital world can offer. In the case of transcription practice, data-mining, and textual editing, programs like T-PEN and of course TEI-texts have seriously revitalized what we can see when we evaluate texts and ask the same fundamental questions in order to conduct research. Their value does not necessarily lie in that the programs make life easier, for surely they do, but more that they create a distinctly different paradigm one can adopt when doing scholarship of any kind with any old book.

Leave a comment

Collation Modeling and Visualization: Video Tutorials

Over the past year or so, a group of us at SIMS and elsewhere have been developing a system for visualizing the physical collation of medieval manuscripts. At the moment, this consists of two things:

  1. Figures that illustrate the make-up of quires: number of leaves, whether leaves are missing or added, etc.
  2. Using digital images of manuscript pages to give an idea of how a quire would look, were it disbound: showing how folios that are disjunct in a bound manuscript relate to one another when the manuscript is unbound.

Here is a screenshot of what this looks like:

BL Cotton Claudius b iv, aka the Old English Illustrated Hexateuch. Showing Quire 3 (4, +2).

BL Cotton Claudius b iv, aka the Old English Illustrated Hexateuch. Showing Quire 3 (4, +2).

You can create these yourself, for the manuscripts you are working with! You don’t even need a collation formula. You do need to be able to express the collation, or at least have an idea of which folios go in which quire. One of the nice things about this system, even in the current beta form, is that it can enable you to compare different collations for the same item. It could help you figure it out!

Instructions for building collation models and visualizing them are on Github. You won’t need to download any code, although the code is there if you are interested or curious. If you want the bifolia layout view, you will need to be able to provide an Excel spreadsheet associating folio or page numbers with image files.

Does that still sound like a lot of work? Never fear! I’ve made a set of video tutorials to walk you through the entire process. I hope these are helpful. And if you are still unsure about doing this yourself even after the videos, be aware that I’ll be leading a workshop at the International Congress on Medieval Studies in Kalamazoo, MI, next May. Maybe I’ll see you there! The videos are embedded below. Be sure to click on the “HD” button at the bottom of each video, or else the videos are very blurry.

1 Comment

Digital Manuscripts as Critical Edition

The following post is the written version of a presentation that Christoph Flüeler, Director of e-codices and Professor at the University of Fribourg, presented at the 50th International Congress on Medieval Studies in Kalamazoo, MI, May 2015. It has been very lightly edited by Dot Porter. Prof.  Flüeler has long been a leader in digital manuscript studies, and in his talk he proposed an exciting vision of digital manuscripts as critical edition. With Prof. Flüeler’s permission, we are very pleased to share his talk here on the SIMS blog. He will soon develop these thoughts into a longer article, which will be published in a more formal venue.

The point of departure for my contribution is as follows: in coming years an enormous number of manuscripts, tens of thousands of them from thousands of manuscript collections throughout the world, will be digitized and made available on the Internet. A few years from now perhaps a majority of all manuscripts of great cultural, artistic, and scientific value will be accessible online. As this happens, quality requirements regarding image quality, metadata, and user interfaces will markedly increase, and standards will be established, so that all over the world metadata and images can be processed and annotated via comprehensive and specialized manuscript portals and interoperable image viewing platforms. This presumption is based on careful observation of developments during the past ten years and of the large number of projects currently planned or in progress. Everyone who attended yesterday’s session entitled “All Medieval Manuscripts Online: Strategic Plans in Europe” with presentations by the British Library, the Bibliothèque nationale de France, the Bayerische Staatsbibliothek München, and e-codices knows that I refer here only to concretely planned projects.

If digital manuscripts become ever more important for scholarly research in future, the following question arises: whatis the “scholarly research value” of digital manuscripts?

Discussion of the matter has thus far been conducted in an undifferentiated manner by persons interested in defending the exclusive status of the originals and who in some cases go so far as to question whether digital reproductions have any scholarly research value at all. This point of view strikes me as rather unconstructive, because it simply dismisses as unreliable these resources on which most researchers already rely, and on which they will in future base their work to an ever greater degree.

My perspective is a bit different. What we need to do is to ask the following question: what preconditions must be met in order for a digital manuscript to be understood as a reliable resource for scholarly research, such that a scholarly researcher can, without any great misgivings or doubts, utilize the digital object as the basis for serious research and make use of it to the fullest possible extent?

Central for my reflections is the different status of the physical manuscript and the digital manuscript. The fact that the relationship between physical manuscript and digital manuscript has barely been examined up to this point is rather astonishing. It is probably because a serious theoretical consideration of the immediate precursors of the digital manuscript was never undertaken; I speak here of print facsimiles and microfilms. Facsimile editions are hugely popular with collectors. The production of facsimiles is normally understood as a work of fine craftsmanship. While scholarly researchers are employed in their production, they contribute only the accompanying commentary. Theory has obviously been considered out of place when it comes to the production of facsimiles.

Microfilms, on the other hand, have always been seen as not particularly attractive research aids, are often incomplete, often contain errors, and are as a rule only black-and-white. They are still maintained as archival copies, but for scholarly researchers their usefulness as reproductions has (for the most part) been superseded.

In this context I will not raise the matter of the qualities that distinguish a digital manuscript from a facsimile edition or a microfilm. The advantages of the digital manuscript are too obvious to require enumeration here.

I would like to ask, instead, how a digital manuscript stands in relation to a critical edition of a text. Can the publication of a digital manuscript on the internet be understood as an edition? Further: could such an edition even be regarded as a critical edition?

I would like to consider again the statement I made earlier, in which I asserted that for scholarly research purposes a digital manuscript must be understood as a reliable resource to the extent that medievalists from various disciplines (for ex. History, Art History, History of Law, History of Philosophy, Classical Philology, etc.) can utilize the digital object as the basis for serious research and make use of it to the fullest possible extent.

This echoes the proper purpose of a critical text edition. A critical text edition does exactly this, and the science of creating editions has since the 19th century developed methods for achieving this goal. A critical text edition aims to create an authoritative and easily accessible text. Its usefulness is, however, often far greater: a critical text edition can, for example, highlight the historical dimensions of the transmission of a text and use a critical apparatus to tease out intertextual aspects of the text in ways that far exceed simple transcription. In addition, a critical text edition can drill down to a more original text, identify errors in transmission, and provide a text so convincing in its authenticity that it comes to be accepted in the scholarly research community as an authoritative version of the text.

If we do not insist that the definition of edition can only be applied to a traditional text edition, we can in point of fact understand the publication of a digital manuscript on the Internet as a scholarly edition.

In the meantime there are already thousands of texts which have received their first publication as digital manuscripts. This is also true of hundreds of texts found on e-codices. It is important not to underestimate the usefulness to scholarly research of this additional method of editing, especially for texts that have never been edited previously and that would perhaps otherwise never have been critically edited.

What scholars need are good, scientific editions. This is true for both text editions and editions of digital manuscripts. We can only regard as serious critical editions those that follow established scientific criteria, developed with a firm grounding in the concept that the publication can substitute for the original as a resource for research, up to a certain point and for specific purposes, and that it offers some type of added value beyond that of the original. A digital manuscript, like a traditional critical edition, is not merely a cheap copy, but ideally can show aspects of the primary resource, i.e. the original manuscript, that were not visible in such a way when viewing the original.

It is obviously important to note that the critical edition of digital manuscripts is a different task from the critical edition of texts transmitted in manuscripts. It is, however, not any less exacting.

No edition theory has yet been written concerning digital manuscripts. I can only briefly enumerate some relevant themes and desiderata.

A digital manuscript edition should, like a critical text edition, follow documented scholarly research criteria and not produce a plain, unexamined reproduction of the material object—in this case a physical manuscript, but should—as I already emphasized—create some added value and bring out new aspects of the manuscript that have not previously been observed or recognized; and a digital manuscript should obviously provide a reliable foundation for current research of the original manuscript.

The most authentic possible scientific reproduction is the first step. Completeness, high image quality, and true color must be provided. Measurability and verifiability are fundamental to access for all purposes of scholarly research. Digital manuscripts consist of digital reproductions. It is therefore essential to provide not only metadata about the manuscript, but also metadata about the digital object. The colors must be measurable, not only by using a simple color sample strip, but by employing a complex Color Management System. This is actually already standard these days, but as soon as the files are uploaded to the Internet, all the care that goes into this is often ignored. Image metadata, such as IPTC metadata, should be available together with the digital image, and should be linked closely enough that when images are transferred—for example, into another image viewing platform—the image metadata are automatically attached. Dimensions should be measurable in every part of a manuscript. Simply including ruler in an image is here, as in other cases, not sufficient; a digital measuring tool with flexible usability would always be preferable. In practice, we are for the most part still a long way from such precise, reliable and measurable digital images at this point; however, they are fundamental for serious scientific work. How is one to conduct serious research with images, if the images on the screen are often slightly distorted, the colors are not accurate, and no reliable measuring tool is available? Not to mention poor resolutions of less than 300 dpi! Products like this are simply a waste of money.

Digital manuscripts do not consist merely of digital reproductions though. A digital manuscript is a virtual product that reproduces a tangible object in its entirety. This includes the proper sequencing of images. A data model must ensure that the image sequence remains intact when displayed in other viewing platforms. The same is obviously true for metadata regarding the physical manuscript and the digital manuscript, which aid in understanding the manuscript as manuscript, but also as digital object. I am referring to metadata in the broad sense. This includes: basic metadata, structural metadata, scholarly descriptions, image descriptions, metadata regarding codicology, digital object metadata, reports about additional restoration, and ideally even the full range of existing research literature. Finally, this includes—and very importantly—the transcriptions and editions of the text contained in the manuscript. If a critical edition of a digital manuscript is to comprehend the physical manuscript in its entirety, then text editions form part of it. In the future, text editions should not be understood as separate from digital objects, but as integral parts of them. I regard these integral parts not as competing or the edition as an absolute condition, but rather that these are complementary pieces of the ideal whole. Metadata can be added as desired—the richer the data included, the greater the usefulness and the scholarly research value.

A digital manuscript can and should be used to show more than is visible or explicitly contained in the original. Illustrations can be enlarged. Structural elements of the codex and the text can be accentuated. Individual illustrations or parts of the text can be annotated, and transcriptions and editions can be set next to the page images. Codicological features such as quires, watermarks, and color analysis can not only be provided, but can even be analyzed and interpreted within a digital manuscript. The research area of Image and Text Recognition is hard at work on tools to recognize and analyze layout, script types, scribal practices and eventually even texts.

It is important to emphasize that a digital manuscript should display a manuscript in its entirety. But we should even go a step further. A critical edition of a digital manuscript should not treat only a single manuscript, but should include as much data as possible about other related manuscripts and sources, in order to promote viewing the special qualities and features of the particular manuscript in a broader context. In this area as well the established methods of scientific editing aid me in developing criteria that can be applied to digital manuscript editions.

One fundamental task when editing critical editions of medieval texts transmitted in manuscripts is to collate individual transcriptions of texts and thereby obtain new information. The critical apparatus presents variations of the text as transmitted by the manuscripts used for the edition. This critical apparatus delivers indications of explicit and implicit references to other works as well and unfolds the intertextuality of the text. This means that a critical text edition goes beyond transmission of the text found in a single manuscript.

A critical digital edition of a manuscript can for example expand quire analyses, descriptions of illustrations, script analysis, structural analysis, water mark analysis of a single manuscript via metadata for another digital object, or other objects can be incorporated for the purpose of gaining new information. Let me offer just one example: quire composition and layout analysis can be performed across manuscripts from the same scriptorium or other scriptoria in order to recognize features peculiar to a particular manuscript, a scriptorium, or an entire epoch. A digital manuscript is thus more than just a digital version produced from a single physical object. It effectively has the potential toincorporate the entirety of manuscript transmission contained in all medieval manuscripts.

The publication of medieval manuscripts on the Internet has made amazing progress during the past ten years. Digital manuscript libraries have transcended the status of pilot projects. Digital manuscript libraries have become more professional and have by now become an essential part of the research infrastructure. This is surely due to the fact that not just a few individual manuscripts, but over 15,000 medieval manuscripts have been presented online up until now.

However the success and importance of digital manuscript libraries depend not so much on the number of digitized manuscripts as on the scientific quality of those digital manuscripts, which can only achieve fundamental change in the area of manuscript research through a critical theory of the digital manuscript.

Thank you for your kind attention.

Kalamazoo, May 15, 2015

Christoph Flüeler

Leave a comment

Reblog: “Tashrih al-badan” (Anatomy of the body, 14th Century)

Reblogging a post about LJS 49 from facsilium: ancient manuscripts and rare books:

The venous system, with figure drawn frontally and the internal organs indicated.

Mansur ibn Muhammad ibn Ahmad ibn Yusuf ibn Ilyas, “Mansur ibn Iiyas”descended from a Shiraz family of scholars and physicians. His illustrated treatise, “Anatomy of the human body” often called “Mansur’s Anatomy” consists of an introduction followed by 5 chapters on the 5 main systems of the body: bones, nerves, muscles, veins and arteries; each illustrated with a full-page diagram. The manuscript was a total new for me, as I always thought that Qur’an has severe restrictions regarding human representations. Indeed, it has, especially in Sunni Islam (representation of all living beings).”

Read the rest here