Brief Technical Documentation
This document has been updated for the upcoming publication of the eight module:
Play / Comédie (December 2021). The document in its first state
(published on 24/06/2011) can be found here. The document in its second state
(published in September 2013) can be found here.
The electronic edition is encoded in XML (eXtensible Markup Language).
The encoding design started from the P5 version of the Guidelines of the TEI (Text Encoding
Initiative) and expands the DTD (Document Type Definition) with project-specific tags and
attributes where needed.[1] The L'Innommable/The
Unnamable module, as well as all modules published since, are based on version 2.3.0. of TEI P5[2], while the first module
published in 2011 is based on version 1.0.0.[3] The tagset of the first
module also incorporated some tags from a working document of the TEI SIG "Manuscripts".[4] A number of these
proposed tags were introduced into TEI P5 version 2.0.0.[5], some in modified form.
Because of these modifications and with the following modules in mind, we decided to comply
with the P5 version that was current at the time the xml transcriptions of this module were
finished (june 2013).[6]
The encoding is based on the definitions of crucial notions such as
'document', 'text', 'version', and 'work' by Peter Shillingsburg in his book Scholarly Editing in the Computer Age, notably the chapter entitled 'Ontology'.
In 2021 the technical architecture behind the edition was migrated from its previous publication framework[7] into eXist-db, a NoSQL document database and application platform.
Metadata
The header contains metadata such as a title statement and
publication statement, mentioning the coordinates of the Centre for Manuscript Genetics
(University of Antwerp); a brief source description; and a profile description with
information on the languages and the handwritings in the document.
Structural tags
<text>: This tag is used to indicate
'the actual order of words and punctuation as contained in any one physical form' (Shillingsburg 1996: 46). The
physical form is paper and ink. As a physical vessel the document contains only one
text, but it may contain more than one version of more than one work - a 'work' being
'the message or experience implied by the authoritative versions of a literary writing'
(Shillingsburg 1996: 176).
The archive catalogue number serves as unique id.
<div type="notebook">: As a first child the <body> tag will have a <div> element that declares the type of document that is being transcribed. Values are "notebook", "typescript" and "looseleaves". In the case of typescripts and loose leaves, documents can sometimes have writing on verso pages. If a transcription contains material on verso pages, this top-level <div> tag will have an attribute 'subtype="withversos"'.
<div type="page" rend="recto">: Since BDMP module 4 we have encoded all documents in a series of <div type="page"> elements.
<div type="paralipomena">: Apart from
versions, a document may also contain fragments of text (jottings, notes, reflections,
try-out sentences, and so forth) which strictly speaking do not belong to a version of a
work. These paralipomena are indicated by means of the tag <div
type="paralipomena">.
<div>: Used without attributes, this
tag indicates a version, i.e. 'one specific form of the work - the one the author
intended at some particular moment in time' (Shillingsburg 1996: 44). The writing layers are
indicated by means of <del> (deletion) and <add> (addition) tags.
<p>: Versions or paralipomena may
consist of several paragraphs.
<seg>: Each paragraph usually consists
of several sentences. When Beckett did not work with full sentences (e.g. in Not I / Pas moi) the segment consists of a few lines of text, i.e.
a unit of text that can easily be compared to other versions.
Global attributes
xml:id
The xml:id of the <text> tag is the document's archive number.
According to Peter Shillingsburg's definition, the variant forms of a work usually have
the same name, but in some cases 'there will be disagreement over whether a variant form
is in fact a variant version or a separate work' (176).
xml:lang
This attribute indicates the language in which the version is
written.
n
The catalogue number is followed by the number of the sentence in the
base text (see chapter "base
texts"):
<seg n="MS-UoR-2934,[0127]">
In the case of a sentence that eventually did not make it into the
base text, the number of the preceding sentence that did make it
into the base text is followed by | and an extra number:
<seg n="MS-UoR-2934,[0127|001]">
The first number always consists of 4 digits: 0001 and so on; the
second number, after the |, always consists of 3 digits. In the visualization, this
extra sentence (or phrase) appears in bold, because it constitutes a deviation from the
base text.
version
This attribute indicates the chronological order of the versions of a
textual unit (section, paragraph, sentence).
In L'Innommable/The Unnamable, the chronology of versions largely corresponds to the chronology of the documents.[8] Only in cases where there is more than one version of the same sentence within one document and where the order of writing does not correspond to the documentary order, a version attribute has been added to the <seg> tag to encode the correct chronology.
In Stirrings Still/Soubresauts, the chronology is a lot more complex and version attributes have been added to all sections, paragraphs and sentences.
In L'Innommable/The Unnamable, the chronology of versions largely corresponds to the chronology of the documents.[8] Only in cases where there is more than one version of the same sentence within one document and where the order of writing does not correspond to the documentary order, a version attribute has been added to the <seg> tag to encode the correct chronology.
In Stirrings Still/Soubresauts, the chronology is a lot more complex and version attributes have been added to all sections, paragraphs and sentences.
In the case of partial versions the version number is followed by a
letter (e.g. typescript version 12 of Stirrings Still/Soubresauts contains a
redraft of its last paragraph; this redrafted paragraph is indicated by the number 12a).
zone
<seg> tags have a zone attribute which
holds the name(s) of the zone(s) on a page in the image / text feature that the sentence
is a part of.
Textual Alterations
The most frequently occurring tags in the XML transcriptions are
deletions and additions:
<del>: For each cancelled phrase the
type of cancellation, the author of the cancellation, and the writing tools are
indicated, as well as the person responsible for the transcription (the editor):
<del type="crossOut" hand="#SB" rend="black ink" resp="#DVH">...</del>
In the case of instant alterations (currente
calamo) the type attribute value is 'instantcorrection'. Instant corrections
are only marked if there is no doubt that the cancellation cannot have been introduced
at a later stage: for instance in the sentence 'perhaps not again never
to be heard again', 'not again' is
marked as being followed by an instant correction; 'to be heard' is not, because the
cancellation may have been introduced at a later stage.
<delSpan spanTo="#anchor"/>: For
passages cancelled by Beckett or 'marked as used', three types can be distinguished:
heavily crossed out, a diagonal line or a St. Andrew's cross.
<add>: For additions the place of the
addition is also indicated:
<add place="marginleft" hand="#SB" rend="black ink" resp="#DVH">...</add>
The place indications used in the present edition are:
'marginleft,' 'marginright,' 'margintop,' 'marginbottom,'
'facingleaf,' 'inline,' 'supralinear,' 'infralinear,' 'overwritten.'
Open variants: alternative readings
Open variants have been marked up in this way:
when at last out again <seg type="alternative" xml:id="alt1">he knew not</seg>
<add place="above" type="alternative" xml:id="alt2">no knowing</add>
Transpositions
Transpositions (when the author moves blocks of text to a different
position, using arrows, asterisks, numbers or lines) have been marked up in this
way:
where he sits <seg type="transposition" xml:id="trans1">at his table</seg> <seg type="transposition" xml:id="trans2">head on hands</seg>.
All transpositions are declared in the header:
<listTranspose>
<transpose>
<ptr target="#trans2"/>
<ptr target="#trans1"/>
</transpose>
</listTranspose>
Metamarks
Passages or signs that, strictly speaking, do not belong to the
version: paralipomena, dates and place names, numberings, stamps and 'metamarks',
defined by the TEI as 'any kind of graphic or written signal within a document the
function of which is to determine how it should be read rather than forming part of the
actual content of the document'. In the BDMP these features are encoded as follows:
<metamark>: indicates metamarks, such as 'Stet' as a way of undoing a cancellation; or, for instance, two corresponding instances of the letter 'A' indicating where an addition is to be inserted.
<metamark>: indicates metamarks, such as 'Stet' as a way of undoing a cancellation; or, for instance, two corresponding instances of the letter 'A' indicating where an addition is to be inserted.
<stamp>: indicates a stamp of the holding library.
<num>: indicates the page number as it is presented on
the page. A 'type' attribute specifies whether these numbers were prenumbered in the
notebook, written by Beckett, or added by an archivist.
<floatingText>: An archive number that was written on a
document by the archivist, is seen as a floatingText, as defined by TEI.
<date>: to encode dates.
Notes:
[1] "pb", "stemma", "time",
"section", "trans", "orig", "textn", "over" and "chrono" have been added to the global
attributes. The attributes "version" and "zone" have been added to the tags <text>, <div>, <p>, <sp>, <l>, <lg>, <stage> and <seg>. A <sub> tag has been added.
[2] Published on 17/1/2013
(http://www.tei-c.org/Vault/P5/2.3.0/doc/tei-p5-doc/en/html/).
[3] Published on 2/11/2007
(http://www.tei-c.org/Vault/P5/1.0.0/doc/tei-p5-doc/en/html/).
[4] TEI SIG website: http://www.tei-c.org/Activities/SIG/Manuscript/.
[5] Published on 16/12/2011
(http://www.tei-c.org/Vault/P5/2.0.0/doc/tei-p5-doc/en/html/).
[6] The differences between
the two versions of TEI P5 come down to these differences between L'Innommable /
The Unnamable and Stirrings Still / Soubresauts: <metamark> vs. <ge:metamark>, <listTranspose> vs. <ge:transposeGrp>, <transpose> vs. <ge:transpose>, <handNotes> and <handNote>
vs. <handList> and <hand>.
[7] From 2011 to 2021 the edition was published as a Cocoon webapplication inside the Apache Tomcat servlet container (http://tomcat.apache.org/). The search engine made use of elasticsearch (https://www.elastic.co/).
[8] The chronology of the
sentence versions relating to the first 24 pages of the first English typescript of
The Unnamable differs from the chronology of the rest of the text. A more
detailed analysis is made under "Chronology" in the L'Innommable / The Unnamable
module.
© 2025 Samuel Beckett Digital Manuscript Project
Directors: Dirk Van Hulle and Mark Nixon | Technical realisation: Vincent
Neyt