What is TheSu XML?

TheSu XML (Thesis-Support XML) is a stand-off XML annotation schema for modelling ideas and their discursive contexts in textual sources. Through TheSu XML, declarative statements (theses) can be formally and thematically characterised and linked to their surrounding discourse contexts (supports), such as argumentative justifications, explanatory reformulations, and framing introductions. The schema also supports comparative analysis: multiple statements expressing essentially the same core idea can be linked under abstract propositions. This makes it possible to track how ideas vary in their details, presentation, and use across different sources or contexts. It is tailored for research in the history of ideas, philosophy, science, and technology.

Why Does It Matter?

Historical texts contain complex webs of ideas, arguments, and rhetorical structures. Traditional methods struggle to track these relationships across multiple sources. TheSu XML provides a systematic, computational approach to mapping discourse structures, allowing both detailed analysis and broader pattern discovery.

Who Is It For?

TheSu XML is tailored for researchers in:

  • History of ideas
  • History of philosophy
  • History of science and technology
  • Philology and textual analysis
  • Digital humanities

Whether analysing a single philosophical dialogue or comparing ideas across centuries, TheSu XML provides the structure needed for systematic, reproducible research.

Current Development Context

TheSu XML is currently being developed and refined as part of an FWO-funded research project at KU Leuven (2024-2027), which uses lead and lead white in Greco-Roman sources as a case study.

Note: Most examples throughout this website are drawn from ancient Greco-Roman sources on lead and lead white, reflecting the current research focus. This includes examples from Plutarch, Plato, Dioscorides, Theophrastus, and other ancient authors.

Research Possibilities

History of Ideas

Track Ideas Across Sources

How does an idea appear, develop, and vary across different sources? A statement like "the cosmos is rationally arranged" might be presented as a central philosophical claim in one context but as a passing observation in another, supported by analogical reasoning in one source or logical demonstration elsewhere. TheSu XML supports comparison within a single source, across works by the same author, or between related and unrelated authors, tracking how ideas are presented within their discursive contexts and how intellectual traditions develop.

History of Science

Compare Procedural Knowledge

Which aspects of technical procedures remain stable across centuries, and which transform as knowledge is transmitted? Comparing how processes like distillation or iron tempering are described across different sources can answer such questions. With TheSu XML, step-by-step comparison of procedures becomes possible, identifying patterns of continuity and change in how practical knowledge was preserved, modified, and adapted.

Key Features

Theses

Model Declarative Statements

A thesis is a declarative statement conveyed explicitly or implicitly by a source. With TheSu XML, such statements can be annotated with details including speaker attribution, thematic classification, formal structure (such as embedded analogies or causal explanations), and links to abstract propositions for cross-source comparison.

Each thesis can be marked as explicit (directly stated in the text) or implicit (reconstructed from context), and may include metadata such as speaker attribution, thematic classifiers, keywords, and notes on formal structure. Theses can represent factual claims, philosophical statements, procedural descriptions, or causal explanations. They form the foundation of discourse analysis, enabling systematic cataloguing and indexing of ideas across sources.

Examples
Chemical/Medical Example:
"Lead white is the most cooling of deadly drugs" β€” an explicit thesis from Plutarch, annotated with macro-theme "physical", micro-themes "chemistry" and "medicine", speaker attribution, and linked to its source text span.
Philosophical Example:
"If someone dyes anything a certain colour, the colour of the thing that has been dyed is not the same as that of what is present to it" β€” an implicit thesis from Plato's Lysis, attributed to Socrates, with keywords for "colour" and "smear", and annotation of its embedded etiology (cause-and-effect relationship).
Implicit Thesis:
"There are times in which, when bad is present, what is neither bad nor good is not yet bad, and others in which this has already become bad" β€” an implicit thesis from Plato's Lysis (217e), inferred from Socrates's questions and examples about how the presence of bad affects what is neither good nor bad (e.g., "If something is present to something, will that which possesses it be such as what is present? Or only if it is present in a certain way?").
Procedural Thesis:
Theophrastus's recipe for producing lead white in On Stones is annotated as a single thesis because each step depends on its place within the overall procedureβ€”the steps cannot be meaningfully isolated. The thesis is annotated to include a sequence with seven phases: (1) placing lead over vinegar in closed jars, (2) waiting for the lead to acquire thickness (up to 10 days), (3) opening the jars, (4) scraping away the mould-like substance from the lead (repeating until the lead is consumed), (5) grinding what has been scraped, (6) filtering off what is being ground, and (7) identifying the final precipitate as lead white. Each phase is annotated with objects (lead, vinegar, jars), temporal details (duration, repetition conditions), and transformations (how materials change between phases).

Supports

Map Discursive Contexts

A support is a discourse component that serves another part of the discourse, such as a thesis or another support. Each support can have one or more functions: argumentative (justifications for acceptance or rejection), expository (explanations or clarifications), expansive (elaborations or excursuses), or contextualising (framing or background information). When multiple functions apply, they are ranked to indicate the primary function.

Supports are also classified by their form, such as particular illustrations, affirmations, questions, analogies, or definitions. They can include discourse markers (like "because", "for example", "namely") that signal their function, and are attributed to speakers just like theses. Supports can be explicit (directly stated) or implicit (reconstructed from context), and can themselves employ other theses as premises, enabling reconstruction of complex argumentative structures.

Examples
Expository Support:
"For example, when lead white is applied as a hair dye" β€” this support clarifies the abstract claim "If someone dyes anything a certain colour, the colour of the thing that has been dyed is not the same as that of what is present to it" by providing a concrete illustration, helping readers understand the main point through a specific example.
Argumentative Support:
"Because lead can produce the most cooling of deadly drugs" β€” this support provides a reason that justifies the claim "Lead is among the naturally cold substances", functioning as a premise that supports this conclusion about lead's properties.
Contextualising Support:
"What do you mean?" β€” Menexenus's question in Plato's dialogue that frames the discourse, helping readers understand the context and purpose of the lead white examples that follow in the conversation.

Propositions

Enable Cross-Source Comparison

Propositions are abstract ideas that multiple theses represent. Unlike theses, which are linked to specific text spans in sources, propositions exist independently as abstract concepts. They enable comparative analysis by linking multiple statements that express essentially the same core idea, even when those statements appear in different sources, contexts, or time periods.

Each proposition includes a paraphrase describing the abstract idea, and can be classified thematically with macro-themes and micro-themes, just like theses. Propositions can also contain sequences for procedural ideas (such as abstract recipe models). Linking theses to propositions makes it possible to track how ideas vary in their details, presentation, use, or support throughout a text or corpus, showing patterns of intellectual transmission and transformation.

Examples
Chemical/Medical Example:
The proposition "Lead white is a cooling substance" links multiple theses from different sources: Plutarch's statement that "Lead white is the most cooling of deadly drugs" and Dioscorides's descriptions of lead white's cooling properties. This enables comparison of how the same core idea appears and is supported across different texts.
Property Example:
The proposition "Lead white is white in colour" links theses from Plato's discussion of lead white's whiteness and Dioscorides's remark that properly produced lead white "comes about white and effective", showing how the same property claim appears in different contextsβ€”philosophical discourse versus technical description.
Procedural Proposition:
The proposition "To produce lead white: place lead over vinegar, wait for thickness, scrape off the white substance" serves as an abstract recipe model. Multiple recipe theses from Theophrastus, Dioscorides, and other sources can be linked to this proposition, making it possible to compare step by step how the same process is described differently across sources.

Sequences

Analyse Procedural Knowledge

Sequences are detailed breakdowns of processes nested within theses: the step-by-step instructions or chronological events that a thesis describes. When a thesis contains a recipe, procedure, or chain of events, sequences make it possible to break the process into individual phases, each annotated with specific details such as ingredients, objects, duration, repetitions, and conditions.

Each phase can be annotated with the objects or ingredients involved, how long it takes, whether steps need to be repeated, and what conditions must be met. Variant recipes or procedures can be mapped to their corresponding primary processes, supporting detailed comparison of how different sources describe the same process and identifying which steps are shared, which are altered, and which are extended or omitted.

Examples
Recipe Sequence:
Theophrastus's recipe for producing lead white (described in his work On Stones) is annotated as a single thesis because the individual steps cannot be meaningfully separated from the overall procedure. The thesis includes a sequence broken down into seven phases: (1) placing lead over vinegar in closed jars, (2) waiting for the lead to acquire thickness (up to 10 days), (3) opening the jars, (4) scraping away the mould-like substance from the lead (repeating until the lead is consumed), (5) grinding what has been scraped, (6) filtering off what is being ground, and (7) the final precipitate is lead white. Each phase is annotated with objects (lead, vinegar, jars, the substance formed), temporal details (duration, repetition conditions), and how materials transform between phases.
Variant Recipe Mapping:
Different sources describe lead white production with variations: Theophrastus's recipe uses closed jars and scraping, while Dioscorides's recipe uses a lattice of reeds and includes additional steps like decanting and sun-drying. Sequences allow these variant recipes to be mapped to a primary process, showing which phases are shared (placing lead with vinegar, waiting for transformation), which are altered (jar method vs. lattice method), and which are extended (Dioscorides adds decanting and multiple sifting phases).
Phase Details:
Within a sequence, each phase can include detailed annotations: objects are marked as ingredients or instruments (vinegar as ingredient, jars as instruments), transformations are tracked (lead becomes thickened substance), and temporal information is recorded (waiting phases, repetition conditions like "until consumed completely"). This granular annotation enables precise comparison of how different sources describe the same procedural steps.

Get Started

Ready to explore TheSu XML? Learn more about how it works, access the schema documentation, or view visualisations.

Learn More

Explore additional resources and project information.