DOI: 10.3390/publications11030035 ISSN: 2304-6775

Establishing Genealogies of Born Digital Content: The Suitability of Revision Identifier (RSID) Numbers in MS Word for Forensic Enquiry

Dirk H. R. Spennemann, Rudolf J. Spennemann
  • Computer Science Applications
  • Media Technology
  • Communication
  • Business and International Management
  • Library and Information Sciences

Born-digital content is rapidly becoming the norm for literary works, professional reports, academic journal articles, and formal corporate correspondence. From the perspective of digital forensics, there is a need to understand the origin of a document and its entire creation process, from outlining and drafting to editing the final version of the text. Revision save identifier (RSID) numbers embedded in MS Word documents have been used to examine the nature and extent of individual edits within a document. These RSIDs remain logged in the metadata even if the text with which they were associated has been removed. As copies of such files retain the original’s RSIDs, this metadata can be used to determine the order in which documents were cloned from each other. As a proof-of-concept, this paper examined over 400 template files generated by a single publisher for manuscript submissions to its journals. The study can show that it is possible to establish genealogies and thus relative chronologies of born digital content by first identifying those documents that share a document (root) RSID and then seriating those RSIDs that are shared between two or more documents.

More from our Archive