Logos could act as a way to verify and retrieve document images of medieval period

A research article from 2019 shows that there are different ways to discover logos and seals in historical document images.

At the right hand bottom corner is the logo from at historical document in the medieval period. Fotoillustration: Diplomatarium database at the Swedish National Archives.

Logo and seal serves the purpose of authenticating and referring to the source of a document. This strategy was also prevalent in the medieval period. 

Different algorithm exists for detection of logo and seal in document images. A close look into the present state-of-the-art methods reveals that those methods were focused toward detection of logo and seal in contemporary document images. 

However, such methods are likely to under-perform while dealing with historical documents. This is due to the fact that historical documents are attributed with additional challenges like extra noise, bleed-through effect, blurred foreground elements and low contrast. 

The proposed method frames the problem of the logo and seals detection in an object detection framework. Using a deep-learning technique it counters earlier mentioned problems and evades the need for any pre-processing stage like layout analysis and/or binarization in the system pipeline.

The experiments were conducted on historical images from 12th to the 16th century and the results obtained were very encouraging for detecting logo in historical document images.

To the best of our knowledge, this is the first attempt on logo detection in historical document images using an object-detection based approach.

Swedish charters or charters concerning Sweden can be found in the Diplomatarium database (SDHK).The SDHK dataset for the present investigation consists of the digitized medieval charters in the card index of Diplomatarium Suecanum, comprising a total number of approximately 11,000 charters. A substantial number of charters is not yet digitized.

Charters are legal documents, containing various types of transactions, such as purchases, exchanges of property and similar. The date of the charter images in this corpus ranges from the 12th century to the beginning of the 16th century, with only a few from the beginning of this period but with a considerable increase towards the end. The languages represented are mainly Latin and old Swedish, but some charters in German are also present in this corpus. In the oldest charters, the language is Latin, and the oldest charter in Swedish date from 1330. Later, especially in the 15th century, there are more charters in Swedish.

State of the art Object detection algorithm YOLOV3 was customized and deployed to detect logos in historical document Images. 

This algorithm could retrieve only those documents from a huge corpus those consists of logos.

Future research could be in the direction of classifying different types of logo and making a logo catalogue from a historical document archive.


Link to further information of the study

The researcher behind the study are: 
Sukalpa Chanda at Østfold University College

Prashant Kumar Prasad (Indian Statistical Institute)
Anders Hast (Uppsala University)
Anders Brun (Uppsala University)
Lasse Martensson (Uppsala University)
Umapada Pal (Indian Statistical Institute)

Link to the paper: https://link.springer.com/chapter/10.1007%2F978-3-030-41404-7_58 (Though this requires payment)

Bilde av Sukalpa Chanda

Sukalpa Chanda, PhD
Associate Professor
Faculty of Computer Science
Østfold University College, Norway

Publisert 23. des. 2020 09:12 - Sist endret 23. des. 2020 09:12