Norwegian version of this page
Ongoing project

Hugin-Munin: Enhanced Access to Norwegian Cultural Heritage using AI-driven Handwriting Recognition

The project's main goal is the creation of a system based on Artificial Intelligence that can automatically transcribe any historical handwriting from Norwegian writers even if they have not been seen before, during the training phase of the system.

About the project

Even though there have been large developments within artificial intelligence, computer linguistics and neural networks such a general system with acceptable quality does not exist for Norwegian. There are only specialized systems that can recognize handwriting from writers in the training set with sufficient quality.  
Intermediate goals are to further improve the specialized recognition for writers in the training set, increase the number of writers in the training and to automate the training process as much as possible.  

The following steps will be used to achieve the goals:
-Building from existing systems, generate a robust layout system, i.e. finding text lines, that can adapt to new writer?s style
-Using and adapting state-of-the-art neural network technology for character recognition.
-Utilizing advanced linguistics for historical Norwegian to improve the recognition.
-Incorporate novel techniques such as making artificial documents that mimics handwriting of a writer (using GAN networks), but with a known content so it can be used for training without any manual effort.  Also use a trainable feature-based method (?Zero-shot word spotting?) to recognize words and augment the results from the other processing. 
-Generate a large training set with a diverse set of writing styles and try to minimize the manual effort need for transcription.

The project will place great emphasis on testing and analysing test results with feedback to the development to track progress and identify issues that need special attention.

The project period

From the starting date: 01.10.2021
To the date of completion: 01.02.2025

Project type

Collaborative Project to Meet Societal and Industry-related Challenges

Funding 

From Research Council of Norway: 11 995 kNOK. Total for the project: 15 366 kNOK

Partners 

HØGSKOLEN I ØSTFOLD 
NASJONALBIBLIOTEKET 
TIDVIS AS

ANAHIT AS

TEKLIA 

Participants

Publications

  • Maarand, Martin; Beyer, Yngvil; Kåsen, Andre; Fosseide, Knut T. & Kermorvant, Christopher (2022). A Comprehensive Comparison of Open-Source Libraries for Handwritten Text Recognition in Norwegian, Document Analysis Systems: 15th IAPR International Workshop, DAS 2022, La Rochelle, France, May 22–25, 2022, Proceedings. Springer. ISSN 978-3-031-06554-5. p. 399–413. doi: 10.1007/978-3-031-06555-2_27.

View all works in Cristin

View all works in Cristin

Tags: The Digital Society, DigiTech
Published Nov. 17, 2021 10:42 AM - Last modified July 13, 2023 10:06 PM