On the importance of automatic document verification for an easy and secure onboarding of the IMPULSE eID solution

A holistic approach for the automatic and real-time verification of valid ID documents is being developed for six different European countries as part of the onboarding process of the IMPULSE eID solution. It relies on state-of-the-art Artificial Intelligence techniques and will provide a transparent and secure framework to detect forged ID documents.

Electronic identities (eIDs) have emerged as a novel way of identity proving under the umbrella of the digital revolution the society is experiencing in the last decades. Nevertheless, their deployment is still facing a number of challenges – such as, privacy and security issues, infrastructure and interoperability, low digital literacy levels, technology lock-in, etc. – and even though some citizens make good use of them and their popularity in certain European countries is fairly high, their adoption rates remain low while the everyday use of physical IDs is still the preferred option.

IMPULSE will provide an eID solution based on disruptive technologies focusing, among other aspects, on simplicity and easy adoption. Following this philosophy, the onboarding process that will lead to the generation of the eID will avoid time-consuming paperwork and will only require the capture of pictures/video of the users by means of their smartphones. The main concern of this approach, especially taking into account that only photos of the selected ID documents will be provided – and not the ID card themselves – is the possibility of them being manipulated, arising the need to validate the documents and detect any forgery they might present.

Document forgery is as old as the invention of writing. Records of ancient tampered manuscripts are kept in libraries and museums, such as land property titles during the Roman Republic. When proof of identity became mandatory to conduct administrative proceedings, the need for forgery detection on ID documents appeared. The first detection methods, relying on document experts’ knowledge with no automatic aid, were prone to failure and it was relatively easy to bypass controls. With the appearance of Computer Vision and Machine Learning (ML) techniques, algorithms of ever-growing sophistication have become key for the differentiation between tampered and genuine documents. The improved methodology and technological advancements have led to a cat-and-mouse game being played between authorities – implementing more effective security measures – and counterfeiters – taking advantage of the newest developments for their own interest. Therefore, forgery detection is still a hot topic nowadays to prevent crime while avoiding the potential harm to citizens and strengthening the global economy. Thankfully, cutting-edge Artificial Intelligence techniques provide a toolset to make these detections more accurate than ever.

Despite these improvements, the implementation of ID document verification systems within the European Union and European Free Trade Association is not easy nor straightforward. One of the main challenges consists in the broad range of coexisting document types and versions. To begin with, some countries issue national ID cards, whereas others do not and rely on the use of passports, driving licenses or digital certificates for the same purpose. Moreover, apart from the fact that designs and layouts are different for each nationality, in most cases they have been updated every few years, incorporating the latest security measures. Consequently, a significant number of citizens own legacy, but still valid, versions of their documents. IMPULSE project, with its 6 case studies in 5 different European countries, will leverage the common features of the variations and adapt to their differences.

Another difficulty stems from the device used to capture the document pictures. The smartphone ecosystem has grown into a vast landscape of manufacturers, vendors, brands, models, generations and operating system versions. Accordingly, the smartphone cameras have highly heterogeneous features, offering different resolutions, lenses, sensors and technologies. To these hardware-related issues we need to add the variations in illumination of the locations where the pictures are taken. The result is a considerable variability in image colour, sharpness, document positioning and metadata, which must be treated with digital image processing techniques. 

The IMPULSE IDdocument-verification module will rely on digital image processing together with a handful of useful related technologies, being Optical Character Recognition (OCR) one of its main pillars. In a romantic way, OCR can be defined as teaching the machine to read. A more practical definition in a modern context would be the conversion of images containing printed text into machine-encoded characters. In IMPULSE, state-of-the-art optical character recognition libraries will be used to compare the information contained in the ID documents to the one provided by the users and to locate characters on the document pictures.

Other pieces of the technological stack are worth commenting as well. The machine-readable zone (MRZ) is formatted to allow standard verification by different authorities. It will be retrieved using specialised software and compared to the rest of the information. Furthermore, different features are considered to compare intra-documents similarities at pixel and character levels. In this context, image key-points will serve to detect copy-move forgeries (textual areas of the pictures that have been copied and pasted in a different location). Last, but not least, advanced ML algorithms will be deployed to detect manipulation of characters. The resulting models will consider characters’ morphology and detect outliers. Since the outcome of this module is a probability of forgery, with the specific region(s) where the suspicious element is located on the image, it will serve as a security barrier to counterfeiters and exclusively allow genuine ID documents holders to move forward with the onboarding process on the IMPULSE mobile app.

This state-of-the-art, AI-based technological stack will allow IMPULSE to provide an onboarding process that is at the same time simple, fast, secure, compliant with the General Data Protection Regulation (GDPR) and transparent for users, but robust in the background and heavily resilient against attempts of citizens with suspicious intentions to sign up using fake or false IDs.