Building An OCR app with Entity Extraction Using ML-KIT

Kashif Mehmood
Towards Dev
Published in
3 min readJan 10, 2022

--

Build a card reader app with ML-Kit OCR and Entity Extraction

What is Entity Extraction or Named Entity Extraction?

Entity extraction is a process through which computers extract entities from such as phone numbers, emails, addresses, product names etc. Search engines use this process to understand the search queries, card scanners use it to extract relevant info from visiting cards. Although, it is easy for humans to understand names, places, addresses it’s a hard task for computers to extract entities from a given passage. For example, orange can be a colour and a fruit too, understanding the context of a sentence is important and computers are not good at it.

more on entity extraction can be found here: https://www.expert.ai/blog/entity-extraction-work/

Source: AI Time Journal
https://www.aitimejournal.com/@akshay.chavan/complete-tutorial-on-named-entity-recognition-ner-using-python-and-keras

Entity Extraction On Android

You may have seen different apps that scan a visiting card and then turn the info on the card into text, which can be used as a contact in your phone book. All of this done by doing things
1.Text Recognition
2.Entity Extraction

Both of these parts used to be done through either OpenCV or Tensorflow lite, this had a huge learning curve and along with that, you needed machine learning, NLP expertise to achieve a good result. It was a tedious and time-consuming task that required tons of effort.

Why ML Kit?

ML Kit is a one-stop shop for on-device ML on android. It provides you with many pre-trained models such as barcode scanning, face detection, text recognition, pose estimation and now entity extraction as well. Working along with CameraX ML Kit can do wonders.
More on ML Kit: https://developers.google.com/ml-kit

Let’s get to the code:

This article will illustrate how to extract entities as there are already quite many articles already written on text recognition.

Add this dependency first:

dependencies {
// …
implementation 'com.google.mlkit:entity-extraction:16.0.0-beta1'
}

Step 1:
The first step is to create an entity extractor object, inside the object you can specify which language you want to extract entities from. I am using English.

Step 2:
In the second step, we will check if we need to download the entity extractor model, if it is already downloaded by play services then it will go straight to the success listener, else it will download the model.

Step 3:

In the third step, we will add the text as parameters to our entity extractor object, which will then annotate our text with specific entities. In the “text” variable you should add your text.

Step 4:

The last step is to get the annotated text with the entities, Entity recognition API supports multiple entities that a text can be classified into, The above code returns a list of annotated entities, each containing list of entities that have the text that was annotated and the entity which the text belongs to. We can loop over them and use a when statement to check which text belongs to which entity.

A working application can be found here:

Connect with me on LinkedIn:

https://www.linkedin.com/in/kashif-mehmood-km/

--

--

Software Engineer | Kotlin Multiplatfrom | Jetpack/Multiplatform Compose |