Noupe Editorial Team July 25th, 2017

Smart Text Detection and Manipulation in Images

Everyday Web Developers are looking for better ways to enhance user experience when it comes to images. Often, when dealing with user uploaded or 3rd party images, there could be sensitive information which would call for the need to manipulate text embedded as content in an image. Car registration numbers, identity cards, road signs, and commercials are some of the possible scenarios in which you may need to manipulate the text content of an image. The requirement gets even more interesting when applied to more advanced scenarios, where you translate text written in a foreign language. Manually this seems simple but if you have hundreds of images being regularly, automation is necessary for efficiency. Services like Cloudinary can simplify the complex process involving image text extraction and manipulation. Cloudinary is a cloud-based, end-to-end image and video management service. Storage, manipulation, transformation and media delivery is what Cloudinary knows how to do best. The wide range of manipulations includes character recognition, extraction, and manipulation of text in images. Optical character recognition (OCR) available as an add-on is powered by Google Vision API.

OCR for Manipulation

The first thing we want to attempt with OCR is manipulating an image based on the characters found in the image. For example, on a real estate website, you may want to hide the agent's contact details. Though you may be able to restrict agents from displaying in their contact information, they may discover different ways to leave these details in the image, like shown below: The sign clearly shows the agent’s phone number, which might violate your terms and conditions. With OCR, you can replace the text with your own contact information. To achieve this with Cloudinary, we need to use three parameters: 1. The overlay image: The image on which we intend to cover the detected text. 2. Set Gravity to text_ocrfor correct positioning 3. fl_region_relative to adjust the width of the overlay image to that of the detected text element.,fl_region_relative,w_1.1,g_ocr_text/home_4_sale.jpg To replicate the above example, you'll need a free Cloudinary account. Once you have created your account, upload the image above to your provisioned cloud, and start manipulating the image URL as we're doing above. Of course, you can use an SDK to achieve this. Here’s an example using the JavaScript SDK to deliver a transformed image: js cloudinary.image("home_4_sale.jpg", {overlay: "call_text", flags: "region_relative", width: "1.1", gravity: "ocr_text"}) Rather than overlaying the text in the image with another image, we could also blur the text if we don’t want to display any contact information on the image. Using the same example: js,w_1.1,g_ocr_text/home_4_sale.jpg So, instead of using an overlay, we are setting the e_pixelate_region to blur the region with 15 being the level of blur applied. Notice that g_ocr_text is still there to specify the OCR instruction.

OCR for Text Extraction

Another common use case is retrieving the text detected in the image. The extracted text can then be further analyzed to fit the user's need. You can retrieve this text while uploading or updating an image stored on your Cloudinary server. Let's upload the following Pexel image to Cloudinary and extract the text found on the image: To get started with doing this, you need to create an account on Cloudinary. Once you have an account, there will be a cloud provisioned for you to store your images and transform them as you wish. You also will be handed your API credentials, which include the cloud name, API key, and secret. Retrieve these credentials and store them safely. Next, you will need to enable the OCR add-on by going to your add-on settings and clicking the free option under the OCR add-on configurations. Next, create a simple Node environment by running: ```bash

Create a Node project

npm init --y

Add an entry point

touch index.js ``` We also create an index.js entry point for the example. Before heading right into this file, we need to install the Cloudinary SDK: bash npm install --save cloudinary You can now head back into the index.js entry file and configure a Cloudinary instance to connect to your Cloudinary cloud: ```js const cloudinary = require('cloudinary'); cloudinary.config({ cloudname: 'CLOUDNAME', apikey: 'APIKEY', apisecret: 'APISECRET' }); ``` You're all set to start uploading images while trying to retrieve the text content in the images. Here is how: js cloudinary.v2.uploader.upload("", { ocr: "adv_ocr" }, function(error, result) { if(error) { console.log(error); return } console.log([0].textAnnotations[0].description) }); Basically, the upload method is used to send images to your Cloudinary server. But if you want the upload process to retrieve the text contents while uploading as a response, you need to to set the ocr option to adv_ocr. Run the app with the following command and watch the output in the console: bash node index.js The image we uploaded prints the following in the console: This text can be used as per your requirements.


It's amazing what we can achieve with Cloudinary's OCR add-on. We just had to fix in parameters in the delivery URL of images to replace text or blur them out. You can also easily setup a process to block images with adult or sensitive words by extracting the text. Cloudinary OCR enables this, and many more use cases. Sign up for free to get started.

Noupe Editorial Team

The jungle is alive: Be it a collaboration between two or more authors or an article by an author not contributing regularly. In these cases you find the Noupe Editorial Team as the ones who made it. Guest authors get their own little bio boxes below the article, so watch out for these.

Leave a Reply

Your email address will not be published. Required fields are marked *