Installing Tesseract-OCR on Windows devices
Tesseract-OCR is an open-source optical character recognition (OCR) engine that converts text within images into machine-readable text. Coro leverages Tesseract to identify and scan sensitive information from image files during data scans on Windows endpoint devices.
Installing Tesseract-OCR
To install Tesseract-OCR on a Windows device:
Download and execute the Tesseract-OCR installation file.
Select a language from the Installer Language dialog dropdown, and then select OK:

Select Next >:

Review the agreement terms, and then select I Agree to continue:

Select a user installation option and then select Next >:

Select the components to install. Make sure English is selected in Language data:

Select Next >:

Enter the Tesseract-OCR installation directory, or use the default. Select Next > to continue:
ImportantRecord the Tesseract-OCR installation directory. It is required to configure the TESSDATA_PREFIX environment variable. Without this, Tesseract-OCR might not work properly.
ImportantIf you enter a custom Tesseract-OCR installation directory, you must add this directory to the PATH Environment Variable to ensure Tesseract-OCR is accessible from Windows Command Prompt.
Select the start menu folder in which to create the Tesseract-OCR shortcuts, or select Do not create shortcuts.
Select Install:

Tesseract-OCR starts the installation.
After the installation completes, select Next >:

Select Finish:

Verify the Tesseract-OCR installation by opening Windows Command Prompt and entering:
tesseract -vWindows Command Prompt displays the details of the Tesseract-OCR installation found on the device:
ImportantIf Windows Command Prompt does not recognize the command you must add the Tesseract-OCR installation directory to the PATH environment variable.
Creating the TESSDATA_PREFIX environment variable
If you installed Tesseract-OCR in a custom directory (different from the default C:\Program Files\Tesseract-OCR), you must perform this procedure.
Tesseract-OCR uses language data files (.traineddata) in the Tesseract-OCR\tessdata folder for OCR. If these files are missing from the default location, Tesseract-OCR might fail to process text correctly. To ensure Tesseract-OCR finds them, set the TESSDATA_PREFIX environment variable after installing it on your Windows device.
To add the TESSDATA_PREFIX environment variable:
These instructions apply to Windows 10 and Windows 11.
Select Search and enter Environment Variables.
Select Edit the system environmental variables:

Select Environment Variables... from the System Properties dialog:

Under System variables, select New...:

Enter the following configuration:
- Variable name: TESSDATA_PREFIX
- Variable value: Enter the full path to the
tessdatafolder inside your Tesseract-OCR installation directory. For example, C:\Program Files\Tesseract-OCR\tessdata.
Select OK:

Windows creates the
TESSDATA_PREFIXenvironment variable.Verify the
TESSDATA_PREFIXenvironment variable by opening Windows Command Prompt and entering:echo %TESSDATA_PREFIX%Windows Command Prompt displays the
TESSDATA_PREFIXenvironment variable:
Adding the Tesseract-OCR installation directory to the PATH environment variable
If you installed Tesseract-OCR in a custom directory (different from the default C:\Program Files\Tesseract-OCR), you must perform this procedure.
When you add your Tesseract-OCR installation directory to the PATH environment variable, the operating system (OS) can locate and run Tesseract-OCR from Windows Command Prompt without needing the full path to the Tesseract-OCR executable file.
To add the Tesseract-OCR installation directory to the PATH environment variable:
These instructions apply to Windows 10 and Windows 11.
Select Search and enter Environment Variables.
Select Edit the system environmental variables:

Select Environment Variables... from the System Properties dialog:

Locate the Path variable in the System variables list, and then select Edit...:

Select New, paste your Tesseract installation directory, and then select OK:

Windows adds the Tesseract-OCR installation directory to the PATH environment variable.
Verify the PATH environment variable by opening Windows Command Prompt and entering:
tesseract -vWindows Command Prompt displays the details of the Tesseract-OCR installation found on the device:
