Examples

Text Search (OCR)

Find and interact with text on screen using optical character recognition

ocrtext-searchautomation

The OCR plugin enables you to find and read text directly from the screen. This is useful when automating applications where text is rendered dynamically, or when maintaining reference images isn't practical.

When to Use OCR

  • Dynamic labels that change based on data (prices, usernames, timestamps)
  • Localized applications where button text varies by language
  • Data extraction from screens that don't expose their content programmatically
  • Verification that expected text appears after an action

For static UI elements, image matching is often faster and more reliable.

Installation

bash
npm i @nut-tree/plugin-ocr

Quick Start

typescript
import {screen, singleWord, mouse, straightTo, centerOf} from "@nut-tree/nut-js";
import {configure, LanguageModelType, preloadLanguages, Language} from "@nut-tree/plugin-ocr";

// Configure once at startup
configure({
    dataPath: "./ocr-data",  // Where to store language models
    languageModelType: LanguageModelType.BEST
});

await preloadLanguages([Language.English]);

// Find and click text
const button = await screen.find(singleWord("Submit"));
await mouse.move(straightTo(centerOf(button)));

Finding Text on Screen

The singleWord function (from @nut-tree/nut-js) creates a text query. Configure matching behavior via providerData:

typescript
import {screen, mouse, singleWord, straightTo, centerOf} from "@nut-tree/nut-js";
import {useOcrPlugin, configure, Language, LanguageModelType, preloadLanguages} from "@nut-tree/plugin-ocr";

useOcrPlugin();
configure({
    dataPath: "/path/to/store/language/models",
    languageModelType: LanguageModelType.BEST
});

await preloadLanguages([Language.English, Language.German]);

screen.config.ocrConfidence = 0.8;  // Minimum confidence threshold
screen.config.autoHighlight = true; // Highlight matches for debugging

const location = await screen.find(singleWord("WebStorm"), {
    providerData: {
        lang: [Language.English, Language.German],
        partialMatch: false,   // Require exact word match
        caseSensitive: false   // Ignore case
    }
});

await mouse.move(straightTo(centerOf(location)));

Provider Options

OptionTypeDescription
langLanguage[]Languages to use for recognition
partialMatchbooleanAllow matching part of a word
caseSensitivebooleanWhether to match case exactly

Reading Text from a Region

Extract text from a specific area of the screen:

typescript
import {getActiveWindow, screen} from "@nut-tree/nut-js";
import {useOcrPlugin, configure, LanguageModelType, TextSplit, preloadLanguages, Language} from "@nut-tree/plugin-ocr";

useOcrPlugin();
configure({
    dataPath: "/path/to/store/language/models",
    languageModelType: LanguageModelType.BEST
});

await preloadLanguages([Language.English, Language.German]);

const activeWindow = await getActiveWindow();

// Read text, split by lines
const text = await screen.read({
    searchRegion: activeWindow.region,
    split: TextSplit.LINE
});

console.log(text);

TextSplit Options

ValueDescription
TextSplit.SYMBOLIndividual characters
TextSplit.WORDWords
TextSplit.LINELines
TextSplit.PARAGRAPHParagraphs
TextSplit.BLOCKText blocks
TextSplit.NONEAll text as one string

Configuration Reference

OCR Plugin Configuration

typescript
configure({
    dataPath: string,              // Directory for language model files
    languageModelType: LanguageModelType  // FAST, DEFAULT, or BEST
});

Language Model Types

TypeSpeedAccuracyUse Case
FASTFastestLowerQuick checks, known fonts
DEFAULTBalancedGoodGeneral use
BESTSlowerHighestComplex layouts, small text

Tips

  • Preload languages before searching to avoid delays on first use
  • Use search regions to limit where OCR looks - faster and more accurate
  • Lower confidence (screen.config.ocrConfidence) if text isn't being found, but expect more false positives
  • Enable autoHighlight during development to see what's being matched
  • partialMatch: true helps when text might include extra characters or whitespace

Supported Languages

The plugin supports 100+ languages. Common ones include:

Language.English, Language.German, Language.French, Language.Spanish, Language.Italian, Language.Portuguese, Language.Dutch, Language.Polish, Language.Russian, Language.Japanese, Language.ChineseSimplified, Language.ChineseTraditional, Language.Korean, Language.Arabic, Language.Hindi

See the full list in the OCR plugin documentation.

Was this page helpful?