Text Search (OCR) | nut.js Examples

The OCR plugin enables you to find and read text directly from the screen. This is useful when automating applications where text is rendered dynamically, or when maintaining reference images isn't practical.

When to Use OCR

Dynamic labels that change based on data (prices, usernames, timestamps)
Localized applications where button text varies by language
Data extraction from screens that don't expose their content programmatically
Verification that expected text appears after an action

For static UI elements, image matching is often faster and more reliable.

Installation

bash

npm i @nut-tree/plugin-ocr

Quick Start

typescript

import {screen, singleWord, mouse, straightTo, centerOf} from "@nut-tree/nut-js";
import {configure, LanguageModelType, preloadLanguages, Language} from "@nut-tree/plugin-ocr";

// Configure once at startup
configure({
    dataPath: "./ocr-data",  // Where to store language models
    languageModelType: LanguageModelType.BEST
});

await preloadLanguages([Language.English]);

// Find and click text
const button = await screen.find(singleWord("Submit"));
await mouse.move(straightTo(centerOf(button)));

Finding Text on Screen

The singleWord function (from @nut-tree/nut-js) creates a text query. Configure matching behavior via providerData:

typescript

import {screen, mouse, singleWord, straightTo, centerOf} from "@nut-tree/nut-js";
import {useOcrPlugin, configure, Language, LanguageModelType, preloadLanguages} from "@nut-tree/plugin-ocr";

useOcrPlugin();
configure({
    dataPath: "/path/to/store/language/models",
    languageModelType: LanguageModelType.BEST
});

await preloadLanguages([Language.English, Language.German]);

screen.config.ocrConfidence = 0.8;  // Minimum confidence threshold
screen.config.autoHighlight = true; // Highlight matches for debugging

const location = await screen.find(singleWord("WebStorm"), {
    providerData: {
        lang: [Language.English, Language.German],
        partialMatch: false,   // Require exact word match
        caseSensitive: false   // Ignore case
    }
});

await mouse.move(straightTo(centerOf(location)));

Provider Options

Option	Type	Description
`lang`	`Language[]`	Languages to use for recognition
`partialMatch`	`boolean`	Allow matching part of a word
`caseSensitive`	`boolean`	Whether to match case exactly

Reading Text from a Region

Extract text from a specific area of the screen:

typescript

import {getActiveWindow, screen} from "@nut-tree/nut-js";
import {useOcrPlugin, configure, LanguageModelType, TextSplit, preloadLanguages, Language} from "@nut-tree/plugin-ocr";

useOcrPlugin();
configure({
    dataPath: "/path/to/store/language/models",
    languageModelType: LanguageModelType.BEST
});

await preloadLanguages([Language.English, Language.German]);

const activeWindow = await getActiveWindow();

// Read text, split by lines
const text = await screen.read({
    searchRegion: activeWindow.region,
    split: TextSplit.LINE
});

console.log(text);

TextSplit Options

Value	Description
`TextSplit.SYMBOL`	Individual characters
`TextSplit.WORD`	Words
`TextSplit.LINE`	Lines
`TextSplit.PARAGRAPH`	Paragraphs
`TextSplit.BLOCK`	Text blocks
`TextSplit.NONE`	All text as one string

Configuration Reference

OCR Plugin Configuration

typescript

configure({
    dataPath: string,              // Directory for language model files
    languageModelType: LanguageModelType  // FAST, DEFAULT, or BEST
});

Language Model Types

Type	Speed	Accuracy	Use Case
`FAST`	Fastest	Lower	Quick checks, known fonts
`DEFAULT`	Balanced	Good	General use
`BEST`	Slower	Highest	Complex layouts, small text

Tips

Preload languages before searching to avoid delays on first use
Use search regions to limit where OCR looks - faster and more accurate
Lower confidence (screen.config.ocrConfidence) if text isn't being found, but expect more false positives
Enable autoHighlight during development to see what's being matched
partialMatch: true helps when text might include extra characters or whitespace

Supported Languages

The plugin supports 100+ languages. Common ones include:

Language.English, Language.German, Language.French, Language.Spanish, Language.Italian, Language.Portuguese, Language.Dutch, Language.Polish, Language.Russian, Language.Japanese, Language.ChineseSimplified, Language.ChineseTraditional, Language.Korean, Language.Arabic, Language.Hindi

See the full list in the OCR plugin documentation.