The OCR plugin enables you to find and read text directly from the screen. This is useful when automating applications where text is rendered dynamically, or when maintaining reference images isn't practical.
When to Use OCR
- Dynamic labels that change based on data (prices, usernames, timestamps)
- Localized applications where button text varies by language
- Data extraction from screens that don't expose their content programmatically
- Verification that expected text appears after an action
For static UI elements, image matching is often faster and more reliable.
Installation
npm i @nut-tree/plugin-ocrQuick Start
import {screen, singleWord, mouse, straightTo, centerOf} from "@nut-tree/nut-js";
import {configure, LanguageModelType, preloadLanguages, Language} from "@nut-tree/plugin-ocr";
// Configure once at startup
configure({
dataPath: "./ocr-data", // Where to store language models
languageModelType: LanguageModelType.BEST
});
await preloadLanguages([Language.English]);
// Find and click text
const button = await screen.find(singleWord("Submit"));
await mouse.move(straightTo(centerOf(button)));Finding Text on Screen
The singleWord function (from @nut-tree/nut-js) creates a text query. Configure matching behavior via providerData:
import {screen, mouse, singleWord, straightTo, centerOf} from "@nut-tree/nut-js";
import {useOcrPlugin, configure, Language, LanguageModelType, preloadLanguages} from "@nut-tree/plugin-ocr";
useOcrPlugin();
configure({
dataPath: "/path/to/store/language/models",
languageModelType: LanguageModelType.BEST
});
await preloadLanguages([Language.English, Language.German]);
screen.config.ocrConfidence = 0.8; // Minimum confidence threshold
screen.config.autoHighlight = true; // Highlight matches for debugging
const location = await screen.find(singleWord("WebStorm"), {
providerData: {
lang: [Language.English, Language.German],
partialMatch: false, // Require exact word match
caseSensitive: false // Ignore case
}
});
await mouse.move(straightTo(centerOf(location)));Provider Options
| Option | Type | Description |
|---|---|---|
lang | Language[] | Languages to use for recognition |
partialMatch | boolean | Allow matching part of a word |
caseSensitive | boolean | Whether to match case exactly |
Reading Text from a Region
Extract text from a specific area of the screen:
import {getActiveWindow, screen} from "@nut-tree/nut-js";
import {useOcrPlugin, configure, LanguageModelType, TextSplit, preloadLanguages, Language} from "@nut-tree/plugin-ocr";
useOcrPlugin();
configure({
dataPath: "/path/to/store/language/models",
languageModelType: LanguageModelType.BEST
});
await preloadLanguages([Language.English, Language.German]);
const activeWindow = await getActiveWindow();
// Read text, split by lines
const text = await screen.read({
searchRegion: activeWindow.region,
split: TextSplit.LINE
});
console.log(text);TextSplit Options
| Value | Description |
|---|---|
TextSplit.SYMBOL | Individual characters |
TextSplit.WORD | Words |
TextSplit.LINE | Lines |
TextSplit.PARAGRAPH | Paragraphs |
TextSplit.BLOCK | Text blocks |
TextSplit.NONE | All text as one string |
Configuration Reference
OCR Plugin Configuration
configure({
dataPath: string, // Directory for language model files
languageModelType: LanguageModelType // FAST, DEFAULT, or BEST
});Language Model Types
| Type | Speed | Accuracy | Use Case |
|---|---|---|---|
FAST | Fastest | Lower | Quick checks, known fonts |
DEFAULT | Balanced | Good | General use |
BEST | Slower | Highest | Complex layouts, small text |
Tips
- Preload languages before searching to avoid delays on first use
- Use search regions to limit where OCR looks - faster and more accurate
- Lower confidence (
screen.config.ocrConfidence) if text isn't being found, but expect more false positives - Enable
autoHighlightduring development to see what's being matched partialMatch: truehelps when text might include extra characters or whitespace
Supported Languages
The plugin supports 100+ languages. Common ones include:
Language.English, Language.German, Language.French, Language.Spanish, Language.Italian, Language.Portuguese, Language.Dutch, Language.Polish, Language.Russian, Language.Japanese, Language.ChineseSimplified, Language.ChineseTraditional, Language.Korean, Language.Arabic, Language.Hindi
See the full list in the OCR plugin documentation.