On Screen Search

Learn how to search for images on your screen to use it for automation

Image Search


"A picture is worth a thousand words"

When it comes to desktop automation, this holds true as well.
nut.js allows you to locate template images on your screen, a key capability for automation.

ImageFinder Providers


To do so, we will have to install an additional package, providing the actual implementation to perform image comparison.
Otherwise, all functions relying on image matching will throw an error like Error: No ImageFinder registered.

One available option would be @nut-tree/template-matcher:

1npm i @nut-tree/template-matcher

To use this provider package, simply require it in your code, e.g. index.js:

1const { screen, imageResource } = require("@nut-tree/nut-js");
2require("@nut-tree/template-matcher"); // THIS IS NEW
3
4(async () => {
5  try {
6    await screen.find(imageResource("img.png"));
7  } catch (e) {
8    console.error(e);
9  }
10})();

Provided Functionality


find, findAll and waitFor are the main functions when it comes to image search.

While find and findAll try to locate an image on screen at the very moment, waitFor will repeatedly scan the screen for the image until a certain timeout is reached.

By default, image search is carried out over multiple scales.

All above-mentioned functions are very powerful helpers when automating more complex tasks, so let's see how we can use them to our advantage!

Working with template images


In order to search for an image on your screen, we have to provide a template Image.

These images can either be loaded via their full path and loadImage, or relative to a configurable resource directory.

Resource Directory


When working with a resource directory, you can reference template images by filename, omitting the full path.
However, when loading a template image, these filenames are relative to screen.config.resourceDirectory.

screen.config.resourceDirectory = "/path/to/my/template/images"

If not configured explicitly, screen.config.resourceDirectory is set to the current working directory.

Loading Images from Resource Directory


Instead of using loadImage, so called image resources are loaded via imageResource.

Fetch Images from a Remote Host


fetchFromUrl allows you to pass in a URL to an image located on a remote host that will be fetched and returned as nut.js Image.

find


Template images are Images either directly loaded using their full path, relative to a configurable resource directory or from a remote host via fetchFromUrl.

finding images


Let's dissect how screen.find works by looking at a sample snippet:

1const { screen, imageResource } = require("@nut-tree/nut-js");
2require("@nut-tree/template-matcher");
3
4(async () => {
5    screen.config.resourceDirectory = "/resouce/path";
6    try {
7        const region = await screen.find(imageResource("mouse.png"));
8        console.log(region);
9    } catch (e) {
10        console.error(e);
11    }
12})();

First things first, we're setting up our imports on line 1 and 2.

Line 5 sets our resourceDirectory, although the most interesting thing happens in line 7: actually searching the image.

screen.find will scan your main screen for the provided template image and if it finds a match, it'll return the Region it located the template image in.
Images are matched on a per-pixel basis.
The amount of matching pixels is configurable via the confidence property on the config object.
confidence is expected to be a value between 0 and 1, it defaults to 0.99 (which corresponds to a 99% match).

nut.js currently does not support multi-monitor setups

The Cross-Platform Trick


The resource directory might seem confusing at first, but it actually has a really nice side effect.
Imagine writing a cross-platform automation script where we're dealing with different UIs and therefore different template images.

Using the resource directory, we can configure our directory depending on our current platform:

1screen.config.resourceDirectory = `/path/to/the/project/${process.platform}`;

This way, we can keep all our platform-specific template images in separate folders, but we don't have to actually care in our code.

By using the platform dependent resource directory, we don't have to deal with platform specific filenames.
The same filename will load the correct template image for the current platform, no further action required! 💪

Troubleshooting


In case we screwed up, nut.js will let us know by rejecting.

Wrong resource directory


Searching for mouse.png failed. Reason: 'Error: Failed to load /foo/bar/mouse.png. Reason: 'Failed to load image from '/foo/bar/mouse.png''.'

No match


Searching for mouse.png failed. Reason: 'Error: No match with required confidence 0.99. Best match: 0 at (0, 0, 477, 328)'

Summary


  • nut.js provides a screen instance to search for template images on your screen.
  • The directory where to load your template images from is configurable via the config object.
  • It will search your main screen for the template image and if it finds a match, it'll return the Region it located the template image in.
  • The amount of matching pixels is configurable via the confidence property on the config object.

findAll


findAll is used very similarly to find.
The major difference between the two is the fact that findAll will return a list of all detected matches on your main screen.

Everything else mentioned forfind applies to findAll as well.

waitFor


Being able to locate images on our screen is a huge benefit when automating things, but in reality, we have to deal with timing.
waitFor is here to help by allowing us to specify a timeout in which we expect our template image to appear on screen!

Template images are Images either directly loaded using their full path, relative to a configurable resource directory or from a remote host via fetchFromUrl.

Waiting for images


Let's tweak the snippet used in the find example just a little:

1const { screen, imageResource } = require("@nut-tree/nut-js");
2require("@nut-tree/template-matcher");
3
4(async () => {
5    screen.config.resourceDirectory = "/resouce/path";
6    try {
7        const region = await screen.waitFor(imageResource("mouse.png"));
8        console.log(region);
9    } catch (e) {
10        console.error(e);
11    }
12})();

waitFor basically does the exact same as find, but multiple times over a specified period of time.

It'll scan your main screen for the given template image, but if it fails to find it, it'll simply give it another shot.
The interval in which these retries happen is configurable as well.

1const { screen, imageResource } = require("@nut-tree/nut-js");
2require("@nut-tree/template-matcher");
3
4(async () => {
5    screen.config.resourceDirectory = "/resouce/path";
6    try {
7        const region = await screen.waitFor(imageResource("mouse.png"), 5000, 1000);
8        console.log(region);
9    } catch (e) {
10        console.error(e);
11    }
12})();

In the above snippet, we tell waitFor to look for our template image for at most five seconds, retrying every second.

If it still couldn't locate the image after the configured timeout in milliseconds, it'll reject.
Otherwise, it'll return the Region it located the image in, just like find.

Troubleshooting


Everything mentioned on find applies to waitFor as well.

Timeout


Action timed out after 5000 ms

Summary


  • waitFor will repeatedly search your main screen for the template image and if it finds a match, it'll return the Region it located the template image in.
  • If it can't locate the image, it'll retry the search in configurable intervals until it hits the configured timeout in milliseconds.

Cancelling waitFor


As we learned earlier, waitFor will repeatedly search our screen for a given template image.

This great flexibility does not come for free, so we might not want to wait for the timeout to fire before we can cancel the ongoing search.
nut.js follows the same approach to cancellation as the browser fetch API, using an AbortController.

Before we can actually look at a sample, we will have to install an additional package to our project:

1npm i node-abort-controller

Now, let's take a look at a (rather artificial) example:

1const { screen, Region, imageResource } = require("@nut-tree/nut-js");
2require("@nut-tree/template-matcher");
3const { AbortController } = require("node-abort-controller");
4
5(async () => {
6    const controller = new AbortController();
7    screen.waitFor(imageResource("test.png"), 5000, 1000, {abort: controller.signal});
8    setTimeout(() => controller.abort(), 2000);
9})();

We instantiate our AbortController in line 6 and pass its signal as an OptionalSearchParameter to waitFor.

waitFor has a timeout of 5000 milliseconds configured, retrying after 1000 milliseconds, but after 2000 milliseconds, we call abort() on our AbortController, which will cancel the ongoing search:

1Action aborted by signal

Summary


highlight


Especially during development, we might want to visually track what happens when executing our script.
When it comes to image search, it's one thing to see in e.g. the log that we found a match, but a visual indicator would be even better.

highlight is exactly this!

Configuration


highlight works by overlaying a Region of interest with an opaque highlight window.

Highlight duration and opacity are once again configurable properties on the screen.config object.

Highlighting regions


highlight receives a Region specifying the area to highlight.
It will then overlay the given region with an opaque highlight window.

1const { screen, Region } = require("@nut-tree/nut-js");
2
3(async () => {
4    screen.config.highlightDurationMs = 3000;
5    const highlightRegion = new Region(50, 100, 200, 300);
6    await screen.highlight(highlightRegion);
7})();

Auto Highlighting


The way the API is structured, it's really easy to highlight regions located by e.g. find:

1const { screen, imageResource } = require("@nut-tree/nut-js");
2require("@nut-tree/template-matcher");
3
4(async () => {
5    screen.config.resourceDirectory = "/resouce/path";
6    screen.config.highlightDurationMs = 3000;
7    await screen.highlight(screen.find(imageResource("image.png")));
8})();

However, manually adding highlights is not only cumbersome, but also requires additional effort in case we want to remove it again before running our script in production.

Therefore, nut.js provides an auto-highlight mechanism which is toggleable via the config property.
Highlight during development, disable it in production!

1const { screen, imageResource } = require("@nut-tree/nut-js");
2require("@nut-tree/template-matcher");
3
4(async () => {
5    screen.config.resourceDirectory = "/resouce/path";
6    screen.config.autoHighlight = true;
7    screen.config.highlightDurationMs = 1500;
8    await screen.find(imageResource("test.png"));
9})();

With auto highlight turned on, we no longer have to care about manually highlighting find results.
Once find returns a valid Region, it will be highlighted.
And since waitFor reuses find, auto-highlight works there as well!

Summary


  • nut.js provides a way to visually debug image search results.
  • Both the highlight duration and the highlight window opacity are configurable via the config object.
  • Auto highlight will automatically highlight results returned from find.

Parameterize Search


find, findAll and waitFor accept OptionalSearchParameters to fine-tune the search.

This allows to e.g. limit the search space to a certain portion of your screen:

1const { screen, Region, OptionalSearchParameters, imageResource } = require("@nut-tree/nut-js");
2require("@nut-tree/template-matcher");
3const { AbortController } = require("node-abort-controller");
4
5(async () => {
6    // Configure the postion and size of the area you wish Nut to search
7    const searchRegion = new Region(10, 10, 500, 500);
8    
9    // Configure the confidence you wish Nut to have before finding a match
10    const confidence = 0.88;
11    
12    // Configure an Abort controller so that you can cancel the find operation at any time
13    const controller = new AbortController();
14    const { signal } = controller;
15    
16    // Feed your parameters into the OptionalSearchParameters constructor to make sure they fit the spec
17    const fullSearchOptionsConfiguration = new OptionalSearchParameters(searchRegion, confidence, signal);
18    
19    // .find() will return the Region where it found a match based on your search parameters and provided Image data
20    const matchRegion = await screen.find(imageResource("image.png"), fullSearchOptionsConfiguration);
21    
22    const cancelFindTimeout = setTimeout(() => {
23      controller.abort();
24    }, 5000);
25    
26    
27})();

Multi-scale image search gives you resilience when switching between multiple screen resolutions, but also comes with a price.
Compared to searching on a single scale, it might take substantially longer when searching through multiple scales.

Depending on your task at hand you might not need this additional flexibility, but instead want to benefit of a faster execution.
See this benchmark for an example:

1hyperfine --warmup 3 'node multi-scale.js' 'node single-scale.js' --show-output
2Benchmark 1: node multi-scale.js
3  Time (mean ± σ):     933.5 ms ±  10.4 ms    [User: 1647.4 ms, System: 433.8 ms]
4  Range (min … max):   920.9 ms … 948.4 ms    10 runs
5
6Benchmark 2: node single-scale.js
7  Time (mean ± σ):     526.8 ms ±   9.3 ms    [User: 400.2 ms, System: 108.4 ms]
8  Range (min … max):   514.3 ms … 544.4 ms    10 runs
9
10Summary
11  'node single-scale.js' ran
12    1.77 ± 0.04 times faster than 'node multi-scale.js'

© 2023