Learn how to search for images on your screen to use it for automation
"A picture is worth a thousand words"
When it comes to desktop automation, this holds true as well.
nut.js
allows you to locate template images on your screen, a key capability for automation.
To do so, we will have to install an additional package, providing the actual implementation to perform image comparison.
Otherwise, all functions relying on image matching will throw an error like Error: No ImageFinder registered
.
One available option would be @nut-tree/template-matcher
:
1npm i @nut-tree/template-matcher
To use this provider package, simply require it in your code, e.g. index.js
:
1const { screen, imageResource } = require("@nut-tree/nut-js"); 2require("@nut-tree/template-matcher"); // THIS IS NEW 3 4(async () => { 5 try { 6 await screen.find(imageResource("img.png")); 7 } catch (e) { 8 console.error(e); 9 } 10})();
find, findAll and waitFor are the main functions when it comes to image search.
While find and findAll try to locate an image on screen at the very moment, waitFor will repeatedly scan the screen for the image until a certain timeout is reached.
By default, image search is carried out over multiple scales.
All above-mentioned functions are very powerful helpers when automating more complex tasks, so let's see how we can use them to our advantage!
In order to search for an image on your screen, we have to provide a template Image
.
These images can either be loaded via their full path and loadImage
, or relative to a configurable resource directory.
When working with a resource directory, you can reference template images by filename, omitting the full path.
However, when loading a template image, these filenames are relative to screen.config.resourceDirectory
.
screen.config.resourceDirectory = "/path/to/my/template/images"
If not configured explicitly, screen.config.resourceDirectory
is set to the current working directory.
Instead of using loadImage
, so called image resources
are loaded via imageResource
.
fetchFromUrl
allows you to pass in a URL to an image located on a remote host that will be fetched and returned as nut.js Image
.
Template images are Images
either directly loaded using their full path, relative to a configurable resource directory or from a remote host via fetchFromUrl
.
Let's dissect how screen.find
works by looking at a sample snippet:
1const { screen, imageResource } = require("@nut-tree/nut-js"); 2require("@nut-tree/template-matcher"); 3 4(async () => { 5 screen.config.resourceDirectory = "/resouce/path"; 6 try { 7 const region = await screen.find(imageResource("mouse.png")); 8 console.log(region); 9 } catch (e) { 10 console.error(e); 11 } 12})();
First things first, we're setting up our imports on line 1 and 2.
Line 5 sets our resourceDirectory
, although the most interesting thing happens in line 7: actually searching the image.
screen.find
will scan your main screen for the provided template image and if it finds a match, it'll return the Region it located the template image in.
Images are matched on a per-pixel basis.
The amount of matching pixels is configurable via the confidence
property on the config
object.
confidence
is expected to be a value between 0 and 1, it defaults to 0.99 (which corresponds to a 99% match).
nut.js currently does not support multi-monitor setups
The resource directory might seem confusing at first, but it actually has a really nice side effect.
Imagine writing a cross-platform automation script where we're dealing with different UIs and therefore different template images.
Using the resource directory, we can configure our directory depending on our current platform:
1screen.config.resourceDirectory = `/path/to/the/project/${process.platform}`;
This way, we can keep all our platform-specific template images in separate folders, but we don't have to actually care in our code.
By using the platform dependent resource directory, we don't have to deal with platform specific filenames.
The same filename will load the correct template image for the current platform, no further action required! 💪
In case we screwed up, nut.js will let us know by rejecting.
Searching for mouse.png failed. Reason: 'Error: Failed to load /foo/bar/mouse.png. Reason: 'Failed to load image from '/foo/bar/mouse.png''.'
Searching for mouse.png failed. Reason: 'Error: No match with required confidence 0.99. Best match: 0 at (0, 0, 477, 328)'
screen
instance to search for template images on your screen.config
object.confidence
property on the config
object.findAll
is used very similarly to find
.
The major difference between the two is the fact that findAll
will return a list of all detected matches on your main screen.
Everything else mentioned forfind
applies to findAll
as well.
Being able to locate images on our screen is a huge benefit when automating things, but in reality, we have to deal with timing.
waitFor
is here to help by allowing us to specify a timeout in which we expect our template image to appear on screen!
Template images are Images
either directly loaded using their full path, relative to a configurable resource directory or from a remote host via fetchFromUrl
.
Let's tweak the snippet used in the find
example just a little:
1const { screen, imageResource } = require("@nut-tree/nut-js"); 2require("@nut-tree/template-matcher"); 3 4(async () => { 5 screen.config.resourceDirectory = "/resouce/path"; 6 try { 7 const region = await screen.waitFor(imageResource("mouse.png")); 8 console.log(region); 9 } catch (e) { 10 console.error(e); 11 } 12})();
waitFor
basically does the exact same as find
, but multiple times over a specified period of time.
It'll scan your main screen for the given template image, but if it fails to find it, it'll simply give it another shot.
The interval in which these retries happen is configurable as well.
1const { screen, imageResource } = require("@nut-tree/nut-js"); 2require("@nut-tree/template-matcher"); 3 4(async () => { 5 screen.config.resourceDirectory = "/resouce/path"; 6 try { 7 const region = await screen.waitFor(imageResource("mouse.png"), 5000, 1000); 8 console.log(region); 9 } catch (e) { 10 console.error(e); 11 } 12})();
In the above snippet, we tell waitFor
to look for our template image for at most five seconds, retrying every second.
If it still couldn't locate the image after the configured timeout in milliseconds, it'll reject.
Otherwise, it'll return the Region it located the image in, just like find.
Everything mentioned on find
applies to waitFor
as well.
Action timed out after 5000 ms
waitFor
will repeatedly search your main screen for the template image and if it finds a match, it'll return the Region it located the template image in.As we learned earlier, waitFor
will repeatedly search our screen for a given template image.
This great flexibility does not come for free, so we might not want to wait for the timeout to fire before we can cancel the ongoing search.
nut.js follows the same approach to cancellation as the browser fetch API, using an AbortController.
Before we can actually look at a sample, we will have to install an additional package to our project:
1npm i node-abort-controller
Now, let's take a look at a (rather artificial) example:
1const { screen, Region, imageResource } = require("@nut-tree/nut-js"); 2require("@nut-tree/template-matcher"); 3const { AbortController } = require("node-abort-controller"); 4 5(async () => { 6 const controller = new AbortController(); 7 screen.waitFor(imageResource("test.png"), 5000, 1000, {abort: controller.signal}); 8 setTimeout(() => controller.abort(), 2000); 9})();
We instantiate our AbortController in line 6 and pass its signal
as an OptionalSearchParameter to waitFor.
waitFor
has a timeout of 5000 milliseconds configured, retrying after 1000 milliseconds, but after 2000 milliseconds, we call abort()
on our AbortController, which will cancel the ongoing search:
1Action aborted by signal
waitFor
is cancelable using an AbortController.Especially during development, we might want to visually track what happens when executing our script.
When it comes to image search, it's one thing to see in e.g. the log that we found a match, but a visual indicator would be even better.
highlight is exactly this!
highlight works by overlaying a Region of interest with an opaque highlight window.
Highlight duration and opacity are once again configurable properties on the screen.config
object.
highlight receives a Region specifying the area to highlight.
It will then overlay the given region with an opaque highlight window.
1const { screen, Region } = require("@nut-tree/nut-js"); 2 3(async () => { 4 screen.config.highlightDurationMs = 3000; 5 const highlightRegion = new Region(50, 100, 200, 300); 6 await screen.highlight(highlightRegion); 7})();
The way the API is structured, it's really easy to highlight regions located by e.g. find:
1const { screen, imageResource } = require("@nut-tree/nut-js"); 2require("@nut-tree/template-matcher"); 3 4(async () => { 5 screen.config.resourceDirectory = "/resouce/path"; 6 screen.config.highlightDurationMs = 3000; 7 await screen.highlight(screen.find(imageResource("image.png"))); 8})();
However, manually adding highlights is not only cumbersome, but also requires additional effort in case we want to remove it again before running our script in production.
Therefore, nut.js provides an auto-highlight mechanism which is toggleable via the config
property.
Highlight during development, disable it in production!
1const { screen, imageResource } = require("@nut-tree/nut-js"); 2require("@nut-tree/template-matcher"); 3 4(async () => { 5 screen.config.resourceDirectory = "/resouce/path"; 6 screen.config.autoHighlight = true; 7 screen.config.highlightDurationMs = 1500; 8 await screen.find(imageResource("test.png")); 9})();
With auto highlight turned on, we no longer have to care about manually highlighting find
results.
Once find
returns a valid Region, it will be highlighted.
And since waitFor
reuses find
, auto-highlight works there as well!
config
object.find
.find
, findAll
and waitFor
accept OptionalSearchParameters to fine-tune the search.
This allows to e.g. limit the search space to a certain portion of your screen:
1const { screen, Region, OptionalSearchParameters, imageResource } = require("@nut-tree/nut-js"); 2require("@nut-tree/template-matcher"); 3const { AbortController } = require("node-abort-controller"); 4 5(async () => { 6 // Configure the postion and size of the area you wish Nut to search 7 const searchRegion = new Region(10, 10, 500, 500); 8 9 // Configure the confidence you wish Nut to have before finding a match 10 const confidence = 0.88; 11 12 // Configure an Abort controller so that you can cancel the find operation at any time 13 const controller = new AbortController(); 14 const { signal } = controller; 15 16 // Feed your parameters into the OptionalSearchParameters constructor to make sure they fit the spec 17 const fullSearchOptionsConfiguration = new OptionalSearchParameters(searchRegion, confidence, signal); 18 19 // .find() will return the Region where it found a match based on your search parameters and provided Image data 20 const matchRegion = await screen.find(imageResource("image.png"), fullSearchOptionsConfiguration); 21 22 const cancelFindTimeout = setTimeout(() => { 23 controller.abort(); 24 }, 5000); 25 26 27})();
Multi-scale image search gives you resilience when switching between multiple screen resolutions, but also comes with a price.
Compared to searching on a single scale, it might take substantially longer when searching through multiple scales.
Depending on your task at hand you might not need this additional flexibility, but instead want to benefit of a faster execution.
See this benchmark for an example:
1hyperfine --warmup 3 'node multi-scale.js' 'node single-scale.js' --show-output 2Benchmark 1: node multi-scale.js 3 Time (mean ± σ): 933.5 ms ± 10.4 ms [User: 1647.4 ms, System: 433.8 ms] 4 Range (min … max): 920.9 ms … 948.4 ms 10 runs 5 6Benchmark 2: node single-scale.js 7 Time (mean ± σ): 526.8 ms ± 9.3 ms [User: 400.2 ms, System: 108.4 ms] 8 Range (min … max): 514.3 ms … 544.4 ms 10 runs 9 10Summary 11 'node single-scale.js' ran 12 1.77 ± 0.04 times faster than 'node multi-scale.js'
© 2023