Learn how to search for images on your screen to use it for automation
"A picture is worth a thousand words"
When it comes to desktop automation, this holds true as well.
nut.js allows you to locate template images on your screen, a key capability for automation.
To do so, we will have to install an additional package, providing the actual implementation to perform image comparison.
Otherwise, all functions relying on image matching will throw an error like
Error: No ImageFinder registered.
One available option would be
To use this provider package, simply require it in your code, e.g.
By default, image search is carried out over multiple scales.
All above-mentioned functions are very powerful helpers when automating more complex tasks, so let's see how we can use them to our advantage!
In order to search for an image on your screen, we have to provide a template
These images can either be loaded via their full path and
loadImage, or relative to a configurable resource directory.
When working with a resource directory, you can reference template images by filename, omitting the full path.
However, when loading a template image, these filenames are relative to
screen.config.resourceDirectory = "/path/to/my/template/images"
If not configured explicitly,
screen.config.resourceDirectory is set to the current working directory.
Instead of using
loadImage, so called
image resources are loaded via
fetchFromUrl allows you to pass in a URL to an image located on a remote host that will be fetched and returned as nut.js
Let's dissect how
screen.find works by looking at a sample snippet:
First things first, we're setting up our imports on line 1 and 2.
Line 5 sets our
resourceDirectory, although the most interesting thing happens in line 7: actually searching the image.
screen.find will scan your main screen for the provided template image and if it finds a match, it'll return the Region it located the template image in.
Images are matched on a per-pixel basis.
The amount of matching pixels is configurable via the
confidence property on the
confidence is expected to be a value between 0 and 1, it defaults to 0.99 (which corresponds to a 99% match).
nut.js currently does not support multi-monitor setups
The resource directory might seem confusing at first, but it actually has a really nice side effect.
Imagine writing a cross-platform automation script where we're dealing with different UIs and therefore different template images.
Using the resource directory, we can configure our directory depending on our current platform:
This way, we can keep all our platform-specific template images in separate folders, but we don't have to actually care in our code.
By using the platform dependent resource directory, we don't have to deal with platform specific filenames.
The same filename will load the correct template image for the current platform, no further action required! 💪
In case we screwed up, nut.js will let us know by rejecting.
Searching for mouse.png failed. Reason: 'Error: Failed to load /foo/bar/mouse.png. Reason: 'Failed to load image from '/foo/bar/mouse.png''.'
Searching for mouse.png failed. Reason: 'Error: No match with required confidence 0.99. Best match: 0 at (0, 0, 477, 328)'
screeninstance to search for template images on your screen.
confidenceproperty on the
findAll is used very similarly to
The major difference between the two is the fact that
findAll will return a list of all detected matches on your main screen.
Everything else mentioned for
find applies to
findAll as well.
Being able to locate images on our screen is a huge benefit when automating things, but in reality, we have to deal with timing.
waitFor is here to help by allowing us to specify a timeout in which we expect our template image to appear on screen!
Let's tweak the snippet used in the
find example just a little:
waitFor basically does the exact same as
find, but multiple times over a specified period of time.
It'll scan your main screen for the given template image, but if it fails to find it, it'll simply give it another shot.
The interval in which these retries happen is configurable as well.
In the above snippet, we tell
waitFor to look for our template image for at most five seconds, retrying every second.
Everything mentioned on
find applies to
waitFor as well.
Action timed out after 5000 ms
waitForwill repeatedly search your main screen for the template image and if it finds a match, it'll return the Region it located the template image in.
As we learned earlier,
waitFor will repeatedly search our screen for a given template image.
This great flexibility does not come for free, so we might not want to wait for the timeout to fire before we can cancel the ongoing search.
nut.js follows the same approach to cancellation as the browser fetch API, using an AbortController.
Before we can actually look at a sample, we will have to install an additional package to our project:
Now, let's take a look at a (rather artificial) example:
waitFor has a timeout of 5000 milliseconds configured, retrying after 1000 milliseconds, but after 2000 milliseconds, we call
abort() on our AbortController, which will cancel the ongoing search:
waitForis cancelable using an AbortController.
Especially during development, we might want to visually track what happens when executing our script.
When it comes to image search, it's one thing to see in e.g. the log that we found a match, but a visual indicator would be even better.
highlight is exactly this!
Highlight duration and opacity are once again configurable properties on the
The way the API is structured, it's really easy to highlight regions located by e.g. find:
However, manually adding highlights is not only cumbersome, but also requires additional effort in case we want to remove it again before running our script in production.
Therefore, nut.js provides an auto-highlight mechanism which is toggleable via the
Highlight during development, disable it in production!
With auto highlight turned on, we no longer have to care about manually highlighting
find returns a valid Region, it will be highlighted.
find, auto-highlight works there as well!
waitFor accept OptionalSearchParameters to fine-tune the search.
This allows to e.g. limit the search space to a certain portion of your screen:
Multi-scale image search gives you resilience when switching between multiple screen resolutions, but also comes with a price.
Compared to searching on a single scale, it might take substantially longer when searching through multiple scales.
Depending on your task at hand you might not need this additional flexibility, but instead want to benefit of a faster execution.
See this benchmark for an example: