Browser Integration

Selenium Bridge

Use nut.js image search and OCR capabilities inside Selenium-controlled browser viewports.

Overview

The @nut-tree/selenium-bridge plugin integrates nut.js with Selenium WebDriver, redirecting image search and OCR operations to work within browser viewports instead of the entire desktop. This enables visual automation inside web pages controlled by Selenium.

Browser-Scoped Search

Image search within the viewport only

screen.find(imageResource("btn.png"))

Element Conversion

Convert regions to WebElements

elementAt(region)

WebDriver Compatible

Works with any Selenium-supported browser

Chrome, Firefox, Edge, Safari

Installation

typescript
npm install @nut-tree/selenium-bridge selenium-webdriver

For TypeScript, also install type definitions:

typescript
npm install --save-dev @types/selenium-webdriver

Subscription Required

This package is included in Solo, Team, OCR, and nl-matcher subscription plans.

Quick Reference

useSelenium

useSelenium({ driver: WebDriver })
void

Redirect nut.js operations to a Selenium WebDriver instance

elementAt

elementAt(region: Region)
Promise<WebElement>

Convert a screen region to a Selenium WebElement at that position


Basic Usage

Initializing the Bridge

Pass your WebDriver instance to redirect all nut.js operations to the browser:

typescript
import { Builder, Browser } from "selenium-webdriver";
import { screen, mouse, imageResource, centerOf, straightTo, Button } from "@nut-tree/nut-js";
import { useNlMatcher } from "@nut-tree/nl-matcher";
import { useSelenium } from "@nut-tree/selenium-bridge";

useNlMatcher();

const driver = await new Builder()
  .forBrowser(Browser.CHROME)
  .build();

try {
  // Initialize the Selenium bridge
  useSelenium({ driver });

  await driver.get("https://example.com");

  // screen.find() now searches within the browser viewport
  const loginButton = await screen.find(imageResource("login-button.png"), {
    confidence: 0.95,
    providerData: { validateMatches: true }
  });
  await mouse.move(straightTo(centerOf(loginButton)));
  await mouse.click(Button.LEFT);

} finally {
  await driver.quit();
}

Converting Regions to WebElements

Use elementAt() to convert screen regions back to Selenium WebElements. This is useful when you need to interact with the element using Selenium's API:

typescript
import { Builder, Browser, WebElement } from "selenium-webdriver";
import { screen, imageResource } from "@nut-tree/nut-js";
import { useNlMatcher } from "@nut-tree/nl-matcher";
import { useSelenium, elementAt } from "@nut-tree/selenium-bridge";

useNlMatcher();

const driver = await new Builder()
  .forBrowser(Browser.CHROME)
  .build();

try {
  useSelenium({ driver });
  await driver.get("https://example.com");

  // Find element by image
  const buttonRegion = await screen.find(imageResource("submit-button.png"), {
    confidence: 0.95,
    providerData: { validateMatches: true }
  });

  // Convert region to WebElement
  const buttonElement: WebElement = await elementAt(buttonRegion);

  // Now use standard Selenium methods
  await buttonElement.click();
  const text = await buttonElement.getText();
  console.log("Button text:", text);

} finally {
  await driver.quit();
}

Hybrid Testing

Combine traditional Selenium locators with visual automation for the best of both worlds:

typescript
import { Builder, Browser, By, until } from "selenium-webdriver";
import { screen, mouse, imageResource, centerOf, straightTo, Button } from "@nut-tree/nut-js";
import { useNlMatcher } from "@nut-tree/nl-matcher";
import { useSelenium, elementAt } from "@nut-tree/selenium-bridge";

useNlMatcher();

async function hybridTest() {
  const driver = await new Builder()
    .forBrowser(Browser.CHROME)
    .build();

  try {
    useSelenium({ driver });
    await driver.get("https://example.com/products");

    // Use Selenium for navigation
    await driver.findElement(By.css('[data-category="electronics"]')).click();
    await driver.wait(until.elementLocated(By.css('.product-grid')), 5000);

    // Use nut.js for visual verification with match validation
    const productCard = await screen.find(imageResource("featured-product.png"), {
      confidence: 0.9,
      providerData: { validateMatches: true }
    });
    console.log("Found featured product at:", productCard);

    // Click using visual position
    await mouse.move(straightTo(centerOf(productCard)));
    await mouse.click(Button.LEFT);

    // Get the element for further Selenium operations
    const productElement = await elementAt(productCard);

    // Use Selenium to get attributes
    const productId = await productElement.getAttribute("data-product-id");
    console.log("Product ID:", productId);

  } finally {
    await driver.quit();
  }
}

Match Validation

The providerData: { validateMatches: true } option performs additional validation to ensure the match is accurate. Use this for critical interactions where false positives could cause issues.

Multiple Windows

The bridge automatically uses the active window when you switch between tabs:

typescript
import { Builder, Browser, By } from "selenium-webdriver";
import { screen, imageResource } from "@nut-tree/nut-js";
import { useNlMatcher } from "@nut-tree/nl-matcher";
import { useSelenium } from "@nut-tree/selenium-bridge";

useNlMatcher();

async function multiWindowTest() {
  const driver = await new Builder()
    .forBrowser(Browser.CHROME)
    .build();

  try {
    useSelenium({ driver });
    await driver.get("https://example.com");

    // Click link that opens new window
    await driver.findElement(By.css('[target="_blank"]')).click();

    // Switch to new window
    const handles = await driver.getAllWindowHandles();
    await driver.switchTo().window(handles[1]);

    // Bridge automatically uses the active window
    const newPageHeader = await screen.find(imageResource("new-page-header.png"));
    console.log("Found header in new window:", newPageHeader);

    // Switch back to original window
    await driver.switchTo().window(handles[0]);

    // Searches now happen in original window
    const originalHeader = await screen.find(imageResource("original-header.png"));

  } finally {
    await driver.quit();
  }
}

Test Framework Integration

Jest Integration

typescript
import { Builder, Browser, WebDriver } from "selenium-webdriver";
import { screen, mouse, imageResource, centerOf, straightTo, Button } from "@nut-tree/nut-js";
import { useNlMatcher } from "@nut-tree/nl-matcher";
import { useSelenium } from "@nut-tree/selenium-bridge";

useNlMatcher();

describe("Visual E2E Tests", () => {
  let driver: WebDriver;

  beforeAll(async () => {
    driver = await new Builder()
      .forBrowser(Browser.CHROME)
      .build();
    useSelenium({ driver });
  });

  afterAll(async () => {
    await driver.quit();
  });

  beforeEach(async () => {
    await driver.get("https://example.com");
  });

  test("should display hero section", async () => {
    const heroSection = await screen.find(imageResource("hero-section.png"));
    expect(heroSection).toBeDefined();
    expect(heroSection.width).toBeGreaterThan(500);
  });

  test("should open navigation menu", async () => {
    const menuButton = await screen.find(imageResource("menu-hamburger.png"));
    await mouse.move(straightTo(centerOf(menuButton)));
    await mouse.click(Button.LEFT);

    // Wait for menu to animate in
    const navMenu = await screen.waitFor(imageResource("nav-menu-open.png"), 3000, 500);
    expect(navMenu).toBeDefined();
  });
});

Mocha Integration

typescript
import { Builder, Browser, WebDriver } from "selenium-webdriver";
import { screen, imageResource } from "@nut-tree/nut-js";
import { useNlMatcher } from "@nut-tree/nl-matcher";
import { useSelenium } from "@nut-tree/selenium-bridge";
import { expect } from "chai";

useNlMatcher();

describe("Product Page", function() {
  this.timeout(30000);

  let driver: WebDriver;

  before(async () => {
    driver = await new Builder()
      .forBrowser(Browser.CHROME)
      .build();
    useSelenium({ driver });
  });

  after(async () => {
    await driver.quit();
  });

  it("should display product images", async () => {
    await driver.get("https://example.com/products/123");

    const productImage = await screen.find(imageResource("product-main-image.png"));
    expect(productImage).to.not.be.null;
  });

  it("should show thumbnails gallery", async () => {
    const thumbnails = await screen.findAll(imageResource("product-thumbnail.png"));
    expect(thumbnails.length).to.be.greaterThan(2);
  });
});

Using with OCR

Combine the bridge with the OCR plugin to find and read text on the page:

typescript
import { Builder, Browser } from "selenium-webdriver";
import { screen, mouse, centerOf, straightTo, Button, imageResource, singleWord } from "@nut-tree/nut-js";
import { useNlMatcher } from "@nut-tree/nl-matcher";
import { useSelenium } from "@nut-tree/selenium-bridge";
import { useOcrPlugin, configure, LanguageModelType } from "@nut-tree/plugin-ocr";

useNlMatcher();
useOcrPlugin();

// Configure OCR
configure({
  dataPath: "./ocr-data",
  languageModelType: LanguageModelType.BEST
});

async function extractPrices() {
  const driver = await new Builder()
    .forBrowser(Browser.CHROME)
    .build();

  try {
    useSelenium({ driver });
    await driver.get("https://example.com/pricing");

    // Find element by its text content
    const addToCart = await screen.find(singleWord("Add"));
    await mouse.move(straightTo(centerOf(addToCart)));
    await mouse.click(Button.LEFT);

    // Find price section by image
    const priceSection = await screen.find(imageResource("pricing-table.png"));

    // Read text from the price area
    const priceText = await screen.read({ searchRegion: priceSection });
    console.log("Pricing text:", priceText);

  } finally {
    await driver.quit();
  }
}

Canvas and Chart Testing

Visual automation is particularly useful for testing canvas elements and interactive charts where standard selectors don't work:

typescript
import { Builder, Browser } from "selenium-webdriver";
import { screen, mouse, imageResource, centerOf, straightTo, Point } from "@nut-tree/nut-js";
import { useNlMatcher } from "@nut-tree/nl-matcher";
import { useSelenium } from "@nut-tree/selenium-bridge";

useNlMatcher();

async function testInteractiveChart() {
  const driver = await new Builder()
    .forBrowser(Browser.CHROME)
    .build();

  try {
    useSelenium({ driver });
    await driver.get("https://example.com/analytics");

    // Wait for chart to render
    await screen.waitFor(imageResource("chart-container.png"), 10000, 1000);

    // Find a specific data point on the chart
    const dataPoint = await screen.find(imageResource("chart-peak-marker.png"));

    // Hover to show tooltip
    await mouse.move(straightTo(centerOf(dataPoint)));
    await new Promise(r => setTimeout(r, 500));

    // Verify tooltip appears
    await screen.waitFor(imageResource("chart-tooltip.png"), 3000, 500);

    // Check chart color at specific position (verify data series)
    const chartArea = await screen.find(imageResource("chart-area.png"));
    const color = await screen.colorAt(new Point(
      chartArea.left + 100,
      chartArea.top + 50
    ));

    // Verify expected color for primary data series
    if (color.B > 200 && color.R < 100) {
      console.log("Primary data series is displayed correctly");
    }

  } finally {
    await driver.quit();
  }
}

Remote Execution

The Selenium Bridge supports remote Selenium servers. Pass a remote WebDriver instance to run visual automation on a remote machine:

typescript
import { Builder, Browser } from "selenium-webdriver";
import { screen, imageResource } from "@nut-tree/nut-js";
import { useNlMatcher } from "@nut-tree/nl-matcher";
import { useSelenium } from "@nut-tree/selenium-bridge";

useNlMatcher();

const driver = await new Builder()
  .forBrowser(Browser.CHROME)
  .usingServer("http://selenium-hub:4444/wd/hub")
  .build();

useSelenium({ driver });

await driver.get("https://example.com");
const header = await screen.find(imageResource("header.png"));

Browser Configuration

Chrome Options

typescript
import { Builder, Browser } from "selenium-webdriver";
import chrome from "selenium-webdriver/chrome";
import { useSelenium } from "@nut-tree/selenium-bridge";

const options = new chrome.Options();

// Set consistent window size for reliable image matching
options.addArguments("--window-size=1920,1080");

// Disable GPU for consistent rendering
options.addArguments("--disable-gpu");

// Disable animations for stable screenshots
options.addArguments("--disable-animations");

const driver = await new Builder()
  .forBrowser(Browser.CHROME)
  .setChromeOptions(options)
  .build();

useSelenium({ driver });

Firefox Options

typescript
import { Builder, Browser } from "selenium-webdriver";
import firefox from "selenium-webdriver/firefox";
import { useSelenium } from "@nut-tree/selenium-bridge";

const options = new firefox.Options();

// Set window size
options.addArguments("--width=1920");
options.addArguments("--height=1080");

const driver = await new Builder()
  .forBrowser(Browser.FIREFOX)
  .setFirefoxOptions(options)
  .build();

useSelenium({ driver });

Best Practices

Image Preparation

  • Capture reference images at the same resolution you'll test at
  • Use consistent browser window sizes across test runs
  • Avoid images containing dynamic content (timestamps, random IDs)
  • Store images in version control with your tests

Headless Mode

Visual automation typically requires the browser to be visible. For CI environments, use Xvfb on Linux or similar virtual display solutions.

Debugging Tips

  • Use screen.capture("debug.png") to save viewport screenshots
  • Lower screen.config.confidence for fuzzy matching (default: 0.99)
  • Check browser window size matches your reference images

CI/CD Integration

For running visual tests in CI environments, use a virtual display:

typescript
# GitHub Actions example
- name: Start Xvfb (Linux)
  run: Xvfb :99 -screen 0 1920x1080x24 &

- name: Run visual tests
  run: npm test
  env:
    DISPLAY: ":99"

Playwright vs Selenium Bridge

Choose the bridge that matches your existing infrastructure:

FeatureSelenium BridgePlaywright Bridge
WebDriver supportNativeN/A
Element conversionelementAt()locateByPosition
Test matchersManual assertionsBuilt-in matchers
Page trackingManualtrackPageChanges
Browser supportAll Selenium browsersChromium, Firefox, WebKit

Was this page helpful?