Understanding WebDriver API
Selenium WebDriver provides a powerful API to control browsers programmatically. Let’s master the fundamentals.
WebDriver Interface
WebDriver is the core interface for browser automation:
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeDriver;
WebDriver driver = new ChromeDriver();
Navigation Methods
Control browser navigation:
// Navigate to URL
driver.get("https://example.com");
// Alternative navigation
driver.navigate().to("https://example.com");
// Browser history
driver.navigate().back();
driver.navigate().forward();
driver.navigate().refresh();
// Get current URL
String currentUrl = driver.getCurrentUrl();
// Get page title
String title = driver.getTitle();
// Get page source
String pageSource = driver.getPageSource();
Locator Strategies
WebDriver provides 8 locator strategies:
1. ID (Most Reliable)
import org.openqa.selenium.By;
import org.openqa.selenium.WebElement;
WebElement element = driver.findElement(By.id("username"));
element.sendKeys("testuser");
2. Name
WebElement email = driver.findElement(By.name("email"));
email.sendKeys("test@example.com");
3. Class Name
WebElement button = driver.findElement(By.className("btn-primary"));
button.click();
4. Tag Name
WebElement heading = driver.findElement(By.tagName("h1"));
String text = heading.getText();
5. Link Text
WebElement link = driver.findElement(By.linkText("Sign Up"));
link.click();
6. Partial Link Text
WebElement partialLink = driver.findElement(By.partialLinkText("Sign"));
7. CSS Selector (Powerful & Fast)
// By ID
WebElement elem1 = driver.findElement(By.cssSelector("#username"));
// By class
WebElement elem2 = driver.findElement(By.cssSelector(".btn-primary"));
// By attribute
WebElement elem3 = driver.findElement(By.cssSelector("input[type='email']"));
// By hierarchy
WebElement elem4 = driver.findElement(By.cssSelector("div.container > button"));
// Complex selectors
WebElement elem5 = driver.findElement(By.cssSelector("input[name='username'][type='text']"));
8. XPath (Most Flexible)
// Absolute XPath (avoid - brittle)
WebElement elem1 = driver.findElement(By.xpath("/html/body/div[1]/form/input[1]"));
// Relative XPath (preferred)
WebElement elem2 = driver.findElement(By.xpath("//input[@id='username']"));
// Text-based
WebElement elem3 = driver.findElement(By.xpath("//button[text()='Submit']"));
// Contains
WebElement elem4 = driver.findElement(By.xpath("//div[contains(@class, 'error')]"));
// Axes
WebElement elem5 = driver.findElement(By.xpath("//label[@for='email']/following-sibling::input"));
Element Interaction Methods
Input Elements
WebElement input = driver.findElement(By.id("search"));
// Type text
input.sendKeys("Selenium WebDriver");
// Clear text
input.clear();
// Get value
String value = input.getAttribute("value");
// Submit form
input.submit();
Buttons and Links
WebElement button = driver.findElement(By.id("submitBtn"));
// Click
button.click();
// Get text
String text = button.getText();
// Check if enabled
boolean isEnabled = button.isEnabled();
// Check if displayed
boolean isDisplayed = button.isDisplayed();
Checkboxes and Radio Buttons
WebElement checkbox = driver.findElement(By.id("agree"));
// Check if selected
if (!checkbox.isSelected()) {
checkbox.click();
}
// Uncheck
if (checkbox.isSelected()) {
checkbox.click();
}
Dropdowns
import org.openqa.selenium.support.ui.Select;
WebElement dropdown = driver.findElement(By.id("country"));
Select select = new Select(dropdown);
// Select by visible text
select.selectByVisibleText("United States");
// Select by value
select.selectByValue("us");
// Select by index
select.selectByIndex(1);
// Get selected option
WebElement selected = select.getFirstSelectedOption();
String text = selected.getText();
// Get all options
List<WebElement> options = select.getOptions();
Finding Multiple Elements
// Find all matching elements
List<WebElement> links = driver.findElements(By.tagName("a"));
for (WebElement link : links) {
System.out.println(link.getAttribute("href"));
}
Wait Strategies
Implicit Wait
import java.time.Duration;
// Applied globally to all findElement calls
driver.manage().timeouts().implicitlyWait(Duration.ofSeconds(10));
Explicit Wait
import org.openqa.selenium.support.ui.WebDriverWait;
import org.openqa.selenium.support.ui.ExpectedConditions;
WebDriverWait wait = new WebDriverWait(driver, Duration.ofSeconds(10));
// Wait for element to be clickable
WebElement element = wait.until(
ExpectedConditions.elementToBeClickable(By.id("submitBtn"))
);
// Wait for visibility
WebElement visible = wait.until(
ExpectedConditions.visibilityOfElementLocated(By.className("success"))
);
// Wait for text to be present
wait.until(
ExpectedConditions.textToBePresentInElementLocated(
By.id("message"), "Success"
)
);
Fluent Wait
import org.openqa.selenium.support.ui.FluentWait;
import org.openqa.selenium.NoSuchElementException;
FluentWait<WebDriver> fluentWait = new FluentWait<>(driver)
.withTimeout(Duration.ofSeconds(30))
.pollingEvery(Duration.ofSeconds(2))
.ignoring(NoSuchElementException.class);
WebElement element = fluentWait.until(driver ->
driver.findElement(By.id("dynamicElement"))
);
Window and Frame Handling
Window Management
// Get current window handle
String mainWindow = driver.getWindowHandle();
// Get all window handles
Set<String> allWindows = driver.getWindowHandles();
// Switch to new window
for (String window : allWindows) {
if (!window.equals(mainWindow)) {
driver.switchTo().window(window);
break;
}
}
// Switch back
driver.switchTo().window(mainWindow);
// Window size
driver.manage().window().maximize();
driver.manage().window().setSize(new Dimension(1920, 1080));
Frame Handling
// Switch to frame by index
driver.switchTo().frame(0);
// Switch to frame by name or ID
driver.switchTo().frame("frameName");
// Switch to frame by WebElement
WebElement frameElement = driver.findElement(By.id("myFrame"));
driver.switchTo().frame(frameElement);
// Switch back to default content
driver.switchTo().defaultContent();
// Switch to parent frame
driver.switchTo().parentFrame();
Alert Handling
import org.openqa.selenium.Alert;
// Switch to alert
Alert alert = driver.switchTo().alert();
// Accept alert (OK)
alert.accept();
// Dismiss alert (Cancel)
alert.dismiss();
// Get alert text
String alertText = alert.getText();
// Type into alert (prompt)
alert.sendKeys("input text");
alert.accept();
JavaScript Execution
import org.openqa.selenium.JavascriptExecutor;
JavascriptExecutor js = (JavascriptExecutor) driver;
// Execute JavaScript
js.executeScript("alert('Hello from Selenium');");
// Scroll to element
WebElement element = driver.findElement(By.id("footer"));
js.executeScript("arguments[0].scrollIntoView(true);", element);
// Click hidden element
js.executeScript("arguments[0].click();", element);
// Get return value
Long height = (Long) js.executeScript("return document.body.scrollHeight;");
Screenshot Capture
import org.openqa.selenium.TakesScreenshot;
import org.openqa.selenium.OutputType;
import org.apache.commons.io.FileUtils;
// Take screenshot
TakesScreenshot ts = (TakesScreenshot) driver;
File source = ts.getScreenshotAs(OutputType.FILE);
File destination = new File("screenshots/test.png");
FileUtils.copyFile(source, destination);
Best Practices
- Prefer CSS Selectors over XPath for better performance
- Use Explicit Waits instead of Thread.sleep()
- Unique Locators - ID is best, followed by CSS
- Avoid Absolute XPath - use relative XPath
- Handle Exceptions properly with try-catch
- Close Resources - always call driver.quit()
Common Pitfalls
// ❌ Bad - Hard-coded sleep
Thread.sleep(5000);
// ✅ Good - Explicit wait
WebDriverWait wait = new WebDriverWait(driver, Duration.ofSeconds(10));
wait.until(ExpectedConditions.visibilityOfElementLocated(By.id("element")));
// ❌ Bad - Absolute XPath
driver.findElement(By.xpath("/html/body/div[1]/div[2]/form/input[1]"));
// ✅ Good - Relative XPath
driver.findElement(By.xpath("//input[@id='username']"));
Next Steps
In the next lesson, we’ll implement the Page Object Model pattern to organize our code better.
Key Takeaways
✅ Master 8 locator strategies for element identification
✅ Use explicit waits for dynamic content
✅ Handle windows, frames, and alerts effectively
✅ Execute JavaScript when native methods fail