How To Find All Broken Links On The Page With Selenium

Leyla GORMEL
2 min readMay 9, 2021

Sometimes you need to check all links on your project. You can do this with Postman or any other API test tool but this way easier. When you use API testing tools you need to write all links’ s connection one by one or maybe links changed, that time you should edit all tests one by one again.

Now with this java code, you can check all links. These links can be pdf, image, video, or photo’s link.

Step-1: In HTML we link with this code: <a href=”Adress”></a> it is means we must collect all links in the web page based on <a>. We use that code for this:

List<WebElement> allLinks = driver.findElements(By.tagName(LINKS_TAG));

LINKS_TAG is “a” At, end of the page I will add all code.

Step-2: Identifying and Validating URL

String urlLink = link.getAttribute(LINKS_ATTRIBUTE);

LINKS_ATTRIBUTE is “href”

Step-3: Send HTTP request and Read HTTP response codes

We create HttpConnection with URL parameter. I added also Connection Timeout.

URL url = new URL(urlLink);
HttpURLConnection httpURLConnect=(HttpURLConnection)url.openConnection();
httpURLConnect.setConnectTimeout(5000);
httpURLConnect.connect();
  • Information Response Codes: 100–199
  • Successful Response Codes: 200–299
  • Redirect Codes: 300–399
  • Client Error Codes: 400–499
  • Server Error Codes: 500–599

Basically, we can say that; If the response code is greater than or equal to 400, this time this connection is broken.

import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;
import org.openqa.selenium.chrome.ChromeOptions;
import org.testng.annotations.AfterClass;
import org.testng.annotations.BeforeTest;
import org.testng.annotations.Test;
import java.net.HttpURLConnection;
import java.net.URL;
import java.util.List;


public class FindAllBrokenLinks {
public final String DRIVER_PATH = "Drivers/chromedriver";
public final String DRIVER_TYPE = "webdriver.chrome.driver";
public WebDriver driver;
public final String BASE_URL = "https://www.bbc.com/";
public final String LINKS_ATTRIBUTE = "href";
public final String LINKS_TAG = "a";

@BeforeTest
public void beforeTest(){
ChromeOptions options = new ChromeOptions();
options.addArguments("--disable-notifications","--ignore-certificate-errors","--disable-extensions");
System.setProperty(DRIVER_TYPE,DRIVER_PATH);
driver = new ChromeDriver(options);
driver.manage().window().maximize();
driver.get(BASE_URL);
}

@Test
public void FindAllBrokenLinks() throws Exception{
List<WebElement> allLinks = driver.findElements(By.tagName(LINKS_TAG));
for(WebElement link:allLinks){
try {
String urlLink = link.getAttribute(LINKS_ATTRIBUTE);
URL url = new URL(urlLink);
HttpURLConnection httpURLConnect=(HttpURLConnection)url.openConnection();
httpURLConnect.setConnectTimeout(5000);
httpURLConnect.connect();
if(httpURLConnect.getResponseCode()>=400)
{
System.out.println(urlLink+" - "+httpURLConnect.getResponseMessage()+"is a broken link");
}
else{
System.out.println(urlLink+" - "+httpURLConnect.getResponseMessage());
}
}catch (Exception e) {
}
}

}

@AfterClass
public void CloseDriver(){
driver.close();

}
}

I used the BBC webpage URL as the base URL but it took 1minute and 49 seconds to run this code. :) Maybe you should choose another website.

There are some test results:

https://www.bbc.com/sport — OK
https://www.bbc.com/reel — OK
https://www.bbc.com/worklife — OK
https://www.bbc.com/travel — Moved Temporarily
https://www.bbc.com/future — OK
https://www.bbc.com/culture — OK
https://www.bbc.com/culture/music — OK

http://www.bbc.co.uk/worldserviceradio/ — Moved Permanently
http://www.bbc.co.uk/programmes/p00wf2qw — Moved Permanently
https://www.bbc.com/news/world-europe-57039362 — OK

--

--