Blog > Search Engine Optimization > How to Scan a WordPress Blog for Bad External Links
How to Scan a WordPress Blog for Bad External Links James Parsons • Search Engine Optimization • Published March 02, 2017

Bad links can hurt your website, both giving and receiving. I’m not here to talk about your backlink profile today, though; rather, I’m going to focus on the links you’ve posted on your site, leading out to other sites.

There are a few ways links in this category can hurt you. The one I’m primarily focusing on is links to low quality or spam sites. These sorts of links can get you in trouble because, if they’re followed links, it looks like you’re endorsing a spam site. If not endorsing a spam site, it can make you look like part of a private blog network, which Google has a history of detonating when they find them.

The tricky part with these links is that, even if you’re vetting links as you publish them, you have to keep on top of them over time. If your site is five years old, do you know the status of all of those links in five year old blog posts? Many of the destination pages may have moved, changed, or been deleted. Hell, one method of building backlinks to a site is to find sites that disappear, register their URL, and copy the pages in a microsite to direct to yours. It’s part of broken link building. The links you’ve published on your site might no longer go to a legitimate piece of content, and it’s not like anyone is going to tell you when they change.

Thankfully, auditing your links is pretty easy, if a bit time consuming, when you have WordPress as a framework for your site. All you need are a few cheap tools and some judgment over the quality of a site. Here’s the process I use, and I’ve used it to great effect, boosting my SEO by negating links that were holding me down.

Step 1: Get LinkPatrol to Scan All Links

LinkPatrol is a premium WordPress plugin that will scan your entire site for links. It can be used as a tool to change links as well, and we’ll use it for that later in the process, but for now it’s essentially a scraper to pull and download a CSV or other form of spreadsheet of all the links on your site.

If you’re auditing one single site, all you need is the Blogger license for LinkPatrol. That will run you a one-time fee of $50 and can be used indefinitely. You can use it to audit your site once every six months if you so desire, and never have to pay another cent. If you need it on more than one site, you need a larger license. Five sites will run you $100, and 20 sites will run you $200, after which you’ll need to talk to them directly for larger licenses. The tool is made by Search Engine Journal, by the way, so you know it’s pretty high quality.

Link Patrol Website

Simply buy and install the plugin the same way you would install any other plugin on WordPress. Once you have it installed, you will see a LinkPatrol entry on the left column of your admin dashboard. Click it and it will show you a scanner with 0% and a “start scan” button. Click that and it will start to scan your website. Now, keep in mind that this has to scan every page, post, and piece of content on your site where a link may be hiding. If your site is exceptionally large – tens of thousands of pages or more – this can take a while. For anything under 50,000 posts, though, it should be done in under a minute.

Once it’s done, you will find a reports page with four tabs. Each tab has different functions, some of which we’ll use later. Right now, what you want to do is choose one of the reports with data you care about and export the data as a CSV. I like the domain reports, since I don’t care who the author was and I don’t need to run searches for keywords, I just want every link. The domain report will show you the root domain, number of posts the link appears in, the number of links to that domain, and the number of authors who have linked to it.

Step 2: Use ScrapeBox to Pull the DA and Indexation of All Links

Once you have your spreadsheet of all of your links, you want to download ScrapeBox. ScrapeBox is an incredibly powerful tool that is often used for some gray and black hat techniques, but has numerous legitimate uses as well. It’s also a one-time fee and is generally going to run you about $100, though the price will vary, and you can often find discounts from affiliates around the web (like Scrapebox.com/bhw, where you can get it for only $67).

Scrapebox Software

In order to use ScrapeBox to audit your links, you will also need to register a Moz account. Specifically, you need access to the MozScape API, which you can get here. They have a free account that allows you to scan 25,000 links per month, up to 6 per minute. If that’s too low, or too slow, you can pay the rather steep $250 for a month of faster service, which bumps your quantity limit to 120K rows and your rate to 200 per second. In either case, you need your MozScape API login information added to your ScrapeBox app.

Moz API

In ScrapeBox, you’re looking for the Page Authority add-on, which is included free. This allows you to upload a list – your link CSV – and will scan each link for its Page Authority, Domain Authority, and MozRank. Page Authority isn’t all that important here, since you’re looking more at the individual domains rather than specific links.

You can also scan your list with ScrapeBox to find whether or not a page is indexed in Google. This doesn’t require the Moz API, but will simply give you a yes or no answer. Keep this data handy for the next step. Pull the Domain Authority from your ScrapeBox results and get ready to analyze some data.

Step 3: Nofollow All Links Not Indexed by Google

The first and simplest bit of auditing you can do is ridding yourself of sites that are no longer indexed by Google. Any page not indexed by Google is likely to be a spam page, a penalized page, or a page with steep manual actions taken against it that it hasn’t resolved. In each case, it does you no good to send a bit of your PageRank to that site, so nofollow all of those links. You can, if you prefer, simply remove them all, though that’s potentially more dangerous if those pages do have value despite not being indexed.

Nofollow Option Linkpatrol

To nofollow or remove the links to a given domain, go back to LinkPatrol and find the domain in the domain list. There are two buttons you can press, one for each option, so press the one that fits for your plan.

One interesting thing to note is that LinkPatrol doesn’t directly edit the pages on your website. The buttons in question are check boxes, which you can uncheck later if you want to restore some links, in case your SEO takes a hit. LinkPatrol actually just adds a minor script to the rendering of your page that will block links from displaying or add a nofollow attribute to them dynamically. This way it makes a minimum of calls to your server, a minimum of edits to your data, and can be easily undone by unchecking the box later.

Step 4: Review All Links Under DA60 and Nofollow Bad Links

The most time consuming part of this audit is to manually audit the domains and links that are left. Anything with a Domain Authority above 60 is likely an authoritative trustworthy site. You can leave those alone. This will probably cover a majority of the links on your site, though maybe not, depending on the age of your site and your linking habits.

LinkPatrol ReportsAnything under DA 60 should be audited manually. Sometimes the domain is one you know and trust, and you can leave those links alone as well. Sometimes the link is to a site that just isn’t doing well, for one reason or another. If the content at the other end of the link is still viable and valuable, it’s a link worth keeping around, but it still might be hurting you to pass some of your PageRank to their site. You can nofollow the domain but keep the links around.

Anything with an exceptionally low or invisible DA might just not be a worthwhile site. Old sites that haven’t been updated in years, sites that have been hacked or compromised, sites that have been replaced or parked; these will all show up. At this point you need to go through each one and decide; is the link worth keeping? If it is, nofollow it, so you aren’t losing value to a site that doesn’t matter. If it’s not, remove the link entirely, again using the LinkPatrol check boxes.

Step 5: Use a Broken Link Checker to Replace Broken Links

The final step of my audit process is to rid my site of broken links. Broken links will probably show up with a DA of zero, so you might have come across them already, but you can also use a dedicated tool to scan and remove them as well. I don’t like using LinkPatrol for this one because I don’t just remove or nofollow broken links. Rather, I prefer to replace them with new links to similar content, if such content exists.

First, I use Broken Link Checker to scan my site and show me all of the links that go to broken pages. Then I sort them into two categories; those with obvious destination content and those that aren’t so obvious.

Broken Links Plugin

Obvious destination content is easy to replace. If I link to a basic guide to SEO and that guide disappears, I just find a new, up to date guide to basic SEO and replace the link.

If the content was not obvious – as is the case with some joke links and with links with generic anchor text and not enough context in the paragraph itself – I will run the link through Google or the Wayback Machine to find out what it used to be. Then I’ll decide whether the link needs to be replaced, or if it can be removed without issue. Often it can be removed just fine, but sometimes I’ll find a decent replacement and be able to swap it out right away.

This is a pretty tedious process, so I recommend keeping up on it. Leave Broken Link Checker installed and it will periodically check your site links to see if their status has changed. If they have, the plugin will notify you via your dashboard or via email, and you can address the change right away.

Once you’re done with all of the above, you’ll have a cleaner link profile on your site and should see a bit of an increase to your SEO. It might not be a huge increase, but if you’re in a competitive niche you might see yourself rise in the ranks somewhat. If you see a decrease in ranking, you might have issues with too few external links on your site and should restore and replace some of the links you hid or removed.

I also highly recommend doing a moderate-depth scan of your site links about once every six months or once a year. The longer you let it go, the more negative impact it can have, and the more work it will take to audit, edit, and analyze all of those links. The initial audit is always the hardest, but ideally you’ll be able to keep on top of things so it never gets bad again.

Written by James Parsons

James Parsons

James is a content marketing and SEO professional who enjoys the challenge of driving sales through blogging while creating awesome and useful content.