Websites don’t last forever. Even the most reputable of websites can go offline, change URL’s, or begin blocking their content through a paywall. If your site has a lot of hyperlinks to external websites, chances are that some of those links have become the Internet equivalent of a dead end. Rather than pointing to the intended website, these links can lead to 404 errors, permanent redirects, or a “Site Not Found” message.
This phenomenon is called link rot, and it happens all the time. The older your site is, the higher the chances are of it having broken links. Due to the flexible and decentralized nature of the Internet, link rot is a necessary inconvenience for virtually any website. What’s worse, too many dead links on your website can negatively affect your SEO rankings.
Finding link rot can be as simple as running a web crawler through all of your links and verifying that they actually lead to somewhere that exists. If you use a tool like Google Analytics, you probably already have a report page that has these details. For WordPress users, you can use Broken Link Checker, which crawls your website looking for broken links. Once you have identified the links that lead to nowhere, here are some strategies you can use to patch up those links.
Remove the link
The simplest way of fixing a link is to remove it. No link equals no link rot. This may be necessary if the originally-linked content no longer exists, or if the link is no longer relevant.
But given that you want to keep as many outgoing links as possible, here are some ways you can get started on patching those rotted links.
Find out where the content moved
Have you ever had a friend who changed phone numbers, but never told you their new phone number? This happens all the time with content on the web. The publisher changes the URL, and does not bother leaving behind a redirect from the old URL to the new one.
In situations like this, you should first try searching for the link on your favorite search engine. For example, I have the following broken link on one of my older posts:
Being a link from a news website, the URL suggests that the title of the page was “All signs point to snow for morning commute”. I do a Bing search for that exact phrase, yielding the following link:
The title this time was “Snow accumulates in Seattle for first time in nearly 2 years”. But this is close enough to the original, and the date of publication matched that of the previous link.
Sometimes, you won’t find the exact content that you were hoping for, but this is perfectly normal. In any case, it’s better to link to similarly-related content than to link to a dead end.
Link to an archived version of the content
If you’ve rigorously searched for days without food or sleep (let’s hope not) and have come up empty, you can turn to Archive.org‘s vast collection of archived web pages. Their Wayback Machine literally crawls through the entire Internet, saving web pages in its ever-growing collection of yesteryear.
Using Wayback Machine is straightforward. Copy the broken URL into web.archive.org, and it shows you all the snapshots it took of the content at that particular URL. For example, in a previous post I linked to a now-defunct website iNezha.com. I looked it up in the Wayback Machine, and here are its results:
Next, click on a date in the calendar until you get content that matches what you expect. I decided to use the Sep 17, 2009 capture of this web page, which looks like this:
For WordPress users with the Broken Link Checker plugin, there is actually a feature for replacing an old link with one from Archive.org. However, I would still recommend checking the suggested link, as web pages can change content quite frequently.
Beware hyperlinked media link rot
In HTML, it is very simple to embed an external image or video into your content. You just need to supply the correct URL. Unfortunately, the problem with media link rot is that it sticks out like a sore thumb. Users only notice that a text link is dead when they click on it. With dead media, your readers will see anything from a blank space with an ugly X to a placeholder image with the words “This image has been permanently deleted.”
The best course of action to take in this scenario is to find another web site that has the original media, and use that link in place of the original dead link. If you can get ahold of the original image through Archive.org, you can perform a similar image search on Google Images to get the desired URL.
What strategies or tools do you use to mitigate link rot? Let us know in the comments!