Help! My developer won’t implement / wants to remove / objects to my redirects.
11th December, 2014
A client recently asked me to chip in on a debate they were having with an internal tech team about retiring old 301 redirects. Concerns had been raised about ageing and legacy CMS functionality which needed to be replaced. As part of this process, the IT folks pushed to remove (and not replace) the thousands of redirects which the system was powering.
The only hint as to why they’d think this would be a good/necessary idea was an anecdotal quote from the proponents. They’d decided that as Google had seen, visited, and clearly indexed the destination URLs for some time, the redirects were no longer necessary. Their removal shouldn’t cause any harm. At best, they’d concede that the SEO guys might need to maintain a smaller list of redirects which were specifically set up ‘for SEO purposes’ (such as forcing http/s and lowercase, and a handful of specific page-level redirects), but that otherwise, they’d need to ‘trim the fat’.
Of course, this instantly set alarms bells ringing. The business dominates their market; largely because of the performance of the destination URLs for the redirects in question. Any negative impact whatsoever on the equity flowing to those pages could have huge real-world, commercial implications.
This is a type of scenario which I see frequently. There’s always huge reluctance from the IT/developer community to implement, support or maintain ‘bulky’ redirect lists – either on a day-to-day basis, or as part of website launches or significant overhauls. It’s invariably a painful experience fighting to justify why “let’s just redirect the top 100 most visited URLs on the website” isn’t a viable option. Because it’s hard to quantify the tangible opportunity cost of failing to implement redirects, it’s an incredibly challenging argument to have. It often devolves into simply pleading for people to take leaps of faith.
Given how many times I’ve fought this battle, I wanted to do my bit to help my erstwhile client, and provide some outlines of strong arguments in favour of keeping redirects. Here’s the (slightly edited to preserve anonymity) transcript.
Don’t ever remove redirects. There’s no reason to, and no benefit in doing so. The performance gloom and doom your developers preach is nonsense.
Don’t ever remove redirects. URLs still get hit/crawled/found/requested years after they’re gone. This might impact equity flow, quality, and crawl stuff.
Don’t ever remove redirects. It gives a bad user experience, which impacts the bottom line.
There are a few core considerations from my perspective:
- Tech teams are always highly sceptical of redirects, as I’m sure you’re well-aware. They’re often hesitant to implement even ‘foundational’ global redirects (such as case-forcing, http/s, etc) due to either:
- A lack of understanding/consideration of the user experience and/or the way in which Google discovers, crawls and indexes (and the way in which equity flows in this process)
- Concern over server performance resulting from high levels of redirect lookups on each request
- A desire to ‘trim the fat’ as a part of general housekeeping because it’s good practice to do that kind of thing in other scenarios and dev processes
- In each of these cases, I’ve found that focusing on user experience considerations is a good solution. Getting an agreement that any scenario in which a user enters the site on 404 error page has a negative commercial impact on the business removes a lot of the ‘SEO politics’.
- You can even go so far as to quantify this through surveying (e.g., “would seeing an error page on our site affect how much you trust the brand?“) and ascribe a £loss value to each resultant 404 per hit, based on estimated impact on conversion rates, average order values, etc. This gives you lots of ammunition to justify the continued existence of the redirects in a context which they can understand and buy into.
- Performance arguments (“Having 10,000 redirects in our
.htaccessfile will slow our site down!”) can usually be easily tested, and done so under simulated strain. I’m yet to see any meaningful performance impact following the successful deployment of redirects numbering up to the ~20k mark; though at that stage, you might want to be moving that logic into load balancers, maps, and/or caching systems.
- It’s also worth considering that almost all of the major CMS platforms operate by looking up the requested URL against a database to find a pattern matched result; as such, depending on the approach to implementation, having a large list of redirects to check against needn’t necessarily represent an extra lookup, process, or performance hit (unless the quantities are ridiculous, or the request forms obscenely complex).
- If performance concerns are still a barrier, I typically resort to identifying equivalent performance opportunities, load speed improvements, and technical optimisations which can offset the perceived cost. There’s never any shortage of opportunities for simple things like improved resource caching, client-side rendering efficiencies, etc. If they can estimate the ‘damage’, it should be possible to ‘offset’.
It’s also worth demonstrating that URLs never really stop getting hit. Ideally, you’d have a solution in place which monitors how many times a redirect has been hit, and the last time/date at which this occurred; I use a system like this on a number of my own sites. I periodically review the list and retire redirects that have been ‘stale’ for over 12 months and have a low overall hit count, but there are always surprisingly few of these.
What I’ve found interesting from running this system for a number of years over multiple sites is that:
- Any URLs which used to have any real levels of equity, links and/or performance keep getting hit for years
- Redirects which have been removed and return a 404 keep getting hit by search engines, indefinitely
- I run ~8k redirects on one of my own sites at present, with no measurable performance hit for turning them all on/off
- Removing redirects in these scenarios has a definite impact on SEO, even if this is only indirect, due to the impact being on the reduced discovery, crawlability, and crawl equity distribution resulting from a bot entering the site at a dead URL
- People still link to dead content; much of the web is stale. We also know that urls which used to be linked to still maintain some degree of equity, so removal of any of these is liable to negatively impact the destination page’s equity
- Any marginal performance overhead for running this kind of system (the performance fears resurface here in the context of maintaining these kinds of logs) is vastly offset by the value retention and improved user experience
I think that crawl quota is a particularly relevant point for [your large site] with a site of your size and scale, any activity which is likely to result in Google allocating lower crawl resource is liable to have enormous consequences to indexation levels and speed, which is obviously to be avoided at all costs.
I’d expect a site of your scale to want to maintain as many of these legacy redirects as possible, for as long as possible, until the impact of them is measurably and demonstrably detrimental and this cannot be offset elsewhere. Upwards of 6,000 sounds a lot, but I think that it’s a reasonable and realistic volume given your site, legacy, etc. Having said that, I’d definitely want to be putting processes in place in the future to minimise the likelihood of URL structures expiring, and planning for more future-proof’d patterning (obviously easier said than done!). Good practice suggests that no URL should ever change or die, which is a card you might be able to play, but I suspect that may lead to blame games around who-chose/designed-which-legacy-URL-structure, which may cause more harm than good at this stage!
If there’s continued pressure to ‘trim the fat’, I’d personally want to investigate, query and quantify every single rule which is proposed for deletion to understand whether there’s still anything linking to it, whether the URL has/had any social shares, and whether server-logs indicate that it’s been hit recently. But even if the vast majority of URLs turn out to be ‘duds’, all of the above rationales around user experience and equity management still apply.
I wonder – as part of the migration process, is there any opportunity to combine any of the redirects into pattern matching rules? E.g., if there are 100 URLs with a shared folder root, it may be prudent to craft a single rule based on matching that pattern. This should reduce quantities significantly.
From a completely alternate perspective, if deletion is to proceed, I’d potentially consider returning a 410 rather than 404 status on the URLs in question. This may help in sending a stronger signal that, rather than a large section of your website having broken/vanished (and the negative connotations associated with this), that a deliberate decision has been made to remove those pages. I’m not convinced that this would make any real difference, but it feels like a low risk/effort which may help out.
Hopefully some of this can help bolster your arguments.
As a predominantly technical person, I’ve often argued in favour of redirects from an equity-preservation perspective; it’s only in writing the email that I realised that the two biggest points here are the tangible impact on user experience and the expected (and somewhat measurable) impact on crawl quotas.
I think that next time I run into this scenario, I might not talk about ‘link juice’ or the difference between different HTTP status codes at all, and just focus on these elements. Who’d have thought?