Using a content delivery network (CDN) can provide a better user experience with faster page and resource loading and less downtime. It can also save online businesses a lot of money and minimize damage from DoS attacks.
However, many SEOs understandably get concerned when they find out the website they’re in charge of is going to start hosting its images and other types of content on another domain. I’d like to share four SEO best practices for using a CDN that should allow you to get the benefit of a CDN without the risk of losing all of your traffic from Google Image Search and elsewhere.
Tip #1 – Use Your Own Subdomain
Set up the CNAME so that the CDN exists on a subdomain for your own site (e.g. cdn.yoursite.com). This is better than hosting with the CDN (e.g. yourdomain.CDNCompany.com). If that sounds confusing to you, just be sure to ask the CDN company if you can have the service exist on your own subdomain instead of one of theirs. I think most services can be set up that way. Some services, like CloudFlare, act more like reverse proxy for your entire domain so nothing really changes URL-wise anyway.
Tip #2 – Retain File Names and Paths
Keep the same file naming conventions you had before the CDN. For example, if an image was at www.yoursite.com/images/image1.jpg and you go to a CDN that renames it assets.yourdomain.com/images/000123.jpg or changes the folder structure (e.g. assets.yourdomain.com/assets/image1.jpg) you could run into some problems if the old URLs are not redirected. That’s what happened to this guy. If the pre-CDN path to the file was at www.yoursite.com/images/image1.jpg you want the file to be here: cdn.yoursite.com/images/image1.jpg.
Tip #3 – Set Up the CDN in GWT
You should set up and verify your CDN subdomain in Google Webmaster Tools and Bing Webmaster Tools. This allows you to do several things. For one, if you are concerned with the IP address of the CDN being out of your targeted country you could set the geographic target in GWT. It also allows you to use the URL removal tool if needed, among other things.
Tip #4 – Use Rel Canonical on CDN-Hosted Pages
Well we found out the hard way when Amazon introduced CDN for pages that we where having a duplicate content issue. So we took the following steps to stop it from happening:
Every page view for Moz.com is running through our CMS, so we look for the Cloudfront User Agent and throw a 404. Since Amazon won’t cache a 404, there are no duplicate pages within Cloudfront only our images. The images don’t go through the CMS, so they don’t get blocked and work fine through the CDN.
Once we put this in place we had to invalidate every page that was cached and then request they be removed from Google.
Amazon’s Cloudfront User Agent is “Amazon CloudFront”.
As you can see, solutions to the issue of duplicate content being indexed on a CDN subdomain can range from the relatively simple use of rel canonical tags to a complicated process put in place by some very bright developers. Overall, however, using a CDN should be safe for SEO and could even improve your pageload speed, which could have a positive effect on rankings. Just follow best practices and you should be fine. Here are a few more resources to check out…
Other Useful Articles
What The Heck is a CDN Infographic on WPBeginner.com
Google on Content Delivery Networks & Search Rankings by Barry Schwartz at SEL
More Tips from Google on Content on CDNs by Barry Schwartz at SEL
How Does Google Treat Sites Behind a CDN on Google Product Forums
Real World Performance Comparison of CDNs by Dirk Paessler
Using Amazon S3 Without SEO Issues by Barry Schwartz at RustyBrick.com
Setting Up a CDN in WordPress by Martin Brinkman on Ghacks.net
Multicast vs Unicast Streaming (credit for image above)