How to do a health check on your hotel website using the server log file

Ensuring that your hotel website has a good technical SEO setup is crucial for the success of your SEO strategy. Otherwise you may jeopardize your site’s search ranking if the backend isn’t set up properly. Conducting a server log file analysis can help ensure that your site is indexable—meaning that search engines can correctly detect the content on your site—and it also allows you to detect other potential issues.

What is a server log?

A server log is a file that exists on a web host that contains a record of all activities recorded on a website during a particular period of time. So every time a search engine crawler accesses your website to understand all of its pages and content, the attempts should be recorded in your server logs.

How do I access my server logs?

The ways in which you obtain access to your website’s server logs depend on the technology used on the backend of your website and who is responsible for managing it. If you have an IT team at your hotel responsible for hosting the site, then they’d likely be the ones you should ask to obtain access to these logs. If you’re working with a web agency to manage your site, then the agency should already be keeping an eye on the server logs for you.

If your site is self-hosted and you can access the cPanel software,  then it’s quite simple: just look for the server logs icon.

Cpanel log files

If you are able to get access to a server log, then you’ll need a program to use to view the file—we recommend Screaming Frog Log Analyser, but you can also check out DeepCrawl.

What can Server Logs tell me about the state of my website?

Server logs can help you better understand how your website is being accessed and by whom—from rogue third parties looking to scrape your hotel’s rates, to genuine search engine crawlers such as Googlebot or Bingbot. Here are the top 10 things that your log file can tell you about the health of your website:

1 – Determine speed bottlenecks on your website

When you have access to your site’s server log, you can easily check the amount of time it takes to load resources on your website through the Average Response Time. This metric tells you how long it took for your web server to respond to the requests it received. The lower the figure, the faster the response time, and the better experience your website delivers to the end user, which ultimately helps to improve search ranking.

2 – Find bandwidth-intensive resources

Having issues with downtime on your web host, or intermittent site access issues? It could be that some specific resources on your server are taking too long to load, or have been requested heavily.

You can use your server log to identify these URLs with a large file size (perhaps uncompressed images or video files). Anything with a really large file size or with a slow response time should be assessed for potential changes: can the files be compressed? Are the heavy elements essential to your site or are there more efficient alternatives?

3 – Check for missing redirects and review status codes

Although an analytics tracking tool such as Google Analytics can show you the most accessed pages on your site, they likely won’t pick up every page accessed on the site, such as pages accessed by non-human visitors, or pages with JavaScript/cookies disabled (as Google Analytics relies upon cookies).

In contrast, a log file analysis can show you a list of all URLs accessed from your web host. For example, if your site went through an SSL migration at some point, you’ll likely see plenty of activities on your old HTTP://www site address, as well as the redirect status (which should hopefully be 301 redirecting to the HTTPS equivalent).

404 error codes

Checking on your redirects is an important step in ensuring that your site is well optimized for search engines. Try to keep redirects to a minimum wherever you can. For example, don’t link internally to a page that subsequently redirects—instead, edit the link to the new URL of the page. Also, make sure that no important pages are returning a 404 error code or are intermittently serving timeout codes.

Similarly, you should check that key pages on your website are returning true 200 status codes. This tells search crawlers that a page exists at the designated location. An inconsistent 200 status code on a key page would warrant further investigation.

4 – Find out if your site has been shifted to the mobile-first index

Until recently, it was surprisingly difficult to check whether your website has yet been shifted to Google’s mobile first index. This is where Google ranks your entire website (desktop and mobile version) based on how your mobile version is performing. If your site is slow on mobile, or doesn’t function well for mobile users, then don’t be surprised to see a drop in your search rankings.

Google recently stated that it will send out notifications within Google Search Console to sites when they are shifted to the mobile first index, but you can stay ahead of the game by keeping an eye on your server logs. If you see a big surge in the amount of Google mobile bot visits, then your site is  likely to be switched to mobile-first index already.

google-mobile-first-email

At time of writing, the mobile-first switching has taken place for many of our hotel clients, but there are many still waiting to be switched over.

5 – Discover the IP addresses of your traffic and block nuisance IPs

From your log file, you can easily find the IP addresses that have accessed your website during a set period. You’ll likely find real visitors (such as people looking to book a room at your hotel), as well as search engine crawlers and other bots.

For instance, you may find genuine search crawler tools like Screaming Frog or DeepCrawl, or search marketing tools like SEMRush or Ahrefs, appearing among your log files.

You’ll probably also have various unidentified IPs that may belong to various good and bad scripts, which are set to automatically crawl all links on the internet and perform various functions (e.g. scraping content, monitoring site changes, and so on). Since some of these actions may be looking to exploit vulnerabilities on your site, it’s a good idea to block these IP addresses if you are certain that they shouldn’t be on your site. There are various online IP address lookup tools to assist you with this task.

Blocking IPs can be done from your server but should be used with extreme caution. For example, if you start blocking Googlebot from accessing your site, your site would eventually disappear from Google’s index and your organic traffic would drop to null. (You can verify Googlebot by following these instructions.)

An alternative way to block specific crawlers from accessing your site would be to deny them access via your websites robots.txt file. However, you must be extremely careful because s a misplaced character may end up blocking important search crawlers.

booking.com-robots-file

6 –  Determine a regular crawl pattern from search engines

Depending on how often you update your site, Google might try to save its resources by crawling your site on a specific schedule, say every few weeks or so. By reviewing your log files, you can notice whether Googlebot, Bingbot, or other search engine crawlers come at certain periods.

From an SEO and user-experience standpoint, you should make sure that when Google is returning to your site, it’s able to easily find and detect any new content so the site can get indexed quickly and not experience any slowness or any kind of server errors. Determining a crawl schedule from search engines can help with that, but unless your hotel has a huge amount of content and updates on a near daily basis, it’s unlikely to be a big issue if you don’t know the schedule.

7 – Discover the search engines accessing your site

By looking at the User Agents that have visited your website via the server logs, you’ll get a good idea about the search engines that are interested in indexing your content.

These User Agents are usually clearly labelled within the log file software you use, but you can take the additional step to verify that they are indeed accurate. Some scripts can mimic another User Agent, spoofing the traffic so that it appears to come from a different location—often to avoid detection.

user-agents-from-server-log

Screaming Frog’s Log Analyser has the option to verify these User Agents when you first import the log file, but we caution you that this check can take a while to complete, especially on sites with high traffic.

You can use the User Agent report to also identify issues such as no visits from User Agents—so if you notice Baidu hasn’t accessed your site at all, then perhaps there’s an issue with serving your content in China. You can spot and resolve the issue by using a VPN to change your location and accessing your site in a browser, or using a tool to change the header’s user-agent.

change-user-agent

Since Chinese travel market is growing extremely quickly, you want to ensure that Baidu (China’s leading search engine) isn’t avoiding your website for whatever reasons.

8 – Keep an eye on referral traffic

Besides Google Analytics and other analytics tools, you can also see referral traffic in your server log.
From an SEO perspective, referral traffic shows the links to your website that are providing huge value, mainly because these are the links from which Googlebot, or another search engine crawler, are actually “following” to find their way onto your website.

When it comes to SEO, you’ll want to obtain as many backlinks as you can from relevant and powerful sources across the web, especially the ones that have a high readership of engaged viewers. A link that is actually going to be visited by a genuine user (and therefore is likely crawled with higher priority by Googlebot) is more valuable than the one that doesn’t provide any referral traffic.

By reviewing your logs, you can find the links that provide genuine “human” traffic. You will also get  insights into how successful your content marketing campaigns are performing, outside of just link-metrics. Then you should prioritize those links that generate the most human visits.

9 – Assess Google’s mobile bot technology

Considering the recent buzz around Google’s switch to mobile-first indexing, server logs provide valuable insight into User-Agent information upon each visit.

For example, you can find out from the Google mobilebot User Agent the specific browser and device (e.g. Google Nexus 5X with Chrome browser version 41) that Google is using to assess your site’s performance. For technical SEO specialists and web developers, this is vital information—they now know which device/browser they should be using to audit their website ahead of the switch to mobile-first indexing.

Googlebot-mobile-technology

From the same server logs, you can also see that Bing’s mobile bingbot is using an iPhone OS and is currently crawling and rendering sites using a version of Safari (9537.53).

Armed with this knowledge, web developers can stay ahead of the game by assessing the essential mobile SEO elements such as site speed and usability by emulating Google’s visit from a Google Nexus 5.

Using the above info can also help when it comes to identifying fetch and render issues within Google Search Console, where Google can’t “see” a website exactly as a user does. This is often the case when JavaScript is used heavily on a site. It’s a key technical SEO check to make, especially considering the number of scripts most hotels have running on their websites.

10 – Stay secure by checking for rogue URLs

We recently identified an issue on a hotel’s spa page in which Google indexed a huge amount of “spam” pages” that don’t actually exist.

It turned out that the hotel’s WordPress site had been compromised at some point, with a script installed that cloaked content to Googlebot but hid it from users, presumably to avoid detection. As the result, it pointed a number of links to the “non-existent” content from various suspicious websites, which likely misled Google into accessing and indexing this content.

By accessing your server log file, you can detect this kind of “negative SEO” attack, and hopefully fix the issue before it escalates. In this case, the WordPress install and web host needed to be cleaned and re-installed, with higher security measures added—another reason to regularly check your server logs!

Referring-domain-spam

Hopefully with the above information, you’ll feel more confident to tackle your server log files. Although this topic falls under the spectrum of technical SEO,  it’s still good for you to be aware of the potential things that can take place undetected on your hotel’s website so you can share the knowledge with people who are responsible for managing, maintaining, and optimizing your website.

 

Matt Tutt

Matt Tutt

Matt is a SEO Specialist at Travel Tripper with extensive knowledge of optimizing hotel websites for maximum visibility online. He loves helping hotels to improve their organic search rankings and grow their direct bookings. You can get in touch with him at matt@traveltripper.com

Stay on top of hotel distribution and marketing trends.

Sign up for Travel Tripper's newsletter to get the latest news, tips, and resources delivered to your inbox.

subscribe
Travel Tripper Logo

Request a Demo