How to Fix Googlebots Causes JavaScript Errors?

I’ve recently been working on logging JavaScript errors on my production server, and I encountered an interesting challenge: errors originating from bots, like Googlebot, were cluttering my error logs. I want to share my journey from my original implementation to an enhanced solution that filters out these non-critical errors.

Original Code and Its Explanation

The Code I Started With

Here’s the code I originally implemented to catch and log JavaScript errors:

<script type="text/javascript">
window.onerror = function(msg, url, line) {
if (window.XMLHttpRequest) {
var xmlhttp = new XMLHttpRequest();
} else {
var xmlhttp = new ActiveXObject('Microsoft.XMLHTTP');
}
xmlhttp.open('POST', '/logJSerrorsHere', true);
xmlhttp.setRequestHeader('Content-type', 'application/x-www-form-urlencoded');
xmlhttp.send('msg=' + encodeURIComponent(msg) + '&url=' + encodeURIComponent(url) + '&line=' + line);
return true;
}
</script>

How It Works

  • Global Error Handler:
    I assign a function to window.onerror, which is triggered whenever an uncaught JavaScript error occurs.
  • Error Details:
    The function receives three parameters:
    • msg: The error message.
    • url: The URL of the script where the error occurred.
    • line: The line number in the script where the error was thrown.
  • Creating an AJAX Request:
    The code checks if the browser supports XMLHttpRequest (modern browsers) or falls back to ActiveXObject for older versions of Internet Explorer.
  • Logging the Error:
    An asynchronous POST request is sent to /logJSerrorsHere with the error details. The error data is URL-encoded and passed as parameters.
  • Returning true:
    Returning true prevents the browser from further handling the error, which is useful to avoid additional error messages in the console.

Key Enhancements in the Code:

  1. Bot Filtering:
// Detect common crawlers and CLI tools
const isBot = /googlebot|bingbot|.../i.test(userAgent);
  1. False Positive Handling:
// Ignore browser extension errors
const isExtensionError = url.startsWith('chrome-extension://');

// Ignore jQuery errors if not actually used
const isjQueryError = msg.includes('$...') && !jQueryInPage;
  1. Rate Limiting:
// Prevent error storms using sessionStorage
if (lastError && now - lastError < 60000) return;
  1. Enhanced Context:
// Track JS environment state
jsLoaded: {
    jQuery: typeof jQuery !== 'undefined',
    documentReady: document.readyState
}
  1. Reliable Transport:
// Prefer beacon API for better reliability
navigator.sendBeacon('/logJSerrorsHere', blob);

Recommended Actions:

  1. Server-Side Filtering:
# Example Python middleware
BOT_AGENTS = re.compile(r'Googlebot|Bingbot', re.I)

def log_error(request):
    if BOT_AGENTS.search(request.headers.get('User-Agent', '')):
        return HttpResponse(status=204)  # No content
    # Log real error
  1. Error Dashboard:
    • Separate bot errors vs user errors
    • Track frequency and unique occurrences
    • Alert only on user-facing errors with >0.5% occurrence
  2. SEO Health Check:
# Use Google Search Console's URL Inspection
curl -H "User-Agent: Googlebot" https://your-site.com/important-page

Extended Code with Additional Functionality

After a while, I noticed that errors like “$ is not defined” were being logged by Googlebot (a well-known crawler) and other spider bots. Since bots don’t execute JavaScript like real users do, these errors are less relevant in a production environment. I decided it was time to refine my approach by filtering out these non-critical errors.

The Enhanced Code

Below is the enhanced version of my code that includes checks to ignore errors from bot user agents and specific error messages:

<script type="text/javascript">
window.onerror = function(msg, url, line) {
// Check if the user agent is Googlebot or other spider bots
if (navigator.userAgent && /googlebot|spiderbot|crawler/i.test(navigator.userAgent)) {
console.log("Ignoring error from bot: " + navigator.userAgent);
return true; // Skip logging errors from bots
}

// Optionally, ignore specific error messages (e.g., "$ is not defined")
if (msg && msg.indexOf("$ is not defined") !== -1) {
console.log("Ignoring '$ is not defined' error: " + msg);
return true;
}

// Create an AJAX request to log errors
var xmlhttp;
if (window.XMLHttpRequest) {
xmlhttp = new XMLHttpRequest();
} else {
xmlhttp = new ActiveXObject('Microsoft.XMLHTTP');
}
xmlhttp.open('POST', '/logJSerrorsHere', true);
xmlhttp.setRequestHeader('Content-type', 'application/x-www-form-urlencoded');
xmlhttp.send('msg=' + encodeURIComponent(msg) + '&url=' + encodeURIComponent(url) + '&line=' + line);
return true;
}
</script>

What’s New in the Extended Code

Bot Detection

I added a check using navigator.userAgent to test if the user agent string contains “googlebot”, “spiderbot”, or “crawler”. If it does, I log a message to the console for debugging and then ignore the error. This ensures that errors generated by bots do not clutter my logs.

Specific Error Filtering

In addition to detecting bots, I included a condition to ignore errors that specifically mention “$ is not defined”. This error is common when a bot attempts to run code that assumes a library like jQuery is available. Since these errors do not affect real users, filtering them out helps me keep my error logs clean and focused on issues that really matter.

Preserving the Original Functionality

For all other errors that are not filtered out by these new conditions, the code continues to create an AJAX request and send the error details to the server, just as before.

Should I Deal With Googlebot Errors?

After some thought and testing, I realized that errors originating from bots like Googlebot generally aren’t a concern. Bots might not execute JavaScript in the same way as browsers do, or they may load pages in environments where certain scripts aren’t available. By filtering out these errors, I can focus on genuine issues that impact real users. In my production system, I decided it was more beneficial to log only those errors that have a direct effect on user experience.

Final Verdict:

I am pleased with the progress I’ve made in handling JavaScript errors more effectively on my client project. Through this work, I learned that not every error is critical and that understanding your user base including distinguishing between real users and bots is essential for effective error logging. Filtering out non-essential errors allows me to prioritize debugging efforts and maintain a clear, actionable error log.

Related blog posts