Robots Exclusion Checker
🔍 Security Report Available View on Chrome Web StoreChrome will indicate if you already have this installed.
Overview
Robots Exclusion Checker is designed to visually indicate whether any robots exclusions are preventing your page from being crawled or indexed by Search Engines.
## The extension reports on 5 elements:
1. Robots.txt
2. Meta Robots tag
3. X-robots-tag
4. Rel=Canonical
5. UGC, Sponsored and Nofollow attribute values
- Robots.txt
If a URL you are visiting is being affected by an "Allow” or “Disallow” within robots.txt, the extension will show you the specific rule within the extension, making it easy to copy or visit the live robots.txt. You will also be shown the full robots.txt with the specific rule highlighted (if applicable). Cool eh!
- Meta Robots Tag
Any Robots Meta tags that direct robots to “index", “noindex", “follow" or “nofollow" will flag the appropriate Red, Amber or Green icons. Directives that won’t affect Search Engine indexation, such as “nosnippet” or “noodp” will be shown but won’t be factored into the alerts. The extension makes it easy to view all directives, along with showing you any HTML meta robots tags in full that appear in the source code.
- X-robots-tag
Spotting any robots directives in the HTTP header has been a bit of a pain in the past but no longer with this extension. Any specific exclusions will be made very visible, as well as the full HTTP Header - with the specific exclusions highlighted too!
- Canonical Tags
Although the canonical tag doesn’t directly impact indexation, it can still impact how your URLs behave within SERPS (Search Engine Results Pages). If the page you are viewing is Allowed to bots but a Canonical mismatch has been detected (the current URL is different to the Canonical URL) then the extension will flag an Amber icon. Canonical information is collected on every page from within the HTML <head> and HTTP header response.
- UGC, Sponsored and Nofollow
A new addition to the extension gives you the option to highlight any visible links that use a "nofollow", "ugc" or "sponsored" rel attribute value. You can control which links are highlighted and set your preferred colour for each. I’d you’d prefer this is disabled, you can switch off entirely.
## User-agents
Within settings, you can choose one of the following user-agents to simulate what each Search Engine has access to:
1. Googlebot
2. Googlebot news
3. Bing
4. Yahoo
## Benefits
This tool will be useful for anyone working in Search Engine Optimisation (SEO) or digital marketing, as it gives a clear visual indication if the page is being blocked by robots.txt (many existing extensions don’t flag this). Crawl or indexation issues have a direct bearing on how well your website performs in organic results, so this extension should be part of your SEO developer toolkit for Google Chrome. An alternative to some of the common robots.txt testers available online.
This extension is useful for:
- Faceted navigation review and optimisation (useful to see the robot control behind complex / stacked facets)
- Detecting crawl or indexation issues
- General SEO review and auditing within your browser
## Avoid the need for multiple SEO Extensions
Within the realm of robots and indexation, there is no better extension available. In fact, by installing Robots Exclusion Checker you will avoid having to run multiple extensions within Chrome that will slow down its functionality.
Similar plugins include:
NoFollow
https://chrome.google.com/webstore/detail/nofollow/dfogidghaigoomjdeacndafapdijmiid
Seerobots
https://chrome.google.com/webstore/detail/seerobots/hnljoiodjfgpnddiekagpbblnjedcnfp
NoIndex,NoFollow Meta Tag Checker
https://chrome.google.com/webstore/detail/noindexnofollow-meta-tag/aijcgkcgldkomeddnlpbhdelcpfamklm
CHANGELOG:
1.0.2: Fixed a bug preventing meta robots from updating after a URL update.
1.0.3: Various bug fixes, including better handling of URLs with encoded characters. Robots.txt expansion feature to allow the viewing of extra-long rules. Now JavaScript history.pushState() compatible.
1.0.4: Various upgrades. Canonical tag detection added (HTML and HTTP Header) with Amber icon alerts. Robots.txt is now shown in full, with the appropriate rule highlighted. X-robots-tag now highlighted within full HTTP header information. Various UX improvements, such as "Copy to Clipboard” and “View Source” links. Social share icons added.
1.0.5: Forces a background HTTP header call when the extension detects a URL change but no new HTTP header info - mainly for sites heavily dependant on JavaScript.
1.0.6: Fixed an issue with the hash part of the URL when doing a canonical check.
1.0.7: Forces a background body response call in addition to HTTP headers, to ensure a non-cached view of the URL for JavaScript heavy sites.
1.0.8: Fixed an error that occurred when multiple references to the same user-agent were detected within robots.txt file.
1.0.9: Fixed an issue with the canonical mismatch alert.
1.1.0: Various UI updates, including a JavaScript alert when the extension detects a URL change with no new HTTP request.
1.1.1: Added additional logic Meta robots user-agent rule conflicts.
1.1.2: Added a German language UI.
1.1.3: Added UGC, Sponsored and Nofollow link highlighting.
1.1.4: Switched off nofollow link highlighting by default on new installs and fixed a bug related to HTTP header canonical mismatches.
1.1.5: Bug fixes to improve robots.txt parser.
1.1.6: Extension now flags 404 errors in Red.
1.1.7: Not sending cookies when making a background request to fetch a page that was navigated to with pushstate.
1.1.8: Improvements to the handling of relative vs absolute canonical URLs and unencoded URL messaging.
1.2.0.11: Updating to Google's new manifest V3 and fixing small bugs.
1.2.0.12: Added a Spanish language version and made improvements to existing translations. Linking to new website https://www.checkrobots.com
1.2.0.13: Fixed pushState navigation data extraction, resolved inconsistent icon display, and added security protections to prevent logout issues with enterprise websites.
Found a bug or want to make a suggestion? Please email extensions @ samgipson.com
Tags
Privacy Practices
Security Analysis — Robots Exclusion Checker
Permissions
Code Patterns Detected
External Connections
What This Extension Does
The Robots Exclusion Checker extension helps SEO professionals and developers identify issues that may prevent web pages from being crawled or indexed by search engines. It analyzes robots.txt, meta tags, HTTP headers, canonical links, and link attributes to highlight potential problems visually. This tool is designed for users who want a clear overview of how search engine crawlers might perceive their site.
Permissions Explained
- storageexpected: Allows the extension to save user settings like selected user-agents and preferences between sessions.
Technical: Uses Chrome'schrome.storageAPI, which stores data locally in the browser. If compromised, could allow attackers to modify or read saved configuration values such as custom user-agent choices or highlight settings. - https://www.checkrobots.com/*check this: Grants access to a third-party domain used for checking robots exclusion rules and possibly sending data about the current page.
Technical: This permission allows network requests to checkrobots.com, potentially including page content or URL information. If exploited, it could enable tracking of browsing behavior or exfiltration of sensitive metadata from pages visited by users. ⚠ 1
Your Data
The extension sends data to external domains like checkrobots.com and other services for analysis. It may also store user preferences locally on the device.
Technical Details
Network activity includes requests to www.checkrobots.com (HTTP/HTTPS), w3.org, fonts.googleapis.com, vuejs.org, google.com, twitter.com, facebook.com, linkedin.com, ko-fi.com. Data sent likely includes page URLs and possibly HTML content or headers for analysis. No explicit encryption details provided; data is transmitted over standard HTTP(S). Local storage may contain user settings such as selected user-agents.
Code Findings
The extension creates new JavaScript scripts dynamically, which can be a security concern if not handled carefully. If an attacker controls part of the input used to generate these scripts, they might inject malicious code.
Technical: Code uses document.createElement('script') and sets .src or .text, potentially allowing injection of untrusted content into script execution contexts. This is especially risky if inputs from web pages are not sanitized before use in dynamic script generation.
💡 Commonly used by extensions that fetch remote resources or inject code for analysis purposes, such as SEO tools checking external data sources.
The extension uses character manipulation functions to hide parts of its code. While not inherently malicious, this technique is often used in malware or adware to evade detection.
Technical: Code contains patterns like String.fromCharCode(...) and .charCodeAt() that are typically used for encoding strings (e.g., hiding URLs or API keys). If these functions are misused to obfuscate harmful behavior, it could make analysis harder.
💡 Used in legitimate extensions to encode internal configuration data or prevent easy reverse-engineering of logic without malicious intent.
The extension injects HTML content into the page using innerHTML. If not properly sanitized, this could allow attackers to insert harmful scripts if they control part of the injected data.
Technical: Code uses element.innerHTML = ... which can introduce XSS vulnerabilities if values come from untrusted sources (e.g., user input or external APIs). In particular, when injecting content based on server responses or page metadata, this presents a risk unless strict sanitization is applied.
💡 Common in extensions that display structured data like robots.txt rules or meta tags directly within the browser UI.
The extension communicates with other origins using postMessage, which is normal for extensions but can be misused if not carefully managed to avoid leaking sensitive data.
Technical: Uses window.postMessage() for inter-frame or cross-origin communication. If messages are sent without proper origin checks or validation of message content, it could expose internal state or allow unauthorized access to extension functionality.
💡 Standard practice in extensions that interact with web pages or external services where secure messaging is required.
The extension listens for changes to its own stored data, which may be used to detect when settings are modified or accessed by other scripts.
Technical: Uses chrome.storage.onChanged.addListener() to monitor updates in local storage. While useful for syncing preferences across tabs, it could also serve as a covert channel if misused to track user behavior or detect unauthorized access attempts.
💡 Standard functionality in extensions that need real-time updates of settings or state changes.
The extension does not define a Content Security Policy (CSP), which could leave it vulnerable to certain types of attacks like XSS if the code is injected into insecure contexts.
Technical: No CSP header or meta tag found in manifest or injected scripts. This increases risk from third-party script injection, especially since the extension injects content and makes XHR/fetch requests.
💡 Some extensions do not include CSP due to complexity of managing it across multiple environments; however, this is a known weakness if not mitigated by other protections.
The extension fetches information from various domains to analyze robots exclusions and related metadata. This is expected behavior but requires careful handling of responses.
Technical: Uses fetch() or XMLHttpRequest (XHR) APIs to request data from checkrobots.com, w3.org, etc., likely for validating robots.txt rules or checking canonical tags. If response parsing isn't robust, it could lead to unexpected behaviors or errors in analysis.
💡 Standard practice in SEO tools that rely on external validation services or standards-based APIs like W3C's robot protocol documentation.
The Robots Exclusion Checker extension provides useful functionality for analyzing how search engines crawl and index web pages. However, it has several concerning behaviors including dynamic script creation, obfuscation techniques, and reliance on a third-party domain that may collect user data. While the core purpose aligns with its stated goals, users should be cautious about granting access to checkrobots.com due to potential privacy implications. For those who strictly require this tool for SEO work, consider reviewing settings regularly and monitoring network activity.