If you have been experiencing issues when trying to crawl websites using the Chrome Crawler, please follow the steps below to diagnose and (hopefully) fix the problem.
You can self-diagnose the issue by checking how Chrome is running in Sitebulb in a different place - the Single Page Analysis tool.
This is located in the top navigation menu - head here and try a URL that you know works fine in the browser:
If it works, you'll see Sitebulb will have collected a bunch of data about the URL. If it does not work, you'll see a message like this:
Try another URL from a different website, and check if you see the same thing. If you get another error, move onto the next section.
If the Single Page Analysis is indeed working, then please contact support and we will figure out how to resolve the situation.
It is amazing how many problems can be solved by a simple restart.
And yes, we know it's annoying to have to shut all your programs down and interrupt your work, but it's the most straightforward of these resolution steps, so please make sure you do it.
Once your computer has restarted, open up Sitebulb and head back to the Single Page Analysis tool and try one of those URLs again.
If it works...huzzah! You've fixed it. Now you can go back and run your audit again.
If it doesn't work, move onto step 3:
If you Google something like 'avg blocking chrome' you'll see how prevalent it is that anti-virus software decides to block a browser you use every single day - and this isn't even the headless version!
Anti-virus software can be both aggressive and inconsistent, particularly when it comes to something like headless Chromium, which certainly CAN be used in malware or adware (even though Sitebulb absolutely does not do anything dodgy).
In order to check this you will need to go into your anti-virus settings and find 'Blocked Apps' (or similar), like this in AVG:
Or you might find it in quarantine:
If it is, then remove any blocks, and add the entire Sitebulb folder as an exception:
This should then look something like this:
This should stop Sitebulb from being targeted by the anti-virus software in future. HOWEVER, in our experience, one of the most common things that anti-virus software does is actually delete the installed Chromium .exe file out of the Sitebulb folder.
So before you proceed, reinstall the latest version of Sitebulb. And do not worry, you will not lose any of your old audits or anything - this is just like applying an update.
Once you have reinstalled Sitebulb, open it up and head back to the Single Page Analysis tool and try one of those URLs again.
If it works...huzzah! You've fixed it. Now you can go back and run your audit again.
If it doesn't work, move onto step 4:
Similar to anti-virus, your firewall could be blocking Sitebulb from making outgoing connections (which it needs, in order to crawl websites).
Check that Sitebulb is in your 'allowed' list:
If not, add it as an allowed app. Also check that port 10401 and 10402 are allowed - as Sitebulb needs these ports to communicate.
Once you have adjusted your firewall settings, open it Sitebulb and head back to the Single Page Analysis tool and try one of those URLs again.
If it works...huzzah! You've fixed it. Now you can go back and run your audit again.
If it doesn't work, move onto step 5:
If you've tried everything listed above, but Sitebulb STILL will not crawl properly, it is probably something we have never seen before. In which we'll need to work with you to get to the bottom of the issue (which we will!).
Please email [email protected] and provide the following information:
We'll look into it and figure out what we need to do to make it work!
This next section is purely informational, but might help you understand a bit better what is going on.
Sitebulb's Chrome Crawler uses the latest stable version of Chromium, in headless mode, which allows it to closely mimic the way that Google renders web pages.
However, using Chrome in this way can occasionally cause some red flags to anti-virus software or firewalls, who mistakenly class Sitebulb as some sort of trojan or adware. Or, more specifically, it is the headless Chromium that they think is nefarious, and take steps to block it, like this:
A totally generic 'threat' - they don't know what it is but they suggest blocking it anyway. Le sigh.
It's also not always as clear cut and obvious as this, as often your anti-virus will take steps in the background to 'protect' you, so we must remain vigilent!
It is quite easy to spot when Chrome does not work propely. Quite simply - you won't be able to crawl with Chrome properly! You might see audits that look like this:
Not a great start to an audit!
Since v5.6 we have actually added checks and warning messages to various points in the auditing process:
Sitebulb will check that Chrome is running ok when you go to start a new Project, and if it isn't you will see this message:
Chromium might be installed ok, but then get blocked at the point in which Sitebulb tries to do something with it - for instance during the pre-audit. If this is the case, you would see a fugly error message on the Audit setup page:
You might be able to get this far and still progress to actually running an audit, and then experience the failure, in which case you'll see a message like this: