Denis Potschien June 9th, 2017

JavaScript and Search Engines: What You Should Keep in Mind

Just a few years ago, JavaScript was very controversial. Annoying ad popups were the reason why the programming language was often blocked by default. Now, it's become hard to imagine modern web design without JavaScript. JavaScript is especially important in the mobile internet - as a means of playing media, but also for geolocations and navigations. But how well do JavaScript and search engines get along? What should you keep in mind?

Search Engines Have Learned

First things first, it should be said that the times when search engines were unable to handle JavaScript are over. While content loaded via JavaScript used to be invisible for search engines, they have learned something new by now. And when talking about search engines, I obviously mean Google, for the most part. The search giant is still the measure of all things. To Google, JavaScript is no hazard by default anymore. However, that doesn't mean that you can use JavaScript without any hesitation. As Google keeps the ways their search algorithm works secret, the following tips are to be taken with a grain of salt.

Loading Content Via Load Instead of User Events

Oftentimes, events are used to alter a website's content via JavaScript. Here, search engines usually only consider content loaded via Load events. These results are fired from the browser as soon as a site's DOM tree is loaded. Search engines like Google allow for load events during crawling, so that the site's content is usually only indexed after the execution of the load events. User events are not loaded, though. Thus, all changes triggered via click or touch events, for instance, will not be considered during indexing.

Push-States and URLs

In order to allow Google to index a site, it always has to be accessible via a URL. Thus, click events cannot be considered either, as they always display content triggered by an individual user. Thanks to JavaScript's push state API, it is possible to influence a website's URL. This allows you to realize a website's entire navigation via JavaScript, by using "pushState()" to alter the URL displayed in the browser, while loading and replacing content via JavaScript at the same time. As Google can't index URLs that are exclusively realized via push state API, each URL created using "pushState()" also needs to have a "real, existing" URL. By the way, this is not only interesting for search engines, but for social networks as well. That's because you can only share sites that have a "real" URL. Facebook and Twitter also need to extract content from a site, which only works if there's a URL. It's important to always use "pushState()" to create correct URLs that also always have the right content via JavaScript. A wrong push state URL that doesn't load any new content may lead to double content. And search engines don't like that either.

Don't Exclude JavaScript

It may be self-evident, but should still be mentioned. Of course, you have to make sure that JavaScript files are not excluded for search engines. If the "robots.txt" generally prohibits JavaScript files, search engines don't have a way to access them. As JavaScript itself doesn't contain indexable content, it is often hidden from search engines.

Progressive Enhancement

Regardless of all options that Google and other search engines give you to allow you to crawl content created via JavaScript, the safest method is still the so-called "Progressive Enhancement". This principle pursues the approach that content has to be prepared in a way that makes them available, regardless of the browser, or crawler. Concretely, this means that texts, images, and other content supposed to be found and searched by a search engine, should function without JavaScript, if possible. However, this is often related to a significant additional effort for the web developer. Each website basically has to provide all content even without JavaScript. Depending on the type and preparation of the content, certain compromises where only important content is displayed without JavaScript can be viable options. Here, you have to judge which effort is reasonable for your project.

Correct Semantics

With and without JavaScript: in any case, it's important that your content is labeled semantically correct. Headings loaded via JavaScript need to be labeled with the respective HTML elements as well. Here, the same rules that apply to normal content apply. In the end, for Google, the decisive factor is what the HTML source text looks like after the execution of JavaScript. Thus, choosing the right elements is crucial.

Testing Crawler View

If you choose to load content exclusively via JavaScript (not following the "Progressive Enhancement" principle), you should test to see if search engines can correctly, and completely see and crawl your content. Google's "Search Console" is one of the things helping you here. Under "Crawl", you'll find the function "Fetch as Google". Here, you can display a website for mobile and desktop devices, in the way that Google actually crawls it. However, there are other, mostly charged, tools like SEO.JS, and preprender.io which have specialized on checking if a website's JavaScript contents are properly displayed during crawling. This may be a sensible addition for complex projects.

Denis Potschien

Denis works as a freelance web designer since 2005.

One comment

Leave a Reply

Your email address will not be published. Required fields are marked *