Full Transcription
Full transcription of John Mueller’s Office Hours Hangout on May 28, 2021.
Q 1:08 – The first basic questions that I want to know are regarding web hosting and domain registration. Suppose I want to launch a site in Germany. Is it required that I get a .de domain registered in Germany and a website hosted in Germany for better SEO?
A 1:33 – “No, it’s not required. In general, if you want to use geotargeting, there are two ways to do that. One is to use the country level, top-level domains, which would be .de for Germany in that case. The other is to use a generic top-level domain and to use a geotargeting setting in Search Console. So that could be, for example, a .com website, or .net, or .info, or .eu, or whatever. Any of those would also work, and then you just set geotargeting for Germany. The hosting location is also not required. That’s something that, way in the early days before we had the setting in Search Console, we used the hosting location as a way to guess which country the website might be targeting. But nowadays I don’t think that’s used at all. And with a content delivery network, if you have a fancy international website, then the hosting location doesn’t matter anyway, because you always have some local presence automatically.”
Q 2:45 – So the next question is on the Germany side. So I have an English website, now I want to make a German website. I want to re-ask this question about duplicate content with this. So I have German content, then, suppose I use a translator like Google Translate to translate. Will Google tell me that is duplicate?
A 3:13 – “No. If it’s translated content, it’s not duplicated content.”
Q 3:18 – But if I have translated using Google Translate only?
A 3:21 – “I think that’s a different problem then, though. So just in general, translated content is unique content. It’s different words, different letters on the page, so it’s different content. Depending on how you translate it, that would be more of a quality issue. So if you use an automatic translating tool and you just translate your whole website automatically into a different language, then probably we would see that as a lower quality website, because often the translations are not that great. But if you take a translation tool, and then you rework it with maybe translators who know the language, and you create a better version of that content, then that’s perfectly fine. And I imagine, over time, the translation tools will get better so that it works a little bit better. But at least for the moment, if you just automatically translate it, from a quality point of view, that would be problematic. And even a step further, if that’s something that is done at scale, then the webspam team might step in and say, this is automatically generated content. We don’t want to index it.”
Reply 4:36 – So if I do it with human interface, with intelligence. This question was asked way back in 2011, it is already in Google Search Central. So in 10 years, Google Translate is maybe a million times better.
John 5:01 – “Yeah.”
Reply 5:02 – Yeah. So just a million times better. I will not generate the site, but I will take a piece of the content, and I will generate it, because if I go to a translator, they will charge me in euros, and it will be very costly. The expression ‘what is your name’ suppose, it can be easily translated. There are complex situations where grammar and other things are involved. There’s a long sentence of 15 words, 20 words. Then I can translate some, then I can hire somebody, a freelancer, who could look into it, and then make changes. Would that be good?
John 5:43 – “I think that’s a good start. I think you have to consider the quality aspect.”
Reply 5:50 – If It’s good quality then, no issues?
John 5:52 – “Yeah.”
Reply 5:53 – Good quality for content it is like the grammar should be good and the sentences should be meaningful sentences?
John 6:03 – “Just like what kind of content you would expect in your own language. Like if you’re searching in your language, and you find a page, and you read it, and it’s like ‘I don’t know who wrote this, this doesn’t make much sense’ then you wouldn’t trust that page, right? Essentially, it’s the same thing. You’re creating content for German users, and if they look at it and say ‘oh, this doesn’t make much sense’, then they’re going to go somewhere else.”
Q 6:38 – I have some more questions. So then again, on the site, I have one doubt. The sitemap does not use priority. Does Google algo use the last mod attribute?
A 6:55 – “Yes.”
Q 6:56 – The time? Suppose I have not updated the sitemap, but I keep on updating my content. My content is updated, but it is not linked to any Yoast or other plugin. So I have been updating my content, but I forgot to update my sitemap. So sitemap is up in January, but my content is up as of May. So what do I do now?
A 7:17 – “So we use the sitemap to help us to crawl better, but it doesn’t replace crawling. So if your sitemap is old and your content is updated, we will crawl your website normally in addition to the sitemap. But if you tell us in a sitemap file that these pages are new or these pages have changed, then we can crawl more efficiently. So the sitemap helps us, but it doesn’t replace the normal crawling.”
Q 7:43 – So in the sitemap, do I need to keep the support and contact us link in the sitemap, or can I avoid it? Because these links are already on the website.
A 7:54 – “Up to you. You don’t have to list all URLs in a sitemap.”
Reply 7:59 – All URLs that I wish to be indexed are to be in the sitemap.
John 8:03 – “You can do it however you want. Some people list everything. Some people just list the important parts. Some people just list one section of the website in the sitemap. It’s totally up to you.”
Q 8:39 – If using superscripts to give out links is a good thing to do. For example, I have seen blogs in which it shows what people should do, they write the sentences and correspond to the word. For example, there is a sentence called ‘I am an SEO’, then there is a superscript on SEO in the square brackets. And when you click on that superscript, the definition of SEO, or the things more related to SEO, you will be landing on the quarter of the page. And there, you will find the link. So is it a good practice to use superscripts to give out links?
A 9:15 – “I think, on the one hand, we can find the link. So it’s not problematic. But we do get a lot of value out of the context of individual links. So in particular, we look at the anchor text for the link. We look at the part before and after the links as well. So with a superscript, you’re essentially just listing the URL, and you’re listing the URL out of the context of the rest of the page. And that makes it a lot harder for us to understand the connection between your content and what you’re linking to. So I would try to avoid that if you can. There might be usability reasons why that makes sense in certain cases. But in general, we do like to try to find the link within the context of your pages.”
Q 10:21 – I have a Dutch university that I’m working for and I want to implement the hreflang tags for that university. We have a lot of Dutch pages, and English is our second language. That’s what we want to target all the other countries with. The only problem that we have, or that I face, is that not all the pages have a translated version. Because some pages are only in English, some pages are only in Dutch. But how do you deal with the hreflang tags in that scenario? Do you just simply skip the hreflang on the pages that do not have a translation, or do you keep it there but not point to other pages? What is the best practice in that?
A 11:03 – “So the hreflang annotations are on a per-page basis. So you don’t have to do that across the whole website. So there are two, I guess, approaches that you can take there. One is to skip it on those pages. The other is just to leave them there and just have one hreflang tag. Like, you just have the Dutch version, or you just have the English version of a page. And we will look at that and say, oh, there is no hreflang pair, so we will skip that hreflang. So it’s kind of like, depending on what your process is, I would say both of those are essentially equivalent for us.”
Q 11:52 – I actually have two questions. So one of the questions we have received from our client. So when someone searches by their brand name, they can see the home page and other pages linked. And the organic search shows their Twitter account, their Facebook account, and one of their product review site accounts. And then it shows a website which is from the USA, but they have a similar brand name to our client. And now, my client’s question is they have an Instagram page, which they are updating regularly, but that Instagram page link is appearing on the second page. Their question is why is the USA site appearing on the first page, not the Instagram page? And their Twitter page, which they have not used for the last four years, is also getting on the first page, not Instagram. So what is the logic behind this and what we can do?
A 12:51 – “I don’t think there is any kind of special logic behind it. It’s essentially just kind of normal ranking as we would do. So that’s something where sometimes your pages will show up for queries, sometimes other people’s pages will show up for queries. And that kind of happens naturally. I think there might be two things which they could do which could help, potentially. One is to use the structured data on the pages to say that certain other pages are the same as the home page and to link to the different profiles that they have. I’m not sure exactly what the name is of the structured data type. But you can link to different kinds of social media profiles that you have for a company from your primary website. And the effect there is that it helps us to better understand that these belong together. And then maybe, we will rank them a little bit closer together. I think it’s always tricky because maybe that other person who has a similar name is like, why don’t I show up for these queries more in comparison. So that’s, for example, something that I see a lot with my name. So if you search for my name, you can find me, but you can also find lots of other people who are called John Mueller. And some of them are pretty famous. And I imagine they’re also like, who is this guy from Google, and what is he doing in my search results, and how can I get rid of him? And that’s something where it’s like, sometimes there just are different pages that have a similar name or that have the same name, and they show up in the same search results. The other thing that you could do is if there is a knowledge graph entry or a knowledge panel on the side for the company that we show, you can verify that, and you can add the other social media profiles there as well. So that’s something that wouldn’t necessarily affect the ranking of those pages, but it makes it a little bit easier that if we show the knowledge graph entry for the business, then you can go to the social media profiles directly and kind of view whatever they’re posting on social media. So those are kind of the two aspects that I would look at there.”
Q 15:21 – OK, the second question. I’m not quite sure actually whom to ask this question because this is a little bit tricky for me. So what happened for our company when we started on Google Maps, it shows the Google Business listing automatically. There is no problem with that. But when we started on Google organic search engine, it showed all the results related to our company, but it did not show the Google Business listing on the right side of the result. But I’m not sure whom to ask this question, whether it is you or whether it is the Google Business listing team.
A 15:58 – “I don’t know. I feel that’s also more of a ranking question in terms of we can find your business and show it in the normal search results, but maybe we’re not showing the Google My Business entry for a business like that. And I don’t know the details of how the Google My Business profile ranking side of things work. I could imagine it also takes into account things like the location, where maybe if your business is not actually local or nearby, then maybe we wouldn’t show the Google My Business entry in the side. But I don’t know. My feeling is there is nothing technical that you’re doing wrong if the listing is available, and it’s just not ranking in the search results. But you can still ask, I think the Google My Business team has a separate help forum specifically for kind of companies that are listed in Google My Business. So I would double-check with them to see if there is something that you can do. Maybe there is a simple trick to make it easier to kind of connect those two worlds. But my feeling is that probably, this is just a ranking question.”
Q 18:28 – My question is regarding title tags. Traditionally, I was used to just having one title tag at the top of the page which would describe the subject of the page. But I’ve recently been using a WordPress page plugin that’s been adding title tags to SVG images. So when I look at the source code, I end up with approximately 19 title tags. Now, designers are telling us it’s perfectly OK, Google will recognize them. But I feel really uncomfortable about that. And what I’d like is some clarification of title tags within SVG images. Do you either ignore them altogether or do you know it’s something completely different than the main title tag?
A 18:18 – “I think this was an issue a couple of years back. But I think we’ve since resolved that. And in general, I wouldn’t worry about this, especially if you’re seeing the search results are OK. So that’s something where I would do things like go into Search Console and look at the top queries that people are using to find your website and then just try those searches out. And then look at the titles that are shown there and if that looks OK, then I wouldn’t worry about it.”
Reply 18:51 – Well, the title tags are showing correctly. But the reason why I’ve looked in the code is that there are pages that I do feel they’ve been indexed by Google but are not ranking where I think it should be for the quality of content that’s said. They’re not running in the top 100. Really well-written, unique content. And I think it should be appearing somewhere, and I’m looking for the reason. Do you think it’s possible that they might contribute to confusion with Google?
A 19:23 – “I don’t think so, no. It’s something where, at least at the time when I first saw this and we had problems with that, it was more a matter of we were accidentally taking these titles and showing them as titles in the search results. And that’s kind of annoying, but it wouldn’t cause any issues with the ranking of the pages. So just because you have multiple titles on a page, especially if they’re in SVG files, that’s something where we would say, like, whatever. That’s perfectly fine. The other place I might watch out for this is in image search results if those SVG files are images that you want to have appear in image results.”
Reply 20:08 – No, it’s Facebook, Instagram icons.
John 20:14 – “Yeah. I wouldn’t worry about that. That’s perfectly fine.”
Q 20:43 – We’ve recently made large changes to our sitemap file, both adding and removing pages, in an effort to improve our SEO. To our surprise, each change significantly reduced our impressions and/or clicks. Does Google penalize directly or indirectly for large sitemap changes? Or on the other hand, does Google boost more long-standing content?
A 21:09 – “No. So I think for both of those questions, it’s a clear no. In particular, sitemap file changes are perfectly fine. And some websites have a lot of sitemap changes that they do because they make changes to their pages a lot. And that’s perfectly fine. The sitemap file helps us to crawl more efficiently. The sitemap file is not a ranking factor on its own. So just because it exists, because you’re changing something, because it stays stable, that is not a ranking factor. Also, with regards to long-standing content, we don’t kind of boost long-standing content, or evergreen content, or content that hasn’t changed. It’s something where I think the effects that most people see is more around what people actually are expecting to find in the search results. And sometimes, you want something more like a stable reference to find more information on a topic. Sometimes, people want to find kind of the newest updates on a specific topic. And that kind of intent can change over time. Or usually, it does change over time. Like, for example, if you’re looking at, I don’t know, a vacation place for example, then maybe, if nothing is happening, if everything is fine, then you would expect to find kind of more stable content about that location. Whereas if some event took place in that location just recently, then you expect to find more news, kind of new updated content about that location. And then overtime again, it’ll shift more towards the evergreen content again. And these are things that our systems try to recognize, and they do change the rankings of these things over time depending on what we expect that users are trying to find.”
Q 23:09 – We see a spike in traffic shortly after introducing new types of pages followed by tapering off, though we don’t expect our users to behave any differently based on how long the content has been live. Our content isn’t very time based, nor at all newsy. Do you have any thoughts on why we might see this sort of release spike?
A 23:30 – “So I guess this is almost like the opposite of the previous question, where the previous person was like ‘oh when I make some new changes, then everything goes down’. And this person is like ‘when I make changes, everything goes up’. I think probably what is happening in this particular case is that we’re seeing new content for a website. And especially when it comes to new content on a website or new websites overall, there’s kind of this period where we recognize the new content. We can crawl and index the new content. But we don’t have a lot of signals for that new content yet, and then we have to make assumptions. And our systems try to make assumptions where they think this is probably in line with the rest of the website. But sometimes, those assumptions are kind of on the high side where we say ‘oh, this is fantastic content, probably’. And sometimes the assumptions are more on the lower side, we’re a little bit more conservative. And we’re like, ‘ah, we have to be careful with showing this new content’. And that’s something where you will see that sometimes new content performs particularly well for a period of time and then it settles down again. Sometimes it performs kind of badly initially and then settles down in a higher state. This is something which is essentially just our systems kind of trying to figure out where this new content should fit in before we have a lot of signals about the content. And in the SEO world, this is sometimes called kind of like a sandbox where Google is keeping things back to prevent new pages from showing up, which is not the case. Or some people call it the honeymoon period where new content comes out and Google really loves it and tries to promote it. And it’s, again, not the case that we’re explicitly trying to promote new content or demote new content. It’s just, we don’t know, and we have to make assumptions. And then sometimes, those assumptions are right, and nothing really changes over time. Sometimes, things settle down a little bit lower. Sometimes a little bit higher.”
Q 25:45 – I’m trying to add my sitemap to Search Console and receiving a 403. We have a firewall blocking the request. Can we get a region or IP address that Google is trying to access the sitemap to whitelist them?
A 26:02 – “So we don’t have a list of the IP addresses for Googlebot because they can change over time. But we do have a method of you verifying whether or not a Googlebot request is legitimate or not. And we have that documented in our Help Center. So in a case like this where you really need to kind of block everyone else and only let Google look at the content, that’s something that you can do. Specifically for sitemap files, we’ve said that’s OK. I would, of course, kind of like watch out for the other search engines as well so that you don’t block them from accessing your content because otherwise, they wouldn’t be able to access it. But essentially, for sitemap files, you can verify the request. Usually, the way that this is done is you look at the user agent. And if it’s a Google Doc user agent, then you take the IP address that the request has and you do a reverse lookup to find the hostname for that IP address. And then you do another lookup of the hostname to double-check that the IP address is correct. And that’s something that you can do at scale. You can cache these for a while. You can kind of keep them like that for a bit and collect those IP addresses if you want. I would just be careful with the assumption that one IP address will always be the same one because we do have different data centers in different locations. And sometimes, you end up with different IP addresses that are used for crawl. So I would regularly, at least, update your lists if you’re generating these lists on your own.”
Q 27:50 – If we have a blog post about the finances of a company which is only relevant for a few months per year until the next report comes out, is it better to update that blog post or to create a new one in terms of SEO for Google? Does it hurt SEO to have a page change every month or every three months?
A 28:09 – That’s something that comes up regularly. Our recommendations are essentially, this is assuming that you want these financial pages to show up in search, maybe taking a step back if you’re just like, “Well, we put these pages out there, but we don’t really care if people find these pages. We care about the rest of our website.” In a case like that, you can do whatever you want anyway but if you do want these pages to be found, then usually, the approach is to have one stable URL for this content. Usually you would want to keep the older versions of the content and move those to an archived section. Essentially you would have the current report on a clear URL like “financialreport.html”, and you would take the last report that you have and move it to something like “2020 financial report”, like an archive section of your site. The goal here is essentially that primary page that you have, the normal financial report page, that page is the page that collects the signals over time, where over time we will see if someone is searching for the financial report for your company, this is the place to go. If someone ends up searching for the financial report for 2019 for your company, then we can still dig up those archive pages as well but essentially, the primary financial report page would not be relevant there. You can apply the same technique for anything that you update regularly. If you have an event that takes place once a year or every couple of months, then you have one stable event landing page and then from there, you link to an archive section with the older versions of that event. If you have products where you say, “I have an Android phone 27”, and it replaces the Android phone 25 or whatever, then people will be searching for “Android phone” and they want to find that “Android phone” general stable landing page. That’s the approach I would take there. Just have one stable landing page for the overall topic that you keep updating and move the old content into an archive section so that if people are searching for the old content, they can still find it. Most people, when they’re just searching for the topic overall, they can find that stable page and overall, over time, that stable page will just kind of gain in importance across your website anyway. That’s kind of the approach I would take.
Q 31:14 – A question with regards to iframes. My site has a page with multiple iframes which are updated monthly. The source is graphs that I make with Plotly JavaScript, which I then host on the web pages and they’re not available throughout the website anywhere else. Should I put a no follow on those pages so that they don’t get ranked?
A 31:38 – I don’t think you can do a no follow on an iframe so probably that doesn’t matter so much. What you could do is use a rel=canonical on the iframe to content to point back to your general page. That makes it a little bit easier but you can’t put a no follow on the iframe.
Q 32:00 – The second question refers to the page speed insights test. They appear to be very slow due to having to load the Plotly library for each frame separately instead of just caching it. Do you think I would be better off foregoing iframe’s user interactiveness and the graphs altogether and just switch them out with pictures in order to rank them better?
A 32:25 – Possibly. I don’t know. On the one hand, I would try to figure out what you can do to make the JavaScript faster in terms of caching. There are a lot of things that you can do with regards to how you serve content from your server. With caching you can probably do some things to help improve the speed of those JavaScript libraries. I wouldn’t say it’s absolutely necessary to get rid of all of the JavaScript but probably, you can do quite a bit to improve the speed as well. The other thing is, with regards to the content of the iframes, one of the things that happens here is if you’re generating these graphs with JavaScript, we would not be able to index them as images. If for example, you have fancy images that you’re creating and people are actually looking for those images in image search, then we wouldn’t be able to index those as images because we would essentially process the JavaScript and see “oh, it does something fancy with the canvas or whatever.” but we wouldn’t see that this is actually an image that we would be able to index. If you’re seeing that people are looking for these graphs using Google Images, then I would at least make it possible that people can view them as images. The final thing here with regards to speed in particular, it is possible that you can achieve a significant speed boost by swapping out the JavaScript libraries against static images. Sometimes there are things that you can do to make it so that they’re still interactive, such as having the images as placeholders, and when you click on them, then you go to the JavaScript version of the page, and you can interact with those images a little bit more directly. It depends a little bit on your pages themselves and whether or not that actually makes sense. If everyone is always interacting with those JavaScript pages, then probably you want to kind of keep it interactive by default however, if most people are just looking at those images, then maybe you don’t need to make them interactive by default, and you can just cache the image. Make it so that when people click on it that they can go to the JavaScript version, and then do the interactive bit there. Those are kind of the aspects that I would look at there but again, just with regard to JavaScript library, there are a lot of things you can do with JavaScript and with the serving content from your server that can make it fast too. I think there are probably also some other aspects with regards to speed and ranking that you could look at here. In particular, if this is a small part of your website, then probably the overall aggregate data for your website with regards to speed will be independent of these particular pages. If this is the primary part of your website, then maybe it is a stronger aspect of the overall speed of your website.
Q 35:37 – A question about responsive web design and display none under a mobile first indexing. Can I assume that the content that is in HTML is seen on desktop but can’t be seen or accessed from smartphone because it uses display none in responsive design is indexed and evaluated equally as other visible and accessible content under mobile first indexing? So it has a little bit more details on what’s happening here.
A 36:28 – In general, we would be able to index this content, and we would be able to show it in the search results. I would, however, go with the caveat that you can test this yourself, and you can double check to see if this is actually ranking well or the way that you want it to rank in the search results just by searching for your content yourself. In that regard, it’s something where you don’t have to take my word but rather, you can try it out and see what actually happens there. Purely from an indexing point of view, we would be able to pick that up and we might not recognize that on mobile, you would never be able to access this content. If it’s accessible on desktop, from our point of view, that’s also OK. It’s not the case that the mobile version has to be one to one exactly the same as what is visible on the desktop version. It can be the case that there are actual content differences there. Most sites at the moment are on mobile first indexing already so perhaps your site is on there as well and then you kind of see exactly what’s happening already. We’re still kind of struggling with some of the last sites with mobile first indexing so hopefully, we’ll be able to shift more of those over the next couple of months or so.
Q 38:02 – One part for the spin and I also get sometimes confused about. If a mobile is a mobile first index, then Google will take the content which is appearing for the mobile device. That is one issue for indexing, and the next step is for linking. You have also said that we consider mobile content for ranking and indexing purposes. We are just trying to serve that content on desktop, but not on mobile. How Google will take this kind of infrastructure of this kind of elements for ranking and indexing purposes?
A 38:49 – If your site is on mobile first indexing and you have content that is only in the HTML on desktop, we would not be able to index that. That would be essentially invisible to us. The way that you can test is by using the inspect URL tool in Search Console to do a live fetch with the mobile version and to look at the HTML to see if it’s actually in there. If it’s not in the HTML on the mobile version, if it’s only visible for desktop users for example, then we would not be able to index it, and we would not be able to use that for ranking.
Q 39:37 – There’s an upcoming Google experience algorithm update in mid-June. I want to optimize my page speed but there is one issue which is the reCAPTCHA Javascript. How do I optimize that?
A 40:01 – I don’t know. It’s something where if you’re using JavaScript on your pages which is slowing things down for your users, then we would see that as your page is being slow. It’s not the case that we would say, “oh, this is reCAPTCHA. We will ignore it”, or we don’t count it against your website or this is a Google Ads so we won’t count it against your website. It’s really, users are seeing that things are slow, and that’s what we use for the speed of–
Reply 40:38 – We’re really facing issues with reCAPTCHA Javascript showing as a third party. Is there anything to differ the particular JavaScript?
John 40:52 – I don’t know how the reCAPTCHA script is embedded. That’s something where sometimes, the way that you embed a third party script can play a big difference in how the speed is done or is seen by users. Sometimes scripts are just slow and I don’t know in particular the reCAPTCHA script. I could imagine that it has to do something to recognize, is this a user or is this not an actual user. Maybe some of that is more processing intensive and just takes time. If that’s the case, then I would just think about what you can do to improve the speed of your site overall. Maybe that includes removing some of these scripts or removing some of these scripts from pages that don’t necessarily need that script. I don’t like to say, “oh, you should remove Google’s scripts from your pages” but if these things are slowing your pages down, then that’s something that sometimes you have to do. Sometimes it’s also something where you can contact the team that is implementing the script and say, “hey, your script is so slow. I’m going to remove it unless you make it faster.” and maybe if they get this feedback, they’ll be able to improve things. I don’t know.
Q 42:36 – Does getting more Google reviews or responding to them increase your SEO ranking in any way?
A 42:44 – As far as I know, no. It might be that it affects something in Google My Business in the local listings, in the Maps listings but at least from an SEO point of view, we don’t look at the number of Google reviews that you have or the back and forth that you have with regards to Google reviews.
Q 43:06 – How do you deal with hreflang when there’s not a translation for every page?
A 43:09 – I think we touched on this.
Q 43:11 – I implemented custom tracking for my website that is triggered by JavaScript on page load and hits a subdomain. Seeing crosstabs for my domain property, I identified that one third of all crawls are on that tracking subdomain. Could this be interfering with the crawling of the main subdomain? What would be the best approach to avoid this tracking subdomain to be crawled?
A 43:35 – I guess, first of all, if it’s a tracking script and it’s not necessary for your content, you can block it by robots.txt. If you block it by robots.txt, we would not crawl that part of your site, so we wouldn’t take that into account. That’s kind of the practical effect there. My guess is that this wouldn’t be negatively affecting your website anyway. It is very possible that you’re seeing this as something that is visible in the reports but not necessarily as something that is actually causing issues on your website. In particular, if we can crawl your normal content quickly enough, if we can pick all of the changes, the new things that you put out, if we can pick them up quickly enough for you, then I would not worry about this. I think it’s like something you could optimize with robots.txt to block the tracking subdomain. If you’re not absolutely sure what the tracking subdomain all does– maybe there is also an API there that pulls in content, or I don’t know– then I wouldn’t necessarily just block it by default and say, “oh, I don’t want any of this crawl” because it could have other effects as well. If you’re sure that this is just a tracking script, it doesn’t affect the rest of your pages, everything loads normally without it, go for it.
Q 45:05 – Search Console mobile usability report says the text is too small to read. What is the minimum font size we should maintain?
A 45:13 – I don’t know. I looked briefly for our documentation on this. I don’t think we have the exact font size documented however, in the Lighthouse tool, where there’s also the mobile friendliness test that you can do, it looks out for a 12-point font. That might be something to aim at or at least try out the Lighthouse tool and see how that works out there. One of the things with a font too small that I’ve seen with mobile friendliness test is sometimes, this is triggered when the CSS for your pages can’t be loaded properly. When we think, “oh, it’s a big desktop-size website, so we have to shrink the whole page down”, and then it looks a lot smaller. That might be something that throws things off a little bit. In particular if you’re seeing this issue being flagged for individual URLs across your website and not for the whole website overall, then it might just be that, “oh, we weren’t able to get the CSS file once” and next time, we can get the CSS file, and it’ll be fine again.
Q 46:19 – I’m running a website using WordPress. I have a question about structured data for non-AMP pages that contain structured data. Is there anything else that I should be setting besides the Search Central guidelines? If I’m concerned about structured data, is it better for SEO to make my site AMP?
A 46:36 – So AMP or not AMP is essentially independent of all structured data questions. For structured data, I would primarily focus on the aspects that you can see in the search results. That means if you have content that matches a special format that we would show in the search results, I would try to use the right structure data to be visible like that. For example, if you have recipes on your site, then using the recipe markup makes it possible for us to pick up nutritional information, to show a nice thumbnail image, all of that. If you don’t have recipes on your website, then obviously, there’s no need to add recipe markup to your pages. That wouldn’t make much sense. My recommendation here would be to see the AMP or not AMP question as something separate. A lot of times, using AMP is a great way to make a site very fast. If you’re struggling with speed with your non-AMP version, then maybe shifting to an AMP version or with the AMP plugin would be an approach. Independently of that, on the structured data side, I would go through the structured data that we have documented on our site and think about the different types that we show there and think about which types might make sense for your website, and based on that, then decide the actual markup that you put on your pages.
Q 48:26 – I have a website consisting of a million pages. We sell automotive parts so putting static content on every page is hard for us. We just changed up our description part number on every page. My question is, does Google consider this as a duplicate content first? And secondly, is there any other way we can use to create dynamic content for my pages?
John 49:00 – What kind of dynamic content is that?
Reply 49:04 – We are selling automotive parts pages only. It’s like code pages only. We take the information from the user, and then we take a lead on this.
John 49:18 – OK. So it’s basically a big catalog of parts that you have and you’re creating web pages for that, right?
Reply 49:23 – Correct.
A 49:31 – I think that’s perfectly fine. It’s essentially a database-driven website, which tends to be generated dynamically like that. From our side, we wouldn’t necessarily differentiate between your generating the HTML pages with a database or your generating the HTML pages with static files. We don’t see that difference. We essentially just look at the content that you’re providing and if the content is useful, if it matches what people are searching for, then that’s perfectly fine. How you get to that content is up to you. Sometimes, it makes sense to go through a database. Sometimes, there’s separate text files. That’s totally up to you. I think the important part here really is the content itself has to be useful and compelling. For example, you’re a reseller for a catalog of products. Like maybe for a big car manufacturer, you have all of their parts, you get their database of parts, and you just put that database of parts online as well. Then essentially, you’re not providing a lot of unique value by just putting the same database online. You really want to focus a little bit more on the additional value that you can provide past just taking someone else’s database and putting it online.
Reply 51:15 – Currently, we are facing de-indexing of these pages so we are not taking this data from other sites. It’s in-house content only. We are just taking the part numbers and all those. We are facing de-indexing issues now. Could this be a reason for that?
John 51:40 – It’s hard to say without knowing more about your website. I think you’re probably on the right track. The thing with indexing a large database like that is that our systems are also very dynamic in that they will kind of re-evaluate how much content we need to have indexed from individual websites over time. Especially with larger websites, it’s completely normal that sometimes a lot of content is indexed. Sometimes, less content is indexed. These kind of fluctuations happen all the time. What I would watch out for here is that within your website structure, you structure it in a way that the critical pages are recognizable for Google as well as being critical pages. That could be things like your category pages, or your subcategory pages, or the blog posts or content that you have written about your products. In that sense, we don’t need to have all of the individual products indexed if we have your category pages indexed, for example, because on the category pages, you also list the products and if someone is looking for a specific part number, they can find your category page and from there, they can still go to the individual part. That’s kind of what I would watch out for here with a larger site is assume that some of this content will not be indexed and assume that with the kind of structure that you provide of your website, you can kind of guide Google to focus on the parts that you think are important. As long as those important parts are indexed, then the rest can fluctuate a bit. If you have a million products and let’s say a thousand category pages, if those 1,000 category pages are all indexed, then the fluctuations with those million products– that’s a big number of fluctuations, but it doesn’t matter so much for your overall business because people can still find the content that they’re looking for.
Q 54:13 – I just wanted to know what are the best practices to handle J’s render blocking?
John 54:24 – To handle what?
Reply 54:25 – I’m sorry. J’s render blocking. JavaScript render blocking.
John 54:30 – JavaScript render blocking. So to block JavaScript from being rendered, or to fix it so that it–
Reply 54:38 – Optimize. To fix it.
A 54:46 – Yeah, I think it’s a big range of things that you can look at here so it’s really hard to say what the best practices are. I would recommend as a start to at least go through all of our documentation on JavaScript and SEO because we have a lot of things documented. We have a lot of videos about this as well and to use the tools that we have to double check your pages. It might be that your pages are all OK in that regard, and you don’t have to do anything special. It might be that you recognize some issues that you have to focus on. The other thing that I would recommend doing here is if you run into specific technical issues with regards to JavaScript and rendering and SEO is to make sure to join the office hours with Martin. He does them, I think, every other week or so, and he’s our expert on everything around JavaScript and SEO. If you come with a technical challenge to him, then he will be super happy to help you to try to figure that out. That’s kind of the approach that I would take there.
Reply 56:01 – OK. If I am going to file my Javascript there from the loading, is it workable?
John 56:08 – It can certainly be OK. These are techniques that you can do to improve the speed of your pages. Depending on what actually happens on your pages in that regard, it can have an effect on improving the speed for users without necessarily causing issues for SEO. Deferred loading, for example, is something that essentially takes the JavaScript and processes it a little bit later. If that JavaScript is not critical for your pages, if it doesn’t generate any special content, then it doesn’t matter for SEO purposes if it’s loaded later or earlier. However, if that JavaScript generates the primary content of your pages and you’re deferring it from being loaded until after Google processes your pages, then we wouldn’t have that content for indexing. That’s something where you kind of have to make a judgment call on your side and say, this makes sense, or this doesn’t make sense, or this improves speed but causes technical SEO issues or maybe things improve speed and they work well for SEO as well. All of these are things that you can test out.
Reply 57:26 – Yeah, because we find that 50% of this is unused data.
John 57:31 – Yeah. I mean, a lot of sites have tons of JavaScript on them that is never used. That’s something where I suspect by focusing on the JavaScript that you have on your site, cleaning all of that up, you can save quite a bit of time or rendering time as well for users. I think there are multiple approaches that you could take there. You could, for example, go into the JavaScript files and tweak things line by line. Sometimes, you can just take a tool and run it over your JavaScript files, and it does everything for you. Sometimes, it’s a matter of switching to a more lightweight library. Rather than including everything, just include the parts that you actually need. I don’t think there’s one answer that would fix it for all websites.