Once again the SEO world is all ablaze because of a Twitter conversation with a Google employee. Gary Illyes from Google made a comment yesterday implying that it is not a good idea to remove thin content when trying to recover from Panda. Here is the tweet that started off the discussion:
@jenstar We don't recommend removing content in general for Panda, rather add more highQ stuff @shendison
— Gary Illyes (@methode) October 7, 2015
While it can be difficult to carry on a detailed discussion like this via Twitter, I do think that there are things to be learned when a Google employee comments on something like this. But, I don't think that Gary was saying that a Panda hit site should never remove thin content. Rather, I think that what he was saying was that we should improve upon thin content wherever possible, improve upon our overall website quality wherever possible, and then, where it's not possible, remove that content. I have indeed seen sites that have made Panda recoveries by removing thin content.
Here is an example of a site that hosted a large forum. The forum had all of its user profiles indexed as well as thousands of pages of relatively useless forum posts with which no one would ever engage. We removed the forum completely and with the next Panda update the site saw a nice increase.
The site saw a doubling of traffic and revenue in December of 2013 as compared to 2012.
Here is a large site that was affected by the Panda update in May of 2014. The site owner reviewed his content, and deemed 20% of it to be potentially thin. He removed this content from Google's index and saw a nice recovery with the next Panda update:
What does Google say about Panda and thin content?
Google employee John Mueller has said several times in Google hangouts that removing thin content is a good idea. Here are some quotes from him:
This question was about a Panda hit site that had lost 30% of their traffic.
Answer: There are quality issues there that you can address. I'd see what you could do to significantly improve the overall quality of your pages...Usually what I recommend in situations like this is to look at the blog post from two years ago from Amit Singhal about 23 Questions you can ask to assess the quality of your site...Sometimes a site like that might have a lot of thin content...One the one hand if I want to keep these articles, then maybe prevent these from appearing in search. Maybe use a noindex tag for these things. Maybe you could remove them completely.
Question: We're a classified website with several thousand listings. We can't control the content so some things might be thin. Is it a good idea to noindex the thin page?
Answer: That sounds like a really good idea. If you can recognize that some of these pages are thin. Putting a noindex on them is a great thing to do.
But here's why I agree with Gary Illyes
I do think that in many cases removing thin content is vital when it comes to Panda recovery. But, I think that sometimes we put too much emphasis on removing thin content. I believe that Panda is getting better and better at determining whether a site is a high quality site or not. If you have a mediocre site that contains thousands of pages of thin content, removing that thin content is not going to make you a great site. But, I do think that some great sites can be perceived by Panda as lower quality if they also contain sections that are thin. Gary said that they see many sites that wrongly remove good content in an effort to recover from Panda:
@Marie_Haynes twitter is not the right medium for this discussion. we see way too many people cut the good. Careful what you trim #defcon1 — Gary Illyes (@methode) October 8, 2015
I think that an example of this would be a site that perhaps removes years of blog posts that once got traffic but no longer do. Another example would be a site that removes all posts that are under a certain word count. Not all short content is thin content. The problem is that it's often unclear what content is thin and what is helpful.
Removing thin content is often not be enough
I also think that Gary was trying to prevent people from thinking that all they need to do to improve the quality of their site is to remove thin content. There are some cases, such as the examples that I gave above where removing thin content alone seems to do the trick. But for many sites that are hit by Panda, there are a boatload of quality issues that also need to be addressed.
I personally believe that Panda is evolving to go far beyond the technical aspects of a site. While it is always good to fix page speed issues, optimize your crawl budget and have a good url structure, in some cases that may not be enough. I have seen a number of e-Commerce sites that have been hit by Panda. Some of these sites are decent. But, when you look at their competitors it is clear that a customer would prefer to buy from them. For example, the competitors' sites might have incredible reviews, videos, buying guides and a Q&A section. I believe that Panda is recognizing that sites like that are of much higher quality than others. If your site is not significantly better than your competitors and users consistently prefer them over you, then Panda may an issue. For cases like this, removing thin content from the site is not likely to make a big difference.
Another example that I think fits with Gary's comments would be a site that has 100 different short articles all talking about one particular subject. While you could remove those articles, a much better solution would be to consolidate them into one massive and awesome article on the subject. This is a case where beefing up thin content would be much better than removing it.
Gary gave us a nice little tid-bit
At the end of our Twitter discussion, Gary said this:
@pauledmondson use search analytics: look for pages that don't satisfy users' information need for the queries they rank for @Marie_Haynes
— Gary Illyes (@methode) October 8, 2015
Google's new Search Analytics, which you can find in Google Search Console (formerly known as Webmaster Tools), can be quite helpful when it comes to finding content that could be improved upon. For example, if you see that you are getting a lot of searches for "how to install" to a particular product page, but your product page doesn't contain that information, then users are going to return to the search results and end up on someone else's site. If that happens enough times, then you could be perceived as a lower quality site. By beefing up your product page and adding thorough install instructions, you will retain readers who engage with your content.
Mining Search Analytics data for ways that we can improve our content to make it as useful as possible is a great idea.
What do you think?
I think that this whole conversation can be summed up like this:
If you have thin content that would be useful to users if it was improved, then do all that you can to make that content awesome. But, if you have thin content that can't be improved upon, remove it from Google's index.
What do you think? Have you had a site recover from Panda as the result of removing thin content?
Google update newsletter
Want an update when Google makes a big algorithm change or other announcement? Sign up here!
hello, i had a time with a project seems the real case like this. the thin content was all over the website. So we had experimentally start different aspects to improve the rank of website against 4000 terms. we in many scenario we do write high quality on about 100 pages (without removing thin content). 50 pages with total remove of thin/old content from pages and added new. 100 were got re-write the old content with addition of 100-200 words in it. after the results of recent updates in Oct and just before some days the fluctuations on 11/19. We got huge up in case of adding new content without removing old thin content. We had also got some on getting re-write content pages. But unfortunately didn’t success on the pages with original content after removing the old content. That’s my one of case right now.