Kategoriler
Genel

A Strange Sitemap Parsing Error and the simple Solution

Notice: My blog is in Turkish, but I wanted to share the solution of the sitemap parsing error problem so this will be a short post.

The Problem:

Google Search Console shows a “Parsing Error” in your Sitemaps section.
Already tried: Searched Google, read Common XML Sitemap Errors, Validated your XML and its ok, re-created your sitemap, then blamed Yoast or security and caching plugins for blocking Google Crawler to reach your sitemap but nothing helped. Bing and Yandex are just ok and shows no errors on your sitemap.

The cause of the problem: A hacked WordPress website, an unwanted URL injection which affects nearly all pages of your website, so your “sitemap.xml” or “sitemap_index.xml” or whatever its name has the URL injection too.

The weird thing is that if you go to yoursitename/sitemap.xml and view the source, you cannot see any malicious URL, no hidden stuff, but when I tried “Fetch as Google” and Fetch and Render I just saw the weird Asian dating URL’s at the end of the sitemap! Bang! And also there is a “noindex x-robots” meta tag inserted. Maybe by Cloudflare or not.

Here is the fetch as google code:
Fetch as Google:
http://www.website.com/sitemap_index.xml
Indexing requested for URL and linked pages
Googlebot type: Desktop (render requested)
Complete on Wednesday, October 17, 2018 at 3:08:01 PM PDT
Downloaded HTTP response:

TTP/1.1 200 OK
Date: Wed, 17 Oct 2018 22:08:02 GMT
Content-Type: text/xml; charset=UTF-8
Transfer-Encoding: chunked
Connection: keep-alive
Set-Cookie: __cfduid=……2a19aa036ae1539814082; expires=Thu, 17-Oct-19 22:08:02 GMT; path=/; domain=.website.com; HttpOnly
Cache-Control: no-store, no-cache, must-revalidate
Cf-Railgun: 0e497a84f9 0.25 0.498083 0030 cc99
Expires: Thu, 
1
HTTP/1.1 200 OK
2
Date: Wed, 17 Oct 2018 22:08:02 GMT
3
Content-Type: text/xml; charset=UTF-8
4
Transfer-Encoding: chunked
5
Connection: keep-alive
6
Set-Cookie: __cfduid=….1539814082; expires=Thu, 17-Oct-19 22:08:02 GMT; path=/; domain=.website.com; HttpOnly
7
Cache-Control: no-store, no-cache, must-revalidate
8
Cf-Railgun: 0e497a84f9 0.25 0.498083 0030 cc99
9
Expires: Thu, 19 Nov 1981 08:52:00 GMT
10
Pragma: no-cache
11
Set-Cookie: PHPSESSID=d6XDG0kItbR01S%2Ca4uw-x2; path=/
12
Vary: Accept-Encoding
13
X-Robots-Tag: noindex, follow
14
Server: cloudflare
15
CF-RAY: 46b614dd70948339-ATL
16

17
<?xml version=”1.0″ encoding=”UTF-8″?><?xml-stylesheet type=”text/xsl” href=”//www.website……xsl”?>
.

..

.
45
<lastmod>2018-01-22T22:32:21+03:00</lastmod>
46
</sitemap>
47
</sitemapindex>
48
<!– XML Sitemap generated by Yoast SEO –>
49
<center><a href=’http://www.website/kristen-dating-indonesia/’>kristen dating indonesia</a>, <a href=’http://www.website.com/dating-for-drug-addicts/’>dating for drug addicts</a>, <a href=’http://www.website.com/online-dating-advice-chat-room/’>http://www.website.com/online-dating-advice-chat-room/</a></center>

Download Time: 0.535 seconds
© 2018 Google LLC – Webmaster Central – Terms of Service – Privacy Policy – Search Console Help

Solution:

Clean up your hacked WordPress. My Favorite plugins are:

Other Plugins failed:

  • Wordfence: Unable to finish a scan, because I am using a dedicated server, even the minimum settings didn’t help too much server load…
  • Sucuri: The free version has nothing, no malware scan… Firewall, the famous WAF is not available I free version… I am still surprised what people suggest Sucuri so much…
  • All in One Security for WordPress: Not a bad plugin but Malware scan is an external service not included in free version. They explain it as its a complex and dynamic issue to detect and clean…
  • Others: Lots of time spent, but no luck.

Conclusion:

  • Anti-Malware Security and Brute-Force Firewall https://wordpress.org/plugins/gotmls/
  • Security & Malware scan by CleanTalk
    https://wordpress.org/plugins/security-malware-firewall/  (Be quick to clean up your site before the trial period ends)

Both are good plugins, Anti-Malware Security and Brute-Force Firewall deserve a Donation – Sorry Paypal is not available in Turkey. Cleantalk is worth its price, i liked “Send for Analyze” option…

By the help of this combination i was able to scan for malware and injections, a flawless cleaning, then resubmitted the sitemap, when i see that there were no errors i was very happy and also angry for the hours of useless research.

I wrote this blog post because i see many posts about Sitemap parsing error in Google Product Forums and WordPress forums, and you may notice that every WordPress website is under a constant attack and hacked WordPress website numbers are growing very fast.

Take your security measures and try not to be hacked.