google scanning newspaperGoogle just announced that they are scanning millions of pages of old newspapers and will make that content available through Google News Archives. With this, they also got one step closer to their mission of "organizing world’s information."

Google scanners will scan everything that appeared on the newspaper page including vintage advertisements, crosswords, cartoon strips, illustrations and even those rare photographs that appeared during your great grandpa’s era. See example.

Also see: Search Old Newspapers with Google News

Spam Sneaks into Google News Archives

I was playing around with Google News Archives and was surprised to see that lot of spam content has managed to sneak into Google News. For instance, a search for "mughal rule" suggested pages that were either commercial websites or blog posts – see this screenshot.

google news archives

The same happened with another search query like "rani laxmi bai" – none of the top results can be classified as "news sources" and they are purely commercial websites.

This is so surprising because according to Google Terms, only newspaper publishers and aggregator with historical content can submit their content but it looks like spammers have managed to trick the process.

google news spam