Remove a pdf From Google

Google doesn’t just show links to standard html web pages, they also crawl and index non-html content including pdf files.

There are a couple of options for those wishing to remove pdf links from Google:

  1. Get the pdf taken down from the original website so you see a true “404 page not found error” (preferred)
  2. Ask the site owner to block Google by adding a “robots.txt” directive (acceptable)
  3. Keep the pdf in place but edit it’s contents so the privacy issue is resolved at the source (least preferred)

Let’s go through these options one-by-one:

1) Get the pdf removed from the site so you see a true 404 error

This is the best option and there is no need to wait for Google to revisit the pdf. You can expedite the removal process with the Google removal tool.

Important Note: The pdf must return a true page not found response (404). Redirects (302)  or other responses (200) may result in a denial and the pdf may linger around in Google for some time.

Expected result: Google’s automated tool will check the url and if it’s a 404, they then will remove the pdf. This may take slightly longer than normal web pages, expect up to one week.

2) Ask the site owner to block Google

Any site owner can add a piece of code the site instructing Google not to visit the pdf file. Once that code is in place you or anyone else can remove the pdf by going to the Google removal tool.

The code is called a “robots.txt” directive.

Expected result: Google’s automated tool will check the url and if it’s blocked by robots.txt they will remove the pdf. This may take slightly longer than normal web pages, expect up to one week.

3) Edit the contents of the pdf

In theory, if you edit the contents of a pdf file so the privacy issue is resolved at the source it is then just a case of waiting for Google to update things at their end.

And here lays the problem; Google doesn’t crawl (visit) pdf files as frequently as normal html web pages. Choosing this option may result in long delays before they update the search engine pages.

Expected result: Google will update the search results only after they see the updated pdf, this could take one day or it could take weeks.


One Response to Remove a pdf From Google

  1. M. Vimal kumar varma January 19, 2012 at 4:03 am #

    Sir,
    I have problem removing pdf from google search.
    The article was removed from the website, but it still appears in google search.
    With a very limited knowledge about software i am unable to to remove the pdf.
    The page appears to be live and the status shown is 200.
    How come the page is live even after it got removed from the website?
    Kindly guide me in this matter for the successfull removal of the same.
    This is the url which i wasnt to remove and the journal website from which the article was removed.
    kindly help me in this issue.
    regards,
    Vimal
    url: http://www.technicaljournalsonline.com/ijpsr/current%20issue/IJPSR%20VOL%20II%20ISSUE%20I%20Article%2015.pdf
    website: http://www.technicaljournalsonline.com