How long does it take for customized Google Site Search to show results from pdf files?

Lauroca

The site in question is http://www.ejmh.eu

I am pretty unsatisfied with the results I am getting from the Site Search provided by Google.

We have over 160 pdf files in this subfolder: http://www.ejmh.eu/mellekletek

The files are the digital versions of articles. When I search for content in those pdf files, Google does not show results. It does show results from older pages, dating back 1-2 years but it is certainly not showing anything from pdf files that I have just put up 3 weeks ago.

My questions:

If I place a Google Search on a site, does it not automatically display results from ALL the content in the root domain?

Is there any correlation between how the Site Search is indexing the files and how Google is indexing the urls in general?

Should I just wait and see whether site search performance improves or should I switch to another Search software like Zoom Search?

It is vital to have a proper, high-quality search functioning on that site in the very near future.

What are your experiences? Any tips are greatly appreciated.

Lauroca

Hi, everyone: problem solved.

Here is what I did: I created a seperate sitemap-xml and linked to all the new pdfs.

I updated the general sitemap.xml and linked to the new sitemap as well.

I (re)submitted both sitempas via the Webmaster Tools.

Within a few hours, most of pdfs got indexed and the overall quality of search has improved dramatically. Thanks for all your help.

Lauroca

It may be a good idea to include all the pdf files on the sitemap, even if it is a troublesome process.

Otherwise it just takes too long for Google to index them.

What still surprises me is that even for a site search, you need to win the 'indexing battle'. I thought that Google indexes everythig within the map for the 'sake of the site search' and displays the results when a visitor is searching within the site. Less fancy softwares are actually doing the job. I thought a Google Site Search provides something even better.

Lauroca

Last crawl - thanks, great info.

yes, all new pdfs are linked from the html files.

This the summary page of one article: http://www.ejmh.eu/5archives_ppr_jaggle_061.html

In the middle of the page, you see 'download full text' - this is from where the individual papers (pdf) are linked.

wissamdandan

Do you have the new PDFs Linked from pages like the old ones?

Try to create a page listing all the new PDFs, and basically Google might take time to recrawl your site and add these new PDFs ( by the way the last copy saved in Google Cache is from Feb 11)

Lauroca

You are great, thanks for your time. Yeah, I did check things out with this google command: there are pdf's listed but these are all old pdfs I have put up a long time ago. None of the pdfs I have put up recently are among those indexed.

Do you think that only those urls come up through a customized site search that are indexed by Google? Does Google not crawl the site and make a list of urls for the sake of the search purely? (Zoom search does it, for example) In theory, there could be two different type of 'crawls': one for the site search and one for the larger world, searching in the browser.

As for the settings...can you plase help me further: what exactly would you change?

wissamdandan

http://www.google.com/search?rlz=1C1GGGE_enUS359US359&sourceid=chrome&ie=UTF-8&q=site:www.ejmh.eu+pdf

if you check here all the pdf are indexed in google

so i will check the settings on CSE

reference here http://www.google.com/cse/docs/resultsxml.html#wsQueryTerms

Lauroca

Thanks for the tip, it's a good one. But they are all 100% texts.

wissamdandan

If a search engine cannot read the text, due to it being a graphic and not text, then it won't be able to fully index the words on the document.

so make sure all your PDF are 100% text that was converted to a PDF and not a "Scan" (image) of the original document that was saved as a PDF

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

How long does it take for customized Google Site Search to show results from pdf files?

Browse Questions

Explore more categories

Related Questions

Should I "no-index" two exact pages on Google results?

Hybrid page showing in Google search results

Question on Google's Site: Search

Page disappeared from Google index. Google cache shows page is being redirected.

Staging site and "live" site have both been indexed by Google

I think google thinks i have two sites when i only have one

NoIndex/NoFollow pages showing up when doing a Google search using "Site:" parameter

Should you block search results?