How long does it take for customized Google Site Search to show results from pdf files?
-
The site in question is http://www.ejmh.eu
I am pretty unsatisfied with the results I am getting from the Site Search provided by Google.
We have over 160 pdf files in this subfolder: http://www.ejmh.eu/mellekletek
The files are the digital versions of articles. When I search for content in those pdf files, Google does not show results. It does show results from older pages, dating back 1-2 years but it is certainly not showing anything from pdf files that I have just put up 3 weeks ago.
My questions:
If I place a Google Search on a site, does it not automatically display results from ALL the content in the root domain?
Is there any correlation between how the Site Search is indexing the files and how Google is indexing the urls in general?
Should I just wait and see whether site search performance improves or should I switch to another Search software like Zoom Search?
It is vital to have a proper, high-quality search functioning on that site in the very near future.
What are your experiences? Any tips are greatly appreciated.
-
Hi, everyone: problem solved.
Here is what I did: I created a seperate sitemap-xml and linked to all the new pdfs.
I updated the general sitemap.xml and linked to the new sitemap as well.
I (re)submitted both sitempas via the Webmaster Tools.
Within a few hours, most of pdfs got indexed and the overall quality of search has improved dramatically. Thanks for all your help.
-
It may be a good idea to include all the pdf files on the sitemap, even if it is a troublesome process.
Otherwise it just takes too long for Google to index them.
What still surprises me is that even for a site search, you need to win the 'indexing battle'. I thought that Google indexes everythig within the map for the 'sake of the site search' and displays the results when a visitor is searching within the site. Less fancy softwares are actually doing the job. I thought a Google Site Search provides something even better.
-
Last crawl - thanks, great info.
yes, all new pdfs are linked from the html files.
This the summary page of one article: http://www.ejmh.eu/5archives_ppr_jaggle_061.html
In the middle of the page, you see 'download full text' - this is from where the individual papers (pdf) are linked.
-
Do you have the new PDFs Linked from pages like the old ones?
Try to create a page listing all the new PDFs, and basically Google might take time to recrawl your site and add these new PDFs ( by the way the last copy saved in Google Cache is from Feb 11)
-
You are great, thanks for your time. Yeah, I did check things out with this google command: there are pdf's listed but these are all old pdfs I have put up a long time ago. None of the pdfs I have put up recently are among those indexed.
Do you think that only those urls come up through a customized site search that are indexed by Google? Does Google not crawl the site and make a list of urls for the sake of the search purely? (Zoom search does it, for example) In theory, there could be two different type of 'crawls': one for the site search and one for the larger world, searching in the browser.
As for the settings...can you plase help me further: what exactly would you change?
-
if you check here all the pdf are indexed in google
so i will check the settings on CSE
reference here http://www.google.com/cse/docs/resultsxml.html#wsQueryTerms
-
Thanks for the tip, it's a good one. But they are all 100% texts.
-
If a search engine cannot read the text, due to it being a graphic and not text, then it won't be able to fully index the words on the document.
so make sure all your PDF are 100% text that was converted to a PDF and not a "Scan" (image) of the original document that was saved as a PDF
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How long does it take to rank easy keywords?
I have an established site with low keyword ranking and the keyword I am wanting to rank for it rated below 10 on Moz. It has been a few days since I published the article.
Technical SEO | | Begbie20060 -
Question about Unpredictability with the Knowledge panel showing up for the same search
The people in my client's office get different results when they search for their company name in Google. For example one person ALWAYS gets the right rail knowledge panel with full details about the company while her boss NEVER sees it. They are both on desktop search. Rosemary
Technical SEO | | RosemaryB0 -
URL not indexed but shows in results?
We are working on a site that has a whole section that is not indexed (well a few pages are). There is also a problem where there are 2 directories that are the same content and it is the incorrect directory with the indexed URLs. The problem is if I do a search in Google to find a URL - typically location + term then I get the URL (from the wrong directory) up there in the top 5. However, do a site: for that URL and it is not indexed! What could be going on here? There is nothing in robots or the source, and GWT fetch works fine.
Technical SEO | | MickEdwards0 -
How to avoid duplicate content on internal search results page?
Hi, according to Webmaster Tools and Siteliner our website have an above-average amount of duplicate content. Most of the pages are the search results pages, where it finds only one result. The only difference in this case are the TDK, H1 and the breadcrumbs. The rest of the layout is pretty static and similar. Here is an example for two pages with "duplicate content": https://soundbetter.com/search/Globo https://soundbetter.com/search/Volvo Edit: These are legitimate results that happen to have the same result. In this case we want users to be able to find the audio engineers by 'credits' (musicians they've worked with). Tags. We want users to rank for people searching for 'engineers who worked with'. And searching for two different artists (credit tags) returns this one service provider, with different urls (the tag being the search parameter) hence the duplicate content. I guess every e-commerce/directory website faces this kind of issue. What is the best practice to avoid duplicate content on search results page?
Technical SEO | | ShaqD1 -
My pages are not listed in search results
My URL is: puremobile.comI have two websites: puremobile.ca and puremobile.com : both same products, but different discription , but same title of productwhen i exact search a product for example :** "HTC 70H0029701M Smartphone Case Large"** , puremobile.ca shows up , but not puremobile.com I have no issues with indexing, webmaster tools is indexing normallywhen i search for: puremobile.com "HTC 70H0029701M Smartphone Case Large" , i get the puremobile.com product page.but when i search ANY product (no matter how unique its title or description is : google doesnt display puremobile.commy PR ( as i far as i can see was PR 5 last year, and today when i checked it was PR 0) .. I havent been doing any fishy Link building, some basic blogger outreach ( non paid), and social bookmarking. and my blog is very active and I have original content on my pages.what is causing this? and how can i resolve this issue.any help is greatly appreciated
Technical SEO | | puremobile0 -
Local search results appearing above Organic
Hello, I've just performed a search for the query 'outdoor clothing' using Google Incognito mode and I've added the screenshots below to show my findings. The first attempt at the search only showed local results then when I clicked search again adverts were shown. I found this very odd, normally I'd see the local results after the 3rd of 4th organic result.Have Google changed their algorithm or is this just random? http://img209.imageshack.us/img209/8643/incognito1.jpg http://img692.imageshack.us/img692/6382/incognito2jf.jpg Thanks, Dan
Technical SEO | | Sparkstone0 -
Google Analytics - Custom Variables
Hi guys, I'd appreciate any advice with this one. At the moment I'm in the process of arranging a URL re-structure. I was wondering what the best way would be to track the performance of the old URLs against new ones? We will be ammending the URLs for any new property pages which go live on our website but leaving the old URLs in play for any old properties listed. We're taking this approach for the moment so we can conduct analysis on the change. It has been mentioned to me that placing a 'setvariable' in the code of pages with the old URLs and ones with the new URLs would be a way of tracking performance. However, my knowledge in this area is a little bit grey. Any advice? Cheers, Mark
Technical SEO | | MarkScully0 -
Profile pic and Google profile appearing in search results
Do a Google search for: opensiteexplorer . The 2nd (may vary) result is an seoMOZ blog post, "Brand New Open Site Explorer is Here (and Linkscape's Updated, too)". Google is displaying Rand's pic and google profile link in the search result. How? Can't find the Google profile link in the seoMOZ page source.
Technical SEO | | questfore0