Information & Instructions for Web Site Creators

If you plan to use a Stanford Google Custom Search box on your Stanford web site, please subscribe to search-partners@lists.stanford.edu for notifications of service changes, updates, etc.

How to...

Put a search box on your site

You can add a search box to your web site to help visitors find content on your site. You can restrict the search to a specified directory, or search the entire Stanford web site. The search box will look and behave like this:

The visitor can enter a search term and click the Search button; the page will leave your site and display the search results in a formatted page.

Insert one of the following HTML into your web page where you want the search box to appear:

Stanford Web Search Box
<form action="http://www.stanford.edu/search" id="cse-search-box"> <div> <input type="hidden" name="cx" value="003265255082301896483:sq5n7qoyfh8" /> <input type="hidden" name="cof" value="FORID:9" /> <input type="hidden" name="ie" value="UTF-8" /> <input type="text" name="q" size="31" /> <input type="submit" name="sa" value="Search" /> </div> </form>

Site Specific Search Box (include or exclude a web directory)
<form action="http://www.stanford.edu/search" id="cse-search-box"> <div> <input type="hidden" name="cx" value="003265255082301896483:sq5n7qoyfh8" /> <input type="hidden" name="cof" value="FORID:9" /> <input type="hidden" name="ie" value="UTF-8" /> <input type="text" name="q" size="31" /> <input type="hidden" name="as_dt" value="i" /> <input type="hidden" name="as_sitesearch" value="<yoururl>" /> <input type="submit" name="sa" value="Search" /> </div> </form>
Customize the following required parameters:
- name="q" size="31"
  Sets the width (in number of characters) of the search box. You can change the size to suit your site's layout.
Customize the following optional parameters:

If you want to restrict your search feature to one specific directory (and its subdirectories), include the following two parameters (as_dt and as_sitesearch). Restriction to multiple site URLs is not supported.

If you want the search feature on your site to search the entire Stanford collection, remove these two parameters from your HTML.
- name="as_dt" value="i"
  This setting determines whether your search should include or exclude the directory specified in "as_sitesearch". Values can be:
  - "i" (include only results in the web directory specified by as_sitesearch)
  - "e" (exclude all results in the web directory specified by as_sitesearch)
- name="as_sitesearch" value="<yoururl>"
  Pages in the specified directory will be included in or excluded from your search (according to the value of "as_dt").
  e.g.: name="as_sitesearch" value="web.stanford.edu/dept/classics"
  - You must specify the name of the host server followed by the path of the directory.
    e.g.:
    - web.stanford.edu/dept/classics not www.stanford.edu/dept/classics
    - *.stanford.edu/dept/anthropology (for sites hosted on AFS that have been previously indexed at www.stanford.edu)
  - If the ("/") character is at the end of the web directory path specified, then only files within that directory will be searched and files in sub-directories will not be considered.
    e.g.:
    - web.stanford.edu/dept/classics to include sub-directories
    - web.stanford.edu/dept/classics/ to exclude sub-directories
  - as_sitesearch allows allows you to specify one directory (and all its sub-directories) as the domain to be searched—you cannot specify multiple disparate directories using this option.
  - If you want the search feature on your site to search the entire Stanford web site, delete this parameter.
If you need to search more than one directory or Stanford subdomain, we recommend that you create your own Google Custom Search Engine. This is a free service.

Get pages into the index

Google Custom Search uses the Google index. All you need to do to get your web pages into the Stanford/Google index is:

put the pages up in a web space
make sure your pages don't contain meta tags that prevent the robot from indexing your page
submit your page for indexing by Google's crawler
AND/OR have your page linked to by other pages in Google's index like the Stanford a-z index

The Google crawler will pick up changed, new, and removed pages automatically when it visits Stanford web sites. Content crawl frequency is dependent on how important Google's algorithm believes it to be. For instance, pages that Google believes to be important and quickly changing are crawled frequently, while others are crawled less frequently (up to two weeks before being revisited).

If a page is not in the index, perform a search for all pages that link to your page. The syntax for this search is: "link:yourdomain.com" For example, to see if pages link to your personal page at Stanford, you would enter "link:http://web.stanford.edu/~mypage" into the http://www.stanford.edu search box. The results will give you a list of all pages that link to your page.

If you would like your page(s) to be listed in the Stanford index, visit http://www.stanford.edu/atoz/ and click on the "suggestions" link.

Note that if your web pages do not have any external links from other pages in the Stanford search collection, they won't be picked up by the Google crawler.

Keep pages out of the index

If you don't want a page to be indexed, insert this <meta> tag within your page's <head> tag:

This will prevent crawlers (robots) from indexing the page, and from following any links from the page. If the page has already been indexed, it will be removed from the index the next time Google crawls the page.

You can prevent the pages in a directory from being indexed by restricting access to the directory with WebAuth.

Stanford's configuration of Google custom search

Search domains

Stanford's search collection includes all the web pages in these domains:

http://www.stanford.edu
http://web.stanford.edu
http://*.stanford.edu(including most virtual URLs such as medicine.stanford.edu)
http://www.stanfordalumni.org
http://www.stanfordmag.org
http://gostanford.com

...that are not specifically excluded by:

the search administrator
a noindex <meta> tag in the page's HTML
password (including webauth) protection
restricted-access files and/or directories

Web pages excluded by the search administrator

Web pages in the following directories (and their subdirectories) are excluded from the Stanford search collection:

URLs being phased out of use
e.g.: http://www-leland.stanford.edu
webauth-protected (or otherwise restricted-access) pages and directories
specific pages kept out of the index at the request of their owners

These pages have been excluded for a variety of system performance, copyright, license, and University policy reasons.

Additional directories or pages not listed here may have been excluded by the search administrator. If you think your page may have been excluded and don't want it to be, submit a Help ticket..

Crawling schedule

Google crawls Stanford web sites at different paces depending on how its algorithm handles different factors like relevancy, quality, type, frequency of update, and what other pages link to the content. Please read the Web Search FAQ (linked from the right sidebar) for more information about getting your page or web site indexed.

Information & Instructions for Web Site Creators

How to...

Put a search box on your site

Get pages into the index

Keep pages out of the index

Stanford's configuration of Google custom search

Search domains

Web pages excluded by the search administrator

Crawling schedule

Help & FAQs

Web Search

Services

Support

University IT

Connect

Information & Instructions for Web Site Creators

How to...

Put a search box on your site

Get pages into the index

Keep pages out of the index

Stanford's configuration of Google custom search

Search domains

Web pages excluded by the search administrator

Crawling schedule

Help & FAQs

Web Search

Services

Support

University IT

Connect

UIT Web Editors