Integrating Yahoo! BOSS with your ruby on rails application

What is Yahoo! BOSS? Yahoo developer website cites it as:

"Yahoo! Search BOSS (Build your Own Search Service) is an initiative in Yahoo! Search to open up Yahoo!'s search infrastructure and enable third parties to build revolutionary search products leveraging their own data, content, technology, social graph, or other assets. This release includes Web, News, and Image Search as well as Spelling Suggestions."

Some of the possible implementation of Yahoo! Boss may be finding related posts for your article, suggested tag/category for an article/query, correcting misspelled words by providing suggestions, fetching latest news on a topic dynamically, search over delicious tags, customized language search. Some of these have been tried and successfully implemented. An example is the TechCrunch's Search Engine.

Here is a concise guide on how to use this new service in your next rails application.

Obtain API key

Get an application key from http://developer.yahoo.com/search/boss/.

Install BOSSMan Gem

BOSSMan is a gem for interaction with the Yahoo BOSS web service written by John Pignata. Install it using the following commands:

Usage

Put this in your environment.rb

Web Search

Use the following code to execute search over the web:

Apart from universal arguments mentioned at the end of this document, the following arguments can also be used:

Parameters Values/Description
:filter Filter out adult or hate content
Syntax: :filter => "-hate, -porn"
:type Specifies document formats (pdf, msoffice,etc)
:view Syntax: :view => "view1,view2", etc

:view => "keyterms" will retrieve related words and phrases for each search result.

:view => "searchmonkey_feed" will retrieve structured data markup, if available, for the search result in dataRSS format.

:view => "searchmonkey_rdf" will retrieve structured data markup, if available, for the search result in rdf format.

:view => "delicious_toptags" will retrieve the top public delicious tags for a document and the counts associated with each tag

:view => "delicious_saves" will retrieve the number of times a document was saved in delicious

:view => "language" identifies the language of the document

:abstract :abstract => "long" will retrieve and display an abstract of a web document up to 300 characters. This expanded abstract provides the requestor with a larger piece of information to work from in a web search query. The default for abstract is an abbreviated description.

Click here to view list of response fields returned by web search.

Images Search

For searching images on web, use the following code:

Click here for list of additional arguments that can be provided with image search.

Click here to view list of response fields returned by image search.

News Search

Yahoo! BOSS also provides news search capabilties.

Click here for list of additional arguments that can be provided with news search.

Click here to view list of response fields returned by news search.

Spelling Search

Correct your misspelled words using its spelling suggestion feature:

Site Explorer

You can also search for the pages from other sites linking into your site pages.

For displays a list of all pages belonging to a domain in the Yahoo! index, use se_pagedata method provided.

Universal Arguments for Web, Images and News

Following is the list of most commonly used arguments with web, image and news search. For more comprehensive list click here

Argument Options/Details
:start Ordinal position of first result. First position is 0. Default sets start to 0.
:count Total number of results to return. Maximum value is 50. Default sets count to 10.
:format The data format of the response. Value can be set to either "xml" or "json". Default sets format to "json".
:callback The name of the callback function to wrap the result. Parameter is valid only if format is set to "json". No default value exists.
:sites Restrict BOSS search results to a set of pre-defined sites. Multiple sites must be comma separated. Example: (:sites => "abc.com,cnn.com"). The Images service does not yet support multiple sites.
:view Retrieve additional search data provided by the respective BOSS service. Please see individual chapters to see what view options are available.
:style By default for web search result titles and abstracts contain bold HTML tags around the search term. Use :style => "raw" to remove the bold tags around the search terms in titles and abstracts.