Note This is a draft version of the documentation for the Microsoft Office Research Service SDK and may contain inaccuracies. The complete, detailed version of the documentation for the Microsoft Office Research Service SDK will be available in late 2003.
Abstract: The Research and Reference feature in Office 2003 applications provides rich, integrated search functionality. Out of the box, Office 2003 includes a number of Microsoft and third party services. Research and Reference is also a platform for organizations to build their own research and reference services and for third party research providers to build subscription services. This paper describes the technology in more detail and discusses how administrators can ensure that their organizations get its maximum value.
Published: April 2003
Service/Source - A service or source is a collection of research data displayed in the Research and Reference task pane. In this document, we will use the term 'service.
Provider - A provider is a source of research and reference content typically accessed via an external URL or internal server address. A provider may offer one or more research sources.
Query - Query is the general term for a Research and Reference search.
XML - eXtensible Markup Language is a metadata definition language used to describe data in a structured open format.
SOAP - Simple Object Access Protocol is an XML/HTTP-based protocol for accessing services, objects and servers in a platform-independent manner.
The Research and Reference feature in Microsoft® Office 2003 lets information workers quickly locate and use the information they need without leaving the application in which they are working. Research and Reference expands upon the search implementation in Microsoft Office XP, which provides integrated Windows® Explorer-based search functionality from within Office XP applications. It is powerful and broad enough to provide information workers with a search experience that they will choose over web-based research and reference sites.
Before Office 2003, information workers had to do similar research and quick reference tasks by switching repeatedly between applications. For research, information workers could open a browser, retrieve relevant information, and return to an Office application to incorporate that information. For quick reference tasks, such as looking up a definition, information workers could, again, use a browser-based dictionary service, or they could use Microsoft's Bookshelf®. The first design priority for Research and Reference within Office 2003 was to dramatically streamline this process by combining and improving existing tools and by providing:
The Office 2003 product includes numerous sources of Research and Reference "right out of the box", including Dictionary, Thesaurus, MSN Search, and Encarta® Encyclopedia, and a number of third party services. Research and Reference is also a platform for organizations to build their own research and reference services and for third party research providers to build subscription services.
This paper describes the Research and Reference feature in more detail, including its architecture and infrastructure requirements. The paper goes on to describe how organizations and third parties can create custom services, and finishes with a discussion of the options for configuring clients to use both built-in and custom services.
All Office 2003 applications provide the same Research and Reference task pane. Figure 1 shows the task pane in Microsoft Office Word 2003, with a few results from a key-word search. This search was accomplished by right clicking the word "remedy" within the document and clicking Look Up, one of several methods for initiating a search.
Notice in the figure that the Thesaurus results provide smart tags that allow the information worker to copy a word, insert it into the document (replacing the current selection and taking on the proper formatting), or do a look up of that word. Developers can integrate smart tag functionality into their research and reference services in a variety of ways. Given the dramatic improvements in smart tag technology in Office 2003, this integration is a powerful aspect of Research and Reference.
When using Research and Reference, one specifies via a pull-down menu which services to search. In this example, All Reference Books was selected. Selecting All Research Sites returns results from all installed internally and externally hosted research services. Again, the results are well organized (with collapsible sections) and, as mentioned above, can have smart tag intelligence built-in.
Figure 2 shows a simple query for the term "Blood Pressure", with results from eLibrary, MSN® Search, Factiva News Search, and Encarta Encyclopedia.
Figure 1: Task pane
Figure 2: Simple research search
Research and Reference uses XML for all communications and for the display and manipulation of search results. The layout of results is very flexible, because developers can use XML and smart tags to provide rich formatting, collapsible lists, intelligent content-based actions, and so on.
Since Research and Reference is built into Office 2003 it works right "out of the box" with no specific customization required. From a network perspective, all communications are done over HTTP (via XML, or, more specifically, SOAP), so there is no special firewall configuration required. Research and Reference services can be hosted either internally or externally.
The following sections describe the architecture from the perspective of IT professionals concerned with how Research and Reference functions within their existing IT infrastructure:
Research and Reference services are made available through a provider, which can host multiple services. Office 2003 applications connect to a provider via its URL and receive from the provider a list of available services. By default, all Office 2003 clients are configured to check Microsoft's provider (http://office.microsoft.com/research/query.asmx) for new Microsoft services and for third party services that Microsoft lists. Organizations can also create their own providers, exposing whatever services they wish.
All client/provider communications, as well as client/service communications, take place over HTTP. Hence, as far as clients are concerned, it makes no difference whether the provider or service is located within the firewall or on the Internet (see figure 3).
There are three basic scenarios where a research service can be used: in an intranet, through the Internet, or on a client machine (running a service locally on a machine has limitations that are discussed in the SDK),. Information workers positioned behind a corporate firewall can access services on the Internet directly through a client application, such as Word 2003 or Microsoft Office Excel 2003, or can access research services indirectly through a server within their corporate intranet.
Figure 3: Possible client/provider locations
The basic sequence of events for service installation is as follows:
HKEY_CURRENT_USER\Software\Microsoft\Office\11.0\Common\Research\Sources\<servicename>
and consist of entries such as those shown in figure 4 (the MSN Search Service). Figure 4: Registry entries for a service
For smart tag integration, the service provider incorporates a separate setup process from within a search result in the Research task pane.
Note that IT professionals can and should incorporate Research and Reference service installation into their deployment strategy for the company. See the "Service Deployment" section below for more details.
Once a service is registered, information workers can initiate searches for that service. During a search, Office 2003 sends query packets to the service, which sends a response packet containing search results. All communications take place with formatted XML packets, and each segment of the communication adheres to a schema. Figure 5 shows the order of the XML schema packets that pass between client and service:
Figure 5: client/service communications
When the Office 2003 application receives a response from a service with the results of the search, it displays the results in the Research and Reference task pane.
Research service providers are able to specify custom actions for the content they return. These actions are presented to information workers using the same mechanism used to expose the built-in content actions such as inserting and copying.
The actions themselves are carried out via smart tags provided by the service provider. This way the Research and Reference framework does not need to maintain information about smart tags. For example, a research service may return a response containing a smart tag that gives information workers the ability to grab additional live data, transform the response text, or some other action. An Insert action can also place content into Word 2003 and Excel 2003 documents as XML, and then, for example, additional intra-document actions may become available.
Office 2003 includes a rich offering of research and reference services out of the box. There are a number of Microsoft services as well as third party services provided by partners. Figure 6 shows the Research Options pane, which displays installed services and allows information workers to activate and deactivate services.
Figure 6: Installed research and reference services
The following services are installed by default:
The Thesaurus and Translation services, listed above in the Reference Books section, are locally installed, which means that offline searches will yield results. All other default services are not locally installed, so that offline searches will not yield results.
In addition to providing better, broader, and more integrated searching via the built-in services described above, Research and Reference is also a platform for organizations to build their own research and reference services and for third parties to build subscription services.
For example, a pharmaceutical company with a huge internal database (or multiple databases) containing information on their products (R&D information, insurance information, sales information, and so on) could create a service that makes all this information available to designated information workers in a powerful way while they work within Office 2003 applications. The service would consist of a SOAP function named "Query" that handles client queries, retrieves the information from the database, and returns the information to the client. The Research and Reference Solution Developers Kit provides detailed information on building a custom service.
Continuing the example from figure 2 above, our sample company (Contoso Pharmaceuticals) has created a custom-built research service with information on its products. An information worker can add the service from the Research Options pane by clicking Add Services and typing in the URL, as shown in figure 7. The company could also use one of the deployment methods described in the "Service Deployment" section below.
Figure 7: Adding a service manually
After adding the new service, the information worker's search for "Blood Pressure" yields the results shown in figure 8. Key corporate data is now available where the information worker needs it.
All services are defined by registry entries (figure 4 shows the registry entries for the MSN Search Service), and service deployment consists in getting those registry entries onto the client machine. There are a few ways to do this:
Figure 8: Results from the custom service
With these pointers in place, Office 2003 will frequently check whether new services are available and, based on how the administrator has configured clients, either notify the user or automatically install the new services. See "Controlling Service Installation Options" for more details.
If information workers are to have full control over which services they want to install, administrators do not need to do anything more.
Optionally, administrators can control which services are installed by default, whether information workers will be able to add services manually, and whether they will automatically connect to an internal Arbitrary Discovery Server.
First, to specify which services are installed by default, the administrator can create a set of keys under HKEY_LOCAL_MACHINE
that define the services. By default, no keys exist at HKEY_LOCAL_MACHINE\Software\Microsoft\Office\11.0\Common\Research\Sources\.
Creating a list of keys, each representing a service (as shown in figure 9), will cause Office to bypass its normal procedure of initially installing services. Specifically, instead of looking to Microsoft's discovery server and installing default services, the Office client will find these keys written to HKEY_LOCAL_MACHINE
and copy these to HKEY_CURRENT_USER
, with the result that the services specified by the administrator are now installed for the user. This process is called "propagation."
Figure 9: HKEY_LOCAL_MACHINE keys will propagate to users
There are a few additional registry entries that allow administrators to further control user options. The three available keys (which do not exist by default) are:
HKEY_LOCAL_MACHINE\Software\Microsoft\Office\11.0\common\Research\Options\NoAdd
This key should be a DWORD value that behaves as a Boolean but allows for other values. When given a value of 1, NoAdd will block all forms of adding services EXCEPT:HKEY_LOCAL_MACHINE\Software\Microsoft\Office\11.0\Common\Research\Options\NoDiscovery
This key should also be a DWORD value. When given a value of 1, NoDiscovery will prevent Office applications from looking to either Microsoft's discovery server or any Arbitrary Discovery Server.
HKEY_LOCAL_MACHINE\Software\Microsoft\Office\11.0\Common\Research\Discovery
This key will cause Office applications to query the specified address for available services. As mentioned above, up to 5 entries are possible, and take the form of :
Figure 10 shows the new keys, with NoAdd turned on and NoDiscovery turned off.
Figure 10: New keys
Administrators may use the Office Resource Kit to deploy these registry settings, or they may manually install them or create a batch file.
There are no special security considerations for Research and Reference usage, per se. Since a registered service may only transmit data that adheres to the Research and Reference schema, there is no danger of malicious content within the XML stream that is displayed in the task pane. However, a response may contain a link to an installation program for integrated smart tag functionality - as with any code, it is highly recommended that only signed code be allowed.
Some services may require authentication, for which Research and Reference provides the following models:
As mentioned in the Architecture section, communications take place over HTTP, but developers may also use HTTPS for secure connections. Developers can place all communications over HTTPS or they can specify that some specific action uses HTTPS (for example, submitting a form).
Research and Reference provides a powerful, integrated, and extensible solution for gathering information. As more and more companies build their own services and more and more third parties build subscription services, Research and Reference will become even more powerful.
Smart documents are another feature of Office 2003. Both features use XML to make data available to information workers. One key functional difference is that smart documents are more suited to manipulating data (both in retrieving data and in saving data back to a database or other location), while Research and Reference is suited more to gathering information. Another key difference is that smart document solutions are attached to a specific document, while Research and Reference is independent of any specific document.
Of the built-in services, the Thesaurus and Translation services store their information on the client so that, even without an internet connection, they will work. Other built-in services will not yield results offline. Depending on how they are developed, third party services may also store data locally and work without an internet connection.
Yes. Administrators can prevent users from adding services, and they can configure Office's server side sources (including third party ones).
For more information, see the following web sites
Office http://www.microsoft.com/office