WWW 2008 / Poster Paper

April 21-25, 2008 · Beijing, China

The World Wide Telecom Web Browser
Sheetal Agarwal, Arun Kumar, Amit Anil Nanavati, Nitendra Rajput
IBM India Research Laboratory 4, Block C, Vasant Kunj, Institutional Area, New Delhi - 110070, INDIA.

{sheetaga, kkarun, namit, rnitendra}@in.ibm.com

ABSTRACT
As the numb er of telephony voice applications grow, there will b e a need for a browser to surf the Web of interconnected voice applications (called as VoiceSites). These VoiceSites are accessed through a telephone over an audio channel. We present the concept and architecture of T-Web Browser, a World Wide Telecom Web browser that enables browsing the Web of voice applications through an ordinary phone. This browser will supp ort rich browsing features such as history and b ookmarking. Categories and Sub ject Descriptors: H.4.3 [Communications Applications]: Information browsers General Terms: Design, Human Factors Keywords: Developing Regions, Voice Browser, WWTW, HSTP

1.

INTRODUCTION

Currently, voice applications are accessed using a telephone device as shown in Figure 1. A Voice Browser is used to access multiple applications A1 , A2 , ..., An that are hosted on the same Application Server (App Server in the figure). Since all the voice applications are on the same server, and are accessed by the same Voice Browser, it is p ossible for the Application Server to maintain a browsing history of the applications b eing browsed. The applications A1 to An can provide a link to each other [1] .
ASR A1 A2 ... An
+

ASR B1 B2 ... Bn
+

TTS

App Server

TTS

App Server

Browser, the reachability of these applications can b e significantly increased. This is analogous to the web applications in the World Wide Web, where any website can connect to any other website, regardless of where they are deployed. Continuing the Web analogy further, we have shown that creating and deploying voice applications (called VoiceSites) is as easy as creating a webpage [3]. This is esp ecially imp ortant for p eople in developing countries where Internet/PC availability is low, since creating VoiceSites can b e done by speaking through an ordinary telephone. We also describ ed the process of enabling hyperspeech link to another voice application. and presented a Hyp ersp eech Transfer Protocol to supp ort hyperspeech links [2]. Such an interconnection of VoiceSites op en several p ossibilities for telephony voice applications and can create a web parallel to the WWW, known as WWTW [4]. As an example, a fisherman can create his VoiceSite that has information and pricing of the fishes available with him. He can link his VoiceSite to a payment gateway VoiceSite to enable transactions. Vil lagers can cal l his VoiceSite and order a fish and make payment while the fisherman is busy at his lake farm. In a practical scenario, it is likely that these two VoiceSites are hosted at different Voice Browsers. In this paper, we present the concept and architecture of a T-Web Browser. This browser will enable navigation of interlinked voice applications that are deployed across different Voice Browsers. The architecture illustrates that the Browser can b e implemented as a sp ecial VoiceSite (Section 2) and can supp ort standard browsing features such as go-back and bookmark. The T-Web browser can also b e implemented on the device itself, but that would require sp eech recognition supp ort on the device.

Voice Browser

Voice Browser

2. WWTW BROWSER ARCHITECTURE
We implement the T-Web Browser as a sp ecial VoiceSite. In addition to a dialog flow (authored as VoiceXML-jsp), the T-Web Browser VoiceSite uses a database to maintain a history of the current user session. It uses an additional database to store the caller b ookmarks. The history consists of the title and the phone numb ers of the VoiceSites called by the user. The b ookmarks contain the phone numb er and a name tag for each b ookmarked VoiceSite. The op erational model of the T-Web Browser is shown in Figure 2. We describ e the working of T-Web Browser using the circled steps shown in the figure. 1. User calls the T-Web Browser to access a VoiceSite. 2. The T-Web Browser transfers the call to the phone numb er of the VoiceSite through HSTP.

Figure 1: The current deployment and access model. However, in this architecture, the Voice Browser at the first site will not b e able to surf a link from an application in set A to any application in B . How to enable surfing such cross-browser links is the problem we address in this paper. By allowing a voice application on one Voice Browser to link to another voice application deployed on another Voice
Copyright is held by the author/owner(s). WWW 2008, April 21­25, 2008, Beijing, China. ACM 978-1-60558-085-2/08/04.

1121


WWW 2008 / Poster Paper

April 21-25, 2008 · Beijing, China

World Wide Telecom Web
Voice Site
HSTP

Voice Site
HSTP

Voice Site
HSTP

Voice Site
HSTP

Voice Site
HSTP

Voice Site
HSTP

Voice Site
HSTP

Voice Site
HSTP

HSTP T ransfer

2 6

3 1 5

3 2

6

T-Web Browser

4
ASR
+

History

Bookmarks

7

Dialog Flow App Server

TTS
Back

Voice Browser

Figure 2: The T-Web Browser interaction with VoiceSites. 3. When a user selects a hyp ersp eech link to browse to the other VoiceSite, the session is transferred to the target VoiceSite and HSTP passes the call transfer information to the T-Web Browser. 4. This information is stored in the Browser history. 5. The user issues a Browser command, e.g. go-back. 6. The T-Web Browser instructs the HSTP layer on the current VoiceSite to initiate a transfer to the earlier VoiceSite phone numb er. 7. At anytime, the user can say bookmark to b ookmark the currently browsed VoiceSite. The HSTP layer implements the protocol for transfer of user session from one VoiceSite to another. We will have to modify the HSTP protocol and the message format for it to supp ort browsing features. An additional field will have to b e added to the HSTP message. This is the phone numb er of the next VoiceSite to which the user will surf. The current VoiceSite application context will b e transmitted to the TWeb Browser in the application context field of the HSTP message. The ab ove architecture requires that a user can access the VoiceSite and the T-Web Browser simultaneously. This requires the availability of two simultaneous voice channels from the user phone. This can b e realised in two ways: (a) through a three-party conference call b etween the user, the T-Web Browser and the VoiceSite, (b) by having b oth the voice channels active, and the user can put one channel on hold while talking through the other. The first approach requires that the VoiceSite and the T-Web Browser should b e able to disambiguate the user utterance and identify whether it is a command to the browser or an interaction with the VoiceSite. The latter approach needs a phone (and the service provider) that has the ability to provide two simultaneously active calls. Since neither of the two requirements are unsurmountable, any of these methods can b e used in the implementation of the T-Web Browser. This architecture can supp ort rich browsing features similar to multi-tabb ed browsing on the WWW by enabling simultaneous active calls to different VoiceSites. A user can put other VoiceSites on hold while interacting with one.

2.1 Implementation
We have implemented the HSTP protocol layer that will enable the call and context transfer required by the T-Web Browser. The HSTP protocol has b een implemented as a Java class library and the API can b e accessed from any VoiceXML application through a Java Bean. We are in the process of developing the prototyp e of the T-Web Browser VoiceSite. The VoiceSite will b e authored in VoiceXML-jsp. We will deploy this VoiceSite in Apache TomCat. This will b e accessed through phone calls that are intercepted by the Dialogic telephony card. We use the Genesys Voice Browser to interpret VoiceXML.

3. CONCLUSION
In this pap er, we briefly presented the requirement for a Browser that can browse VoiceSites that are deployed across domains. We describ ed the concept and a high level architecture of a T-Web Browser that is implemented as a VoiceSite. We are in the process of implementing this browser, and keenly await user feedback through field studies.

4. REFERENCES
[1] 1800-555-TELL. http://www.tellme.com. [2] S. Agarwal, D. Chakrab orty, A. Kumar, A. A. Nanavati, and N. Ra jput. HSTP: Hyp ersp eech Transfer Protocol. In ACM Hypertext, UK, Septemb er 2007. [3] A. Kumar, N. Ra jput, D. Chakrab orty, S. Agarwal, and A. A. Nanavati. Voiserv: Creation and delivery of converged services through voice for emerging economies. In IEEE WoWMoM, Finland, June 2007. [4] A. Kumar, N. Ra jput, D. Chakrab orty, S. Agarwal, and A. A. Nanavati. WWTW: A World Wide Telecom Web for Developing Regions. In ACM SIGCOMM Workshop on Networked Systems For Developing Regions, Japan, August 2007.

1122