WWW 2008 / Refereed Track: Security and Privacy - Misc April 21-25, 2008 · Beijing, China Privacy-Enhanced Sharing of Personal Content on the Web Mohammad Mannan, Paul C. van Oorschot School of Computer Science, Carleton University Ottawa, Ontario, Canada {mmannan, paulv}@scs.carleton.ca ABSTRACT Publishing p ersonal content on the web is gaining increased p opularity with dramatic growth in social networking websites, and availability of cheap p ersonal domain names and hosting services. Although the Internet enables easy publishing of any content intended to b e generally accessible, restricting p ersonal content to a selected group of contacts is more difficult. Social networking websites partially enable users to restrict access to a selected group of users of the same network by explicitly creating a "friends' list." While this limited restriction supp orts users' privacy on those (few) selected websites, p ersonal websites must still largely b e protected manually by sharing passwords or obscure links. Our focus is the general problem of privacy-enabled web content sharing from any user-chosen web server. By leveraging the existing "circle of trust" in p opular Instant Messaging (IM) networks, we prop ose a scheme called IM-based Privacy-Enhanced Content Sharing (IMPECS) for p ersonal web content sharing. IMPECS enables a publishing user's p ersonal data to b e accessible only to her IM contacts. A user can put her p ersonal web page on any web server she wants (vs. b eing restricted to a sp ecific social networking website), and maintain privacy of her content without requiring site-sp ecific passwords. Our prototyp e of IMPECS required only minor modifications to an IM server, and PHP scripts on a web server. The general idea b ehind IMPECS extends b eyond IM and IM circles of trust; any equivalent scheme, (ideally) containing pre-arranged groups, could similarly b e leveraged. 1. INTRODUCTION Through social networking and photo-sharing websites, and p ersonal blogs, it is b ecoming increasingly p opular to make p ersonal content available on the Internet. For some users, these sites provide a textual and/or pictorial documentary of life. Primarily b ecause it is the easiest mode of op eration, many users of these services allow their p ersonal web content to b e accessed by all other Internet users, often with the false impression that none other than their family or friends would look into their p ersonal online p osts [29]. Privacy concerns are largely b eing ignored (sometimes unknowingly) in the current rush to online lifecasting. Social networking websites such as Faceb ook and MySpace provide access control mechanisms for partially restricting p ersonal content to a known circle of contacts; photosharing websites such as Flickr and Shutterfly provide similar mechanisms. A user can invite her friends and family to b e added to her permitted list, and can authorize only such p eople to view her web content, but only if they create accounts at the publishing user's social networking site. Although users rep ortedly disclose p ersonal data in abundance at these social networking sites, a relatively small numb er of users limit access to their profiles only to a friends' circle; several studies provide evidence of such b ehaviour [21, 29, 39, 51]. While this limited restriction might help users' privacy, this applies only for the content on those (few) sites. We focus on the general problem of privacy-enabled web content sharing from any user-chosen web server. Many users now own domain names for hosting p ersonal websites, facilitated by the very low price; as of Octob er 2007, a top-level domain name may cost less than $6/year, with $4/month commercial hosting fees. Most ISPs also offer free web spaces for home users. It is thus cheap and easy to make any p ersonal data available to anyone around the glob e through a website; however, restricting such content to a selected group of p eople is more difficult. Currently this is achieved primarily by either (i) advertising an obscure link through p ersonal email, i.e., a URL which is not linked from any other web page; or (ii) protecting a web page with a password, and distributing that password among chosen contacts through email, instant messaging (IM), or phone. Emailing an obscure URL to many contacts (friends and family memb ers) is a rather cumb ersome approach, esp ecially if the shared URLs are often up dated. Password protection (e.g. HTTP Authentication [18], forcing a login dialogue/page) is not uncommon among the more technically inclined, but this leads to yet one more password to share and maintain, and once a password is shared with some- Categories and Subject Descriptors K.6.5 [Management of Computing and Information Systems]: Security and Protection--Authentication, Unauthorized access ; K.4.1 [Computers and Society]: Public Policy Issues--Privacy General Terms Security, Human Factors Keywords access control, sharing, circle of trust, privacy Version: February 24, 2008. Copyright is held by the International World Wide Web Conference Committee (IW3C2). Distribution of these papers is limited to classroom use, and personal use by others. WWW 2008, April 21­25, 2008, Beijing, China. ACM 978-1-60558-085-2/08/04. 487 WWW 2008 / Refereed Track: Security and Privacy - Misc one, the access grant cannot b e retracted without changing the password (which also requires distributing the new password to all other contacts). Also, anyone who learns the shared password can view the protected content without the publishing user's consent; anyone knowing the password can pass it on to others, and such transitive access is not generally preventable. Relying on the immense p opularity of public instant messaging (IM) networks,1 we prop ose a scheme called IM-based Privacy-Enhanced Content Sharing (IMPECS) to disseminate p ersonal web content by leveraging the established "circle of trust" on IM networks. We assume b oth publishing and viewing users can, or already do use IM. A user's web content can b e viewed only by her IM contacts. Further restrictions can b e applied dep ending on which group of users (e.g. family, friends, co-workers) a sp ecific contact is placed in by a publishing user, i.e., one who originally makes p ersonal content available for her IM contacts. A viewing user is one who wants to view such content. We assume that a web server and an IM server share a user-sp ecific content sharing key; a `ticket' (similar to a session cookie) is generated by the IM server for a viewing user using the content key of a publishing user, and the web server validates the ticket b efore serving data from a user's web folder (cf. Kerb eros [31]). Our primary goal is to enhance privacy (i.e. confidentiality) of users' p ersonal web content; we do not aim for very high-end or military-grade security, as the security of IMPECS is limited by the underlying IM and web communication protocols, which in current practice transfer most content in plaintext although authentication passwords are generally sent over SSL (cf. [26, 11]). The main intended feature of IMPECS is that total strangers are precluded from direct access to a user's p ersonal web content, but "friends" as designated by the user's IM contact list are allowed access (without requiring any sp ecial shared password). IMPECS also prevents large-scale web crawlers and auto-indexers from tagging p ersonal data and pictures (see e.g. [4, 27]). However, malicious IM contacts of a publishing user may of course re-p ost the user's private content to a public web forum, and we are not prop osing any form of digital rights management (DRM) control. In summary, our prop osal for privacy-enhanced p ersonal web content sharing offers the following features and b enefits. 1. Privacy-Enhanced Sharing. A publishing user's p ersonal web content can b e viewed only by the IM contacts that she pre-approves. Thus privacy of a user's web content is restricted to a designated group. For many existing IM users, such groups can b e leveraged without additional setup costs. 2. Usable Security. The privacy enhancement does not require a viewing user to separately up date his IM client, or rememb er the publishing user's URL, or have access to a site-sp ecific password to view the publisher's content. Similarly, the publishing user need not carry out any extra steps b eyond existing management of an IM contact list, although finer granularity lists can optionally b e created by advanced users. 3. Interoperability. In contrast to social networking For example, according to one estimation [6], there are ab out 350 million user accounts in MSN and Yahoo! IM networks in total. 1 April 21-25, 2008 · Beijing, China websites, a user can publish her web content at any web server of her choice, and yet b e able to maintain greater access control on her content. 4. Decreased Risks Related to Sharing. By restricting op en access to p ersonal details, IMPECS reduces opp ortunities for launching context-aware, targeted phishing attacks [30, 44, 50]. 5. Protection Against Web Server Compromise. A variant of IMPECS (Section 4) can prevent en masse drive-by-downloads [36, 47] as currently b eing enabled by the compromise of a hosting provider with a large numb er of customers. To test our design, we built a prototyp e of IMPECS using the IETF standardized Extensible Messaging and Presence Protocol (XMPP [40, 41], i.e., the Jabb er IM protocol). This required only minor modifications to the IM server, and PHP scripts on a web server. Our implementation source code is available on request. Organization. In Section 2, we discuss the prop osed IMPECS scheme, threat model and op erational assumptions. Our prototyp e implementation is discussed in Section 3, along with brief comments on deployment issues. A variant of IMPECS is discussed in Section 4. Section 5 provides further motivation, an overview of existing and prop osed work related to p ersonal content sharing, and a comparison of IMPECS with these in terms of user convenience and usability. Section 6 concludes. 2. IM-BASED PRIVACY-ENHANCED CONTENT SHARING (IMPECS) In this section, we describ e the prop osed IMPECS scheme, threat model and op erational assumptions. Table 1 summarizes our notation. We assume readers are familiar with basic IM definitions such as presence and contact list (e.g. see [25]). A, B Two IM users Alice and Bob, b oth memb ers of each other's resp ective contact lists. A is the publishing user; B is the viewing user. IM and web servers, resp ectively. Both A and B have accounts with Si , and A maintains an account with Sw . A's user ID at Sw (unique in Sw 's domain). A's content sharing key, shared with b oth Sw and Si . Authenticated encryption [20, 10] of data using symmetric key K . The URL of A's publishing web folder on Sw . Access restrictions on URLA as imp osed by A. An access control ticket for viewing URLA (generated by Si , and validated by Sw ). A `registration' URL generated by Sw when requested by A. The content sharing key and restrictions are shared b etween Sw and Si through this URL. A `viewing' URL (for accessing URLA ) containing a ticket Tiw , generated by Si at the request of B . Si , S w I DAw KAw {data}K U R LA R Tiw URLAR URLAT Table 1: Notation used in IMPECS 488 WWW 2008 / Refereed Track: Security and Privacy - Misc Publisher (A) IM Server (Si ) Authentication (b etween A, Sw ) April 21-25, 2008 · Beijing, China Web Server (Sw ) o Request a registration URL for URLA , sp ecifying restrictions R / / o Authentication (b etween A, Si ) URLAR o URLAR / / Figure 1: Registering a URL in IMPECS Overview of IMPECS. Assume user A maintains a website on a web server Sw . A registers her site with an IM server Si , and sets p ermission for the site, e.g., which contacts can access which pages/folders. For example, contacts in the group "friends" may have different p ermissions than the group "family." Sw and Si share a user-sp ecific content sharing key for A. IM contacts of A can see (through their IM clients) whether A offers any p ersonal URL which they are p ermitted to view. When a contact B wants to visit A's advertised p ersonal website (or any pages thereon), B sends a request to Si to visit the website. Dep ending on restrictions R (e.g. duration, frequency) for viewing web pages at URLA , Si generates a `ticket' (similar to a session cookie), and sends a sp ecial URL to B along with the ticket. B receives the URL instantly (e.g. as an IM text message) from Si , and can visit URLA within a time p eriod as sp ecified by the ticket. Note that A need not b e online to provide this p ermission. We now describ e the scheme in greater detail. Setup. A and B are two IM users who maintain IM accounts at the same IM server Si . (Note that A and B may use different IM servers, as long as their IM servers facilitate communication b etween the users, e.g., as in distributed XMPP [40], Windows Live/Yahoo! Messenger.) Both users have added each other into their contact lists; adding someone to a contact list requires explicit p ermission from the user b eing added (a common practice in most IM networks). A also puts B into an appropriate group of her contact list (e.g. "family", "friends", "co-workers"). A maintains an account with a web server Sw , and uploads some p ersonal pictures or files under a web folder URLA at Sw . A wants to share URLA with a select group of IM contacts including B . Registering a URL with the IM server. We now describ e the steps for publishing a content-hosting URL in IMPECS. Figure 1 outlines the following steps. 1. A logs into Sw (e.g. using a pre-established password over SSL). 2. A uploads her p ersonal files and sets restrictions on URLA , e.g., the length of time a ticket will remain valid after b eing generated by Si (using e.g. HTML check-b oxes or drop-down lists). A then requests Sw to generate a registration URL for URLA . 3. Sw generates a random content sharing key KAw (e.g. 128 bits, sufficient for precluding offline dictionary attacks) and stores it in a protected database, or in a file under A's private space. Sw constructs the registration URL, URLAR = http://< URLA > /?userid=I DAw &key= KAw &restrictions=R, and sends URLAR to A (e.g. through HTTPS). Here, by we mean the actual URL (without the `scheme name'), not a lab el for that URL (i.e. not the string "URLA "). 4. A logs into Si (e.g. using her regular IM password over SSL). 5. A forwards URLAR to Si , for the purp ose of registering this information with Si . Si stores URLA , I DAw , KAw and R for future ticket generation. Viewing a protected URL via an IM server. We now describ e the steps for viewing a content-hosting URL in IMPECS. Figure 2 outlines these steps. 1. B logs into Si (e.g. using his regular IM password over SSL), and receives his contact list as usual in IM. As part of IMPECS, B also receives a list of private URLs, offered by his contacts, which are authorized to b e accessed by B . 2. B sends a request to Si for a ticket to view one of these URLs, say A's web content at URLA . 3. Si generates a ticket Tiw = {I DAw , R}KAw , constructs URLAT = http://< URLA > /?userid=I DAw &ticket= Tiw , and sends URLAT to B . 4. B forwards URLAT to Sw . Sw retrieves KAw using I DAw as emb edded in B 's request. Then Sw decrypts the ticket Tiw , and compares whether A's user ID in the URL is the same as inside the ticket. Sw also checks the restrictions; e.g., R could b e as simple as a timestamp, in which case Si encrypts the current timestamp into the ticket and Sw accepts that ticket if received within a sp ecific time p eriod (e.g. 60 seconds, as set by A). 5. Sw sends the content hosted at URLA to B after validating B 's ticket Tiw in URLAT (as in step 4). If a valid ticket is not supplied, Sw denies access to URLA . Caveats. A malicious user B can compromise the privacy of content hosted at URLA , by making a copy of the website and p osting it on a publicly accessible site, or sending a valid ticket to anyone B wants. Although A cannot stop 489 WWW 2008 / Refereed Track: Security and Privacy - Misc Viewer (B ) IM Server (Si ) April 21-25, 2008 · Beijing, China Web Server (Sw ) Authentication (b etween B , Si ) o Request to access URLA / / o URLAT URLAT / Content hosted at URLA o Figure 2: Viewing a personal URL in IMPECS copying of her p ersonal content, she may limit (to some extent) forwarding of a valid ticket with the help of Si and Sw in the following way. Si can encrypt B 's current IP address into the ticket, and Sw can check whether it receives the ticket from the sp ecified IP address as emb edded inside the ticket (assuming b oth Si and Sw have access to the same IP address of B ). If a content key KAw is leaked, anyone can generate valid tickets with that key, and thus compromise the privacy of content hosted at URLA . If A changes her content key KAw , this threat can b e minimized. Note that A's modifications to her web content, and key up dates, are transparent to viewing users. Although valid tickets can b e generated with a compromised KAw , this key does not enable access to modify A's content on Sw . Most IM and web accounts are currently authenticated by user-chosen (generally weak ) passwords. A compromised IM account enables an attacker to add any malicious link (as p ersonal URLs) to that account. A compromised web account enables an attacker to p ost any content on the compromised user's web space, and modify content keys (although he cannot up date the content key at Si ). However, these threats exist currently for b oth IM and web accounts; IMPECS does not increase these existing risks nor does it attempt to address them. If user content is distributed across many different hosting sites (rather than b eing concentrated only to few sites as in current social networking sites), then an adversary cannot easily track users by collecting their p ersonal web content from only a few selected sites. However, in IMPECS if the IM server Si is compromised (or coop erates with the adversary), privacy of user content is lost for all IMPECS users of Si even if their content is hosted at different providers; from compromised content keys, anyone can generate valid tickets for accessing user data. Thus the IM server is a p otential single p oint of privacy breach (if compromised or hostile). If attackers can compromise the web server of a publishing user A, they can display whatever content they want from A's site, or spread malware to users visiting the site [36]. Compromise of a web server that hosts content from a large numb er of users is particularly more risky, and has b een rep orted in the past (e.g. [47]). We briefly outline a a variant of IMPECS to mitigate such a large scale compromise in Section 4. Threat model and operational assumptions. We assume that the circle of trust as built into IM networks is reliable, i.e., a viewing user is not malicious. A publishing user A cannot b e added to anyone's contact list without b eing explicitly approved by A (as is the common practice in most IM networks). To achieve fine-grained access control, we also assume that a publishing user groups contacts appropriately, and authorizes access to these groups conscientiously (e.g. which group can access which URLs). IMPECS trusts that the IM server checks publishing user A's p ermissions prop erly, and only sends tickets to authorized users. The web server is trusted to deliver A's content only after validating an appropriate access control ticket. The availability of usable site maintenance tools (e.g. HTML editing, file uploading) is also assumed for publishing users. If a publishing user A's IM client offers a user interface for setting a p ersonal URL (which is the norm in many IM clients, e.g., Yahoo! Messenger), we can use that to send the registration URL (containing the content key and restrictions), and thus may avoid changing A's IM client. A viewing user B 's IM client can also remain the same if it offers viewing IM contacts' p ersonal URLs (e.g. the `View Profile' option in Yahoo! Messenger provides a `Home Page' field in a profile webpage). We require only minor modifications to a web server through server-side scripts (assuming the server allows such scripts). The web server may optionally maintain a database of user-sp ecific content keys; otherwise, the content key of a user must b e stored in the user's private space on that web server. For an IM server, enforcing restrictions (in ticket generation) is easy; the server already restricts text (and other request) messages sent to a user from any other IM users according to the receiving user's preferences. However, users must register their URLs with the IM server; most IM services currently enable users to register p ersonal URLs on their profiles. Leaking these URLs (without the corresp onding content keys) will not by itself authorize access to any web content; they are inaccessible unless someone gets a valid ticket from the IM server. Communication in most public IM networks (client-server and client-client) and web servers (client-server) is plaintext, although a password for authentication is generally sent over SSL. Note that our design involves the content key KAw (i.e. URLAR ) b eing sent over SSL. An attacker with access to the communication link may eavesdrop on private content of a user when the user uploads content to the web server, or when content is served to a (valid) viewing user. Using a variant of IMPECS (see Section 4), or at the added cost of SSL, these attacks can b e addressed. 490 WWW 2008 / Refereed Track: Security and Privacy - Misc April 21-25, 2008 · Beijing, China Figure 3: A viewing URL instance in IMPECS 3. IMPLEMENTATION In this section we discuss our prototyp e implementation, and computational and deployment costs of IMPECS. We implemented a prototyp e of IMPECS using the Extensible Messaging and Presence Protocol (XMPP [40, 41], based on the p opular Jabb er2 IM protocol). As XMPP server and client, we chose jabberd2 [22] and Pidgin [34] (previously known as Gaim) resp ectively, on a Linux platform. For cryptographic library, we use OpenSSL and the PHP mcrypt module; we use AES-CBC-128 for symmetric encryption, and /dev/urandom for random numb er generation. MySQL is used for database supp ort. Our implementation source code for the prototyp e is available on request. We assume that the publishing user A can run PHP scripts on the web server Sw . Sw also stores A's content sharing key in a database. We create a web folder for A on Sw which is accessible for writing (and viewing) when A logs into Sw . Other than login as A, for viewing any content of the folder, one must supply a ticket containing a valid timestamp (and I DAw ) encrypted under A's content key. We assume that system clocks of Si and Sw are (more or less) synchronized. Sw checks whether a requesting URL contains a valid ticket; we accept a timestamp to b e valid if it arrives within 60 seconds of b eing generated by Si . A and B also add each other to their resp ective contact lists. XMPP uses the vCard [15] format for p ersonal profile information storage, which facilitates advertising one's p ersonal URL. We use this field in vCard for storing a usersp ecified URL, and added one field called content-key into the vCard table for storing a user's content sharing key (along with I DAw ).3 Ideally an XMPP user can set vCard values from any XMPP client. However, as the Pidgin implementation we used (version 2.0.1) lacks any such user interface for setting vCard values, we directly inserted URLA and KAw to A's vCard table on the jabb erd2 server database. For viewing a contact's vCard, a user can select the contact from the Pidgin contact list, and choose the "Get Info" option from the context menu. When Si receives such a request www.jabber.org Instead of inserting the content-key field, I DAw and KAw could b e emb edded into the URL field, allowing Si to remain in conformance with the vCard standard. 3 2 for A's vCard from B , Si retrieves A's content key KAw , and generates a ticket by encrypting the current time and I DAw with the key. Si then constructs a URL using URLA as the base, and I DAw and the (hexadecimal encoded) ticket as parameters. Figure 3 shows one example of Si 's resp onse to B . Then B can click on the link and b e able to view URLA , if validated by Sw . Computational and deployment costs. In addition to retrieving A's vCard information from a database (as required by a regular XMPP server), IMPECS requires one symmetric-key encryption by Si . One symmetric-key decryption is required by Sw when a viewing URL is received (for ticket validation). Sw also must generate a 128-bit random numb er when A requests a registration URL (for the content key generation). These op erations are relatively light-weight for the IM and web servers; no practical deployment barrier in terms of p erformance is exp ected. In a distributed IM service such as XMPP or Windows Live/Yahoo! Messenger, where A and B may have accounts with different IM servers, IMPECS does not require any changes to B 's server or client software. (Note that as of Feb. 2008, XMPP is supp orted by several large IM services, e.g., Google Talk, IBM Lotus Sametime, and AOL/ICQ.) We require changes to A's IM and web servers. The changes in Sw are mostly achieved through PHP scripts. A's content key and restrictions can b e stored in a file under a private folder (on A's web space), or in a database if Sw provides database access. Also, B remains anonymous to Sw in IMPECS; i.e., B does not need an account at Sw for viewing A's content, as opp osed to social networking websites (although a ticket is required in IMPECS). Note that all publishing users at Sw can reuse the same PHP scripts for our scheme; i.e., users are not required to write or modify the PHP scripts (these scripts may b e provided by, e.g., Sw or the op en-source community). Why not to implement IMPECS as a Facebook application. For ease of deployment, we could implement IMPECS in Faceb ook Platform4 or Google Op enSocial.5 Instead we chose to base our IMPECS design and implementa4 5 http://developers.facebook.com/ http://code.google.com/apis/opensocial/ 491 WWW 2008 / Refereed Track: Security and Privacy - Misc tion on IM for the following reason. We b elieve that storing relationship information and user data at the same site may undermine privacy; for example, a single entity then learns too much ab out users and may use that knowledge to launch unfriendly (in regard to users' privacy) campaigns such as targeted advertisements, sharing user data with government agencies and third-party businesses. This also makes such sites an attractive target to compromise. These threats are quite evident from the short history of Faceb ook and MySpace. IM networks have also b een targeted for malicious purp oses such as spreading worms and phishing URLs; however, such attacks generally compromise relationship information (i.e. email addresses) but not user content. April 21-25, 2008 · Beijing, China technique. The publishing user A may up date Kenc in a similar way to the content key KAw . However, an up date to Kenc does not mandate up dating KAw or vice-versa, and b oth key up dates are transparent to viewing users. 5. MOTIVATION, RELATED WORK AND COMPARISON TO IMPECS In this section we discuss existing and prop osed work related to p ersonal web publishing, and contrast the IMPECS scheme with these in terms of privacy and user convenience. Popular IM networks, e.g., Yahoo!, AOL, and Windows Live enable users to maintain a profile accessible as a webpage. Microsoft offers free web spaces for sharing p ersonal web content (e.g. profile, photos, blogs, guestb ook) through its Windows Live Spaces social networking website at www. spaces.live.com. Live Spaces is integrated with the Windows Live Messenger IM client. User A can control who may view her Live Spaces' webpage. A can invite friends to join the Windows Live Messenger network to view her content. A may authorize only her IM contacts (or a subset of the contacts) to view her space. Alternatively, A may make her space accessible to anyone on the web. If A's space is restricted to IM contacts, a contact B (from A's contact list) can login to Live Spaces using B 's Windows Live Messenger login credential for viewing A's space. If logged into the IM network, B can also select A's profile from a context menu from the Live Messenger client; from A's profile, B can access A's space without further authentication. Yahoo! is also extending its IM service to offer a social networking site called Mash (mash.yahoo.com).6 However, in either case, similar to the common social networking practice (e.g. as in Faceb ook or MySpace), B must join A's network to view any access-restricted content. In contrast, when using IMPECS, B does not need to know where his (IMPECS-enabled) IM contacts host their content. To partially relieve users from the necessity of creating multiple web credentials, Microsoft p ermits third-party businesses to use its Windows Live ID Web Authentication7 (previously known as Microsoft Passp ort). Similarly, Yahoo! offers the Browser-Based Authentication8 (BBAuth) service that enables third-party web applications to b e authenticated through widely used Yahoo! IDs. Op enID (openid. net) is an initiative from the op en source community to unify online authentication, also reducing the burden of creating multiple web credentials. AOL has enabled the use of Op enID (through openid.aol.com) for its IM service and AOL Pages social network. Op enID can also b e used for Yahoo! login (through openid.yahoo.com). Lib erty Alliance (projectliberty.org) is another `holistic' approach to establish an op en standard for online identity. If any such unified identification framework b ecomes widely accepted in the long-run, IMPECS would b ecome even more app ealing (e.g. through a common login credential). However, IMPECS does not address user authentication across websites p er se, but rather focuses on how the existing trust network and interactiveness of a p opular service like IM can b e leveraged to offer privacy-enhanced p ersonal content sharing on the web. As of Feb. 10, 2008, this is an invitation-only `b eta' service. http://msdn2.microsoft.com/en- us/library/ bb676633.aspx 8 http://developer.yahoo.com/auth/ 7 6 4. A VARIANT OF IMPECS In this section, we briefly outline a variant of IMPECS that can prevent malware-spread from a compromised web hosting provider. We have not implemented this variant yet. Some large hosting providers (e.g. godaddy.com) currently facilitate web hosting for thousands of p ersonal and corp orate sites. If many IMPECS users host their content at such a provider, a successful attack against the provider might p ossibly affect all those IMPECS users. The compromised user sites could b e used for malicious purp oses, e.g., hosting malware for drive-by-downloads [36, 47]. This could b e particularly bad for IMPECS users as private URLs as shared through IMPECS may app ear to b e more trustworthy. Here we outline a prop osal that can guard against such en masse exploits. Additional steps during URL registration. The following additional steps are required from a publishing user. 1. A uses a local application (in-browser JavaScript plugin or an indep endent content editing application) to generate an encryption key Kenc , 128 bits long. A then uses Kenc to encrypt her p ersonal files and upload the result (i.e. {dataf iles}Kenc ) to the web server Sw . This is done at the b eginning of step 2 in URL registration of IMPECS (see Fig. 1 in Section 2). 2. A app ends Kenc to the registration URL received from Sw b efore sending the URL to the IM server Si . This is done at the end of step 3 in URL registration of IMPECS (see Fig. 1 in Section 2). Additional steps for a viewing user. The following additional steps (although transparent) are required from a viewing user. 1. When Si generates URLAT (step 3 in Fig. 2; see Section 2), it also app ends the URL with Kenc as a URL fragment, i.e., #Kenc . When B visits this URL, URLAT is forwarded to Sw but not the fragment, i.e., Sw does not receive Kenc (cf. [2]). 2. In step 5 (see Fig. 2 in Section 2), Sw sends the requested (encrypted) content. B 's browser uses Kenc as received from Si to display the decrypted content. The encryption key Kenc is not accessible to Sw at any time. Thus by compromising Sw , an attacker cannot control what is served to the visiting IMPECS users. Note, however, that regular visitors to such a site are not protected by this 492 WWW 2008 / Refereed Track: Security and Privacy - Misc Most IM networks offer file sharing from user machines generally through custom-built file transfer protocols. An IM user can restrict which contacts in her IM contact list can access the shared files. However, IM file transfer protocols may not work in some cases (e.g. due to firewall restrictions), and a publishing user must b e online to make her files available to others. YouServ [9] is an end-user P2P application designed by IBM to enable p eople to easily share p ersonal content (e.g. photos, music, presentations, work documents) with little to no cost.9 Instead of a sp ecialized P2P protocol, all YouServ content is served through standard web protocols (i.e. DNS with HTTP). An implementation of YouServ was used by thousands of users internally at IBM and Carnegie Mellon University (apparently the web interface for this service at YouServ.com is now defunct). YouServ requires two centralized comp onents called YouServ Coordinator (for authentication and p eer coordination) and YouServ Dynamic DNS (for finding a p eer site's dynamic IP address). A user's YouServ content remains available even when the user's PC is offline (through a p eer hosted site), or firewalled (through a proxy site). Authentication is provided using a single signon password scheme (valid for any YouServ site). Access to any sp ecific file can b e limited to certain memb ers of the YouServ community. Using YouServ, Bayardo et al. [8] prop osed a technique to make IM file transfer easier by making local files available through transient web links; the web link of a file is sent to the recipient simply as an IM text message. In contrast to YouServ, publishing users in IMPECS make their p ersonal content available from a third-party hosting site (as is the current common practice) instead of their own PC (or any of their p eers' PC). The p opularity of social networking websites, e.g., Faceb ook, MySpace, Twitter, Beb o, is apparently comparable to the early years of large-scale IM networks. By joining Faceb ook or MySpace, users can search and connect with friends, share p ersonal content such as photos, videos, blogs, contact information, and preferences. In Faceb ook, users generally locate friends from groups, e.g., classmates from the same school or university, co-workers, geographical locations. MySpace generally categorizes user groups by interests, e.g., music, photography. To add to the interactive p ower of IM, MySpace offers its own IM client called MySpaceIM (accessible only to MySpace users). Faceb ook also has recently (Oct. 2007) added IM capability through the FriendVox browser-based IM client. Twitter enables users to send short messages to selected friends through the web, SMS messages, or IM. Most social networking sites enable limited access control through explicitly creating a "friends' list." Online photo sharing website Flickr offers creation of a list of friends through Yahoo! login credentials. Other photosharing websites such as Shutterfly offer similar privacyenhancing mechanisms. We discuss the effectiveness of such access control mechanisms b elow. Privacy issues in social networking websites. Although social networking sites enable publishing users to partially restrict access to their p ersonal content, privacy concerns are emerging quickly regarding the use of these networks. People have b een denied or lost jobs b ecause of Note that when this research [9] was published in 2002, the cost of hosting a p ersonal website at a third-party hosting company was much higher than today. 9 April 21-25, 2008 · Beijing, China their comments on MySpace or Faceb ook profiles (e.g. [32, 33]), a grocery chain dismissed employees for comments on Faceb ook (e.g. [19]), and students were susp ended for their Faceb ook comments (e.g. [13]). Government agencies such as the CIA are susp ected of tracking users with sp ecial interests (e.g. [35]); apparently under the U.S. Patriot Act, state agencies can look into a job interviewee's Faceb ook profile, even if the profile is "privacy-protected," i.e., p ermitted to b e viewed only by the publisher's circle of friends (e.g. [28]). If a user removes content from his/her profile that may b e deemed offensive or was p osted as a momentary emotional resp onse, or even if the user deletes the entire profile, p ersonal content may still reside in (incremental) archives for a long time (cf. [27]). Many users of social networking sites keep their profiles and friends list publicly accessible. A user survey [29] of social networking websites rep orted that 74% of adult users of those sites exp osed their p ersonal information such as email address, name, birthday, home and work address, and even Social Security Numb er (SSN). Only 39% of resp ondents chose to restrict their p ersonal profiles only to friends. Initial results from another survey [48] of Faceb ook users rep orted that 67% of the participants kept their p ersonal profile op en for all. Another study [37] of the LinkedIn social networking website (used mostly for business purp oses, e.g., to find p otential clients, service providers, business opp ortunities, job listings) rep orted that p eople generally exp ose detailed and (p ossibly) confidential information on their profiles. Dwyer et al. [16] compared information disclosure and p erceptions of trust and privacy in an online survey of Faceb ook and MySpace users. Faceb ook users were rep orted to reveal more identifying information than MySpace users. For example, real name, email address, and IM screen name have b een disclosed by 100%, 94%, and 71% of Faceb ook users resp ectively (in contrast to 66.7%, 40%, and 49.8% of MySpace users resp ectively). Gross and Acquisti [21] investigated patterns of p ersonal information revelation and associated privacy implications using more than 4,000 publicly available Carnegie Mellon University (CMU) users' Faceb ook profiles. Most users provided (seemingly highly accurate) p ersonal information including profile image, full birth date, hometown, current residence, and phone numb er. Personal preferences, interests, and p olitical views were also disclosed by the ma jority of CMU users. Although Faceb ook offers privacy control, most users did not change the default privacy preferences which grant access to a user's full profile by any memb er of the user's groups/networks (e.g. place, institution, interest); only three CMU users' profiles (0.06%) were precluded from view by unconnected users (i.e. not a friend or friend-of-afriend). Based on the revealed p ersonal information, the authors outlined a numb er of privacy implications including online and real-world stalking, digital dossier of participants (by any third-party), and demographics and face reidentification (i.e. relating seemingly anonymous data to explicitly identifying information). The authors also discussed how a user's SSN may b e estimated from disclosed birth date, hometown, current residence and phone numb er. A similar study [17] on 20,000 MySpace user-profiles rep orted that 68% of users kept their p ersonal profiles op en for all. Almost half of a randomly selected 1000 users' group provided global access to all elements of their p ersonal profile. Rosenblum [39] analyzed privacy risks of social networking 493 WWW 2008 / Refereed Track: Security and Privacy - Misc sites, including privacy options as provided by ma jor networking sites and limitations of such privacy settings. In addition to highlighting privacy issues of social networking sites, Barnes [7] emphasizes that a significant educational effort from parents, schools, social networking sites, and government agencies, is required to address the emerging privacy issues related to these sites. Jagatic et al. [23] collected publicly available "circles of friends" data from several social networking websites by using web crawlers; this enabled the researchers to quickly build a database of tens of thousands of relationships. When a (b enign) phishing attack was launched by using the collected social network database, 72% of social networking targets fell victim to the phishing attack, while only 16% of regular users were fooled by the attack. In fact, social networking websites are sp ecifically b eing targeted for launching context-aware phishing attacks (see e.g. [30, 44, 5, 50]), spreading spyware [12] and malware [45], and even for building b otnets [43]. Cross-site scripting flaws in the MySpace website have b een rep orted [49] in the past which could have b een exploited to disclose even privacy-protected user content. Social networking websites with p ersonal details of millions of users would also seem to b e lucrative targets to online attackers (e.g. for targeted phishing or identity theft), and government agencies (e.g. for tracking citizens' digital identities). Equifax, a leading consumer credit rep orting firm, has recently (July, 2007) warned [38] that user profiles on social networking sites are a "goldmine" for ID thieves. MySpace acknowledged [1] that as of July 2007, it had removed more than 29,000 registered sex offenders profiles from the MySpace website, indicating that criminals with other than monetary motives are also exploiting the abundance of p ersonal information freely available at social networking sites. Ahern et al. [3] examined privacy decisions in mobile and online photo sharing using Flickr. Most interviewed users in the study showed little or no concern regarding exp osure of aggregated contextual information, e.g., time, location (emb edded with some uploaded photo files), arising from their photo-sharing habits. In addition to manual photo-tagging as offered by common photo-sharing websites such as Flickr and Shutterfly, Polar Rose (www.polarrose.com) uses facial recognition algorithms for tagging unknown images of a subject if there is a tagged image of the sub ject on Polar Rose's image database (see [4] regarding the inadequacy of current privacy laws in this regard). Search engines, e.g., Sp ock (www.spock.com), customized for finding p ersonal profiles p osted at different websites, may provide even easier access to p ersonal web content. Since Septemb er 2007, Faceb ook is allowing non-memb ers to search for user profiles that are not access-restricted; third-party search engines such as Google and Yahoo! are also authorized to index such profiles (as of Feb. 2008). Convenience and usability of IMPECS. IM contact lists are already in place for IM users, whereas social networking sites require users to invite friends and family memb ers through, e.g., email to join a user's "friends' list"; sometimes these standardized, imp ersonal invitation emails simply irritate the recipients. IM is more interactive than social networking sites despite the immense recent p opularity of those sites. For many IM users, IM clients start automatically after users log into their PC, and many IM users remain signed-on to an IM network as long as they use their April 21-25, 2008 · Beijing, China computer. Social networking sites require a user to op en a web browser, load a site, and sign into that site for maintenance or to view a friend's profile. IM users can view and control more effectively what content is b eing shared at any given time; information regarding who viewed what, and how frequently, may also b e gathered from the IM server's ticket-issuing statistics. We b elieve the following factors make IMPECS app ealing. The viewing user B 's role in IMPECS is simplified in comparison to the current social networking practice. B need only log into his IM client, and select an intended contact's URL for viewing. In contrast to social networking sites, B can remain unaware of who hosts his contact's web content. B need not even store or memorize A's URL; in fact, a b ookmarked URL may not work dep ending on A's restrictions. However, B must realize that private URLs as shared through IMPECS are different than regular static URLs. The publishing user A's content sharing key KAw must b e shared b etween Si and Sw . This can b e accomplished by any of the following means (in increasing order of convenience): (i) A manually copies the registration URL (containing KAw ) from Sw to Si using an interface provided by her IM client; (ii) Sw forms an XMPP URI (xmpp: [42]) emb edding the key with URLA , and A activates the URI (e.g. by a mouse click) to b e processed by a locally installed XMPP client;10 the client sends URLA and KAw to Si ; or, (iii) Sw forwards KAw to Si if there exists a pre-established relationship b etween the servers. A content key up date is also similar to up dating a URL link at Si . To revoke B 's viewing p ermission, A can simply place B on a separate IM contact group which does not have access to URLA (or remove B from her contact list). Thus it is natural to exp ect that IMPECS is more convenient than current content sharing/limiting techniques on the web (e.g. password protection, obscure links). However, we hesitate to make any stronger usability claims without formal user testing (cf. [14]). Once published on the Internet, private content may b ecome p ermanent, e.g., through archived search engine queries and web crawlers [27]; in essence, the Internet does not forget anything published on it, although much of the p ersonal information on the web (e.g. blogs, emotional resp onses, criticisms of friends and authorities) is meant to b e transient. Unfortunately, momentary emotional resp onses to an event, if p osted as text or image on the publicly accessible Internet, may bring unpleasant consequences at a later time. Our approach can enhance "forgetfulness" of the web by not making p ersonal content public in the first place (cf. [11]). Web pages meant for certain p ersonal contacts, friends and family will remain among the pre-established circle of trust as long as none of the trusted IM contacts make copies of a web page and republish it on the public Internet. 6. CONCLUDING REMARKS Privacy is typically violated as a consequence of any of a numb er of factors. These seem to include: (i) oppressive administrations or large corp orations (sometimes by exploiting the common misconception of "I've got nothing to hide" [46]); (ii) a shortage of usable tools to guard online 10 Most p opular IM protocols provide custom URI handlers, e.g., ymsgr: (Yahoo! Messenger), aim: (AOL Instant Messenger). 494 WWW 2008 / Refereed Track: Security and Privacy - Misc privacy; (iii) apathy towards privacy; and (iv) a misunderstanding of the implications of lost privacy. In our opinion, easy access to usable privacy tools may change the actions of ordinary web users towards online privacy; IMPECS is designed to b e such a tool to enhance privacy of p ersonal web content (i.e. we focus on addressing factor (ii) as listed ab ove). We leverage the existing circles of trust among IM contacts, as well as encourage further refinements of trust in p opular IM networks. Unlike current social networking websites, users do not need to (re-)build a "friends' list" in parallel to IM contact lists. In addition, users can publish their content at any website of their choice, and still b e able to maintain privacy of their content (without b eing limited to use only a particular social networking site). Note that the general idea b ehind IMPECS extends b eyond IM and IM circles of trust; any equivalent scheme, (ideally) containing pre-arranged groups, could similarly b e leveraged (cf. Liberty Alliance People Service [24]). As rep orted in a user survey [29], even most adult users of social networking websites keep their p ersonal profiles op en for all. We b elieve that such b ehaviour results largely from practical issues such as difficulties in ensuring close contacts join the same social networking site as the publishing user (just to view a friend's profile), or simply ignorance of the privacy implications of p osting p ersonal details on the Internet. IM is a very p opular Internet application with a greater user base than social networking sites. Distributed IM services such as XMPP and Windows Live/Yahoo! networks enable IM communication b etween users of different IM networks. Therefore, we b elieve that IMPECS has significant deployment advantages over other p ersonal content sharing techniques (e.g. password protection). By restricting p ersonal content to a closed group of IM contacts, we b elieve IMPECS reduces opp ortunities for launching context-aware, targeted phishing attacks [30, 44, 50] where fraudsters collect social context of a target victim from their seemingly innocuous unprotected p ersonal data, and enhances "forgetfulness" [27] of transient p ersonal content on the web. April 21-25, 2008 · Beijing, China Acknowledgements We thank anonymous reviewers for their comments and memb ers of Carleton's Digital Security Group for enthusiastic discussion on this topic. The first author is supp orted in part by an NSERC CGS. The second author is Canada Research Chair in Network and Software Security, and is supp orted in part by an NSERC Discovery Grant, and the Canada Research Chairs Program. 7. REFERENCES [1] ABC News. MySpace finds 29,000 sex offenders. News article (July 25, 2007). http://www.abcnews.go.com/ Technology/wireStory?id=3409947. [2] B. Adida. Beamauth: Two-factor web authentication with a b ookmark. In ACM Computer and Communications Security (CCS), 2007. [3] S. Ahern, D. Eckles, N. Good, S. King, M. Naaman, and R. Nair. Over-exp osed? Privacy patterns and considerations in online and mobile photo sharing. In ACM Computer/Human Interaction (CHI), 2007. [4] Anonymous. In the face of danger: Facial recognition and the limits of privacy law. Harvard Law Review, 120(7), May 2007. [5] Anti-Phishing Working Group. Phishing activity trends rep ort for April 2007. http://www.antiphishing.org/reports/apwg_ report_april_2007.pdf. [6] ArsTechnica.com. Yahoo Messenger and Windows Live Messenger get together. News Article (Sep. 27, 2006). http://arstechnica.com/news.ars/post/ 20060927- 7846.html. [7] S. B. Barnes. A privacy paradox: Social networking in the United States. First Monday: Peer-reviewed Journal on the Internet, 11(9), 2006. [8] R. J. Bayardo and S. Thomschke. Exploiting the web for p oint-in-time file sharing (p oster). In World Wide Web (WWW) Conference, 2005. [9] R. J. Bayardo Jr., R. Agrawal, D. Gruhl, and A. Somani. YouServ: A web hosting and content sharing tool for the masses. In World Wide Web (WWW) Conference, 2002. [10] M. Bellare and C. Namprempre. Authenticated encryption: Relations among notions and analysis of the generic comp osition paradigm. In AsiaCrypt, 2000. [11] N. Borisov, I. Goldb erg, and E. Brewer. Off-the-record communication, or, why not to use PGP. In ACM Workshop on Privacy in the Electronic Society (WPES), 2004. [12] BusinessWeek. Social-networking sites a `hotb ed' for spyware. News article (Aug. 18, 2006). http: //www.msnbc.msn.com/default.aspx/id/14413906/. [13] CBC.ca. 4 charged after school protest over Faceb ook susp ensions. News article (Mar. 23, 2007). http://www.cbc.ca/canada/toronto/story/2007/ 03/23/protest- birchmount.html. [14] S. Chiasson, P. van Oorschot, and R. Biddle. A usability study and critique of two password managers. In USENIX Security, 2006. [15] F. Dawson and T. Howes. vCard MIME directory profile, 1998. RFC 2426, Status: Standards Track. [16] C. Dwyer, S. Hiltz, and K. Passerini. Trust and privacy concern within social networking sites: A comparison of Faceb ook and MySpace. In Americas Conference on Information Systems (AMCIS), Keystone, Colorado, USA, Aug. 2007. [17] R. Feizy. An evaluation of identity on online social networking: MySpace (p oster). In ACM Hypertext and Hypermedia (HT), 2007. [18] J. Franks, P. Hallam-Baker, J. Hostetler, S. Lawrence, P. Leach, A. Luotonen, and L. Stewart. HTTP authentication: Basic and digest access authentication, June 1999. RFC 2617, Status: Standards Track. [19] M. Geist. Facing up to Faceb ook fears. BBC news article (May 9, 2007). http: //news.bbc.co.uk/2/hi/technology/6639417.stm. [20] V. D. Gligor and P. Donescu. Fast encryption and authentication: XCBC encryption and XECB authentication modes. In Workshop on Fast Software Encryption, Yokohama, Japan, Apr. 2001. [21] R. Gross and A. Acquisti. Information revelation and privacy in online social networks. In ACM Workshop on Privacy in the Electronic Society (WPES), 2005. [22] jabb erd pro ject. jabb erd2 XMPP server. Version 2.1.6. http://jabberd.jabberstudio.org/2/. 495 WWW 2008 / Refereed Track: Security and Privacy - Misc [23] T. Jagatic, N. Johnson, M. Jakobsson, and F. Menczer. Social phishing. Communications of the ACM, 50(10), Oct. 2007. [24] Lib erty Alliance. Lib erty ID-WSF People Service ­ federated social identity. White pap er (Dec. 5, 2005). http://www.projectliberty.org. [25] M. Mannan and P. C. van Oorschot. Secure public instant messaging: A survey. In Privacy, Security and Trust (PST), Fredericton, NB, Canada, Oct. 2004. [26] M. Mannan and P. C. van Oorschot. A protocol for secure public instant messaging. In Financial Cryptography and Data Security (FC), Anguilla, British West Indies, 2006. [27] V. Mayer-Schonb erger. Useful void: The art of Ļ forgetting in the age of ubiquitous computing. Harvard KSG Faculty Research Working Pap er Series, article numb er RWP07-022, Apr. 2007. [28] NACE Sp otlight Online. The issues surrounding college recruiting and social networking web sites. News article (June 22, 2006). http: //career.studentaffairs.duke.edu/undergrad/ find_job/consider/nace_socialnetworks.html. [29] National Cyb er Security Alliance. CA/NCSA social networking cyb er security survey. Online article (Sep. 2006). http://staysafeonline.org/features/ SocialNetworkingReport.ppt. [30] Netcraft.com. MySpace accounts compromised by phishers. News article (Oct. 27, 2006). http://news.netcraft.com/archives/2006/10/27/ myspace_accounts_compromised_by_phishers.html. [31] B. C. Neuman and T. Ts'o. Kerb eros: An authentication service for computer networks. IEEE Communications, 32(9), Sept. 1994. [32] New York Times. For some, online p ersona undermines a rīsumī. News article (June 11, 2006). http: e e //www.nytimes.com/2006/06/11/us/11recruit.html. [33] New York Times. How to lose your job on your own time. News article (Dec. 30, 2007). http://www. nytimes.com/2007/12/30/business/30digi.html. [34] Pidgin pro ject. Pidgin: A multi-protocol IM client. Version 2.0.1. http://www.pidgin.im/. [35] PrisonPlanet.com. The Faceb ook.com: Big brother with a smile. News article (June 9, 2005). http://www.prisonplanet.com/articles/june2005/ 090605thefacebook.htm. [36] N. Provos, D. McNamee, P. Mavrommatis, K. Wang, and N. Modadugu. The ghost in the browser: Analysis of web-based malware. In USENIX HotBots, 2007. [37] D. Rand. Threats when using online social networks. CSIS Security Group (a Danish IT security company; article published on May 16, 2007). http://www.csis.dk/dk/forside/LinkedIn.pdf. April 21-25, 2008 · Beijing, China [38] Reuters UK. Networking sites a goldmine for ID fraudsters. News article (July 19, 2007). http://uk.reuters.com/article/ personalFinanceNews/idUKHIL95513120070719. [39] D. Rosenblum. What anyone can know: The privacy risks of social networking sites. IEEE Security and Privacy, 5(3), May 2007. [40] P. Saint-Andre. Extensible messaging and presence protocol (XMPP): Core, Oct. 2004. RFC 3920, Status: Standards Track. [41] P. Saint-Andre. Extensible messaging and presence protocol (XMPP): Instant messaging and presence, 2004. RFC 3921, Status: Standards Track. [42] P. Saint-Andre. Internationalized resource identifiers (IRIs) and uniform resource identifiers (URIs) for the extensible messaging and presence protocol (XMPP), July 2006. RFC 4622, Status: Standards Track. [43] SANS Internet Storm Center. MySpace phish and drive-by attack vector propagating Fast Flux network growth. SANS handler's diary (June 26, 2007). http://isc.sans.org/diary.html?storyid=3060. [44] SecurityFocus.com. Image attack on MySpace b oosts phishing exp osure. News article (June 11, 2007). http://www.securityfocus.com/brief/522. [45] SecurityFocus.com. QuickTime worm uses MySpace to spread. News article (Apr. 12, 2006). http://www.securityfocus.com/brief/375. [46] D. J. Solove. `I've got nothing to hide' and other misunderstandings of privacy. San Diego Law Review, 44, 2007. [47] StopBadware.org. StopBadware.org identifies companies hosting large numb ers of websites that can infect internet users with badware. Press release (May 3, 2007). http://www.stopbadware.org/home/pr_050307. [48] K. Strater and H. Richter. Examining privacy and disclosure in a social networking community (p oster). In Symposium on Usable Privacy and Security (SOUPS), Pittsburgh, PA, USA, July 2007. [49] Toronto Star. Social networking sites hacker targets. News article (Aug. 3, 2007). http://www.thestar. com/sciencetech/Technology/article/243096. [50] Wired.com. Fraudsters target Faceb ook with phishing scam. News article (Jan. 3, 2008). http://www.wired.com/politics/security/news/ 2008/01/facebook_phish. [51] Wired.com. Private Faceb ook pages are not so private. News article (June 28, 2007). http://www.wired.com/software/webservices/ news/2007/06/facebookprivacysearch. 496