TY - JOUR T1 - Estimating Functional Agent-Based Models: An Application to Bid Shading in Online Markets Format JF - Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2011) Y1 - 2011 A1 - Guo,Wei A1 - Jank,Wolfgang A1 - Rand, William KW - Agent-based modeling KW - business KW - Calibration KW - Genetic algorithms KW - internet auctions KW - simulation AB - Bid shading is a common strategy in online auctions to avoid the "winner’s curse". While almost all bidders shade their bids, at least to some degree, it is impossible to infer the degree and volume of shaded bids directly from observed bidding data. In fact, most bidding data only allows us to observe the resulting price process, i.e. whether prices increase fast (due to little shading) or whether they slow down (when all bidders shade their bids). In this work, we propose an agent-based model that simulates bidders with different bidding strategies and their interaction with one another. We calibrate that model (and hence estimate properties about the propensity and degree of shaded bids) by matching the emerging simulated price process with that of the observed auction data using genetic algorithms. From a statistical point of view, this is challenging because we match functional draws from simulated and real price processes. We propose several competing fitness functions and explore how the choice alters the resulting ABM calibration. We apply our model to the context of eBay auctions for digital cameras and show that a balanced fitness function yields the best results. UR - http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1846639 ER - TY - CONF T1 - When does simulated data match real data? T2 - Proceedings of the 13th annual conference companion on Genetic and evolutionary computation Y1 - 2011 A1 - Stonedahl,Forrest A1 - Anderson,David A1 - Rand, William KW - Agent-based modeling KW - business KW - Calibration KW - Genetic algorithms KW - information search KW - network analysis AB - Agent-based models can replicate real-world patterns, but finding parameters that achieve the best match can be difficult. To validate a model, a real-world dataset is often divided into a training set (to calibrate the parameters) and a test set (to validate the calibrated model). The difference between the training and test data and the simulated data is determined using an error measure. In the context of evolutionary computation techniques, the error measure also serves as a fitness function, and thus affects evolutionary search dynamics. We survey the effect of five different error measures on both a toy problem and a real world problem of matching a model to empirical online news consumption behavior. We use each error measure separately for calibration on the training dataset, and then examine the results of all five error measures on both the training and testing datasets. We show that certain error measures sometimes serve as better fitness functions than others, and in fact using one error measure may result in better calibration (on a different measure) than using the different measure directly. For the toy problem, the Pearson's correlation measure dominated all other measures, but no single error measure was Pareto dominant for the real world problem. JA - Proceedings of the 13th annual conference companion on Genetic and evolutionary computation T3 - GECCO '11 PB - ACM CY - New York, NY, USA SN - 978-1-4503-0690-4 UR - http://doi.acm.org/10.1145/2001858.2001988 M3 - 10.1145/2001858.2001988 ER - TY - CONF T1 - Evolving viral marketing strategies T2 - Proceedings of the 12th annual conference on Genetic and evolutionary computation Y1 - 2010 A1 - Stonedahl,Forrest A1 - Rand, William A1 - Wilensky,Uri KW - Agent-based modeling KW - business KW - diffusion KW - Genetic algorithms KW - simulation KW - social networks KW - viral marketing AB - One method of viral marketing involves seeding certain consumers within a population to encourage faster adoption of the product throughout the entire population. However, determining how many and which consumers within a particular social network should be seeded to maximize adoption is challenging. We define a strategy space for consumer seeding by weighting a combination of network characteristics such as average path length, clustering coefficient, and degree. We measure strategy effectiveness by simulating adoption on a Bass-like agent-based model, with five different social network structures: four classic theoretical models (random, lattice, small-world, and preferential attachment) and one empirical (extracted from Twitter friendship data). To discover good seeding strategies, we have developed a new tool, called BehaviorSearch, which uses genetic algorithms to search through the parameter-space of agent-based models. This evolutionary search also provides insight into the interaction between strategies and network structure. Our results show that one simple strategy (ranking by node degree) is near-optimal for the four theoretical networks, but that a more nuanced strategy performs significantly better on the empirical Twitter-based network. We also find a correlation between the optimal seeding budget for a network, and the inequality of the degree distribution. JA - Proceedings of the 12th annual conference on Genetic and evolutionary computation T3 - GECCO '10 PB - ACM CY - New York, NY, USA SN - 978-1-4503-0072-8 UR - http://doi.acm.org/10.1145/1830483.1830701 M3 - 10.1145/1830483.1830701 ER - TY - CONF T1 - Automated cluster-based Web service performance tuning T2 - 13th IEEE International Symposium on High performance Distributed Computing, 2004. Proceedings Y1 - 2004 A1 - Chung,I. -H A1 - Hollingsworth, Jeffrey K KW - Active Harmony system KW - automated performance tuning KW - business KW - cluster-based Web service system KW - Clustering algorithms KW - Computer science KW - Educational institutions KW - electronic commerce KW - Internet KW - Middleware KW - performance evaluation KW - scalability KW - Throughput KW - Transaction databases KW - Web server KW - Web services KW - workstation clusters AB - Active harmony provides a way to automate performance tuning. We apply the Active Harmony system to improve the performance of a cluster-based web service system. The performance improvement cannot easily be achieved by tuning individual components for such a system. The experimental results show that there is no single configuration for the system that performs well for all kinds of workloads. By tuning the parameters, Active Harmony helps the system adapt to different workloads and improve the performance up to 16%. For scalability, we demonstrate how to reduce the time when tuning a large system with many tunable parameters. Finally an algorithm is proposed to automatically adjust the structure of cluster-based web systems, and the system throughput is improved up to 70% using this technique. JA - 13th IEEE International Symposium on High performance Distributed Computing, 2004. Proceedings PB - IEEE SN - 0-7695-2175-4 M3 - 10.1109/HPDC.2004.1323484 ER - TY - CONF T1 - Strategies for exploring large scale data T2 - Parallel Architectures, Algorithms and Networks, 2004. Proceedings. 7th International Symposium on Y1 - 2004 A1 - JaJa, Joseph F. KW - algorithms; KW - association; KW - asymptotic KW - bounds; KW - business KW - data KW - data; KW - database KW - databases; KW - demographic KW - discovery; KW - Indexing KW - indexing; KW - information KW - knowledge KW - large KW - linear KW - mining; KW - multidimensional KW - objects; KW - optimal KW - Parallel KW - pattern KW - processing; KW - query KW - querying; KW - range KW - scale KW - scientific KW - search KW - serial KW - series KW - series; KW - simulation KW - space KW - structure; KW - structures; KW - techniques; KW - temporal KW - TIME KW - value KW - very KW - window; AB - We consider the problem of querying large scale multidimensional time series data to discover events of interest, test and validate hypotheses, or to associate temporal patterns with specific events. This type of data currently dominates most other types of available data, and will very likely become even more prevalent in the future given the current trends in collecting time series of business, scientific, demographic, and simulation data. The ability to explore such collections interactively, even at a coarse level, will be critical in discovering the information and knowledge embedded in such collections. We develop indexing techniques and search algorithms to efficiently handle temporal range value querying of multidimensional time series data. Our indexing uses linear space data structures that enable the handling of queries in I/O time that is essentially the same as that of handling a single time slice, assuming the availability of a logarithmic number of processors as a function of the temporal window. A data structure with provably almost optimal asymptotic bounds is also presented for the case when the number of multidimensional objects is relatively small. These techniques improve significantly over standard techniques for either serial or parallel processing, and are evaluated by extensive experimental results that confirm their superior performance. JA - Parallel Architectures, Algorithms and Networks, 2004. Proceedings. 7th International Symposium on M3 - 10.1109/ISPAN.2004.1300447 ER - TY - JOUR T1 - Rover: scalable location-aware computing JF - Computer Y1 - 2002 A1 - Banerjee,S. A1 - Agarwal,S. A1 - Kamel,K. A1 - Kochut, A. A1 - Kommareddy,C. A1 - Nadeem,T. A1 - Thakkar,P. A1 - Trinh,Bao A1 - Youssef,A. A1 - Youssef, M. A1 - Larsen,R.L. A1 - Udaya Shankar,A. A1 - Agrawala, Ashok K. KW - amusement KW - application-specific KW - architecture; KW - automation; KW - business KW - business; KW - computing; KW - data KW - entertainment; KW - handheld KW - humanities; KW - integration KW - LAN; KW - location-aware KW - malls; KW - mobile KW - museums; KW - office KW - parks; KW - processing; KW - resource KW - Rover; KW - scalability; KW - scalable KW - scheduling; KW - shopping KW - software KW - system KW - theme KW - units; KW - user; KW - wireless AB - All the components necessary for realizing location-aware computing are available in the marketplace today. What has hindered the widespread deployment of location-based systems is the lack of an integration architecture that scales with user populations. The authors have completed the initial implementation of Rover, a system designed to achieve this sort of integration and to automatically tailor information and services to a mobile user's location. Their studies have validated Rover's underlying software architecture, which achieves system scalability through high-resolution, application-specific resource scheduling at the servers and network. The authors believe that this technology will greatly enhance the user experience in many places, including museums, amusement and theme parks, shopping malls, game fields, offices, and business centers. They designed the system specifically to scale to large user populations and expect its benefits to increase with them. VL - 35 SN - 0018-9162 CP - 10 M3 - 10.1109/MC.2002.1039517 ER - TY - CONF T1 - Dynamic queries and brushing on choropleth maps T2 - Fifth International Conference on Information Visualisation, 2001. Proceedings Y1 - 2001 A1 - Dang,G. A1 - North,C. A1 - Shneiderman, Ben KW - brushing KW - business KW - cartography KW - choropleth maps KW - color coding KW - colour graphics KW - complex data sets KW - Computational Intelligence Society KW - Computer science KW - data visualisation KW - Data visualization KW - demographic data KW - Demography KW - Dynamaps KW - dynamic queries KW - economic data KW - Educational institutions KW - geographic information system KW - geographic information systems KW - Joining processes KW - map representations KW - map-based information visualization KW - Query processing KW - Scattering KW - scatterplot view KW - tabular representations KW - user interface KW - User interfaces KW - World Wide Web AB - Users who must combine demographic, economic or other data in a geographic context are often hampered by the integration of tabular and map representations. Static, paper-based solutions limit the amount of data that can be placed on a single map or table. By providing an effective user interface, we believe that researchers, journalists, teachers, and students can explore complex data sets more rapidly and effectively. This paper presents Dynamaps, a generalized map-based information visualization tool for dynamic queries and brushing on choropleth maps. Users can use color coding to show a variable on each geographic region, and then filter out areas that do not meet the desired criteria. In addition, a scatterplot view and a details-on-demand window support overviews and specific fact-finding JA - Fifth International Conference on Information Visualisation, 2001. Proceedings PB - IEEE SN - 0-7695-1195-3 M3 - 10.1109/IV.2001.942141 ER - TY - CONF T1 - The processing of form documents T2 - Document Analysis and Recognition, 1993., Proceedings of the Second International Conference on Y1 - 1993 A1 - David Doermann A1 - Rosenfeld, A. KW - AUTOMATIC KW - business KW - detectors; KW - document KW - documents; KW - extraction; KW - feature KW - form KW - forms; KW - generation; KW - generic KW - handling; KW - known KW - markings; KW - model KW - modeling; KW - non-form KW - optimal KW - properties; KW - reconstruction; KW - set; KW - specialized KW - stroke KW - width AB - An overview of an approach to the generic modeling and processing of known forms is presented. The system provides a methodology by which models are generated from regions in the document based on their usage. Automatic extraction of an optimal set of features to be used for registration is proposed, and it is shown how specialized detectors can be designed for each feature based on their position, orientation and width properties. Registration of the form with the model is accomplished using probing to establish correspondence. Form components which are corrupted by markings are detected and isolated, the intersections are interpreted and the properties of the non-form markings are used to reconstruct the strokes through the intersections. The feasibility of these ideas is demonstrated through an implementation of key components of the system JA - Document Analysis and Recognition, 1993., Proceedings of the Second International Conference on M3 - 10.1109/ICDAR.1993.395687 ER -