Diego Reforgiato Recupero

Download my CV
[education] [past jobs] [research] [publications] [photos] [contact me]


I work with Prof. Alfio Lombardo and Prof. Giovanni Schembra of University of Catania - Department of Computer Science and Telecommunications Engineering (DIIT). I also collaborate with Professor V.S. Subrahmanian, Prof. Antonio Picariello, Dr. Massimiliano Albanese and Dr. Carmine Cesarano of University of Naples Federico II. I have also collaborated with Professor Alfredo Ferro, Dr. Alfredo Pulvirenti, Dr. Rosalba Giugno of University of Catania - Department of Computer Science. I have also a work in progress with Professor Dennis Shasha and Dr. Rodrigo Gutierrez of New York University.

My research and interests include the following topics:

  • PeerSim: A Peer-to-Peer Simulator
    PeerSim Peer-to-peer systems can be of a very large scale such as millions of nodes, which typically join and leave continously. These properties are very challenging to deal with. Evaluating a new protocol in a real environment, especially in its early stages of development, is not feasible. PeerSim has been developed with extreme scalability and support for dynamicity in mind. We use it in our everyday research and chose to release it to the public under the GPL open source license. It is composed of two simulation engines, a simplified (cycle-based) one and and event driven one. The engines are supported by many simple, extendable, and pluggable components, with a flexible configuration mechanism. The cycle-based engine, to allow for scalability, uses some simplifying assumptions, such as ignoring the details of the transport layer in the communication protocol stack. The event-based engine is less efficient but more realistic. Among other things, it supports transport layer simulation as well. In addition, cycle-based protocols can be run by the event-based engine too. PeerSim started under EU project BISON and continues to be used under EU project DELIS. PeerSim is written entirely in Java.

    (For students only. Click here for the updated Google Doc with info, questions and remarks).


  • Intrusion Detection - Home Security
    Intrusion Detection An Intrusion detection system (IDS) is software and/or hardware designed to detect unwanted attempts at accessing, manipulating, and/or disabling of computer systems, mainly through a network, such as the Internet. These attempts may take the form of attacks, as examples, by crackers, malware and/or disgruntled employees. An IDS cannot directly detect attacks within properly encrypted traffic. An intrusion detection system is used to detect several types of malicious behaviors that can compromise the security and trust of a computer system. This includes network attacks against vulnerable services, data driven attacks on applications, host based attacks such as privilege escalation, unauthorized logins and access to sensitive files, and malware (viruses, trojan horses, and worms). An IDS can be composed of several components: Sensors which generate security events, a Console to monitor events and alerts and control the sensors, and a central Engine that records events logged by the sensors in a database and uses a system of rules to generate alerts from security events received. There are several ways to categorize an IDS depending on the type and location of the sensors and the methodology used by the engine to generate alerts. In many simple IDS implementations all three components are combined in a single device or appliance. In this project we are focussing on the development of a library (written in C#) which takes as input values of different kind of sensors (e.g. cameras, infrared). Then, it allows the generation of function blocks in order to process such inputs. These blocks can use any function defined by users. Moreover, users decide how to link the different blocks they create. Finally, output is computed and sent back to the sensors.

    (For students only. Click here for the updated Google Doc with info, questions and remarks).


  • Second Life
    Second Life Second Life is a virtual world in which every blade of grass, every building, every piece of clothing, and every character (avatar) is created by the residents. Residents combine prims, or primitive geometric shapes. The process of building objects, such as chairs, houses, and animals is done through the Second Life GUI software. There is a scripting language in loco called Linden Script Language which gives complete control of each object. By using that you can also create vehicles and use all the physics available. Another ongoing project, libsecondlife, provided a bunch of API for packets exchanging between client-server; by using that you can create and program automatic avatars (bots).


  • Annotated RDF
    ARDFMany extensions to RDF includes temporal reasoning, reasoning about pedigree, reasoning about uncertainty, etc. ARDF is one where RDF triples are annotated by members of a partially ordered set (with bottom element) that can be selected in any way desired by the user. In this project we present a formal declarative semantics for ARDF and develop algorithms to check consistency of ARDF theories and to answer queries. Check out the aRDF system at aRDF


  • Opinion analysis in Document Databases
    OasysThere are numerous applications in which we would like to assess what opinions are being expressed in text documents. For example, Martha Stewart's company may have wished to assess the degree of criticism of a proposed dam in Bangladesh. The ability to gauge opinion on a given topic is therefore of critical interest. In this project, we developed a suite of algorithms which takes as input, a set D of documents as well as a topic t, and gauge the degree of opinion expressed about topic t in the set D of documents. Our algorithms can return both a number (larger the number, more positive the opinion and lower the number, more negative the opinion) as well as a qualitative opinion. We assessed the accuracy of these algorithms via human experiments and showed that the best of these algorithms can accurately reflect human opinions. We developed Oasys, (Opinion Analysis SYStem) (http://oasys.umiacs.umd.edu/oasys); it is a system which crawls documents from the web finding new interesting topics and, by using our opinion analysis algorithms, it gives opinions on them.


  • Antipole indexing to support efficiently range search and k-nearest-neighbor search problems
    Antipole Tree data structureRange and k-nearest neighbor searching are core problems in pattern recognition. Given a database S of objects in a metric space M and a query object q in M, in a range searching problem the goal is to find the objects of S within some threshold distance to q, whereas in a k-nearest neighbor searching problem, the k elements of S closest to q must be produced. These problems can obviously be solved with a linear number of distance calculations, by comparing the query object against every object in the database. However, the goal is to solve such problems much faster. We combine and extend ideas from the M-Tree, the Multivantage Point structure, and the FQ-Tree to create a new structure in the "bisector tree” class, called the Antipole Tree. Bisection is based on the proximity to an "Antipole” pair of elements generated by a suitable linear randomized tournament. The final winners a,b of such a tournament are far enough apart to approximate the diameter of the splitting set. If dist(a,b) is larger than the chosen cluster diameter threshold, then the cluster is split. The proposed data structure is an indexing scheme suitable for (exact and approximate) best match searching on generic metric spaces. The Antipole Tree outperforms by a factor of approximately two existing structures such as List of Clusters, M-Trees, and others and, in many cases, it achieves better clustering properties.


  • Antipole clustering for fast texture synthesis
    input texture noise image synthetized In this project Analysis/Synthesis of textures using a non-parametric multi-resolution approach able to reproduce efficiently the generative stochastic process of a wide class of real texture images is realized through the Antipole Tree and a suitable research strategy able to outperform both the classical linear full-search heuristic and the TSVQ (Tree Structure Vector Quantization) acceleration used in previous works. Experimental results performed on an exhaustive set of textures [VisTex] show the effectiveness of the proposed approach. (see http://alpha.dmi.unict.it/~texture for further details).


  • Fast Colorization of gray images
    input colored image black-and-white image colored black-and-white image In this project we introduce a technique for “colorizing” images by transferring color between a source image (colored) to a destination image (gray-scaled). The general problem of inverting a gray palette to a color palette is an undetermined problem and generally has no unique solution. For this reason toaccomplish this task, for example in restoration of old photos, the (costly!) semantic knowledge of an expert is required. Our method transfers the color from a source image to a target image by matching luminance information between the images. This approach, inspired to a recently published algorithm by Welsh et al., hence inscribes itself among the “similarity based” image enhancing techniques. The main improvement of the new technique is the adoption of the Antipole data structure to fastly retrieve “color words” from a very large vocabulary.


  • Graph matching (exact and approximate) querying semistructured databases
    graph with matches Next-generation database systems dealing with biomedical data, web relationships, network directories and structured documents often model the data as graphs. In this project we investigate on a subgraph searching system GraphBlast for large graph databases that incorporates efficient graph searching algorithms like GraphGrep (go here to see a demo of GraphGrep) and VF together with new efficient data storage and filtering techniques. GraphBlast outperforms the other main graph searching methods on synthetic and real graphs databases.


  • Graphs clustering
    graphs to be clustered Any application that represents data as graphs may be interested in finding patterns in those graphs. To do this in an unsupervised fashion requires the ability to find subgraphs that are similar to one another. That is the purpose of GraphClust. GraphClust is an application to cluster directed or undirected labeled graphs by using substructures. It is thanks to these substructures that the graphs in the dataset can be represented as feature vectors and therefore they can be clustered.


  • Computation of graph centroid and graph nodes clustering
    graph centroids and nodes clustering We are investigating on a efficient way to compute an approximate graph centroid in a weighted graph. To do that the idea of is to give importance to the nodes through which the maximum number of shortest paths pass during the Dijkstra visit. The main procedure starts finding the node u with greater degree. After that, the number of Dijkstra shortest paths passing fot the nodes adjacent to u is computed. If the sum of distances from u to each other node is the least computed so far then the visit goes to the adjacent node to u with the maximum number of Dijkstra shortest paths; else the visit ends. We introduced a method to overcome local minima which are possible to find: to do that, a stratification of the nodes according to their distance from the temporary solution u is created.


Last update on Tue Feb 21 13:21:45 2006