Research Interests · Approximate Processing Techniques · Sensor Networks · Data Warehousing, OLAP · Data Streams · XML Education 2001-May 2005 University of Maryland, Dept. of Computer Science Ph.D. Computer Science, GPA: 4.00 Thesis: Accurate Data Approximation in Constrained Environments Advisor: Prof. Nick Roussopoulos 1999-2001 University of Maryland, Dept. of Computer Science M.Sc. Computer Science, GPA: 4.00 Advisor: Prof. Nick Roussopoulos 1994-1999 National Technical University of Athens Dept. of Electrical and Computer Engineering Diploma of Electrical and Computer Engineering, GPA: 3.832, Rank in Class: 3rd Thesis: SISYPHUS: A Chunk-Based Storage Manager for OLAP Data Cubes Advisor: Prof. Timos Sellis Honors and Awards 1999-2001 Recipient of one out of two UMIACS fellowships for first year graduate students University of Maryland 1996-1999 Recipient of Annual National Fellowship Foundation (IKY) Awards for academic excellence, Athens, Greece (4 times) Research Experience June-August, 2003 Research Intern, AT&T Labs - Research Mentor: Dr. Yannis Kotidis Worked on designing protocols for bandwidth-efficient data aggregation of continuous queries in sensor networks. Designed and implemented a versioning file system on top of a relational engine, optimized for lightweight cgi-bin processes. My libraries are at the core of the Virtual Integration Prototype (VIP), a search engine over AT&T's Legacy Applications (ordering, billing, provisioning, inventory). In 2003 VIP served around 25 million user queries. September 2002 Visitor, AT&T Labs and Center for Discrete Mathematics & Theoritical Computer Science (DIMACS) Worked on designing algorithms for the efficient approximation of multiple signals in sensor network and network management applications, by exploiting piece-wise correlations between parts of the collected signals. 2001-2005 Graduate Research Assistant University of Maryland, Dept. of Computer Science · Approximate Processing Techniques My work on approximate processing techniques mainly involves the approximation of data sets containing multiple measures (multiple numeric entries for each table cell). Such data sets arise in many application domains, from network management and time-series analysis/correlation systems to On-Line Analytical Processing (OLAP) environments. I introduced the extended wavelet coefficients as a flexible storage technique for wavelet coefficients in such multi-measure data sets and proposed both optimal and provably approximate algorithms on selecting which extended wavelet coefficients to retain under a given storage constraint. These algorithms can be applied to minimize various error metrics, such as the weighted sum squared, relative and absolute error of the obtained approximation. While my techniques are developed for multi-measure data sets, the algorithms that I proposed for constructing probabilistic wavelet data synopses are significantly faster than previously proposed techniques even for the single-measure case. · Sensor Networks The first part of my work involved the design of protocols for bandwidth-efficient evaluation of aggregate continuous queries over sensor networks for applications that are willing to tolerate a specified maximum error on the obtained answer. I then studied the dual application, where the application specifies the desired average bandwidth consumption for the posed query, and the goal is to maximize the accuracy of the reported results. I also considered the case of periodic transmission of historical information in such networks. To exploit the correlations that are typically expected between different observed quantities, I proposed the Self-Based Regression (SBR) algorithm for efficiently compressing the transmitted measurements. The SBR algorithm constructs a base signal that contains prominent features of the data, and uses this base signal as a dictionary to encode the data using linear regression. · Data Warehousing, OLAP Calculating and storing data cubes for high-dimensional hierarchical data sets has long been a difficult task due to the dimensionality curse; the number of views in a data cube is exponential in the number of dimensions. I helped design Dwarf, a highly-compressed data structure for computing, storing and indexing data cubes. Dwarf identifies prefix and (most importantly) suffix structural redundancies and factors them out by coalescing their store. What makes Dwarf practical is the automatic discovery, in a single pass over the fact table, of the prefix and suffix redundancies without user involvement or knowledge of the value distributions, and their elimination before the redundant areas of the cube are computed. 1999-2001 University of Maryland Institute for Advanced Computer Studies (UMIACS) Fellow 1998-1999 Research and Teaching Assistant National Technical University of Athens, Dept. of Electrical and Computer Engineering Designed and helped implement the storage manager of the Eratosthenes OLAP system for hierarchical data sets, developed in the National Technical University of Athens. Worked as a research assistant to help develop a new tool for software testing. Teaching Experience 1998-1999 Courses: "Introduction to Compilers" and "Introduction to Software Engineering", National Technical University of Athens Gave lectures, designed projects and wrote 2 chapters in a book involving the use of the flex and bison tools. 2001-2004 DBChat organizer, University of Maryland DBChat is a weekly seminar-type database group meeting. My duties included compiling the list of papers to be discussed, arranging class schedules, and putting the class material on the Web. DBChat was offered as a course in the Fall 2001 semester. Publications Talks and Conference Presentations [12] Query Processing for the Semantic Sensor Web [** Invited Talk **] 1st International Workshop on the Semantic Sensor Web (SemSensWeb 2009) June 2009 [11] Another Outlier Bites the Dust: Computing Meaningful Aggregates in Sensor Networks 25th International Conference on Data Engineering (IEEE ICDE) Shanghai, China, April 2009 [10] Efficient Query Processing in Sensor Networks Technical University of Crete, May 2006 [9] A Fast Approximation Scheme for Probabilistic Wavelet Synopses 4th Hellenic Data Management Symposium, Athens, Greece, August 2005 [8] Efficient Query Processing in Sensor Networks 4th Hellenic Data Management Symposium Athens, Greece, August 2005 [7] A Fast Approximation Scheme for Probabilistic Wavelet Synopses 17th International Conference on Scientific and Statistical Database Management (SSDBM) Santa Barbara, June 2005 [6] Sensor Networks: Applications and Ongoing Research National Technical University of Athens, January 2005 [5] Data Approximation in Multi-Measure Data Sets National Technical University of Athens, July 2004