George Papastefanatos
Researcher
Information Management Systems Institute
ATHENA Research Center Artemidos 6 & Epidavrou Athens Greece
mail-to:, .
Find me in LinkedIn ,
Google
Scholar,
DBLP,
ORCID
About
I am a Principal Researcher at the Information Management Systems Institute (IMSI)
of the ATHENA Research and Innovation Centre .
My research interests are in the area of big data management and
analytics, working in problems related to scalable visual analytics, data integration, knowledge graphs and data evolution.
I obtained my Diploma on Electrical and Computer Engineering and my PhD in Computer Science
from the
Department of Electrical and Computer Engineering
of the National Technical University of Athens (NTUA). Before joining ATHENA R.C., I have been adjuct researcher in NTUA, University of Athens, National Centre for Social Research and University of Ioannina and worked as an external IT expert in various private and public organizations in the design and implementation of large-scale IT projects. I have been an adjunct\visiting lecturer in University of Peloponnese, Athens University of Economics and Business, University of Aegean, University of Piraeus and National School of Public Administration. I have coedited 1 book, 3 chapters in books and more than 80 publications in international conferences and journals in the areas of big data management and analytics. Three of my articles have been selected as Best Papers in International conferences.
News
- 2025: PC Member of VLDB 2025, ACM SIGMOD 2025, and EDBT 2025.
- 2024: Read our new article in VLDB24 titled
"Visualization-aware Time Series Min-Max Caching with Error Bound Guarantees", for interactive visual exploration of large multi-variate time series data through an in-memory adaptive caching approach, MinMaxCache, that efficiently reuses previous query results to accelerate visualization performance within accuracy constraints.
- 2024: Read our two new articles for resource management and energy efficiency in telco data centers.
"Resource demands in telco data centers" in Nature Scientific Data Journal presents a dataset with pattern demands of applications within telco data centers;
"Dynamic Sizing of Cloud-Native Telco Data Centers With Digital Twin and Reinforcement Learning" in IEEE Access Journal presents
a Dynamic Data Center Sizing approach based on reinforcement learning for enabling strategic network configuration, optimizing data center sizing, and facilitating proactive decision-making for data center operations and energy efficiency .
- 2024: Follow the series of Bigvis Workshops. I am co-organizing the 7th International Workshop on Big Data Visual Exploration and Analytics to be held jointly with 50th International Conference on Very Large Databases - VLDB 2024, August 25-29, 2024 in Guangzhou, China.
- 2024: PC Member of ACM SOCC 2024, IEEE DSSA 2024, ADBIS 2024,
LOD 2024, and Discovery Science 2024.
- 2023 - 2024: Adjunct Lecturer, teaching Visual Analytics and Big Data Management courses at the Business and Data Analytics (BDA) track of the Cybersecurity and Data Science MsC program of University of Piraeus.
- 2023: PC Member of:
- 2022: Read our new articles titled "Relational schema optimization for RDF-based knowledge graphs" (open access in Elsevier Information Systems) and
"Resource-Aware Adaptive Indexing for In-situ Visual Exploration and Analytics" (in VLDB Journal)
- 2021: I am member of the organizing committee of IEEE ICDE 2021, held in Chania, Greece.
- 2020: New Book Published Our new book entitled "Linked Data Visualization: Techniques, Tools and Big Data" by Laura Po, Nikos Bikakis, Federico Desimoni & George Papastefanatos is published by Morgan & Claypool 2020.
You can find more details in the linked data visualization website .
A few words about the book: This book covers a wide spectrum of visualization topics, providing an overview of the recent advances in this area, focusing on techniques,
tools, and use cases of visualization and visual analysis of Data (LD). It presents the core concepts related to data visualization and LD technologies, techniques employed
for data visualization based on the characteristics of data, techniques for Big Data visualization, tools and use cases in the LD context, and, finally, a thorough assessment
of the usability of these tools under different scenarios. It offers a complete guide to the evolution of LD visualization for interested readers
from any background and empowers them to get started with the visual analysis of such data. The book can serve as a course textbook or as a primer for everyone who wants to
explore and analyze LD, whether undergraduate and post-graduate students, data scientists, semantic technology developers, or UI & UX designers who wish to gain some practical
experience with LD tools. Previous knowledge of Semantic Web technologies such as RDF, OWL, SPARQL, or programming skills is not required.
- 2020: Best Paper Award: For the first time, DOLAP 2020 considered the best paper award. Our paper entitled "Hierarchical Property Set Merging for SPARQL Query Optimization" by Marios Meimaris, George Papastefanatos and Panos Vassiliadis was selected for this award.
- 2018: Best Paper Award: Our paper entitled "RawVis: Visual Exploration over Raw Data" by Nikos Bikakis, Stavros Maroulis, George Papastefanatos, and Panos Vassiliadis was granted the ADBIS 2018 Best Paper Award.
Research
Interests & Projects
I am coordinating the following projects. Please contact me for more details.
- Jan 2023 - Dec 2025: ExtremeXP: EXPeriment driven and user eXPerience oriented analytics for eXtremely Precise outcomes and decisions. ExtremeXP proposes a new paradigm for data analytics. This paradigm consists of experimentation-driven analytics, to provide accurate, precise, fit-for-purpose, and trustworthy data-driven insights via evaluating different complex analytics variants, considering end users’ preferences and feedback in an automated way. The ambition is to provide capabilities for learning from experimentation to predict user requirements, profiling the user, and proactively generating the accurate analytics workflow towards more precise outcomes and personalized insights for decision making and focusing on the user experience, requirements, and needs and putting him in the center of the decision-making process. ExtremeXP will integrate cutting-edge research results from the domains of data integration, machine learning, visual analytics, explainable AI, decentralized trust, knowledge engineering, and model-driven engineering into a common framework.ExtremeXP proposes a new paradigm for data analytics. This paradigm consists of experimentation-driven analytics, to provide accurate, precise, fit-for-purpose, and trustworthy data-driven insights via evaluating different complex analytics variants, considering end users’ preferences and feedback in an automated way. The ambition is to provide capabilities for learning from experimentation to predict user requirements, profiling the user, and proactively generating the accurate analytics workflow towards more precise outcomes and personalized insights for decision making and focusing on the user experience, requirements, and needs and putting him in the center of the decision-making process. ExtremeXP will integrate cutting-edge research results from the domains of data integration, machine learning, visual analytics, explainable AI, decentralized trust, knowledge engineering, and model-driven engineering into a common framework (Co-funded by HORIZON-CL4-2022-DATA-01-01, GA:101093164).
- Jul 2022 - Jun 2024: Arcadia: Autonomous Resource Allocation for Edge Infrastructures. The optimization of resource allocation in cloud computing environments is a crucial problem with particular research interest and direct application to a multitude of commercial applications. The main objective of ARCADIA is to investigate, design, and evaluate ML methods for optimized resource allocation in cloud computing environments focusing on a) systems exhibiting dynamic workload characteristics, and b) environments with high energy consumption requirements due to simultaneous and continuous operation of computer clusters and equipment. Both of these features can be found in edge systems, and specifically in edge data centers, which have become a pivotal computing part of next-generation networks. (Funded by: Greece 2.0 - National Recovery and Resiliency Plan)
My research interests and some past projects include:
- Data science and Visual Analytics. An active direction in my research concerns the areas of Data visualization, Exploration and Visual Analytics.
- Self-service scalable visual analytics: A recent national project that I am coordinating, called Visual Facts, is related to self-service scalable visual analytics over big data. Self-service visual analytics is a new paradigm, widely promoted in modern corporate environments, in which business users are enabled and encouraged to directly manipulate (explore, blend, analyze) underlying data in rich visual ways, in order to derive insights from business information as quickly and efficiently as possible. The aim of VisualFacts is to develop a cloud-based scalable platform for providing self-service visual analytic capabilities to a wide range of non-corporate users to access, explore, analyze open and privately-held data and collaborate on the analytic results of their work by sharing, annotating and reusing them in the form of open facts.
- In-Situ Visual Data exploration: We have worked in methods aiming at enabling efficient and interactive visual analysis of very large raw data files (e.g., csv, json, etc). Our system employs an in-memory data structure that addresses the visual needs and enables users to perform several visual exploration scenarios.
- Graph Visualization: We have developed GraphVizDB, a tool that enables the visualization and exploration of very large graphs.
- Big Data Management. My research focus is on data management techniques and more specifically on data integration, scalable query processing and visual analytics over big data. Most of these techniques have been applied to various big data scenarios such as:
- Telco Data: My research has recently focused on end-to-end big data solutions for managing massive streams from IoT devices. I am project coordinator of an industry-funded project between IMSI, Intracom Telecom and Ericsson. IMSI has been contracted to design and develop a end2end big data solution and machine learning methods for stream analytics on network quality data coming from IoT devices, such as drones and autonomous cars.
- Scholarly Data: A main area of interest concerns Entity resolution in Big Data Integration settings, such as duplicate detection and entity interlinking. I have worked on Blocking \ Meta-blocking techniques, Parallelization techniques and Machine Learning techniques in Entity resolution flows for improving the performance and quality of the process. I have been involved in OpenAire project, where we have developed a scalable framework over Apache Spark for interlinking scholarly data.
- Data from Connected Vehicles: I participate in a EU-Funded COST Action, WISE-ACT, studying the wider implications of the deployment of autonomous and connected vehicles on existing road infrastructure in EU. My interest is on the adoption of novel cloud and data management technologies for online analytics and reaction to events at the edge and the cloud.
- Knowledge Graphs: An active line of ongoing work is in the area of RDF Indexing and Query Processing in Big Knowledge Graphs. We have developed a scalable approach for storing in relational databases and scalable query processing of RDF knowledge graphs , based on a novel indexing technique, called Extended Characteristic Sets. I have also worked on OLAP analytics on the Web. We have developed an approach that employs data mining techniques and analyzes and detects relationships in OLAP data published on the web in the form of multidimensional data.
- Social Data: I have technically coordinated Socioscope and YouWho two projects that created a visual analysis tool and a chat-based social survey tool, targeting primarily social scientists, for collection, visualization and exploration of social and political data.
- Web Data Management
- Linked Data: I have coordinated www.linked-statistics.gr , a project that makes available in the form of Linked Data, socio-economic and socio-demographic data, from the Hellenic Statistical Authority. Using data web technologies for creating and managing Personal dataspaces is an active research work. We have developed www.linkzoo.gr, a web-based, linked data enabled tool that supports collaborative management of information resources, enabling users to create and manage diverse types of resources into common spaces such as files, web documents, people, datasets and calendar events.
- Web Data dynamics:has been a primary focus of my research, involving problems related to Linked Data Evolution & Archiving, Temporal & Change Modelling, Change Propagation and Synchronization, Proactive design, and Benchmarking. I was actively involved in the FP7 DIACHRON project, which addressed many of the above issues.
- Legal Informatics: A recent project I have coordinated, deals with the Semantic Representation of Legal Documents. We have developed a framework for automatic Structuring and Semantic Indexing of Legal Documents, used in the electronic library of the Greek General Secretariat of Public Revenue (in Greek).
- European Open Science Cloud and European Research Infrastructures
- EOSC Core Development: I have been for the last 4 years the technical manager of the catalogue services behind the European Open Science Cloud. It offers a single catalogue for research services and providers offered by e-Infrastructures and research infrastructures, in EU.
- EOSC Service Development:I am the principal investigator for ATHENA RC of Neanias Project, which develops novel research services for emerging Atmosphere, Underwater & Space Research Communities in the context of the European Open Science Cloud.
- EOSC Monitor Services: I have been involved in the design of the Open Science observatory, a framework that monitors Open Science trends and their impact on research and provides insights and KPIs to researchers, funders and academia.
- Data-Centric Ecosystems, Database Quality Metrics
- A long standing research interest is on Data-Centric Information Systems, and issues related to their schema evolution, the representation of dependencies and automatic repairing of syntactic and semantic inconsistencies due to maintenance operations. I am also interested in the evaluation of design quality metrics and the construction of design patterns for these environments. There is an UoI-IMIS joint project, called
Hecataeus, which combines the representation and management of evolution processes into a powerful tool.
Here is a list of my publications and my
bio.
Education
- 1995-2000:School of Electrical and Computer Engineering, National Technical University of Athens, Greece
- Diploma Thesis: A Quality Assurance Framework for Educational Software
- 2001 - 2009: PhD in Computer Science, National Technical University of Athens, Greece
- Ph.D. Thesis: Policy Regulated Management of Schema Evolution in Database-centric Environments
|