George Papastefanatos

Information Management Systems Institute
ATHENA Research Center
Artemidos 6 & Epidavrou Athens Greece

mail-to:, .

Find me in LinkedIn , Google Scholar, DBLP

ORCID iD icon


I am a Principal Researcher at the Information Management Systems Institute (IMSI) of the ATHENA Research and Innovation Centre .

My research interests are in the area of big data management and analytics, working in problems related to scalable visual analytics and interactive exploration, data integration, web data management and data evolution. I obtained my Diploma on Electrical and Computer Engineering and my PhD in Computer Science from the Department of Electrical and Computer Engineering of the National Technical University of Athens (NTUA).


Research Interests & Projects

  • Data science and Visual Analytics. An active direction in my research concerns the areas of Data visualization, Exploration and Visual Analytics.
    • Self-service scalable visual analytics: A recent national project that I am coordinating, called Visual Facts, is related to self-service scalable visual analytics over big data. Self-service visual analytics is a new paradigm, widely promoted in modern corporate environments, in which business users are enabled and encouraged to directly manipulate (explore, blend, analyze) underlying data in rich visual ways, in order to derive insights from business information as quickly and efficiently as possible. The aim of VisualFacts is to develop a cloud-based scalable platform for providing self-service visual analytic capabilities to a wide range of non-corporate users to access, explore, analyze open and privately-held data and collaborate on the analytic results of their work by sharing, annotating and reusing them in the form of open facts.
    • In-Situ Visual Data exploration: I am collaborating with University of Ioannina, as a postdoc researcher, in a National project, called Ploigia, aiming at enabling efficient and interactive visual analysis of very large raw data files (e.g., csv, json, etc). Our system employs an in-memory data structure that addresses the visual needs and enables users to perform several visual exploration scenarios.
    • Graph Visualization: We have developed GraphVizDB, a tool that enables the visualization and exploration of very large graphs.
  • Big Data Management. My research focus is on data management techniques and more specifically on data integration, scalable query processing and visual analytics over big data. Most of these techniques have been applied to various big data scenarios such as:
    • Telco Data: My research has recently focused on end-to-end big data solutions for managing massive streams from IoT devices. I am project coordinator of an industry-funded project between IMSI, Intracom Telecom and Ericsson. IMSI has been contracted to design and develop a end2end big data solution and machine learning methods for stream analytics on network quality data coming from IoT devices, such as drones and autonomous cars.
    • Scholarly Data: A main area of interest concerns Entity resolution in Big Data Integration settings, such as duplicate detection and entity interlinking. I have worked on Blocking \ Meta-blocking techniques, Parallelization techniques and Machine Learning techniques in Entity resolution flows for improving the performance and quality of the process. I have been involved in OpenAire project, where we have developed a scalable framework over Apache Spark for interlinking scholarly data.
    • Data from Connected Vehicles: I participate in a EU-Funded COST Action, WISE-ACT, studying the wider implications of the deployment of autonomous and connected vehicles on existing road infrastructure in EU. My interest is on the adoption of novel cloud and data management technologies for online analytics and reaction to events at the edge and the cloud.
    • Knowledge Graphs: An active line of ongoing work is in the area of RDF Indexing and Query Processing in Big Knowledge Graphs. We have developed a scalable approach for storing in relational databases and scalable query processing of RDF knowledge graphs , based on a novel indexing technique, called Extended Characteristic Sets. I have also worked on OLAP analytics on the Web. We have developed an approach that employs data mining techniques and analyzes and detects relationships in OLAP data published on the web in the form of multidimensional data.
    • Social Data: I have technically coordinated Socioscope and YouWho two projects that created a visual analysis tool and a chat-based social survey tool, targeting primarily social scientists, for collection, visualization and exploration of social and political data.
  • Web Data Management
    • Linked Data: I have coordinated , a project that makes available in the form of Linked Data, socio-economic and socio-demographic data, from the Hellenic Statistical Authority. Using data web technologies for creating and managing Personal dataspaces is an active research work. We have developed, a web-based, linked data enabled tool that supports collaborative management of information resources, enabling users to create and manage diverse types of resources into common spaces such as files, web documents, people, datasets and calendar events.
    • Web Data dynamics:has been a primary focus of my research, involving problems related to Linked Data Evolution & Archiving, Temporal & Change Modelling, Change Propagation and Synchronization, Proactive design, and Benchmarking. I was actively involved in the FP7 DIACHRON project, which addressed many of the above issues.
    • Legal Informatics: A recent project I have coordinated, deals with the Semantic Representation of Legal Documents. We have developed a framework for automatic Structuring and Semantic Indexing of Legal Documents, used in the electronic library of the Greek General Secretariat of Public Revenue (in Greek).
  • European Open Science Cloud and European Research Infrastructures
    • EOSC Core Development: I have been for the last 4 years the technical manager of the catalogue services behind the European Open Science Cloud. It offers a single catalogue for research services and providers offered by e-Infrastructures and research infrastructures, in EU.
    • EOSC Service Development:I am the principal investigator for ATHENA RC of Neanias Project, which develops novel research services for emerging Atmosphere, Underwater & Space Research Communities in the context of the European Open Science Cloud.
    • EOSC Monitor Services: I have been involved in the design of the Open Science observatory, a framework that monitors Open Science trends and their impact on research and provides insights and KPIs to researchers, funders and academia.
  • Data-Centric Ecosystems, Database Quality Metrics
    • A long standing research interest is on Data-Centric Information Systems, and issues related to their schema evolution, the representation of dependencies and automatic repairing of syntactic and semantic inconsistencies due to maintenance operations. I am also interested in the evaluation of design quality metrics and the construction of design patterns for these environments. There is an UoI-IMIS joint project, called Hecataeus, which combines the representation and management of evolution processes into a powerful tool.

Here is a list of my publications and my bio.

  • 1995-2000: Undergraduate Student, National Technical University of Athens, Greece
    • Diploma Thesis: A Quality Assurance Framework for Educational Software
  • 2001 - 2009: PhD Candidate, National Technical University of Athens, Greece
    • Ph.D. Thesis: Policy Regulated Management of Schema Evolution in Database-centric Environments