Axel Polleres

Publications

You can also check my publication records at

My ORCID ID is: ORCID iD iconorcid.org/0000-0001-5670-1146

If you need bibtex entries, check out my publications in BibBase.


2024


[DP24] Daniil Dobryi and Axel Polleres. SMW Cloud: A corpus of domain-specific knowledge graphs from Semantic MediaWikis. In Proceedings of the 21st European Semantic Web Conference (ESWC2024), Hersonissos, Greece, May 2024. To appear.
Semantic wikis have become an increasingly popular means of collaboratively managing Knowledge Graphs. They are powered by platforms such as Semantic MediaWiki and Wikibase, both of which enable MediaWiki to store and publish structured data. While there are many semantic wikis currently in use, there has been little effort to collect and analyse their structured data, nor to make it available for the research community. This paper seeks to address this gap by systematically collecting structured data from an extensive corpus of Semantic-MediaWiki-powered portals and providing an in-depth analysis of the ontological diversity and re-use amongst these wikis using a variety of ontological metrics. Our paper aims to demonstrate that semantic wikis are a valuable and extensive part of Linked Open Data and, in fact, may be considered an active “sub-cloud” within the Linked Open Data ecosystem, which can provide useful insights into the evolution of small and medium-sized domain-specific Knowledge Graphs.
[APFA24] Amr Azzam, Axel Polleres, Javier D. Fernandez, and Maribel Acosta. smart-KG: Partition-based linked data fragments for querying knowledge graphs. Semantic Web -- Interoperability, Usability, Applicability (SWJ), 2024. To appear, accepted for publication. [ http ]
RDF and SPARQL provide a uniform way to publish and query billions of triples in open knowledge graphs (KGs) on the Web. Yet, provisioning of a fast, reliable, and responsive live querying solution for open KGs is still hardly possible through SPARQL endpoints alone: while such endpoints provide a remarkable performance for single queries, they typically can not cope with highly concurrent query workloads by multiple clients. To mitigate this, the Linked Data Fragments (LDF) framework sparked the design of different alternative low-cost interfaces such as Triple Pattern Fragments (TPF), that partially offload the query processing workload to the client side. On the downside, such interfaces still come with the expense of unnecessarily high network load due to the necessary transfer of intermediate results to the client, leading to query performance degradation compared with endpoints. To address this problem, in the present work, we investigate alternative interfaces, refining and extending the original TPF idea, which also aims at reducing server-resource consumption, by shipping query-relevant partitions of KGs from the server to the client. To this end, first, we align formal definitions and notations of the original LDF framework to uniformly present existing LDF implements and such “partition-based” LDF approaches. These novel LDF interfaces retrieve, instead of the exact triples matching a particular query pattern, a subset of pre-materialized, compressed, partitions of the original graph, containing all answers to a query pattern, to be further evaluated on the client side. As a concrete representative of partition-based LDF, we present smart-KG+, extending and refining our prior work [1] in several respects. Our proposed approach is a step forward towards a better-balanced share of the query processing load between clients and servers by shipping graph partitions driven by the structure of RDF graphs to group entities described with the same sets of properties and classes, resulting in significant data transfer reduction. Our experiments demonstrate that the smart-KG+ significantly outperforms existing Web SPARQL interfaces on both pre-existing benchmarks for highly concurrent query execution as well as an accustomed query workload inspired by query logs of existing SPARQL endpoints.
[FdSAP24] Nicolas Ferranti, Jairo Francisco de Souza, Shqiponja Ahmetaj, and Axel Polleres. Formalizing and validating Wikidata's property constraints using SHACL and SPARQL. Semantic Web -- Interoperability, Usability, Applicability (SWJ), 2024. To appear, accepted for publication. [ http ]
In this paper, we delve into the crucial role of constraints in maintaining data integrity in knowledge graphs with a specific focus on Wikidata, one of the most extensive collaboratively maintained open data knowledge graphs on the Web. The World Wide Web Consortium (W3C) recommends the Shapes Constraint Language (SHACL) as the constraint language for validating Knowledge Graphs, which comes in two different levels of expressivity, SHACL-Core, as well as SHACL-SPARQL. Despite the availability of SHACL, Wikidata currently represents its property constraints through its own RDF data model, which relies on Wikidata's specific reification mechanism based on authoritative namespaces, and - partially ambiguous - natural language definitions. In the present paper, we investigate whether and how the semantics of Wikidata property constraints, can be formalized using SHACL-Core, SHACL-SPARQL, as well as directly as SPARQL queries. While the expressivity of SHACL-Core turns out to be insufficient for expressing all Wikidata property constraint types, we present SPARQL queries to identify violations for all 32 current Wikidata constraint types. We compare the semantics of this unambiguous SPARQL formalization with Wikidata's violation reporting system and discuss limitations in terms of evaluation via Wikidata's public SPARQL query endpoint, due to its current scalability. Our study, on the one hand, sheds light on the unique characteristics of constraints defined by the Wikidata community, in order to improve the quality and accuracy of data in this collaborative knowledge graph. On the other hand, as a “byproduct”, our formalization extends existing benchmarks for both SHACL and SPARQL with a challenging, large-scale real-world use case.

2023


[DP23b] Daniil Dobriy and Axel Polleres. O2WB: A tool enabling ontology reuse in Wikibase. In Proceedings of the 12th Knowledge Capture Conference (K-CAP '23), pages 101--–104, December 2023. [ DOI ]
The Semantic Web initiative has established standards and practices for publishing interconnected knowledge, where RDF Schema and OWL shall enable the reuse of ontologies as one of these established practices. However, Wikibase, the software behind Wikidata, which is increasingly gaining popularity among data publishers, lacks the functionality to import and reuse existing RDF Schema and OWL ontologies. To facilitate ontology reuse, FAIR data publishing and encourage a tighter connection of existing Linked Data resources with Wikibase instances, we align the Wikibase data model with RDF and present O2WB, a tool for ontology import and export within Wikibase.
[PPB+23] Axel Polleres, Romana Pernisch, Angela Bonifati, Daniele Dell'Aglio, Daniil Dobriy, Stefania Dumbrava, Lorena Etcheverry, Nicolas Ferranti, Katja Hose, Ernesto Jiménez-Ruiz, Matteo Lissandrini, Ansgar Scherp, Riccardo Tommasini, and Johannes Wachs. How does knowledge evolve in open knowledge graphs? TGDK, 1(1):11:1--11:59, December 2023. [ DOI | http ]
Openly available, collaboratively edited Knowledge Graphs (KGs) are key platforms for the collective management of evolving knowledge. The present work aims t o provide an analysis of the obstacles related to investigating and processing specifically this central aspect of evolution in KGs. To this end, we discuss (i) the dimensions of evolution in KGs, (ii) the observability of evolution in existing, open, collaboratively constructed Knowledge Graphs over time, and (iii) possible metrics to analyse this evolution. We provide an overview of relevant state-of-the-art research, ranging from metrics developed for Knowledge Graphs specifically to potential methods from related fields such as network science. Additionally, we discuss technical approaches - and their current limitations - related to storing, analysing and processing large and evolving KGs in terms of handling typical KG downstream tasks.
[VRPCBS23] Felipe Vargas-Rojas, Axel Polleres, Llorenç Cabrera-Bosquet, and Danai Symeonidou. PhyQus: Automatic unit conversions for Wikidata physical quantities. In 4rd Wikidata Workshop (co-located with ISWC2023), November 2023. accepted/to appear. [ http ]
Wikidata is gaining attention to address scientific experimental information. In particular, users can exploit the notions of physical quantities and units of measurements already defined in its knowledge graph. However, when users perform queries over scientific data referring to such data, they can only retrieve the physical quantities in the units of measurement explicitly stored as statements, although the knowledge to transform these quantity values into different units required by the query is already (partially) defined in the units' metadata. We propose PhyQus a query-answering approach that allows to retrieve the physical quantities in any convertible unit by performing unit conversion on the fly based on the query information. To this end, our approach is based in the advanced features of the W3C recommendation SHACL and leverages the ontology of unit of measurements, QUDT. We showcase that the approach is feasible considering two main examples one about cities's area and the other about the boiling point of chemical substances.
[ASWP23b] Amin Anjomshoaa, Hannah Schuster, Johannes Wachs, and Axel Polleres. Towards crisis response and intervention using knowledge graphs - CRISP case study. In Marcela Ruiz and Pnina Soffer, editors, Advanced Information Systems Engineering Workshops - CAiSE 2023 International Workshops, Zaragoza, Spain, June 12-16, 2023, Proceedings, volume 482 of Lecture Notes in Business Information Processing, pages 67--73. Springer, June 2023. [ DOI ]
Data plays a critical role in crisis response and intervention efforts by providing decision-makers with timely, accurate, and actionable information. During a crisis, data can help organizations and crisis managers identify the most affected populations, track the spread of the crisis, and monitor the effectiveness of their response efforts. We introduce the CRISP Knowledge Graph, constructed from various data resources provided by different stakeholders involved in crisis and disaster management, which presents a uniform view of infrastructure, networks, and services pertinent to crisis management use cases. We also present preliminary results for network and infrastructure analysis which demonstrate how the CRISP KG can address the requirements of crisis management and urban resilience scenarios.
[KP23] Gerhard Georg Klager and Axel Polleres. Is GPT fit for KGQA? -- preliminary results. In Proceedings of the International Workshop on Knowledge Graph Generation from Text (Text2KG2023), co-located with Extended Semantic Web Conference 2023 (ESWC 2023), May 2023. to appear. [ .pdf ]
In this paper we report about preliminary results on running question answering benchmarks against the recently hyped conversational AI services such as ChatGPT: we focus on questions that are known to be possible to be answered by information in existing Knowledge graphs such as Wikidata. In a preliminary study we experiment, on the one hand, with questions from established KGQA benchmarks, and on the other hand, present a set of questions established in a student experiment, which should be particularly hard for Large Language Models (LLMs) to answer, mainly focusing on questions on recent events. In a second experiment, we assess how far GPT could be used for query generation in SPARQL. While our results are mostly negative for now, we hope to provide insights for further research in this direction, in terms of isolating and discussing the most obvious challenges and gaps, and to provide a research roadmap for a more extensive study planned as a current master thesis project.
[TAD+23] Gary P. Tchat, Amin Anjomshoaa, Daniil Dobriy, Fajar J. Ekaputra, Elmar Kiesling, Axel Polleres, and Marta Sabou. From semantic web to wisdom web: A retrospective on the journey to 2043. In Proceedings of the ESWC2023 “The next 20 years (ESWC 2043)” track, May 2023. [ DOI | .pdf ]
7H15 P4P3r r3V13W5 7H3 3V01U710N 0F 7H3 W15D0M W38 4ND 11NK5 175 D3V310PM3N7 70 r3534rCH 0N 7H3 53M4N71C W38 – Fr0M 175 1NC3P710N 1N 7H3 34r1Y 2157 C3N7UrY 70 175 CUrr3N7 57473 1N 2042. W3 D15CU55 7H3 K3Y M113570N35, CH4113N635 4ND 1NN0V4710N5 7H47 H4V3 5H4P3D 7H3 "W15D0M W38" 14ND5C4P3 0V3r 7H3 P457 D3C4D35, CU1M1N471N6 1N 7H3 H16H1Y 1N73rC0N- N3C73D 1N7311163N7 4ND 3FF1C13N7 6L084L KN0W13D63 5Y573M W3 H4V3 70D4Y. 7H3 P4P3r 5UMM4r1Z35 7H3 F1r57 4U7H0r’5 P3r50N41 V13W5 F0CU51N6 0N 7H3 3V01U710N 0F 7H3 F13LD 51NC3 H3 574r73D H15 PHD 1N 7H3 34r1Y 2020’5. 45 4 r3M1N15C3NC3 70 7H3 3V01U710N 0F 4L50 Wr1773N 14N6U463 51NC3 7H3N, W3 W111 U53 C14551C41 Wr1773N 14N6U463 1N 7H3 r357 0F 7H15 P4P3r.

DISCLAIMER: This paper is a work of fiction, written in 2023 and describing research that may be carried out in and until 2043. For this reason, it includes citations to papers produced in the period 2024-2043, which have not been published (yet); all citations prior to 2024 refer instead to papers already in the literature. Any reference or resemblance to actual events or people or businesses, past present or future, is entirely coincidental and the product of the authors’ imagination. Even the imaginary 2043 keynote speaker and first author, who started its PhD in the early 2020’s, is fictitious.

[ASWP23a] Amin Anjomshoaa, Hannah Schuster, Johannes Wachs, and Axel Polleres. From data to insights: Constructing spatiotemporal knowledge graphs for city resilience use cases. In Proceedings of the Second International Workshop on Linked Data-driven Resilience Research 2023 (D2R2 2023), co-located with Extended Semantic Web Conference 2023 (ESWC 2023), volume 3401 of CEUR Workshop Proceedings. CEUR-WS.org, May 2023. [ .pdf ]
Data integration plays a crucial role in crisis management and city resilience use cases by enabling the consolidation of information from scattere sources into a unified view, thereby allowing decision-makers to gain a more complete and accurate understanding of the situation at hand. In this paper, we introduce the CRISP Knowledge Graph, constructed from various data resources to present a uniform view of infrastructure networks and services pertinent to crisis management to enable informed and targeted interventions to address crises management use cases. We provide a brief explanation of the semantic model and its significance in building a comprehensive knowledge graph and then outline our approach for incorporating some large spatiotemporal datasets into this framework, considering the unique challenges that arise in this process.
[SPW23] Hannah Schuster, Axel Polleres, and Johannes Wachs. Quantifying road network vulnerability by access to healthcare. In NETSCI 2023: INTERNATIONAL SCHOOL AND CONFERENCE ON NETWORK SCIENCE, 2023. conference abstract only, paper forthcoming.
The resilience of transportation networks is highly variable. Some components are crucially important: their failure can cause problems throughout the system. One way to probe a system for weak points needing reinforcement is via simulated stress tests. Network scientists have applied node or edge removal simulations to many systems: interbank lending markets, power grids, software networks, etc.

Reliable transit via roads is especially crucial in healthcare: delays in travel to hospitals have a significant negative effect on patient outcomes including mortality. Yet past studies of road network resilience focus on general mobility or specific kinds of events like floods. And it is unclear how classical resilience analysis applies to geographically embedded road networks with homogeneous degree distribution. We address this gap by using a coarse-grained representation of the Austrian road network in which nodes are municipalities and edges connect municipalities directly via roads. We stress this network, observing changes in accessibility when removing individual edges and groups of edges in geographic clusters using a population-weighted measure of hospital accessibility. Under specific scenarios, certain segments play a critical role in accessibility. We observe changes in burdens on individual hospitals as road closures change which hospitals are nearest. These results are valuable for scheduling road maintenance, extending the road network, or evaluating hospital capacity.

[DP23a] Daniil Dobriy and Axel Polleres. Crawley: A tool for web platform discovery. In Proceedings of the 22nd International Semantic Web Conference (ISWC2023) -- Posters and Demos Track, 2023. To appear. [ .pdf ]
Crawley, a Python-based command-line tool, provides an automated mechanism for web platform discovery. Incorporating capabilities such as Search Engine crawling, web platform validation and recursive hyperlink traversal, it facilitates the systematic identification and validation of a variety of web platforms. The tool’s effectiveness and versatility are demonstrated via two successful use cases: the identification of Semantic MediaWikis instances, as well as the discovery of Open Data Portals including OpenDataSoft, Socrata, and CKAN. These empirical results underscore Crawley’s capacity to support web-based research. We further outline potential enhancements of the tool, thereby positioning Crawley as a valuable tool in the field of web platform discovery.

2023


[DP23b] Daniil Dobriy and Axel Polleres. O2WB: A tool enabling ontology reuse in Wikibase. In Proceedings of the 12th Knowledge Capture Conference (K-CAP '23), pages 101--–104, December 2023. [ DOI ]
The Semantic Web initiative has established standards and practices for publishing interconnected knowledge, where RDF Schema and OWL shall enable the reuse of ontologies as one of these established practices. However, Wikibase, the software behind Wikidata, which is increasingly gaining popularity among data publishers, lacks the functionality to import and reuse existing RDF Schema and OWL ontologies. To facilitate ontology reuse, FAIR data publishing and encourage a tighter connection of existing Linked Data resources with Wikibase instances, we align the Wikibase data model with RDF and present O2WB, a tool for ontology import and export within Wikibase.
[PPB+23] Axel Polleres, Romana Pernisch, Angela Bonifati, Daniele Dell'Aglio, Daniil Dobriy, Stefania Dumbrava, Lorena Etcheverry, Nicolas Ferranti, Katja Hose, Ernesto Jiménez-Ruiz, Matteo Lissandrini, Ansgar Scherp, Riccardo Tommasini, and Johannes Wachs. How does knowledge evolve in open knowledge graphs? TGDK, 1(1):11:1--11:59, December 2023. [ DOI | http ]
Openly available, collaboratively edited Knowledge Graphs (KGs) are key platforms for the collective management of evolving knowledge. The present work aims t o provide an analysis of the obstacles related to investigating and processing specifically this central aspect of evolution in KGs. To this end, we discuss (i) the dimensions of evolution in KGs, (ii) the observability of evolution in existing, open, collaboratively constructed Knowledge Graphs over time, and (iii) possible metrics to analyse this evolution. We provide an overview of relevant state-of-the-art research, ranging from metrics developed for Knowledge Graphs specifically to potential methods from related fields such as network science. Additionally, we discuss technical approaches - and their current limitations - related to storing, analysing and processing large and evolving KGs in terms of handling typical KG downstream tasks.
[VRPCBS23] Felipe Vargas-Rojas, Axel Polleres, Llorenç Cabrera-Bosquet, and Danai Symeonidou. PhyQus: Automatic unit conversions for Wikidata physical quantities. In 4rd Wikidata Workshop (co-located with ISWC2023), November 2023. accepted/to appear. [ http ]
Wikidata is gaining attention to address scientific experimental information. In particular, users can exploit the notions of physical quantities and units of measurements already defined in its knowledge graph. However, when users perform queries over scientific data referring to such data, they can only retrieve the physical quantities in the units of measurement explicitly stored as statements, although the knowledge to transform these quantity values into different units required by the query is already (partially) defined in the units' metadata. We propose PhyQus a query-answering approach that allows to retrieve the physical quantities in any convertible unit by performing unit conversion on the fly based on the query information. To this end, our approach is based in the advanced features of the W3C recommendation SHACL and leverages the ontology of unit of measurements, QUDT. We showcase that the approach is feasible considering two main examples one about cities's area and the other about the boiling point of chemical substances.
[ASWP23b] Amin Anjomshoaa, Hannah Schuster, Johannes Wachs, and Axel Polleres. Towards crisis response and intervention using knowledge graphs - CRISP case study. In Marcela Ruiz and Pnina Soffer, editors, Advanced Information Systems Engineering Workshops - CAiSE 2023 International Workshops, Zaragoza, Spain, June 12-16, 2023, Proceedings, volume 482 of Lecture Notes in Business Information Processing, pages 67--73. Springer, June 2023. [ DOI ]
Data plays a critical role in crisis response and intervention efforts by providing decision-makers with timely, accurate, and actionable information. During a crisis, data can help organizations and crisis managers identify the most affected populations, track the spread of the crisis, and monitor the effectiveness of their response efforts. We introduce the CRISP Knowledge Graph, constructed from various data resources provided by different stakeholders involved in crisis and disaster management, which presents a uniform view of infrastructure, networks, and services pertinent to crisis management use cases. We also present preliminary results for network and infrastructure analysis which demonstrate how the CRISP KG can address the requirements of crisis management and urban resilience scenarios.
[KP23] Gerhard Georg Klager and Axel Polleres. Is GPT fit for KGQA? -- preliminary results. In Proceedings of the International Workshop on Knowledge Graph Generation from Text (Text2KG2023), co-located with Extended Semantic Web Conference 2023 (ESWC 2023), May 2023. to appear. [ .pdf ]
In this paper we report about preliminary results on running question answering benchmarks against the recently hyped conversational AI services such as ChatGPT: we focus on questions that are known to be possible to be answered by information in existing Knowledge graphs such as Wikidata. In a preliminary study we experiment, on the one hand, with questions from established KGQA benchmarks, and on the other hand, present a set of questions established in a student experiment, which should be particularly hard for Large Language Models (LLMs) to answer, mainly focusing on questions on recent events. In a second experiment, we assess how far GPT could be used for query generation in SPARQL. While our results are mostly negative for now, we hope to provide insights for further research in this direction, in terms of isolating and discussing the most obvious challenges and gaps, and to provide a research roadmap for a more extensive study planned as a current master thesis project.
[TAD+23] Gary P. Tchat, Amin Anjomshoaa, Daniil Dobriy, Fajar J. Ekaputra, Elmar Kiesling, Axel Polleres, and Marta Sabou. From semantic web to wisdom web: A retrospective on the journey to 2043. In Proceedings of the ESWC2023 “The next 20 years (ESWC 2043)” track, May 2023. [ DOI | .pdf ]
7H15 P4P3r r3V13W5 7H3 3V01U710N 0F 7H3 W15D0M W38 4ND 11NK5 175 D3V310PM3N7 70 r3534rCH 0N 7H3 53M4N71C W38 – Fr0M 175 1NC3P710N 1N 7H3 34r1Y 2157 C3N7UrY 70 175 CUrr3N7 57473 1N 2042. W3 D15CU55 7H3 K3Y M113570N35, CH4113N635 4ND 1NN0V4710N5 7H47 H4V3 5H4P3D 7H3 "W15D0M W38" 14ND5C4P3 0V3r 7H3 P457 D3C4D35, CU1M1N471N6 1N 7H3 H16H1Y 1N73rC0N- N3C73D 1N7311163N7 4ND 3FF1C13N7 6L084L KN0W13D63 5Y573M W3 H4V3 70D4Y. 7H3 P4P3r 5UMM4r1Z35 7H3 F1r57 4U7H0r’5 P3r50N41 V13W5 F0CU51N6 0N 7H3 3V01U710N 0F 7H3 F13LD 51NC3 H3 574r73D H15 PHD 1N 7H3 34r1Y 2020’5. 45 4 r3M1N15C3NC3 70 7H3 3V01U710N 0F 4L50 Wr1773N 14N6U463 51NC3 7H3N, W3 W111 U53 C14551C41 Wr1773N 14N6U463 1N 7H3 r357 0F 7H15 P4P3r.

DISCLAIMER: This paper is a work of fiction, written in 2023 and describing research that may be carried out in and until 2043. For this reason, it includes citations to papers produced in the period 2024-2043, which have not been published (yet); all citations prior to 2024 refer instead to papers already in the literature. Any reference or resemblance to actual events or people or businesses, past present or future, is entirely coincidental and the product of the authors’ imagination. Even the imaginary 2043 keynote speaker and first author, who started its PhD in the early 2020’s, is fictitious.

[ASWP23a] Amin Anjomshoaa, Hannah Schuster, Johannes Wachs, and Axel Polleres. From data to insights: Constructing spatiotemporal knowledge graphs for city resilience use cases. In Proceedings of the Second International Workshop on Linked Data-driven Resilience Research 2023 (D2R2 2023), co-located with Extended Semantic Web Conference 2023 (ESWC 2023), volume 3401 of CEUR Workshop Proceedings. CEUR-WS.org, May 2023. [ .pdf ]
Data integration plays a crucial role in crisis management and city resilience use cases by enabling the consolidation of information from scattere sources into a unified view, thereby allowing decision-makers to gain a more complete and accurate understanding of the situation at hand. In this paper, we introduce the CRISP Knowledge Graph, constructed from various data resources to present a uniform view of infrastructure networks and services pertinent to crisis management to enable informed and targeted interventions to address crises management use cases. We provide a brief explanation of the semantic model and its significance in building a comprehensive knowledge graph and then outline our approach for incorporating some large spatiotemporal datasets into this framework, considering the unique challenges that arise in this process.
[SPW23] Hannah Schuster, Axel Polleres, and Johannes Wachs. Quantifying road network vulnerability by access to healthcare. In NETSCI 2023: INTERNATIONAL SCHOOL AND CONFERENCE ON NETWORK SCIENCE, 2023. conference abstract only, paper forthcoming.
The resilience of transportation networks is highly variable. Some components are crucially important: their failure can cause problems throughout the system. One way to probe a system for weak points needing reinforcement is via simulated stress tests. Network scientists have applied node or edge removal simulations to many systems: interbank lending markets, power grids, software networks, etc.

Reliable transit via roads is especially crucial in healthcare: delays in travel to hospitals have a significant negative effect on patient outcomes including mortality. Yet past studies of road network resilience focus on general mobility or specific kinds of events like floods. And it is unclear how classical resilience analysis applies to geographically embedded road networks with homogeneous degree distribution. We address this gap by using a coarse-grained representation of the Austrian road network in which nodes are municipalities and edges connect municipalities directly via roads. We stress this network, observing changes in accessibility when removing individual edges and groups of edges in geographic clusters using a population-weighted measure of hospital accessibility. Under specific scenarios, certain segments play a critical role in accessibility. We observe changes in burdens on individual hospitals as road closures change which hospitals are nearest. These results are valuable for scheduling road maintenance, extending the road network, or evaluating hospital capacity.

[DP23a] Daniil Dobriy and Axel Polleres. Crawley: A tool for web platform discovery. In Proceedings of the 22nd International Semantic Web Conference (ISWC2023) -- Posters and Demos Track, 2023. To appear. [ .pdf ]
Crawley, a Python-based command-line tool, provides an automated mechanism for web platform discovery. Incorporating capabilities such as Search Engine crawling, web platform validation and recursive hyperlink traversal, it facilitates the systematic identification and validation of a variety of web platforms. The tool’s effectiveness and versatility are demonstrated via two successful use cases: the identification of Semantic MediaWikis instances, as well as the discovery of Open Data Portals including OpenDataSoft, Socrata, and CKAN. These empirical results underscore Crawley’s capacity to support web-based research. We further outline potential enhancements of the tool, thereby positioning Crawley as a valuable tool in the field of web platform discovery.

2022


[DAvP22] Robert David, Shqiponja Ahmetaj, Mantas Šimkus, and Axel Polleres. Repairing SHACL constraint violations using answer set programming. In Proceedings of the 21st International Semantic Web Conference (ISWC 2022), volume 13489 of Lecture Notes in Computer Science (LNCS), pages 375--391, Virtual Conference (Hangzhou, China), October 2022. Springer. [ DOI | .pdf ]
The Shapes Constraint Language (SHACL) is a recent W3C recommendation for validating RDF graphs against shape constraints to be checked on target nodes of the data graph. The standard also describes the notion of validation reports for data graphs that violate given constraints, which aims to provide feedback on how the data graph can be fixed to satisfy the constraints. Since the specification left it open to SHACL processors to define such explanations, a recent work proposed the use of explanations in the style of database repairs, where a repair is a set of additions to or deletions from the data graph so that the resulting graph validates against the constraints. In this paper, we study such repairs for non-recursive SHACL, the largest fragment of SHACL that is fully defined in the specification. We propose an algorithm to compute repairs by encoding the explanation problem -- using Answer Set Programming (ASP) -- into a logic program, the answer sets of which correspond to (minimal) repairs. We then study a scenario where it is not possible to simultaneously repair all the targets, which may be often the case due to overall unsatisfiability or conflicting constraints. We introduce a relaxed notion of validation, which allows to validate a (maximal) subset of the targets and adapt the ASP translation to take into account this relaxation. Our implementation in Clingo is -- to the best of our knowledge -- the first implementation of a repair generator for SHACL.
[BMS+22] Sofia Baroncini, Margherita Martorana, Mario Scrocca, Zuzanna Smiech, and Axel Polleres. Analysing the evolution of community-driven (sub-)schemas within Wikidata. In Proceedings of the 3rd Wikidata Workshop (co-located with ISWC2022), October 2022. [ .pdf ]
Wikidata is a collaborative knowledge graph not structured according to predefined ontologies. Its schema evolves in a bottom-up approach based, defined by its users. In this paper, we propose a methodology to investigate how semantics develop in sub-schemas used by particular, domain-specific communities within the Wikidata knowledge graph: (i) we provide an approach to identify the domain sub-schema from a set of given classes and its related community, considered domain-specific; (ii) we propose an approach for analysing the such identified sub-schemas and communities, including their evolution over time. Finally, we suggest further possible analyses that would give better insights in (i) the communities themselves, (ii) the KG vocabulary accuracy, quality and its evolution over time according to domain areas, raising the potential of Wikidata improvement and its re-use by domain experts.
[FPSA22] Nicolas Ferranti, Axel Polleres, Jairo Francisco De Souza, and Shqiponja Ahmetaj. Formalizing property constraints in Wikidata. In Proceedings of the 3rd Wikidata Workshop (co-located with ISWC2022), October 2022. [ .pdf ]
Constraints play an important role to ensure data integrity. While the Shapes Constraint Language (SHACL) provides a W3C recommendation for validating RDF Knowledge Graphs (KG) against such constraints, real-world KG have adopted their own constraint formalisms. Wikidata (WD), one of the largest collaboratively Open Data Knowledge Graphs available on the Web, represents property constraints through its own RDF data model, within its own authoritative namespaces, which might be an indication that the nature of WD property constraints is different from other Knowledge Graphs. In this paper we investigate the semantics of WD constraints, and unambiguously formalize all current constraints using SPARQL to retrieve violations; we also discuss the expressiveness of WD constraint language compared with SHACL core and discuss the evolution of constraint violations. We found that, while all current WD property constraint types can be expressed using SPARQL, only 86% (26 out of 30) can be expressed using SHACL core: the rest face issues related to using separator properties and arithmetic expressions.
[DP22] Daniil Dobriy and Axel Polleres. Analysing and promoting ontology interoperability in Wikibase. In Proceedings of the 3rd Wikidata Workshop (co-located with ISWC2022), October 2022. [ .pdf ]
Wikibase, the open-source software behind Wikidata, increasingly gains popularity among third-party Linked Data publishers. However, the platform's unique data model decreases the degree of interoperability with existing Semantic Web standards and tools that underlie Linked Data as codified by Linked Data principles. In particular, this unique data model of Wikibase also undermines the direct reuse of ontologies and vocabularies, in a manner compliant with Semantic Web standards and Linked Data principles. To this end, firstly, we compare the Wikibase data model to the established RDF data model. Secondly, we enumerate a series of challenges for importing existing ontologies into Wikibase. Thirdly, we present practical solutions to these challenges and introduce a tool for importing and re-using ontologies within Wikibase. Thus, the paper aims to promote ontology interoperability in Wikibase and by doing so hopes to contribute to higher degree of inter-linkage of Wikibase instances with Linked Open Data.
[BKR+22] Stefan Bachhofner, Elmar Kiesling, Kate Revoredo, Philipp Waibel, and Axel Polleres. Automated process knowledge graph construction from bpmn models. In 3rd DEXA conferences and workshops (DEXA2022), August 2022. Full paper. [ DOI ]
Enterprise knowledge graphs are increasingly adopted in industrial settings to integrate heterogeneous systems and data landscapes. Manufacturing systems can benefit from knowledge graphs as they contribute towards implementing visions of interconnected, decentralized and flexible smart manufacturing systems. Process knowledge is a key perspective which has so far attracted limited attention in this context, despite its usefulness for cap turing the context in which data are generated. Such knowledge is commonly expressed in diagrammatic languages and the resulting models can not readily be used in knowledge graph construction. We propose BPMN2KG to address this problem. BPMN2KG is a transformation framework from BPMN2.0 process models into knowledge graphs. Thereby BPMN2KG creates a frame for process-centric data integration and analysis with this transformation. We motivate and evaluate our transformation framework with a real-world industrial use case focused on quality management in plastic injection molding for the automotive sector. We use BPMN2KG for process-centric integration of dispersed production systems data that results in an integrated knowledge graph that can be queried using SPARQL, a standardized graph-pattern based query language. By means of several example queries, we illustrate how this knowledge graph benefits data contextualization and integrated analysis.
[HPD+22] Armin Haller, Axel Polleres, Daniil Dobriy, Nicolas Ferranti, and Sergio J. Rodríguez Méndez. An analysis of links in Wikidata. In 19th European Semantic Web Conference, ESWC 2022. Springer, May 2022. [ DOI | .pdf ]
Wikidata has become one of the most prominent open knowledge graphs (KGs) on the Web. Relying on a community of users with different expertise, this cross-domain KG is directly related to other data sources. This paper investigates how Wikidata is linked to other data sources in the Linked Data ecosystem. To this end, we adapt previous definitions of ontology links and instance links to the terminological part of the Wikidata vocabulary and perform an analysis of the links in Wikidata to external datasets and ontologies from the Linked Data ecosystem. As a side effect, this reveals insights on the ontological expressiveness of meta-properties used in Wikidata. The results of this analysis show that while Wikidata defines a large number of individuals, classes and properties within its own namespace, they are not (yet) extensively linked. We discuss reasons for this and conclude with some suggestions to increase the interconnectedness of Wikidata with other KGs.
[WNSP22] Johannes Wachs, Mariusz Nitecki, William Schueller, and Axel Polleres. The geography of open source software: Evidence from GitHub. Technological Forecasting and Social Change, 176:121478, March 2022. [ DOI | http ]
Open Source Software (OSS) plays an important role in the digital economy. Yet although software production is amenable to remote collaboration and its outputs are digital, software development seems to cluster geographically in places like Silicon Valley, London, or Berlin. And while OSS activity creates positive externalities which accrue locally through knowledge spillovers and information effects, up-to-date data on the geographic distribution of open source developers is limited. This presents a significant blindspot for policymakers, who often promote OSS at the national level as a cost-saving tool for public sector institutions. We address this gap by geolocating more than half a million active contributors to GitHub in early 2021 at various spatial scales. Compared to results from 2010, we find a significant increase in the share of developers based in Asia, Latin America and Eastern Europe, suggesting a more even spread of OSS developers globally. Within countries, however, we find significant concentration in regions, exceeding the concentration of high-tech employment. Social and economic development indicators predict at most half of regional variation in OSS activity in the EU, suggesting that clusters have idiosyncratic roots. We argue for localized policies to support networks of OSS developers in cities and regions.
[HCP22] Giray Havur, Cristina Cabanillas, and Axel Polleres. Benchmarking answer set programming systems for resource allocation in business processes. Expert Systems with Applications, in press, 2022. [ DOI ]
Declarative logic programming formalisms are well-suited to model various optimization and configuration problems. In particular, Answer Set Programming (ASP) systems have gained popularity, for example, to deal with scheduling problems present in several domains. The main goal of this paper is to devise a benchmark for ASP systems to assess their performance when dealing with complex and realistic resource allocation with objective optimization. To this end, we provide (i) a declarative and compact encoding of the resource allocation problem in ASP (compliant with the ASP Core-2 standard), (ii) a configurable ASP systems benchmark named BRANCH that is equipped with resource allocation instance generators that produce problem instances of different sizes with adjustable parameters (e.g. in terms of process complexity, organizational and temporal constraints), and (iii) an evaluation of four state-of-the-art ASP systems using BRANCH. This solid application-oriented benchmark serves the ASP community with a tool that leads to potential optimizations and improvements in encodings and further drives the development of ASP solvers. On the other hand, resource allocation is an important problem that still lacks adequate automated tool support in the context of Business Process Management (BPM). The ASP problem encoding, ready-to-use ASP systems and problem instance generators benefit the BPM community to tackle the problem at scale and mitigate the lack of openly available problem instance data.

2021


[ADO+21] Shqiponja Ahmetaj, Robert David, Magdalena Ortiz, Axel Polleres, Bojken Shehu, and Mantas Šimkus. Reasoning about explanations for non-validation in SHACL. In Proceedings of the 18th International Conference on Principles of Knowledge Representation and Reasoning (KR 2021), November 2021. [ DOI | .pdf ]
The Shapes Constraint Language (SHACL) is a recently standardized language for describing and validating constraints over RDF graphs. The SHACL specification describes the so-called validation reports, which are meant to explain to the users the outcome of validating an RDF graph against a collection of constraints. Specifically, explaining the reasons why the input graph does not satisfy the constraints is challenging. In fact, the current SHACL standard leaves it open on how such explanations can be provided to the users. In this paper, inspired by works on logic-based abduction and database repairs, we study the problem of explaining non-validation of SHACL constraints. In particular, in our framework non-validation is explained using the notion of a repair, i.e.,a collection of additions and deletions whose application on an input graph results in a repaired graph that does satisfy the given SHACL constraints. We define a collection of decision problems for reasoning about explanations, possibly restricting to explanations that are minimal with respect to cardinality or set inclusion. We provide a detailed characterization of the computational complexity of those reasoning tasks, including the combined and the data complexity.
[KP21] Bernhard Krabina and Axel Polleres. Seeding Wikidata with municipal finance data. In Lucie-Aimée Kaffee, Simon Razniewski, and Aidan Hogan, editors, Proceedings of the 2nd Wikidata Workshop (Wikidata 2021) co-located with (ISWC 2021), volume 2982 of CEUR Workshop Proceedings. CEUR-WS.org, October 2021. [ .pdf ]
The paradigm shift from cash-based to accrual accounting in the public  nances of Austrian municipalities as of 2020, together with the availability of uniform spending data from Austria provides an ideal environment to research the potential of Wikidata for improving awareness of public  nance information. The importance of publicly available municipal  nance information is of signi cant interest for citizens to ensure trust in public spending and governance at a local level. It is all the more surprising that such spending data is hardly available. The present paper is a  rst push towards integrating comparable municipal  nance data into Wikidata. Our analysis reveals a lack of joint, standardized representation of common public spending data. Thus we have begun seeding Wikidata with a uni ed corpus of  nance data from 379 Austrian municipalities by batch uploading, re-using already existing properties. Our approach is a  rst step towards the question whether and how Wikipedia and Wikidata could serve as spaces for information on public  nances.
[HCP21] Giray Havur, Cristina Cabanillas, and Axel Polleres. BRANCH: An ASP systems benchmark for resource allocation in business processes. In Proceedings of the Best Dissertation Award, Doctoral Consortium, and Demonstration & Resources Track at BPM 2021, volume 2973 of CEUR Workshop Proceedings, pages 176--180. CEUR-WS.org, September 2021. [ .pdf ]
The goal of BRANCH is to benchmark Answer Set Programming (ASP) systems to test their performance when dealing with the task of automatically allocating resources to business process activities. Like many other scheduling problems, the allocation of resources and starting times to process activities is a challenging optimization problem, yet it is a crucial step for an optimal execution of the processes. BRANCH has been designed as a configurable benchmark equipped with instance generators that produce problem instances of different size and hardness with respect to adjustable parameters. This application-oriented benchmark supports the BPM community to find the ASP systems and implementationsthat perform better in solving the resource allocation problem.
[HBC+21] Aidan Hogan, Eva Blomqvist, Michael Cochez, Claudia d'Amato, Gerard de Melo, Claudio Gutierrez, José Emilio Labra Gayo, Sabrina Kirrane, Sebastian Neumaier, Axel Polleres, Roberto Navigli, Axel-Cyrille Ngonga Ngomo, Sabbir M. Rashid, Anisa Rula, Lukas Schmelzeisen, Juan Sequeda, Steffen Staab, and Antoine Zimmermann. Knowledge graphs. ACM Computing Surveys (CSUR), 54(4):1--37, July 2021. Extended pre-print available at https://arxiv.org/abs/2003.02320. [ DOI ]
In this paper we provide a comprehensive introduction to knowledge graphs, which have recently garnered significant attention from both industry and academia in scenarios that require exploiting diverse, dynamic, large-scale collections of data. After some opening remarks, we motivate and contrast various graph-based data models, as well as languages used to query and validate knowledge graphs. We explain how knowledge can be represented and extracted using a combination of deductive and inductive techniques. We conclude with high-level future research directions for knowledge graphs.
[FKP21] Erwin Filtz, Sabrina Kirrane, and Axel Polleres. The linked legal data landscape. linking legal data across different countries. Artificial Intelligence and Law, 29:485–--539, February 2021. [ DOI | .pdf ]
The European Union is working towards harmonizing legislation across Europe, in order to improve cross-border interchange of legal information. This goal is supported for instance via standards such as the European Law Identifier (ELI) and the European Case Law Identifier (ECLI), which provide technical specifications for Web identifiers and suggestions for vocabularies to be used to describe metadata pertaining to legal documents in a machine readable format. Notably, these ECLI and ELI metadata standards adhere to the RDF data format which forms the basis of Linked Data, and therefore have the potential to form a basis for a pan-European legal Knowledge Graph. Unfortunately, to date said specifications have only been partially adopted by EU member states. In this paper we describe a methodology to transform the existing legal information system used in Austria to such a legal knowledge graph covering different steps from modeling national specific aspects, to population, and finally the integration of legal data from other countries through linked data. We demonstrate the usefulness of this approach by exemplifying practical use cases from legal information search, which are not possible in an automated fashion so far.
[AAM+21] Amr Azzam, Christian Aebeloe, Gabriela Montoya, Ilkcan Keles, Axel Polleres, and Katja Hose. WiseKG: Balanced Access to Web Knowledge Graphs. In Proceedings of the Web Conference 2021, pages 1422–--1434, Ljubljana, Slovenia, 2021. ACM / IW3C2. [ DOI | .pdf ]
SPARQL query services that balance processing between clients and servers become more and more essential to handle the increasing load for open and decentralized knowledge graphs on the Web. To this end, Linked Data Fragments (LDF) have introduced a foundational framework that has sparked research exploring a spectrum of potential Web querying interfaces in between server-side query processing via SPARQL endpoints and client-side query processing of data dumps. Current proposals in between typically suffer from imbalanced load on either the client or the server. In this paper, to the best of our knowledge, we present the first work that combines both client-side and server-side query optimization techniques in a truly dynamic fashion: we introduce WiseKG, a system that employs a cost model that dynamically delegates the load between servers and clients by combining client-side processing of shipped partitions with efficient server-side processing of star-shaped sub-queries, based on current server workload and client capabilities. Our experiments show that WiseKG significantly outperforms state-of-the-art solutions in terms of average total query execution time per client, while at the same time decreasing network traffic and increasing server-side availability.

2020


[WMNP20] Thomas Weber, Johann Mitlöhner, Sebastian Neumaier, and Axel Polleres. Odarchive - creating an archive for structured data from open data portals. In Proceedings of the 19th International Semantic Web Conference (ISWC 2020), volume 12507 of Lecture Notes in Computer Science (LNCS), pages 311--327, Virtual Conference (Athens, Greece), November 2020. Springer. [ DOI | .pdf ]
We present ODArchive, a large corpus of structured data collected from over 260 Open Data portals worldwide, alongside with curated, integrated metadata. Furthermore we enrich the harvested datasets by heuristic annotations using the type hierarchies in existing Knowledge Graphs. We both (i) present the underlying distributed architecture to scale up regular harvesting and monitoring changes on these portals, and (ii) make the corpus available via different APIs. Moreover, we (iii) analys the characteristics of tabular data within the corpus. Our APIs can be used to regularly run such analyses or to reproduce experiments from the literature that have worked on static, not publicly available corpora.
[KSF+20] Sabrina Kirrane, Marta Sabou, Javier D. Fernández, Francesco Osborne, Cécile Robin, Paul Buitelaar, Enrico Motta, and Axel Polleres. A decade of semantic web research through the lenses of a mixed methods approach. Semantic Web -- Interoperability, Usability, Applicability (SWJ), 11(6):979--1005, October 2020. [ DOI | http ]
The identification of research topics and trends is an important scientometric activity, as it can help guide the direction of future research. In the Semantic Web area, initially topic and trend detection was primarily performed through qualitative, top-down style approaches, that rely on expert knowledge. More recently, data-driven, bottom-up approaches have been proposed that offer a quantitative analysis of the evolution of a research domain. In this paper, we aim to provide a broader and more complete picture of Semantic Web topics and trends by adopting a mixed methods methodology, which allows for the combined use of both qualitative and quantitative approaches. Concretely, we build on a qualitative analysis of the main seminal papers, which adopt a top-down approach, and on quantitative results derived with three bottom-up data-driven approaches (Rexplore, Saffron, PoolParty), on a corpus of Semantic Web papers published between 2006 and 2015. In this process, we both use the latter for “fact-checking” on the former and also to derive key findings in relation to the strengths and weaknesses of top-down and bottom up approaches to research topic identification. Although we provide a detailed study on the past decade of Semantic Web research, the findings and the methodology are relevant not only for our community but beyond the area of the Semantic Web to other research fields as well.
[PFNP20] Jan Portisch, Omaima Fallatah, Sebastian Neumaier, and Axel Polleres. Challenges of linking organizational information in open government data to knowledge graphs. In 22nd International Conference on Knowledge Engineering and Knowledge Management (EKAW 2020), volume 12387 of Lecture Notes in Computer Science (LNCS), pages 271--286, Bozen-Bolzano, Italy, September 2020. Springer. [ DOI | http ]
Open Government Data (OGD) is being published by various public administration organizations around the globe. Within the metadata of OGD data catalogs the publishing organizations (1) are not uniquely and unambiguously identifiable and, even worse, (2) change over time, by public administration units being merged or restructured. In order to enable fine-grained analyzes or searches on Open Government Data on the level of publishing organizations, linking those from OGD portals to publicly available knowledge graphs (KGs) such as Wikidata and DBpedia seems like an obvious solution. Still, as we show in this position paper, organization linking faces significant challenges, both in terms of available (portal) metadata and KGs in terms of data quality and completeness. We herein specifically highlight five main challenges, namely regarding (1) temporal changes in organizations and in the portal metadata, (2) lack of a base ontology for describing organizational structures and changes in public knowledge graphs, (3) metadata and KG data quality, (4) multilinguality, and (5) disambiguating public sector organizations. Based on available OGD portal metadata from the Open Data Portal Watch, we provide an in-depth analysis of these issues, make suggestions for concrete starting points on how to tackle them along with a call to the community to jointly work on these open challenges.
[HFKP20] Armin Haller, Javier D. Fernández, Maulik R. Kamdar, and Axel Polleres. What are links in linked open data? a characterization and evaluation of links between knowledge graphs on the web. ACM Journal of Data and Information Quality (JDIQ), 2(2):1–--34, May 2020. Pre-print available at https://epub.wu.ac.at/7193/. [ DOI ]
Linked Open Data promises to provide guiding principles to publish interlinked knowledge graphs on the Web in the form of findable, accessible, interoperable and reusable datasets. We argue that while as such, Linked Data may be viewed as a basis for instantiating the FAIR principles, there are still a number of open issues that cause significant data quality issues even when knowledge graphs are published as Linked Data. Firstly, in order to define boundaries of single coherent knowledge graphs within Linked Data, a principled notion of what a dataset is, or, respectively, what links within and between datasets are, has been missing. Secondly, we argue that in order to enable FAIR knowledge graphs, Linked Data misses standardised findability and accessability mechanism, via a single entry link. In order to address the first issue, we (i) propose a rigorous definition of a naming authority for a Linked Data dataset (ii) define different link types for data in Linked datasets, (iii) provide an empirical analysis of linkage among the datasets of the Linked Open Data cloud, and (iv) analyse the dereferenceability of those links. We base our analyses and link computations on a scalable mechanism implemented on top of the HDT format, which allows us to analyse quantity and quality of different link types at scale.
[FKPS20] Javier D. Fernández, Sabrina Kirrane, Axel Polleres, and Simon Steyskal. HDTcrypt: Compression and Encryption of RDF Datasets. Semantic Web -- Interoperability, Usability, Applicability (SWJ), 11(2):337--359, 2020. [ DOI | http ]
The publication and interchange of RDF datasets online has experienced significant growth in recent years, promoted by different but complementary efforts, such as Linked Open Data, the Web of Things and RDF stream processing systems. However, the current Linked Data infrastructure does not cater for the storage and exchange of sensitive or private data. On the one hand, data publishers need means to limit access to confidential data (e.g. health, financial, personal, or other sensitive data). On the other hand, the infrastructure needs to compress RDF graphs in a manner that minimises the amount of data that is both stored and transferred over the wire. In this paper, we demonstrate how HDT - a compressed serialization format for RDF - can be extended to cater for supporting encryption. We propose a number of different graph partitioning strategies and discuss the benefits and tradeoffs of each approach.
[HP20] Armin Haller and Axel Polleres. Are we better off with just one ontology on the web? 11(1):87--99, January 2020. SWJ 10-years special issue. [ DOI | http ]
Ontologies have been used on the Web to enable semantic interoperability between parties that publish information independently of each other. They have also played an important role in the emergence of Linked Data. However, many ontologies on the Web do not see much use beyond their initial deployment and purpose in one dataset and therefore should rather be called what they are -- (local) schemas, which per se do not provide any interoperable semantics. Only few ontologies are truly used as a shared conceptualization between different parties, mostly in controlled environments such as the BioPortal. In this paper, we discuss open challenges relating to true re-use of ontologies on the Web and raise the question: “are we better off with just one ontology on the Web?”
[PKF+20] Axel Polleres, Maulik Rajendra Kamdar, Javier D. Fernández, Tania Tudorache, and Mark A. Musen. A more decentralized vision for linked data. 11(1):101--113, January 2020. SWJ 10-years special issue. [ DOI | http ]
In this deliberately provocative position paper, we claim that more than ten years into Linked Data there are still (too?) many unresolved challenges towards arriving at a truly machine-readable and decentralized Web of data. We take a deeper look at key challenges in usage and adoption of Linked Data from the ever-present “LOD cloud” diagram. Herein, we try to highlight and exemplify both key technical and non-technical challenges to the success of LOD, and we outline potential solution strategies. We hope that this paper will serve as a discussion basis for a fresh start towards more actionable, truly decentralized Linked Data, and as a call to the community to join forces.
[AFA+20] Amr Azzam, Javier D. Fernández, Maribel Acosta, Martin Beno, and Axel Polleres. SMART-KG: Hybrid shipping for SPARQL querying on the web. In The Web Conference 2020, Taipei,Taiwan, 2020. Pre-print available at https://epub.wu.ac.at/7428. [ DOI ]
While Linked Data (LD) provides standards for publishing (RDF) and (SPARQL) querying Knowledge Graphs (KGs) on the Web, serving, accessing and processing such open, decentralized KGs is often practically impossible, as query timeouts on publicly available SPARQL endpoints show. Alternative solutions such as Triple Pattern Fragments (TPF) attempt to tackle the problem of availability by pushing query processing workload to the client side, but suffer from unnecessary transfer of irrelevant data on complex queries with large intermediate results. In this paper we present smart-KG, a novel approach to share the load between servers and clients, while significantly reducing data transfer volume, by combining TPF with shipping compressed KG partitions. Our evaluations show that outperforms state-of-the-art client-side solutions and increases server-side availability towards more cost-effective and balanced hosting of open and decentralized KGs.
[FNS+20] Erwin Filtz, María Navas-Loro, Cristiana Santos, Axel Polleres, and Sabrina Kirrane. Events matter: Extraction of events from court decisions. In Villata Serena, Jakub Harasta, and Petr Kremen, editors, Legal Knowledge and Information Systems - JURIX 2020: The Thirty-third Annual Conference, Brno, Czech Republic, December 9-11, 2020, volume 334 of Frontiers in Artificial Intelligence and Applications, pages 33--42. IOS Press, 2020. [ DOI | http ]
The analysis of court decisions and associated events is part of the daily life of many legal practitioners. Unfortunately, since court decision texts can often be long and complex, bringing all events relating to a case in order, to understand their connections and durations is a time-consuming task. Automated court decision timeline generation could provide a visual overview of what happened throughout a case by representing the main legal events, together with relevant temporal information. Tools and technologies to extract events from court decisions however are still underdeveloped. To this end, in the current paper we compare the effectiveness of three different extraction mechanisms, namely deep learning, conditional random fields, and rule-based method, to facilitate automated extraction of events and their components (i.e., the event type, who was involved, and when it happened). In addition, we provide a corpus of manually annotated decisions of the European Court of Human Rights, which shall serve as a gold standard not only for our own evaluation, but also for the research community for comparison and further experiments.
[MDE+20] Bernhard Moser, Georg Dorffner, Thomas Eiter, Wolfgang Faber, Günther Klambauer, Robert Legenstein, Bernhard Nessler, Axel Polleres, and Stefan Woltran. Österreichische AI Strategie aus Sicht der Wissenschaft: Forderungen der ASAI zu einer konkreten AI Strategie in Österreich. OCG Journal, 01/2020:14--17, 2020. Invited article (in German). [ http ]
Die letzte Regierungsumbildung und COVID-19 haben zu einer Unterbrechung des 2018 initiierten Prozesses AIM AT 2030 für eine AI Strategie für Österreich geführt. Daher wurden in den letzten beiden Jahren notwenige Akzente insbesondere in der Forschungsförderung im Bereich des Machine Learning (ML) und der Artifical Intelligence (AI), aber darüber hinaus auch in weiteren für Österreichs Digitalisierung relevanten Forschungsfeldern, im Gegensatz zum europäischen Umfeld, nicht gesetzt

2019


[VFP+19] Svitlana Vakulenko, Javier Fernández, Axel Polleres, Maarten de Rijke, and Michael Cochez. Message passing for complex question answering over knowledge graphs. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management (CIKM2019, pages 1431--1440, Beijing, China, November 2019. ACM. [ DOI | http ]
Question answering over knowledge graphs (KGQA) has evolved from simple single-fact questions to complex questions that require graph traversal and aggregation. We propose a novel approach for complex KGQA that uses unsupervised message passing, which propagates confidence scores obtained by parsing an input question and matching terms in the knowledge graph to a set of possible answers. Our approach outperforms the state-of-the-art on the LC-QuAD benchmark. Moreover, our error analysis reveals correct answers missing from the benchmark dataset and inconsistencies in the DBpedia knowledge graph.
[PPB+19] Harshvardhan J. Pandit, Axel Polleres, Bert Bos, Rob Brennan, Bud Bruegger, Fajar J. Ekaputra, Javier D. Fernández, Roghaiyeh Gachpaz Hamed, Elmar Kiesling, Mark Lizar, Eva Schlehahn, Simon Steyskal, and Rigo Wenning. Creating a vocabulary for data privacy -- the first-year report of Data Privacy Vocabularies and Controls Community Group (DPVCG). In On the Move to Meaningful Internet Systems: OTM 2019 Conferences Confederated International Conferences: CoopIS, ODBASE, C&TC, volume 11877 of Lecture Notes in Computer Science (LNCS), pages 714--730, Rhodes, Greece, October 2019. Springer. 18th International Conference on Ontologies, DataBases, and Applications of Semantics (ODBASE 2019). [ DOI | .pdf ]
Managing privacy and understanding handling of personal data has turned into a fundamental right, at least within the European Union, with the General Data Protection Regulation (GDPR) being enforced since May 25th 2018. This has led to tools and services that promise compliance to GDPR in terms of consent management and keeping track of personal data being processed. The information recorded within such tools, as well as that for compliance itself, needs to be interoperable to provide sufficient transparency in its usage. Additionally, interoperability is also necessary towards addressing the right to data portability under GDPR as well as creation of user-configurable and manageable privacy policies. We argue that such interoperability can be enabled through agreement over vocabularies using linked data principles. The W3C Data Privacy Vocabulary and Controls Community Group (DPVCG) was set up to jointly develop such vocabularies towards interoperability in the context of data privacy. This paper presents the resulting Data Privacy Vocabulary (DPV), along with a discussion on its potential uses, and an invitation for feedback and participation.
[FKPW19] Erwin Filtz, Sabrina Kirrane, Axel Polleres, and Gerhard Wohlgenannt. Exploiting EuroVoc's hierarchical structure for classifying legal documents. In On the Move to Meaningful Internet Systems: OTM 2019 Conferences Confederated International Conferences: CoopIS, ODBASE, C&TC, volume 11877 of Lecture Notes in Computer Science (LNCS), pages 164--181, Rhodes, Greece, October 2019. Springer. 27th International Conference on COOPERATIVE INFORMATION SYSTEMS (CoopIS 2019). [ DOI | .pdf ]
Multi-label document classification is a challenging problem because of the potentially huge number of classes. Furthermore, real-world datasets often exhibit a strongly varying number of labels per document, and a power-law distribution of those class labels. Multi-label classification of legal documents is additionally complicated by long document texts and domain-specific use of language. In this paper we are using different approaches to compare the performance of text classification algorithms on existing datasets and corpora of legal documents, and contrast those experiments with results on general-purpose multi-label text classification datasets. Moreover, for the EUR-Lex legal datasets, we show that exploiting the hierarchy of the EuroVoc thesaurus helps to improve classification performance by reducing the number of potential classes while retaining the informative value of the classification itself.
[KFP+19] Maulik R. Kamdar, Javier D. Fernández, Axel Polleres, Tania Tudorache, and Mark A. Musen. Enabling web-scale data integration in biomedicine through linked open data. npj Digital Medicine, 2(1):90, September 2019. [ DOI | http ]
The biomedical data landscape is fragmented with several isolated, heterogeneous data and knowledge sources, which use varying formats, syntaxes, schemas, and entity notations, existing on the Web. Biomedical researchers face severe logistical and technical challenges to query, integrate, analyze, and visualize data from multiple diverse sources in the context of available biomedical knowledge. Semantic Web technologies and Linked Data principles may aid toward Web-scale semantic processing and data integration in biomedicine. The biomedical research community has been one of the earliest adopters of these technologies and principles to publish data and knowledge on the Web as linked graphs and ontologies, hence creating the Life Sciences Linked Open Data (LSLOD) cloud. In this paper, we provide our perspective on some opportunities proffered by the use of LSLOD to integrate biomedical data and knowledge in three domains: (1) pharmacology, (2) cancer research, and (3) infectious diseases. We will discuss some of the major challenges that hinder the wide-spread use and consumption of LSLOD by the biomedical research community. Finally, we provide a few technical solutions and insights that can address these challenges. Eventually, LSLOD can enable the development of scalable, intelligent infrastructures that support artificial intelligence methods for augmenting human intelligence to achieve better clinical outcomes for patients, to enhance the quality of biomedical research, and to improve our understanding of living systems.
[HSP+19] Giray Havur, Simon Steyskal, Oleksandra Panasiuk, Anna Fensel, Víctor Mireles, Tassilo Pellegrini, Thomas Thurner, Axel Polleres, and Sabrina Kirrane. Automatic license compatibility checking. In Mehwish Alam, Ricardo Usbeck, Tassilo Pellegrini, Harald Sack, and York Sure-Vetter, editors, Proceedings of the Posters and Demo Track of the 15th International Conference on Semantic Systems (SEMANTiCS 2019), volume 2451 of CEUR Workshop Proceedings, Karlsruhe, Germany, September 2019. CEUR-WS.org. [ .pdf ]
In this paper, we introduce the Data Licenses Clearance Center system, which not only provides a library of machine readable licenses but also allows users to compose their own license. A demonstrator canbe found at https://www.dalicc.net/
[BFKP19] Martin Beno, Erwin Filtz, Sabrina Kirrane, and Axel Polleres. Doc2RDFa: Semantic annotation for web documents. In Mehwish Alam, Ricardo Usbeck, Tassilo Pellegrini, Harald Sack, and York Sure-Vetter, editors, Proceedings of the Posters and Demo Track of the 15th International Conference on Semantic Systems (SEMANTiCS 2019), volume 2451 of CEUR Workshop Proceedings, Karlsruhe, Germany, September 2019. CEUR-WS.org. [ .pdf ]
Ever since its conception, the amount of data published on the world-wide web has been rapidly growing to the point where it has become an importantsource of both general and domain specific information. However, the majority of documents published online are not machine readable by default. Many researchers believe that the answer to this problem is to semantically annotate these documents, and thereby contribute to the linked “Web of Data”. Yet, the process of annotating web documents remains an open challenge. While some efforts towards simplifying this process have been made in the recent years, there is still a lack of semantic content creation tools that integrate well with information worker toolsets. Towards this end, we introduce Doc2RDFa, an HTML rich text processor with the ability to automatically and manually annotate domain-specific content.
[PDDP19] Harshvardhan J. Pandit, Javier D.Fernández, Christophe Debruyne, and Axel Polleres. Towards cataloguing potential derivations of personal data. In The Semantic Web: ESWC 2019 Satellite Events, Portorož, Slovenia, June 2019. Poster abstract. [ DOI ]
The General Data Protection Regulation (GDPR) has established far reaching rights to transparency and accountability in the context of personal data usage and collection. While GDPR obligations clearly apply to data explicitly obtained from or provided by data subjects, the situation becomes less clear for data derived from existing personal data. In this paper, we address this issue with an approach for identifying potential data derivations using a rule-based formalisation of examples documented in the literature using Semantic Web standards. Our approach is useful for identifying risks of potential data derivations from given data and provides a starting point towards an open catalogue to document known derivations for the privacy community, but also for data controllers, in order to raise awareness in which sense their data collections could become problematic.
[BDE+19] Horst Bischof, Georg Dorffner, Alexander Egyed, Wolfgang Faber, Bernhard Moser, Bernhard Nessler, Axel Polleres, Stefan Woltran, Gerhard Friedrich, and others. Positionspapier zur österreichischen Artificial Intelligence Strategie AIM AT 2030, June 2019. Forum Forschung der uniko (Österreichischen Universitätenkonferenz), available at https://uniko.ac.at/positionen/. [ .pdf ]
Die Universitäten begrüßen die Initiative zur Erstellung einer österreichischen AI Strategie. Die Universitäten sind wesentlicher Träger der AI‐Forschung und Ressource für maßgebliche Kompetenzen in diesem Gebiet. Förderung der AI Forschung in Österreich muss maßgeblich unter Involvierung der österreichischen Universitäten passieren. Die österreichischen Universitäten schlagen eine Reihe konkreter Maßnahmen vor, um die Kompetenz Österreichs im Bereich der AI bzw. des maschinellen Lernens (ML) weiter zu stärken und vor allem den Forschungsstandort Österreich in der internationalen AI‐Community, im speziellen im Verbund der europäischen Netzwerke ELLIS (ellis.eu) und CLAIRE (www.claire‐ai.org), zu verankern. Die vorgeschlagenen Maßnahmen gliedern sich in folgende 3 Kernbereiche: 1. Internationale Vernetzung, 2. Nationale Vernetzung, 3. Schaffung und Ausbau der Infrastruktur. Erarbeitet im Rahmen des Forums Forschung der uniko (Österreichischen Universitätenkonferenz) von den Unterzeichnern der beiliegenden AI‐Deklaration der akademischen Arbeitsgruppe zur Unterstützung der Österreichischen AI‐Strategie
[NP19] Sebastian Neumaier and Axel Polleres. Enabling spatio-temporal search in open data. Journal of Web Semantics (JWS), 55:21--36, March 2019. [ DOI | http ]
Intuitively, most datasets found on governmental Open Data portals are organized by spatio-temporal criteria, that is, single datasets provide data for a certain region, valid for a certain time period. Likewise, for many use cases (such as, for instance, data journalism and fact checking) a pre-dominant need is to scope down the relevant datasets to a particular period or region. Rich spatio-temporal annotations are therefore a crucial need to enable semantic search for (and across) Open Data portals along those dimensions, yet -- to the best of our knowledge -- no working solution exists. To this end, in the present paper we (i) present a scalable approach to construct a spatio-temporal knowledge graph that hierarchically structures geographical as well as temporal entities, (ii) annotate a large corpus of tabular datasets from open data portals with entities from this knowledge graph, and (iii) enable structured, spatio-temporal search and querying over Open Data catalogs, both via a search interface as well as via a SPARQL endpoint, available at http://data.wu.ac.at/odgraphsearch/
[FUPK19] Javier D. Fernandez, Jürgen Umbrich, Axel Polleres, and Magnus Knuth. Evaluating query and storage strategies for RDF archives. Semantic Web -- Interoperability, Usability, Applicability (SWJ), 10(2):247--291, 2019. [ http ]
There is an emerging demand on efficiently archiving and (temporal) querying different versions of evolving semantic Web data. As novel archiving systems are starting to address this challenge, foundations/standards for benchmarking RDF archives are needed to evaluate its storage space efficiency and the performance of different retrieval operations. To this end, we provide theoretical foundations on the design of data and queries to evaluate emerging RDF archiving systems. Then, we instantiate these foundations along a concrete set of queries on the basis of a real-world evolving datasets. Finally, we perform an extensive empirical evaluation of current archiving techniques and querying strategies, which is meant to serve as a baseline of future developments on querying archives of evolving RDF data.
[BDPP19] Piero Andrea Bonatti, Stefan Decker, Axel Polleres, and Valentina Presutti. Knowledge Graphs: New Directions for Knowledge Representation on the Semantic Web (Dagstuhl Seminar 18371). Dagstuhl Reports, 8(9):29--111, 2019. [ DOI | http ]
The increasingly pervasive nature of the Web, expanding to devices and things in everyday life, along with new trends in Artificial Intelligence call for new paradigms and a new look on Knowledge Representation and Processing at scale for the Semantic Web. The emerging, but still to be concretely shaped concept of “Knowledge Graphs” provides an excellent unifying metaphor for this current status of Semantic Web research. More than two decades of Semantic Web research provides a solid basis and a promising technology and standards stack to interlink data, ontologies and knowledge on the Web. However, neither are applications for Knowledge Graphs as such limited to Linked Open Data, nor are instantiations of Knowledge Graphs in enterprises — while often inspired by — limited to the core Semantic Web stack. This report documents the program and the outcomes of Dagstuhl Seminar 18371 “Knowledge Graphs: New Directions for Knowledge Representation on the Semantic Web”, where a group of experts from academia and industry discussed fundamental questions around these topics for a week in early September 2018, including the following: what are knowledge graphs? Which applications do we see to emerge? Which open research questions still need be addressed and which technology gaps still need to be closed?
[NLFD+19] María Navas-Loro, Erwin Filtz, Víctor Rodríguez Doncel, Axel Polleres, and Sabrina Kirrane. TempCourt: Evaluation of temporal taggers on a new corpus of court decisions. Knowledge Engineering Review, 34:E24, 2019. [ DOI ]
The extraction and processing of temporal expressions in textual documents has been extensively studied in several domains, however for the legal domain it remains an open challenge. This is possibly due to the scarcity of corpora in the domain and the particularities found in legal document that are highlighted in this paper. Considering the pivotal role played by temporal information when it comes to analyzing legal cases, this paper presents TempCourt, a corpus of manually annotated temporal expressions in 30 judgments from the European Court of Human Rights, the European Court of Justice and the United States Supreme Court. The corpus contains two different temporal annotation sets that adhere to the TimeML standard, the first one capturing all temporal expressions and the second dedicated to temporal expressions that are relevant for the case under judgment (thus excluding dates of previous court decisions). The proposed gold standards are subsequently used to compare ten state-of-the-art cross-domain temporal taggers, and to identify not only the limitations of cross-domain temporal taggers but also limitations of the TimeML standard when applied to legal documents. Finally, the paper identifies the need for dedicated resources and the adaptation of existing tools, and specific annotation guidelines that can be adapted to different types of legal documents.

2018


[PSPM18a] Peter F. Patel-Schneider, Axel Polleres, and David Martin. Comparative preferences in SPARQL. In Catherine Faron-Zucker, Chiara Ghidini, Amedeo Napoli, and Yannick Toussaint, editors, Knowledge Acquisition, Modeling and Management (EKAW2018) -- 21st International Conference, volume 11313 of Lecture Notes in Computer Science (LNCS), pages 289--305, Nancy, France, November 2018. Springer. [ DOI | .pdf ]
Sometimes one does not want all the solutions to a query but instead only those that are most desirable according to user-specified preferences. If a user-specified preference relation is acyclic then its specification and meaning are straightforward. In many settings, however, it is valuable to support preference relations that are not acyclic and that might not even be transitive, in which case though their handling involves some open questions. We discuss a definition of desired solutions for arbitrary preference relations and show its desirable properties. We modify a previous extension to SPARQL for simple preferences to correctly handle any preference relation and provide translations of this extension back into SPARQL that can compute the desired solutions for all preference relations that are acyclic or transitive. We also propose an additional extension that returns solutions at multiple levels of desirability, which adds additional expressiveness over prior work. However, for the latter we conjecture that an effective translation to a single (non-recursive) SPARQL query is not possible.
[VdRC+18] Svitlana Vakulenko, Maarten de Rijke, Michael Cochez, Vadim Savenkov, and Axel Polleres. Measuring semantic coherence of a conversation. In Denny Vrandečić, Kalina Bontcheva, Mari Carmen Suárez-Figueroa, Valentina Presutti, Irene Celino, Marta Sabou, Lucie-Aimée Kaffee, and Elena Simperl, editors, Proceedings of the 17th International Semantic Web Conference (ISWC 2018), volume 11136 of Lecture Notes in Computer Science (LNCS), pages 634--651, Monterey, CA, October 2018. Springer. [ DOI | .pdf ]
Conversational systems have become increasingly popular as a way for humans to interact with computers. To be able to provide intelligent responses, conversational systems must correctly model the structure and semantics of a conversation. In this paper, we introduce the task of measuring semantic (in)coherence in a conversation with respect to background knowledge, which relies on the identification of semantic relations between concepts introduced during a conversation. We propose and evaluate graph-based and machine learning approaches for measuring semantic coherence using knowledge graphs, their vector space embeddings and word embedding models, as sources of background knowledge. We demonstrate in our evaluation results how these approaches are able to uncover different coherence patterns in conversations on the Ubuntu Dialogue Corpus.
[BBD+18] Piero Bonatti, Bert Bos, Stefan Decker, Javier D. Fernández, Sabrina Kirrane, Vassilios Peristeras, Axel Polleres, and Rigo Wenning. Data privacy vocabularies and controls: Semantic web for transparency and privacy. In Semantic Web for Social Good Workshop (SWSG) co-located with ISWC2018, volume 2182 of CEUR Workshop Proceedings. CEUR-WS.org, October 2018. [ .pdf ]
Managing Privacy and understanding the handling of personal data has turned into a fundamental right - at least for Europeans - since May 25th with the coming into force of the General Data Protection Regulation. Yet, whereas many different tools by different vendors promise companies to guarantee their compliance to GDPR in terms of consent management and keeping track of the personal data they handle in their processes, interoperabilty between such tools as well uniform user facing interfaces will be needed to enable true transparency, user-configurable and -manageable privacy policies and data portability (as also - implicitly - promised by GDPR). We argue that such interoperability can be enabled by agreed upon vocabularies and Linked Data.
[PKF+18] Axel Polleres, Maulik R. Kamdar, Javier D. Fernández, Tania Tudorache, and Mark A. Musen. A more decentralized vision for linked data. In Decentralizing the Semantic Web (Workshop of ISWC2018), volume 2165 of CEUR Workshop Proceedings. CEUR-WS.org, October 2018. An extended technical report of this paper is availavble at http://epub.wu.ac.at/6371/. [ .pdf ]
We claim that ten years into Linked Data there are still many unresolved challenges towards arriving at a truly machine-readable and decentralized Web of data. With a focus on the the biomedical domain---currently, one of the most promising “adopters” of Linked Data, we highlight and exemplify key technical and non-technical challenges to the success of Linked Data, and we outline potential solution strategies.
[PSPM18b] Peter F. Patel-Schneider, Axel Polleres, and David Martin. Fixing comparative preferences for SPARQL. In ISWC 2018 Posters & Demos, October 2018. Poster abstract. [ .pdf ]
Preferences have been part of the goal of the Semantic Web from its inception, but are not currently part of any Semantic Web standards, such as SPARQL. Several proposals have been made to add comparative preferences to SPARQL. Comparative preferences are based on comparing solutions to a query and eliminating ones that come out worse in the comparison, as in searching for gas stations and eliminating any for which there is a closer station serving the same brand of gasoline. The proposals each add an extra construct to SPARQL filtering out non-preferred query solutions. Their preference constructs are of different expressive power but they can each be thought of as providing a skyine operator. In this poster we fix several technical problems of these existing proposals.
[NSP18] Sebastian Neumaier, Vadim Savenkov, and Axel Polleres. Geo-semantic labelling of open data. In Anna Fensel, Victor de Boer, Tassilo Pellegrini, Elmar Kiesling, Bernhard Haslhofer, Laura Hollink, and Alexander Schindler, editors, Proceedings of the 14th International Conference on Semantic Systems 10th (SEMANTiCS), volume 137 of Procedia Computer Science, pages 9--20, Vienna, Austria, September 2018. Elsevier. [ .pdf ]
In the past years Open Data has become a trend among governments to increase transparency and public engagement by opening up national, regional, and local datasets. However, while many of these datasets come in semi-structured file formats, they use different schemata and lack geo-references or semantically meaningful links and descriptions of the corresponding geo-entities. We aim to address this by detecting and establishing links to geo-entities in the datasets found in Open Data catalogs and their respective metadata descriptions and link them to a knowledge graph of geo-entities. This knowledge graph does not yet readily exist, though, or at least, not a single one: so, we integrate and interlink several datasets to construct our (extensible) base geo-entities knowledge graph: (i) the openly available geospatial data repository GeoNames, (ii) the map service OpenStreetMap, (iii) country-specific sets of postal codes, and (iv) the European Union's classification system NUTS. As a second step, this base knowledge graph is used to add semantic labels to the open datasets, i.e., we heuristically disambiguate the geo-entities in CSV columns using the context of the labels and the hierarchical graph structure of our base knowledge graph. Finally, in order to interact with and retrieve the content, we index the datasets and provide a demo user interface. Currently we indexed resources from four Open Data portals, and allow search queries for geo-entities as well as full-text matches at http://data.wu.ac.at/odgraph/.
[FKP18] Erwin Filtz, Sabrina Kirrane, and Axel Polleres. Interlinking legal data. In Proceedings of the Posters and Demos Track of the 14th International Conference on Semantic Systems (SEMANTiCS2018), Vienna, Austria, September 2018. Poster abstract. [ .pdf ]
In recent years, the European Union has been working towards harmonizing legislation thus allowing for easier cross-border access to, exchange and reuse of legal information. This initiative is supported via standardization activities such as the European Law Identifier (ELI) and the European Case Law Identifier (ECLI), which provide technical specifications for web identifiers and vocabularies that can be used to describe metadata pertaining to legal documents. Unfortunately, to date said initiative have only been partially adopted by EU member states, possibly due to the manual effort involved in curating the metadata. As a first step towards streamlining this process, we propose a cross-jurisdictional legal framework that demonstrates how legal information stored in national databases can be linked at a European level using Natural Language Processing together with external knowledgebases to automatically populate the knowledge base.
[HSP+18] Giray Havur, Simon Steyskal, Oleksandra Panasiuk, Anna Fensel, Víctor Mireles, Tassilo Pellegrini, Thomas Thurner, Axel Polleres, and Sabrina Kirrane. Dalicc: A framework for publishing and consuming data assets legally. In Proceedings of the Posters and Demos Track of the 14th International Conference on Semantic Systems (SEMANTiCS2018), Vienna, Austria, September 2018. Poster abstract.
In this paper we introduce the Data Licenses Clearance Center, which provides a library of machine readable standard licenses and allows users to compose arbitrary licenses. In addition, the system supports the clearance of rights issues by providing users with information about the equivalence, similarity and compatibility of licenses. A beta version of the system is available at https://www.dalicc.net/.
[APK18] Amr Azzam, Axel Polleres, and Sabrina Kirrane. Towards making distributed RDF processing FLINKer. In 4th International Conference on Big Data Innovations and Applications (Innovate-Data), pages 9--16, Barcelona, Spain, August 2018. IEEE. [ DOI | http ]
n the last decade, the Resource Description Framework (RDF) has become the de-facto standard for publishing semantic data on the Web. This steady adoption has led to a significant increase in the number and volume of available RDF datasets, exceeding the capabilities of traditional RDF stores. This scenario has introduced severe big semantic data challenges when it comes to managing and querying RDF data at Web scale. Despite the existence of various off-the-shelf Big Data platforms, processing RDF in a distributed environment remains a significant challenge. In this position paper, based on an in-depth analysis of the state of the art, we propose to manage large RDF datasets in Flink, a well-known scalable distributed Big Data processing framework.
[FMPPR18] Javier D. Fernandez, Miguel A. Martínez-Prieto, Axel Polleres, and Julian Reindorf. HDTQ: Managing RDF datasets in compressed space. In Aldo Gangemi, Roberto Navigli, Maria-Esther Vidal, Pascal Hitzler, Raphaël Troncy, Laura Hollink, Anna Tordai, and Mehwish Alam, editors, Proceedings of the 15th European Semantic Web Conference (ESWC2018), volume 10843 of Lecture Notes in Computer Science (LNCS), pages 191--208, Heraklion, Greece, June 2018. Springer. [ DOI | .pdf ]
HDT (Header-Dictionary-Triples) is a well-known compressed representation of RDF data that supports retrieval features without prior decompression. Yet, RDF datasets often contain additional graph information, such as the origin, version or validity time of a triple. Traditional HDT is not capable of handling this additional parameter(s). This work introduces HDTQ (HDT Quads), an extension of HDT, which is able to represent quadruples (or quads) while still being highly compact and . Two approaches of this extension, Annotated Triples and Annotated Graphs, are introduced and their performance is compared to the leading open-source RDF stores on the market, Results show that HDTQ achieves the best compression rates and is a competitive alternative to well-established systems.
[KDD+18] Sabrina Kirrane, Javier D.Fernández, Wouter Dullaert, Uros Milosevic, Axel Polleres, Piero Bonatti, Rigo Wenning, Olha Drozd, and Philip Raschke. A scalable consent, transparency and compliance architecture. In Aldo Gangemi, Anna Lisa Gentile, Andrea Giovanni Nuzzolese, Sebastian Rudolph, Maria Maleshkova, Heiko Paulheim, Jeff Z Pan, and Mehwish Alam, editors, The Semantic Web: ESWC 2018 Satellite Events, number 11155 in Lecture Notes in Computer Science (LNCS), pages 131--136, Heraklion, Greece, June 2018. Demo abstract. [ .pdf ]
In this demo we present the SPECIAL consent, transparency and compliance system. The objective of the system is to afford data subjects more control over personal data processing and sharing, while at the same time enabling data controllers and processors to comply with consent and transparency obligations mandated by the European General Data Protection Regulation. A short promotional video can be found at https://purl.com/specialprivacy/demos/ESWC2018.
[BHK+18] Stefan Bischof, Andreas Harth, Benedikt Kämpgen, Axel Polleres, and Patrik Schneider. Enriching integrated statistical open city data by combining equational knowledge and missing value imputation. Journal of Web Semantics (JWS), 48:22--47, January 2018. [ DOI | .pdf ]
Several institutions collect statistical data about cities, regions, and countries for various purposes. Yet, while access to high quality and recent such data is both crucial for decision makers and a means for achieving transparency to the public, all too often such collections of data remain isolated and not re-usable, let alone comparable or properly integrated. In this paper we present the Open City Data Pipeline, a focused attempt to collect, integrate, and enrich statistical data collected at city level worldwide, and re-publish the resulting dataset in a re-usable manner as Linked Data. The main features of the Open City Data Pipeline are: (i) we integrate and cleanse data from several sources in a modular and extensible, always up-to-date fashion; (ii) we use both Machine Learning techniques and reasoning over equational background knowledge to enrich the data by imputing missing values, (iii) we assess the estimated accuracy of such imputations per indicator. Additionally, (iv) we make the integrated and enriched data, including links to external data sources, such as DBpedia, available both in a web browser interface and as machine-readable Linked Data, using standard vocabularies such as QB and PROV. Apart from providing a contribution to the growing collection of data available as Linked Data, our enrichment process for missing values also contributes a novel methodology for combining rule-based inference about equational knowledge with inferences obtained from statistical Machine Learning approaches. While most existing works about inference in Linked Data have focused on ontological reasoning in RDFS and OWL, we believe that these complementary methods and particularly their combination could be fruitfully applied also in many other domains for integrating Statistical Linked Data, independent from our concrete use case of integrating city data.
[BBP18] Cristina Baroglio, Olivier Boissier, and Axel Polleres, editors. Transactions on Internet Technology, Special Issue: Computational Ethics and Accountability, volume 18. ACM, 2018. Editorial. [ http ]
Computational Ethics and Accountability are becoming topics of increasing societal impact; in particular, on the one hand, in the context of recent advances in AI and machine-learning techniques, people and organizations accept decisions made for them by machines, be they buy-sell decisions, pre-filtering of applications, deciding which content users are presented, which personal data are shared and used by third parties, up to automated driving. In each of these application scenarios, where algorithms and machines support or even replace human decisions, ethical issues may arise. In the present special issue we take the stance on whether Intelligent Systems and AI themselves can help to enable accountability and transparency, thus, act as technologies to enable rather than endanger ethically compliant, accountable, and eventually sustainable computing. Multi-agent systems, Semantic Web and Agreement Technologies, and Value-Sensitive Design are just some of the research areas whose methods and results can fruitfully support business ethics and social responsibility. In this special issue, you will find a collection of articles that aim to make computational advances by approaching these challenges from different angles.
[SWM+18] Sherif Sakr, Marcin Wylot, Raghava Mutharaju, Danh Le Phuoc, and Irini Fundulaki. Linked Data: Storing, Querying, and Reasoning. Springer, 2018. Foreword by Axel Polleres.

2017


[SHS+17] Simon Sperl, Giray Havur, Simon Steyskal, Cristina Cabanillas, Axel Polleres, and Alois Haselböck. Resource utilization prediction in decision-intensive business processes. In Paolo Ceravolo, Maurice van Keulen, and Kilian Stoffel, editors, Proceedings of the 7th International Symposium on Data-driven Process Discovery and Analysis (SIMPDA 2017), volume 2016 of CEUR Workshop Proceedings, pages 128--141, Neuchâtel, Switzerland, December 2017. CEUR-WS.org. [ .pdf ]
An appropriate resource utilization is crucial for organizations in order to avoid, among other things, unnecessary costs (e.g. when resources are under-utilized) and too long execution times (e.g. due to excessive workloads, i.e. resource over-utilization). However, traditional process control and risk measurement approaches do not address resource utilization in processes. We studied an often-encountered industry case for providing large-scale technical infrastructure which requires rigorous testing for the systems deployed and identified the need of projecting resource utilization as a means for measuring the risk of resource underand over-utilization. Consequently, this paper presents a novel predictive model for resource utilization in decision-intensive processes, present in many domains. In particular, we predict the utilization of resources for a desired period of time given a decision-intensive business process that may include nested loops, and historical data (i.e. order and duration of past activity executions, resource profiles and their experience etc.). We have applied our method using a real business process with multiple instances and presented the outcome.
[BFUP17] Martin Beno, Kathrin Figl, Jürgen Umbrich, and Axel Polleres. Perception of key barriers in using and publishing open data. eJournal of eDemocracy and Open Government (JeDEM), 9(2):134--165, December 2017. [ .pdf ]
There is a growing body of literature recognizing the benefits of Open Data. However, many potential data providers are unwilling to publish their data and at the same time, data users are often faced with difficulties when attempting to use Open Data in practice. Despite various barriers in using and publishing Open Data still being present, studies which systematically collect and assess these barriers are rare. Based on this observation we present a review on prior literature on barriers and the results of an empirical study aimed at assessing both the users’ and publishers’ views on obstacles regarding Open Data adoption. We collected data with an online survey in Austria and internationally. Using a sample of 183 participants, we draw conclusions about the relative importance of the barriers reported in the literature. In comparison to a previous conference paper presented at the conference for E-Democracy and Open Government, this article includes new additional data from participants outside Austria, reports new analyses, and substantially extends the discussion of results and of possible strategies for the mitigation of Open Data barriers.
[AEF+17] Clemens Appl, Andreas Ekelhart, Natascha Fenz, Peter Kieseberg, Hannes Leo, Sabrina Kirrane, Axel Polleres, Alfred Taudes, Veronika Treitl, Christian Singer, and Martin Winner. Big Data, Innovation und Datenschutz, December 2017. Studie im Auftrag des Bundesministeriums für Verkehr, Innovation und Technologie (BMVIT). [ .pdf ]
Mit der Einführung der DS-GVO im Mai 2018 müssen Europaäische Organisationen, die davon betroffen sind, Datenschutz und die daraus folgenden Beschraänkungen beachten. Die Problematik stellt sich besonders für den Innovationstreiber Big Data, da bei derartigen Anwendungen a priori nicht klar ist, welche Daten zu sammeln sind und wie diese in entsprechende Anwendungen eingehen. In einer datenschutzfreien Welt wuürden derartige Anwendungen durch omöglichst breite, zweckfreie Datensammlung, darauf basierende Datenanalyse zur Entdeckung verborgener Muster und die Entwicklung neuer Funktionen, die den AnwenderInnen anhand ihrer Daten angeboten werden, entstehen. Die in vorliegender Studie vorgenommene rechtliche Analyse der DS-GVO ergibt, dass aufgrund der rechtlichen Anforderungen diese Vorgangsweise zur Entwicklung von Big Data Anwendungen nicht möglich ist: Datensammlungen und Data Mining Analysen mit personenbezogenen Daten sind ohne Einwilligung nicht erlaubt. Als Ausgangspunkt zur Entwicklung eines entsprechenden Maßnahmenbündels wird in dieser Studie eine DS-GVO kompatible Vorgangsweise zur Entwicklung einer Big Data Anwendung entwickelt. Basis dieses Vorschlags sind eine rechtlichen Analyse der DS-GVO mit Schwerpunkt Big Data, eine technische Analyse der zur Umsetzung der Auflagen vorhandenen Technologien sowie Gespraäche mit Unternehmen und Behoörden. Grundidee der Vorgangsweise ist die Einholung der Einwilligung zur Anonymisierung und/oder Datenanalyse bereits bei der Entwicklung des datengenerierenden Systems, einer auf das Testen abgestimmten Einwilligung und einem Opt-in beim Ausrollen der Big Data Anwendung. Desweiteren schlägt die Studie strategische Maßnahmen in den Bereichen Ausbildung, Forschung, Förderung, gesetzliche Rahmenbedingungen, sowie Teilnahme an internationalen Initiativen (wie z.B. MyData) vor.
[BKPW17] Piero Bonatti, Sabrina Kirrane, Axel Polleres, and Rigo Wenning. Transparent personal data processing: The road ahead. In TELERISE: 3rd International Workshop on TEchnical and LEgal aspects of data pRIvacy and SEcurity @ SAFECOMP2017, volume 10489 of Lecture Notes in Computer Science (LNCS), pages 337--349, Trento, Italy, September 2017. [ .pdf ]
The European General Data Protection Regulation defines a set of obligations for personal data controllers and processors. Primary obligations include: obtaining explicit consent from the data subject for the processing of personal data, providing full transparency with respect to the processing, and enabling data rectification and erasure (albeit only in certain circumstances). At the core of any transparency architecture is the logging of events in relation to the processing and sharing of personal data. The logs should enable verification that data processors abide by the access and usage control policies that have been associated with the data based on the data subject’s consent and the applicable regulations. In this position paper, we: (i) identify the requirements that need to be satisfied by such a transparency architecture, (ii) examine the suitability of existing logging mechanisms in light of said requirements, and (iii) present a number of open challenges and opportunities.
[SMUP17] Vadim Savenkov, Qaiser Mehmood, Jürgen Umbrich, and Axel Polleres. Counting to k, or how SPARQL 1.1 could be efficiently enhanced with top k shortest path queries. In 13th International Conference on Semantic Systems (SEMANTiCS), pages 97--103, Amsterdam, the Netherlands, September 2017. ACM. [ .pdf ]
While graph data on the Web and represented in RDF is growing, SPARQL, as the standard query language for RDF still remains largely unusable for the most typical graph query task: finding paths between selected nodes through the graph. Property Paths, as introduced in SPARQL1.1 turn out to be unfit for this task, as they can only be used for testing path existence and not even allow to count the number of paths between nodes. While such a feature has been shown to theoretically highly intractable, particularly in graphs with a high degree of cyclicity, practical use cases still demand a solution. A common restriction in fact is not to ask for all, but only the k-shortest paths between two nodes, in order to obtain at least the most important of potentially infeasibly many possible paths. In this paper, we extend SPARQL 1.1 property paths in a manner that allows to compute and return the k shortest paths matching a property path expression between two nodes. We present an algorithm and implementation and demonstrate in our evaluation that a realtively straightforward solution works (in fact, more efficiently than other, tailored solutions in the literature) in practical use cases.
[NPSU17] Sebastian Neumaier, Axel Polleres, Simon Steyskal, and Jürgen Umbrich. Data integration for open data on the web. In Giovambattista Ianni, Domenico Lembo, Leopoldo E. Bertossi, Wolfgang Faber, Birte Glimm, Georg Gottlob, and Steffen Staab, editors, Reasoning Web. Semantic Interoperability on the Web (Reasoning Web 2017), volume 10370 of Lecture Notes in Computer Science (LNCS), pages 1--28. Springer, London, United Kingdom, July 2017. [ DOI | .pdf ]
In this lecture we will discuss and introduce challenges of integrating openly available Web data and how to solve them. Firstly, while we will address this topic from the viewpoint of Semantic Web research, not all data is readily available as RDF or Linked Data, so we will give an introduction to different data formats prevalent on the Web, namely, standard formats for publishing and exchanging tabular, tree-shaped, and graph data. Secondly, not all Open Data is really completely open, so we will discuss and address issues around licences, terms of usage associated with Open Data, as well as documentation of data provenance. Thirdly, we will discuss issues connected with (meta-)data quality issues associated with Open Data on the Web and how Semantic Web techniques and vocabularies can be used to describe and remedy them. Fourth, we will address issues about searchability and integration of Open Data and discuss in how far semantic search can help to overcome these. We close with briefly summarizing further issues not covered explicitly herein, such as multi-linguality, temporal aspects (archiving, evolution, temporal querying), as well as how/whether OWL and RDFS reasoning on top of integrated open data could be help.
[FPKH17] Erwin Filtz, Axel Polleres, Roman Karl, and Bernhard Haslhofer. The evolution of the bitcoin graph. In Proceedings of the 1st International Data Science Conference (iDSC2017), Salzburg, Austria, June 2017. [ .pdf ]
Bitcoin as a virtual currency provides means to execute payments in an anonymous way without regulations from central authorities. In this paper we analyze structural properties of the Bitcoin graph and investigate how users behave in the Bitcoin system over time. Our analysis shows for instance that Bitcoin has a highly volatile exchange rate which probably makes it uninteresting for long-term investments; moreover we show how transactions "patterns" have evolved over time.
[BFPU17] Martin Beno, Kathrin Figl, Axel Polleres, and Jürgen Umbrich. Open data hopes and fears: Determining the barriers of open data. In 2017 Conference for E-Democracy and Open Government (CeDEM 2017), Krems, Austria, May 2017. Nominated for best paper award. [ .pdf ]
In recent years, Open Data has gained considerable attention: a steady growth in the number of openly published datasets – mainly by governments and public administrations - can be observed as the demand for Open Data rises. However, many potential providers are still hesitant to open their datasets and at the same time users often face difficulties when attempting to use this data in practice. This indicates that there are still various barriers present both regarding usage and publishing of Open Data, but studies that systematically collect and assess these barriers regarding their impact are rare. Based on this observation we survey prior literature on barriers, and have developed a questionnaire aimed at both assessing the users and publishers views on obstacles regarding Open Data adoption. Using a sample of over 100 participants from Austria who completed our online survey, we draw conclusions about the relative importance of the barriers reported in the literature. The empirical findings presented in this study shall serve as a solid foundation for future research on the mitigation of Open Data barriers.
[AFPS17] Albin Ahmeti, Javier Fernández, Axel Polleres, and Vadim Savenkov. Updating wikipedia via DBpedia mappings and SPARQL. In Eva Blomqvist, Diana Maynard, Aldo Gangemi, Rinke Hoekstra, Pascal Hitzler, and Olaf Hartig, editors, Proceedings of the 14th European Semantic Web Conference (ESWC2017), volume 10249 of Lecture Notes in Computer Science (LNCS), pages 485--501, Portorož, Slovenia, May 2017. Springer. [ DOI | .pdf ]
DBpedia crystallized most of the concepts of the Semantic Web using simple mappings to convert Wikipedia articles to RDF data. This “semantic view” of wiki content has rapidly become the focal point of the Linked Open Data cloud, but its impact on the original Wikipedia source is limited. In particular, little attention has been paid to the benefits that the semantic infrastructure can bring to maintain the wiki content, for instance to ensure that the effects of a wiki edit are consistent across infoboxes. In this paper, we present an framework for handling ontology-based updates of wiki content. Starting from DBpedia-like mappings converting infoboxes to a fragment of OWL 2 RL ontology, we discuss various issues associated with translating SPARQL updates on top of semantic data to the underlying Wiki content. On the one hand, we provide a formalization of DBpedia as an Ontology Based Data Management framework and study its computational properties. On the other hand, we provide a novel approach to the inherently intractable update translation problem, leveraging the pre-existent data for disambiguating updates.
[FKPS17] Javier Fernández, Sabrina Kirrane, Axel Polleres, and Simon Steyskal. Self-enforcing access control for encrypted RDF. In Proceedings of the 14th European Semantic Web Conference (ESWC2017), volume 10249 of Lecture Notes in Computer Science (LNCS), pages 607--622, Portorož, Slovenia, May 2017. Springer. [ .pdf ]
The amount of raw data exchanged via web protocols is steadily increasing. Although the Linked Data infrastructure could potentially be used to selectively share RDF data with different individuals or organisations, the primary focus remains on the unrestricted sharing of public data. In order to extend the Linked Data paradigm to cater for closed data, there is a need to augment the existing infrastructure with robust security mechanisms. At the most basic level both access control and encryption mechanisms are required. In this paper, we propose a flexible and dynamic architecture for securely storing and maintaining RDF datasets. By employing an encryption strategy based on Functional Encryption (FE), in which data access is enforced by the cryptographic approach itself, we allow for fine-grained access control over encrypted RDF data while at the same time reducing the administrative overhead associated with access control management.
[NUP17] Sebastian Neumaier, Jürgen Umbrich, and Axel Polleres. Lifting data portals to the web of data. In 10th Workshop on Linked Data on the Web (LDOW2017), Perth, Austrialia, April 2017. [ .pdf ]
Data portals are central hubs for freely available (governmental) datasets. These portals use different software frameworks to publish their data and the metadata descriptions of these datasets come in different schemas according to the used framework. The present work aims at re-exposing and connecting the metadata descriptions of currently 854k datasets on 261 data portals to the Web of Linked Data by mapping and publishing their homogenized metadata in standard vocabularies such as DCAT and Schema.org. Additionally, we publish existing quality information about the datasets and further enrich their descriptions by automatically generated metadata for CSV resources. In order to make all this information traceable and trustworthy, we annotate the generated data using W3C’s provenance vocabulary. The dataset descriptions are harvested weekly and we offer access to the archived data by providing APIs compliant to the Memento framework. All this data -- a total of about 120 million triples per weekly snapshot -- is queryable at the SPARQL endpoint at http://data.wu.ac.at/portalwatch/sparql.

2016


[FKK+16] Javier Fernández, Elmar Kiesling, Sabrina Kirrane, Julia Neuschmid, Mika Mizerski, Axel Polleres, Marta Sabou, Thomas Thurner, and Peter Wetz. Propelling the Potential of Enterprise Linked Data in Austria: Roadmap and Report. edition mono/monochrom, Zentagasse 31/8, A-1050 Vienna, Austria, December 2016. [ .pdf ]
The PROPEL project – Propelling the Potential of Enterprise Linked Data in Austria – surveyed technological challenges, entrepreneurial opportunities, and open research questions on the use of Linked Data in a business context and developed a roadmap and a set of recommendations for policy makers, industry, and the research community. Results are summarized in the present book.
[NUP16a] Sebastian Neumaier, Jürgen Umbrich, and Axel Polleres. Automated quality assessment of metadata across open data portals. ACM Journal of Data and Information Quality (JDIQ), 8(1):2, November 2016. [ DOI | .pdf ]
The Open Data movement has become a driver for publicly available data on the Web. More and more data -- from governments, public institutions but also from the private sector -- is made available online and is mainly published in so called Open Data portals. However, with the increasing number of published resources, there are a number of concerns with regards to the quality of the data sources and the corresponding metadata, which compromise the searchability, discoverability and usability of resources. In order to get a more complete picture of the severity of these issues, the present work aims at developing a generic metadata quality assessment framework for various Open Data portals: we treat data portals independently from the portal software frameworks by mapping the specific metadata of three widely used portal software frameworks (CKAN, Socrata, OpenDataSoft) to the standardized DCAT metadata schema. We subsequently define several quality metrics, which can be evaluated automatically and in a efficient manner. Finally, we report findings based on monitoring a set of over 260 Open Data portals with 1.1M datasets. This includes the discussion of general quality issues, e.g. the retrievability of data, and the analysis of our specific quality metrics.
[NUP16b] Sebastian Neumaier, Jürgen Umbrich, and Axel Polleres. Challenges of mapping current CKAN metadata to DCAT. In W3C Workshop on Data and Services Integration, Amsterdam, the Netherlands, November 2016. [ http ]
This report describes our experiences using the mapping of the metadata in CKAN powered Open Data portals to the DCAT model. CKAN is the most prominent portal software framework used for publishing Open Data and used by several governmental portals including data.gov.uk and data.gov. We studied the actual usage of DCAT in 133 existing Open Data portals and report the key findings.
[NUPP16] Sebastian Neumaier, Jürgen Umbrich, Josiane Parreira, and Axel Polleres. Multi-level semantic labelling of numerical values. In Proceedings of the 15th International Semantic Web Conference (ISWC 2016) - Part I, volume 9981 of Lecture Notes in Computer Science (LNCS), pages 428--445, Kobe, Japan, October 2016. Springer. Nominated for best student paper award. [ DOI | .pdf ]
With the success of Open Data a huge amount of tabular data sources became available that could potentially be mapped and linked into the Web of (Linked) Data. Most existing approaches to “semantically label” such tabular data rely on mappings of textual information to classes, properties, or instances in RDF knowledge bases in order to link -- and eventually transform -- tabular data into RDF. However, as we will illustrate, Open Data tables typically contain a large portion of numerical columns and/or non-textual headers; therefore solutions that solely focus on textual “cues” are only partially applicable for mapping such data sources. We propose an approach to find and rank candidates of semantic labels and context descriptions for a given bag of numerical values. To this end, we apply a hierarchical clustering over information taken from DBpedia to build a background knowledge graph of possible “semantic contexts” for bags of numerical values, over which we perform a nearest neighbour search to rank the most likely candidates. Our evaluation shows that our approach can assign fine-grained semantic labels, when there is enough supporting evidence in the background knowledge graph. In other cases, our approach can nevertheless assign high level contexts to the data, which could potentially be used in combination with other approaches to narrow down the search space of possible labels.
[PBP16] Mónica Posada-Sánchez, Stefan Bischof, and Axel Polleres. Extracting geo-semantics about cities from openstreetmap. In Proceedings of the Posters and Demos Track of the 12th International Conference on Semantic Systems (SEMANTiCS2016), Leipzig, Germany, September 2016. [ .pdf ]
Access to high quality and updated data is crucial to assess and contextualize city state of a affairs. The City Data Pipeline uses diverse Open Data sources to integrate statistical information about cities. The resulting incomplete dataset is not directly usable for data analysis. We exploit data from a geographic information system, namely OpenStreetMap, to obtain new indicators for cities with better coverage. We show that OpenStreetMap is a promising data source for statistical data about cities.
[FGUKP16] Javier David Fernández Garcia, Jürgen Umbrich, Magnus Knuth, and Axel Polleres. Evaluating query and storage strategies for RDF archives. In 12th International Conference on Semantic Systems (SEMANTiCS), ACM International Conference Proceedings Series, pages 41--48. ACM, September 2016. [ .pdf ]
There is an emerging demand on efficiently archiving and (temporal) querying different versions of evolving semantic Web data. As novel archiving systems are starting to address this challenge, foundations/standards for benchmarking RDF archives are needed to evaluate its storage space efficiency and the performance of different retrieval operations. To this end, we provide theoretical foundations on the design of data and queries to evaluate emerg- ing RDF archiving systems. Then, we instantiate these foundations along a concrete set of queries on the basis of a real-world evolving dataset. Finally, we perform an empirical evaluation of various current archiving techniques and querying strategies on this data. Our work comprises -- to the best of our knowledge -- the first benchmark for querying evolving RDF data archives.
[HCMP16] Giray Havur, Cristina Cabanillas, Jan Mendling, and Axel Polleres. Resource allocation with dependencies in business process management systems. In Business Process Management Forum - BPM Forum 2016, volume 260 of Lecture Notes in Business Information Processing, pages 3--19, Rio de Janeiro, Brazil, September 2016. Springer. [ .pdf ]
Business Process Management Systems (BPMS) facilitate the execution of business processes by coordinating all involved resources. Traditional BPMS assume that these resources are independent from one another, which justifies a greedy allocation strategy of offering each work item as soon as it becomes available. In this paper, we develop a formal technique to derive an optimal schedule for work items that have dependencies and resource conflicts. We build our work on Answer Set Programming (ASP), which is supported by a wide range of efficient solvers. We apply our technique in an industry scenario and evaluate its effectiveness. In this way, we contribute an explicit notion of resource dependencies within BPMS research and a technique to derive optimal schedules.
[MNUP16] Johann Mitlöhner, Sebastian Neumaier, Jürgen Umbrich, and Axel Polleres. Characteristics of open data CSV files. In 2nd International Conference on Open and Big Data, August 2016. Invited paper. [ DOI | .pdf ]
This work analyzes an Open Data corpus containing 200K tabular resources with a total file size of 413GB from a data consumer perspective. Our study shows that ∼10% of the resources in Open Data portals are labelled as a tabular data of which only 50% can be considered CSV files. The study inspects the general shape of these tabular data, reports on column and row distribution, analyses the availability of (multiple) header rows and if a file contains multiple tables. In addition, we inspect and analyze the table column types, detect missing values and report about the distribution of the values.
[ACPS16] Albin Ahmeti, Diego Calvanese, Axel Polleres, and Vadim Savenkov. Handling inconsistencies due to class disjointness in SPARQL updates. In Harald Sack, Eva Blomqvist, Mathieu d'Aquin, Chiara Ghidini, Simone Paolo Ponzetto, and Christoph Lange, editors, Proceedings of the 13th European Semantic Web Conference (ESWC2016), volume 9678 of Lecture Notes in Computer Science (LNCS), pages 387--404, Heraklion, Greece, June 2016. Springer. [ .pdf ]
The problem of updating ontologies has received increased attention in recent years. In the approaches proposed so far, either the update language is restricted to (sets of) atomic updates, or, where the full SPARQL update language is allowed, the TBox language is restricted so that no inconsistencies can arise. In this paper we discuss directions to overcome these limitations. Starting from a DL-Lite fragment covering RDFS and concept disjointness axioms, we define three semantics for SPARQL update: under cautious semantics, inconsistencies are resolved by rejecting updates potentially introducing conflicts; under brave semantics, instead, conflicts are overridden in favor of new information where possible; finally, the fainthearted semantics is a compromise between the former two approaches, designed to accommodate as much of the new information as possible, as long as consistency with the prior knowledge is not violated. We show how these semantics can be implemented in SPARQL via rewritings of polynomial size and draw first conclusions from their practical evaluation.
[GHHP16] Claudio Gutierrez, Aidan Hogan, Daniel Hernández, and Axel Polleres. Certain answers for SPARQL? In Alberto Mendelzon International Workshop on Foundations of Data Management (AMW2016), Panama City, Panama, June 2016. Short paper. [ .pdf ]
The standard semantics of SPARQL and the standard semantics of RDF differ fundamentally, sometimes leading to unintuitive answers. In this paper, we thus motivate an alternative semantics for SPARQL based on certain answers, taking into account the existential nature of blank nodes, the open-world assumption of RDF, and perhaps even the lack of a unique name assumption. We propose that SPARQL is a natural use-case for applying existing techniques that approximate certain answers in relational database settings.
[PRK16] Axel Polleres, Juan Reutter, and Egor V. Kostylev. Nested constructs vs. sub-selects in SPARQL. In Alberto Mendelzon International Workshop on Foundations of Data Management (AMW2016), Panama City, Panama, June 2016. [ .pdf ]
The issue of subqueries in SPARQL has appeared in different papers as an extension point to the original SPARQL query language. Particularly, nested CONSTRUCT in FROM clauses are a feature that has been discussed as a potential input for SPARQL 1.1 which was resolved to be left out in favour of select subqueries under the -- unproven -- conjecture that such subqueries can express nested construct queries. In this paper, we show that it is indeed possible to unfold nested SPARQL construct queries into subqueries in SPARQL 1.1; our transformation, however, requires an exponential blowup in the nesting depth. This suggests that nested construct queries are indeed a useful syntactic feature in SPARQL that cannot compactly be replaced by subqueries.
[AFPS16] Albin Ahmeti, Javier D. Fernández, Axel Polleres, and Vadim Savenkov. Towards updating Wikipedia via DBpedia mappings and SPARQL. In Alberto Mendelzon International Workshop on Foundations of Data Management (AMW2016), Panama City, Panama, June 2016. Short paper. [ .pdf ]
DBpedia is a community effort that has created the most important cross-domain datasets in RDF, a focal point of the Linked Open Data (LOD) cloud. In its core there is a set of declarative mappings extracting the data from Wikipedia infoboxes and tables into the RDF. However, while DBpedia focuses on publishing knowledge in a machine-readable way, little attention has been paid to the benefits of supporting machine updates. This greatly restricts the possibilities of automatic curation of the DBpedia data that could be semi-automatically propagated to Wikipedia, and also prevents maintainers from evaluating the impact of their edits on the consistency of knowledge. Excluding the DBpedia tax- onomy from the editing cycle is a major drawback which we aim to address. This paper starts a discussion of DBpedia making a case for a benchmark for Ontology-Based Data Management (OBDM). As we show, although based on fairly restricted mappings (which we cast as a variant of nested tgds here) and minimalistic TBox language, accommodating DBpedia updates is intricate from different perspectives, ranging from conceptual (what is an adequate semantics for DBpe- dia SPARQL updates?) to challenges related to the user interface design.
[BDP+16] Marta Borriello, Christian Dirschl, Axel Polleres, Phil Ritchie, Frank Salliau, Felix Sasaki, and Giannis Stoitsis. From xml to rdf step by step: approaches for leveraging xml workflows with linked data. In XML Prague 2016 -- Conference Proceedings, pages 121--138, Prague, Czech Republic, February 2016. [ http ]
There have been many discussions about benefits and drawbacks of XML vs. RDF. In practice more and more XML and linked data technologies are being used together. This leads to opportunities and uncertainties: for years companies have invested heavily in XML workflows. They are not willing to throw them away for the benefits of linked data. This paper aims to start a discussion on approaches for integrating XML and RDF workflows. This should help the incremental adoption of linked data, without the need to throw away XML tooling and to start content processing from scratch.

2015


[CMPH15] Cristina Cabanillas, Jan Mendling, Axel Polleres, and Alois Haselböck. Safety-critical human- and data-centric process management in engineering projects. In Proceedings of the 5th International Symposium on Data-driven Process Discovery and Analysis (SIMPDA 2015), volume 1527 of CEUR Workshop Proceedings, pages 145--148, Vienna, Austria, December 2015. CEUR-WS.org. [ .pdf ]
Complex technical systems, industrial systems or infrastructure systems are rich of customizable features and raise high demands on quality and safety-critical aspects. The activities to create complete, valid and reliable planning and customization process data for a product deployment are part of an overarching engineering process that is crucial for the successful completion of a project and, particularly, for verifying compliance to existing regulations in a distributed, heterogeneous environment. In this paper, we discuss the challenges that process management needs to address in such complex engineering projects, and present an architecture that comprises the functionality required together with findings and results already obtained for its different components.
[BWP15] Stefan Belk, Gerhard Wohlgenannt, and Axel Polleres. Exploring and exploiting(?) the awkward connections between SKOS and OWL. In Jeff Z. Pan and Serena Villata, editors, ISWC 2015 Posters & Demos, volume 1486 of CEUR Workshop Proceedings, Bethlehem, Pennsylvania, October 2015. CEUR-WS.org. Poster abstract. [ .pdf ]
In the Semantic Web, the Web Ontology Language (OWL) vocabulary is used for the representation of formal ontologies, while the Simple Knowledge Organisation System (SKOS) is a vocabulary designed for thesauri or concept taxonomies without formal semantics. Despite their different nature, on the Web these two vocabularies are often used together. Here, we try to explore and exploit the joint usage of OWL and SKOS. More precisely, we first define usage patterns to detect prob- lematic modeling from connections between SKOS and OWL. Next, we also investigate if additional information can be inferred from joint usage with SKOS in order to enrich semantic inferences through OWL alone – although SKOS was designed without formal semantics, we argue for this heretic approach by applicability “in the wild”: the patterns for model- ing errors and inference of new information are transformed to SPARQL queries and applied to real world data from the Billion Triple Challenge 2014; we manually evaluate this corpus and assess the quality of the defined patterns empirically.
[BMPS15a] Stefan Bischof, Christoph Martin, Axel Polleres, and Patrik Schneider. Collecting, integrating, enriching and republishing open city data as linked data. In Marcelo Arenas, Oscar Corcho, Elena Simperl, Markus Strohmaier, Mathieu d'Aquin, Kavitha Srinivas, Paul T. Groth, Michel Dumontier, Jeff Heflin, Krishnaprasad Thirunarayan, and Steffen Staab, editors, Proceedings of the 14th International Semantic Web Conference (ISWC 2015) - Part II, volume 9367 of Lecture Notes in Computer Science (LNCS), pages 57--75, Bethlehem, Pennsylvania, October 2015. Springer. [ .pdf ]
Access to high quality and recent data is crucial both for decision makers in cities as well as for the public. Likewise, infrastructure providers could offer more tailored solutions to cities based on such data. However, even though there are many data sets containing relevant indicators about cities available as open data, it is cumbersome to integrate and analyze them, since the collection is still a manual process and the sources are not connected to each other upfront. Further, disjoint indicators and cities across the available data sources lead to a large proportion of missing values when integrating these sources. In this paper we present a platform for collecting, integrating, and enriching open data about cities in a reusable and comparable manner: we have integrated various open data sources and present approaches for predicting missing values, where we use standard regression methods in combination with principal component analysis (PCA) to improve quality and amount of predicted values. Since indicators and cities only have partial overlaps across data sets, we particularly focus on predicting indicator values across data sets, where we extend, adapt, and evaluate our prediction model for this particular purpose: as a “side product” we learn ontology mappings (simple equations and sub-properties) for pairs of indicators from different data sets. Finally, we republish the integrated and predicted values as linked open data.
[SP15] Simon Steyskal and Axel Polleres. Towards formal semantics for ODRL policies. In 9th International Web Rule Symposium (RuleML2015), number 9202 in Lecture Notes in Computer Science (LNCS), pages 360--375, Berlin, Germany, August 2015. Springer. [ DOI | .pdf ]
Most policy-based access control frameworks explicitly model whether execution of certain actions (read, write, etc.) on certain assets should be permitted or denied and usually assume that such actions are disjoint from each other, i.e. there does not exist any explicit or implicit dependency between actions of the domain. This in turn means, that conflicts among rules or policies can only occur if those contradictory rules or policies constrain the same action. In the present paper - motivated by the example of ODRL 2.1 as policy expression language - we follow a different approach and shed light on possible dependencies among actions of access control policies. We propose a interpretation of the formal semantics of general ODRL policy expressions and motivate rule-based reasoning over such policy expressions taking both explicit and implicit dependencies among actions into account. Our main contributions are (i) an exploration of different kinds of ambiguities that might emerge based on explicit or implicit dependencies among actions, and (ii) a formal interpretation of the semantics of general ODRL policies based on a defined abstract syntax for ODRL which shall eventually enable to perform rule-based reasoning over a set of such policies.
[UNP15a] Jürgen Umbrich, Sebastian Neumaier, and Axel Polleres. Quality assessment & evolution of open data portals. In IEEE International Conference on Open and Big Data, Rome, Italy, August 2015. Best paper award. [ .pdf ]
Despite the enthusiasm caused by the availability of a steadily increasing amount of openly available, structured data, first critical voices appear addressing the emerging issue of low quality in the meta data and data source of Open Data portals which is a serious risk that could disrupt the Open Data project. However, there exist no comprehensive reports about the actual quality of Open Data portals. In this work, we present our efforts to monitor and assess the quality of 82 active Open Data portals, powered by organisations across 35 different countries. We discuss our quality metrics and report comprehensive findings by analysing the data and the evolution of the portals since September 2014. Our results include findings about a steady growth of information, a high heterogeneity across the portals for various aspects and also insights on openness, contactability and the availability of meta data.
[BCM+15] Saimir Bala, Cristina Cabanillas, Jan Mendling, Andreas Rogge-Solti, and Axel Polleres. Mining project-oriented business processes. In 13th International Conference on Business Process Management (BPM 2015), volume 9253 of Lecture Notes in Computer Science, pages 425--440, Innsbruck, Austria, August 2015. Springer. [ .pdf ]
Large engineering processes need to be monitored in detail regarding when what was done in order to prove compliance with rules and regulations. A typical problem of these processes is the lack of control that a central process engine provides, such that it is difficult to track the actual course of work even if data is stored in version control systems (VCS). In this paper, we address this problem by defining a mining technique that helps to generate models that visualize the work history as GANTT charts. To this end, we formally define the notion of a project-oriented business process and a corresponding mining algorithm. Our evaluation based on a prototypical implementation demonstrates the benefits in comparison to existing process mining approaches for this specific class of processes.
[HCPM15] Giray Havur, Cristina Cabanillas, Axel Polleres, and Jan Mendling. Automated Resource Allocation in Business Processes with Answer Set Programming. In 11th International Workshop on Business Process Intelligence 2015, Innsbruck, Austria, August 2015. [ .pdf ]
Human resources are of central importance for executing and supervising business processes. An optimal resource allocation can dramatically improve undesirable consequences of resource shortages. However, existing approaches for resource allocation have some limitations, e.g., they do not consider concurrent process instances or loops in business processes, which may greatly alter resource requirements. This paper introduces a novel approach for automatically allocating resources to process activities in a time optimal way that is designed to tackle the aforementioned shortcomings. We achieve this by representing the resource allocation problem in Answer Set Programming (ASP), which allows us to model the problem in an extensible, modular, and thus maintainable way, and which is supported by various efficient solvers.
[BKPR15] Stefan Bischof, Markus Krötzsch, Axel Polleres, and Sebastian Rudolph. Schema-agnostic query rewriting for OWL QL. In 28th International Workshop on Description Logics (DL2015), Athens, Greece, June 2015. Extended Abstract (full paper at ISWC2014). [ .pdf ]
In this extended abstract, we review our recent research on ontology-based query answering in OWL QL, first published at the International Semantic Web Conference 2014. OWL QL is a popular member of the DL-Lite family that is part of the W3C OWL 2 standard. Typical implementations use the OWL QL TBox to rewrite a conjunctive query into an equivalent set of queries, to be answered against the ABox of the ontology. With the adoption of the recent SPARQL 1.1 standard, however, RDF databases are capable of answering much more expressive queries directly, and we ask how this can be exploited in query rewriting. We find that SPARQL 1.1 is powerful enough to “implement” a full-fledged OWL QL reasoner in a single query. Using additional SPARQL 1.1 features, we develop a new method of schema-agnostic query rewriting, where arbitrary conjunctive queries over OWL QL are rewritten into equivalent SPARQL 1.1 queries in a way that is fully independent of the actual schema. This allows us to query RDF data under OWL QL entailment without extracting or preprocessing OWL.
[ACSP15] Albin Ahmeti, Diego Calvanese, Vadim Savenkov, and Axel Polleres. Dealing with Inconsistencies due to Class Disjointness in SPARQL Update. In 28th International Workshop on Description Logics (DL2015), Athens, Greece, June 2015. [ .pdf ]
The problem of updating ontologies has received increased attention in recent years. In the approaches proposed so far, either the update language is restricted to (sets of) atomic updates, or, where the full SPARQL update language is allowed, the TBox language is restricted so that no inconsistencies can arise. In this paper we discuss directions to overcome these limitations. Starting from a DL-Lite fragment covering RDFS and concept/class disjointness axioms, we define two semantics for SPARQL update: under cautious semantics, inconsistencies are resolved by rejecting updates potentially introducing conflicts; under brave semantics, instead, conflicts are overridden in favor of new information where possible. The latter approach builds upon existing work on the evolution of DL-Lite knowledge bases, setting it in the context of generic SPARQL updates.
[Pol15b] Axel Polleres. Integrating open data: (how) can description logics help me? In 28th International Workshop on Description Logics (DL2015), Athens, Greece, June 2015. Abstract/Invited Talk. [ .pdf ]
In this talk, we will report on experiences and obstacles for collecting and integrating Open Data across various data sets. We wil discuss how both methods from knowledge representation and reasoning as well as from statistics and data mining can be used to tackle some issues we encountered.
[BMPS15b] Stefan Bischof, Christoph Martin, Axel Polleres, and Patrik Schneider. Open City Data Pipeline: Collecting, Integrating, and Predicting Open City Data. In 4th Workshop on Knowledge Discovery and Data Mining Meets Linked Open Data (Know@LOD), Portoroz, Slovenia, May 2015. [ .pdf ]
Having access to high quality and recent data is crucial both for decision makers in cities as well as for informing the public; likewise, infrastructure providers could offer more tailored solutions to cities based on such data. However, even though there are many data sets containing relevant indicators about cities available as open data, it is cumbersome to integrate and analyze them, since the collection is still a manual process and the sources are not connected to each other upfront. Further, disjoint indicators and cities across the available data sources lead to a large proportion of missing values when integrating these sources. In the present paper we present a platform for collecting, integrating, and enriching open data about cities in a re-usable and comparable manner: we have integrated various open data sources and present approaches for predicting missing values: we use different standard regression methods in combination with principal component analysis to improve quality and amount of predicted values. Further, we re-publish the integrated and predicted values as linked open data.
[FPU15] Javier D. Fernández, Axel Polleres, and Jürgen Umbrich. Towards efficient archiving of dynamic linked open data. In Managing the Evolution and Preservation of the Data Web - First Diachron Workshop at ESWC 2015, pages 34--49, Portorož, Slovenia, May 2015. [ .pdf ]
The Linked Data paradigm has enabled a huge shared infrastructure for connecting data from different domains which can be browsed and queried together as a huge knowledge base. However, structured interlinked datasets in this Web of data are not static but continuously evolving, which suggests the investigation of approaches to preserve Linked data across time. In this article, we survey and analyse current techniques addressing the problem of archiving different versions of semantic Web data, with a focus on their space efficiency, the retrieval functionality they serve, and the performance of such operations.
[UMP15] Jürgen Umbrich, Nina Mrzelj, and Axel Polleres. Towards capturing and preserving changes on the web of data. In Managing the Evolution and Preservation of the Data Web - First Diachron Workshop at ESWC 2015, pages 50--65, Portorož, Slovenia, May 2015. [ .pdf ]
Existing Web archives aim to capture and preserve the changes of documents on the Web and provide data corpora of high value which are used in various areas (e.g. to optimise algorithms or to study the Zeitgeist of a generation). So far, the Web archives concentrate their efforts to capture the large Web of documents with periodic snapshot crawls. Little focus is drawn to preserve the continuously growing Web of Data and actually keeping track of the real frequency of changes. In this work we present our efforts to capture and archive the changes on the Web of Data. We describe our infrastructure and focus on evaluating strategies to accurately capture the changes of data and to also estimate the crawl time for a given set of URLs with the aim to optimally schedule the revising of URLs with limited resources.
[UNP15b] Jürgen Umbrich, Sebastian Neumaier, and Axel Polleres. Towards assessing the quality evolution of open data portals. In ODQ2015: Open Data Quality: from Theory to Practice Workshop, Munich, Germany, March 2015. [ .pdf ]
In this work, we present the Open Data Portal Watch project, a public framework to continuously monitor and assess the (meta-)data quality in Open Data portals. We critically discuss the objectiveness of various quality metrics. Further, we report on early findings based on 22 weekly snapshots of 90 CKAN portals and highlight interesting observations and challenges.
[UHPD15] Jürgen Umbrich, Aidan Hogan, Axel Polleres, and Stefan Decker. Link traversal querying for a diverse web of data. Semantic Web -- Interoperability, Usability, Applicability (SWJ), 6(6):585--624, 2015. [ http ]
Traditional approaches for querying the Web of Data often involve centralised warehouses that replicate remote data. Conversely, Linked Data principles allow for answering queries live over the Web by dereferencing URIs to traverse remote data sources at runtime. A number of authors have looked at answering SPARQL queries in such a manner; these link-traversal based query execution (LTBQE) approaches for Linked Data offer up-to-date results and decentralised (i.e., client-side) execution, but must operate over incomplete dereferenceable knowledge available in remote documents, thus affecting response times and “recall” for query answers. In this paper, we study the recall and effectiveness of LTBQE, in practice, for the Web of Data. Furthermore, to integrate data from diverse sources, we propose lightweight reasoning extensions to help find additional answers. From the state-of-the-art which (1) considers only dereferenceable information and (2) follows rdfs:seeAlso links, we propose extensions to consider (3) owl:sameAs links and reasoning, and (4) lightweight RDFS reasoning. We then estimate the recall of link-traversal query techniques in practice: we analyse a large crawl of the Web of Data (the BTC’11 dataset), looking at the ratio of raw data contained in dereferenceable documents vs. the corpus as a whole and determining how much more raw data our extensions make available for query answering. We then stress-test LTBQE (and our extensions) in real-world settings using the FedBench and DBpedia SPARQL Benchmark frameworks, and propose a novel benchmark called QWalk based on random walks through diverse data. We show that link-traversal query approaches often work well in uncontrolled environments for simple queries, but need to retrieve an unfeasible number of sources for more complex queries. We also show that our reasoning extensions increase recall at the cost of slower execution, often increasing the rate at which results returned; conversely, we show that reasoning aggravates performance issues for complex queries.
[Pol15a] Axel Polleres. Das neue Berufsbild “Data Scientist”. OCG Journal, 03/2015:13--16, 2015. Invited article (in German). [ http ]
Nicht nur in der Informatik, sondern in immer mehr Disziplinen und Branchen spielen immer größer werdende Datenmengen eine entscheidende Rolle. Das Verstehen von Daten bedeutet in vielen Geschäftsbereichen entscheidende Wettbewerbsvorteile und die technische Verarbeitbarkeit immer größerer Datenmengen bietet neue Möglichkeiten, stellt aber auch neue Herausforderungen an Wirtschaft und Ausbildung. Der Beruf des “Data Scientist” wurde vor rund 3 Jahren aufgrund dieser Entwicklungen als “Sexiest Job of the 21st Century” tituliert, und schon tauchen Stimmen auf, warum er das schon wieder nicht mehr ist. Dies obgleich “Data Scientist” immer noch ein schwammiger Begriff ist und wir uns fragen müssen: Was ist/macht ein Data Scientist eigentlich? Welche Voraussetzungen muss jemand mitbringen wenn er/sie als Data Scientist arbeiten möchte? In welche Richtung wird sich dieses Berufsbild in der Zukunft entwickeln? Gibt es spezielle Ausbildungen dafür und wo? Auf diese Fragen versuchen wir im folgenden kurz im einzelnen einzugehen und sie -- zumindest aus der Sichtweise des Autors -- zu beantworten.

2014


[BKPR14] Stefan Bischof, Markus Krötzsch, Axel Polleres, and Sebastian Rudolph. Schema-agnostic query rewriting in SPARQL 1.1. In Proceedings of the 13th International Semantic Web Conference (ISWC 2014), Lecture Notes in Computer Science (LNCS). Springer, October 2014. [ .pdf ]
SPARQL 1.1 supports the use of ontologies to enrich query results with logical entailments, and OWL 2 provides a dedicated fragment OWL QL for this purpose. Typical implementations use the OWL QL schema to rewrite a conjunctive query into an equivalent set of queries, to be answered against the non-schema part of the data. With the adoption of the recent SPARQL 1.1 standard, however, RDF databases are capable of answering much more expressive queries directly, and we ask how this can be exploited in query rewriting. We find that SPARQL 1.1 is powerful enough to “implement” a full-fledged OWL QL reasoner in a single query. Using additional SPARQL 1.1 features, we develop a new method of schema-agnostic query rewriting, where arbitrary conjunctive queries over OWL QL are rewritten into equivalent SPARQL 1.1 queries in a way that is fully independent of the actual schema. This allows us to query RDF data under OWL QL entailment without extracting or preprocessing OWL axioms.
[BAPU14] Carlos Buil-Aranda, Axel Polleres, and Jürgen Umbrich. Strategies for executing federated queries in SPARQL1.1. In Proceedings of the 13th International Semantic Web Conference (ISWC 2014), Lecture Notes in Computer Science (LNCS). Springer, October 2014. [ .pdf ]
A common way for exposing RDF data on the Web is by means of SPARQL endpoints, i.e., Web services that implement the SPARQL protocol and allow end users and applications to query just the RDF data they want. However, servers hosting SPARQL endpoints typically restrict the access to the data by limiting the amount of results returned by user queries or the amount of queries per time and client that may be issued. For addressing these problems we analysed different strategies that shall allow to obtain complete query results for federated queries using SPARQL1.1's federated query extension by rewriting the original query. We show that some seemingly intuitive “recipes” for decomposing federated queries to circumvent server limitations provide unsound results in the general case, and provide fixes or discuss under which restrictions these recipes are still applicable. Finally, we evaluate the different proposed strategies in order to check their feasibility in practice.
[ACP14b] Albin Ahmeti, Diego Calvanese, and Axel Polleres. Updating RDFS ABoxes and TBoxes in SPARQL. In Proceedings of the 13th International Semantic Web Conference (ISWC 2014), Lecture Notes in Computer Science (LNCS). Springer, October 2014. [ .pdf ]
Updates in RDF stores have recently been standardised in the SPARQL 1.1 Update specification. However, computing answers entailed by ontologies in triple stores is usually treated orthogonal to updates. Even the W3C's recent SPARQL 1.1 Update language and SPARQL 1.1 Entailment Regimes specifications explicitly exclude a standard behaviour how SPARQL endpoints should treat entailment regimes other than simple entailment in the context of updates. In this paper, we outline different routes to close this gap. We define a fragment of SPARQL basic graph patterns corresponding to (the RDFS fragment of) DL-Lite and the corresponding SPARQL update language, dealing with updates both of ABox and of TBox statements. We discuss possible semantics along with potential strategies for implementing them. We treat both, (i) materialised RDF stores, which store all entailed triples explicitly, and (ii) reduced RDF Stores, that is, redundancy-free RDF stores that do not store any RDF triples (corresponding to DL-Lite ABox statements) entailed by others already.
[DPLB14] Daniele Dell'Aglio, Axel Polleres, Nuno Lopes, and Stefan Bischof. Querying the web of data with XSPARQL 1.1. In ISWC2014 Developers Workshop, volume 1268 of CEUR Workshop Proceedings. CEUR-WS.org, October 2014. [ .pdf ]
On the Web and in corporate environments there exists a lot of XML data in various formats. XQuery and XSLT serve as query and transformation languages for XML. But as RDF also becomes a mainstream format for Web of data, transformations languages between these formats are required. XSPARQL is a hybrid language that provides an integration framework for XML, RDF, but also JSON and relational data by partially combining several languages such as XQuery, SPARQL 1.1 and SQL. In this paper we present the latest open source release of the XSPARQL engine, which is based on standard software components (Jena and Saxon) and outline possible applications of XSPARQL 1.1 to address Web data integration use cases.
[SBPS14] Gottfried Schenner, Stefan Bischof, Axel Polleres, and Simon Steyskal. Integrating distributed configurations with RDFS and SPARQL. In 16th International Configuration Workshop, Novi Sad, Serbia, September 2014. [ .pdf ]
Large interconnected technical systems (e.g. railway networks, power grids, computer networks) are often configured with the help of multiple configurators, which store their configurations in separate databases based on heterogenous domain models (ontologies). When users want to ask queries over several distributed configurations, these domain models need to be aligned. To this end, standard mechanisms for ontology and data integration are required that enable combining query answering with reasoning about these distributed configurations. In this paper we describe our experience with using standard Semantic Web technologies (RDFS and SPARQL) in such a context.
[SP14] Simon Steyskal and Axel Polleres. Defining expressive access policies for linked data using the ODRL ontology 2.0. In Proceedings of the SEMANTiCS 2014, ACM International Conference Proceedings Series, Leipzig, Germany, September 2014. ACM. Short paper. [ .pdf ]
Together with the latest efforts in publishing Linked (Open) Data, legal issues around publishing and consuming such data are gaining increased interest. Particular areas of interest include (i) how to define more expressive access policies which go beyond common licenses, (ii) how to introduce pricing models for online datasets (for non-open data) and (iii) how to realize (i)+(ii) while providing descriptions of respective meta data that is both human readable and machine processable. In this paper, we show based on different examples that the Open Digital Rights Language (ODRL) Ontology 2.0 is able to address all previous mentioned issues, i.e. is suitable to express a large variety of different access policies for Linked Data. By defining policies as ODRL in RDF we aim for (i) higher flexibility and simplicity in usage, (ii) machine/human readability and (iii) fine-grained policy expressions for Linked (Open) Data.
[ACP14a] Albin Ahmeti, Diego Calvanese, and Axel Polleres. SPARQL Update for Materialized Triple Stores under DL-LiteRDFS Entailment. In 27th International Workshop on Description Logics (DL2014), Vienna, Austria, July 2014. [ .pdf ]
Updates in RDF stores have recently been standardised in the SPARQL 1.1 Update specification. However, computing answers entailed by ontologies in triple stores is usually treated orthogonally to updates. Even W3C's SPARQL 1.1 Update language and SPARQL 1.1 Entailment Regimes specifications explicitly exclude a standard behaviour for entailment regimes other than simple entailment in the context of updates. In this paper, we take a first step to close this gap. We define a fragment of SPARQL basic graph patterns corresponding to (the RDFS fragment of) DL-Lite and the corresponding SPARQL update language, dealing with updates both of ABox and of TBox statements. We discuss possible semantics along with potential strategies for implementing them. Particularly, we treat materialised RDF stores, which store all entailed triples explicitly, and preservation of materialisation upon ABox and TBox updates.
[BAP14] Carlos Buil-Aranda and Axel Polleres. In Georg Gottlob and Jorge Pérez, editors, Alberto Mendelzon International Workshop on Foundations of Data Management (AMW2014), volume 1189 of CEUR Workshop Proceedings. CEUR-WS.org, June 2014.
The most common way for exposing RDF data on the Web is by means of SPARQL endpoints. These endpoints are Web services that implement the SPARQL protocol and then allow end users and applications to query just the RDF data they want. However, the servers hosting the SPARQL endpoints restrict the access to the data by limiting the amount of results returned by user queries or the amount of queries per time and client that can be issued. For addressing these problems we analysed different alternatives that shall allow to obtain complete query result sets from the SPARQL endpoints by rewriting the original query using SPARQL1.1's federated query extension. We show that some of the commonly used SPARQL query patterns for this task provide unsound results while other patterns are more suitable. We provide equivalent query patterns that help users in obtaining complete result sets circumventing the limitations imposed by servers.
[PS14] Axel Polleres and Simon Steyskal. Semantic web standards for publishing and integrating open data. In Miguel-Angel Sicilia and Pablo Serrano-Balazote, editors, Handbook of Research on Advanced ICT Integration for Governance and Policy Modeling, pages 28--47. IGI Global, June 2014. [ http ]
The World Wide Web Consortium (W3C) as the main standardization body for Web standards has set a particular focus on publishing and integrating Open Data. In this chapter, the authors explain various standards from the W3C's Semantic Web activity and the—potential—role they play in the context of Open Data: RDF, as a standard data format for publishing and consuming structured information on the Web; the Linked Data principles for interlinking RDF data published across the Web and leveraging a Web of Data; RDFS and OWL to describe vocabularies used in RDF and for describing mappings between such vocabularies. The authors conclude with a review of current deployments of these standards on the Web, particularly within public Open Data initiatives, and discuss potential risks and challenges.
[UKPS14] Jürgen Umbrich, Marcel Karnstedt, Axel Polleres, and Kai-Uwe Sattler. Index-based source selection and optimization. In Katja Hose, Ralf Schenk, and Andreas Harth, editors, Linked Data Management, pages 311--337. Chapman and Hall/CRC, May 2014. [ http ]
[HAMP14] Aidan Hogan, Marcelo Arenas, Alejandro Mallea, and Axel Polleres. Everything you always wanted to know about blank nodes. Journal of Web Semantics (JWS), 27:42--69, 2014. [ http ]
In this paper we thoroughly cover the issue of blank nodes, which have been defined in RDF as `existential variables'. We first introduce the theoretical precedent for existential blank nodes from first order logic and incomplete information in database theory. We then cover the different (and sometimes incompatible) treatment of blank nodes across the W3C stack of RDF-related standards. We present an empirical survey of the blank nodes present in a large sample of RDF data published on the Web (the dataset), where we find that 25.7% of unique RDF terms are blank nodes, that 44.9% of documents and 66.2% of domains featured use of at least one blank node, and that aside from one Linked Data domain whose RDF data contains many “blank node cycles”, the vast majority of blank nodes form tree structures that are efficient to compute simple entailment over. With respect to the RDF-merge of the full data, we show that 6.1% of blank-nodes are redundant under simple entailment. The vast majority of non-lean cases are isomorphisms resulting from multiple blank nodes with no discriminating information being given within an RDF document or documents being duplicated in multiple Web locations. Although simple entailment is NP-complete and leanness-checking is coNP-complete, in computing this latter result, we demonstrate that in practice, real-world RDF graphs are sufficiently “rich” in ground information for problematic cases to be avoided by non-naive algorithms.
[HLP14] Pascal Hitzler, Jens Lehmann, and Axel Polleres. Logics for the semantic web. In Dov M. Gabbay, Jörg H. Siekmann, and John Woods, editors, Computational Logic, volume 9 of Handbook of the History of Logic, pages 679--710. Elesevier, 2014. [ .pdf ]
This chapter summarizes the developments of Semantic Web standards such as RDF, OWL, RIF and SPARQL and their foundations in Logics. It aims at providing an entry point particularly for logicians to these standards.
[Pol14] Axel Polleres. SPARQL. In Reda Alhajj and Jon G. Rokne, editors, Encyclopedia of Social Network Analysis and Mining, pages 1960--1966. Springer, 2014. [ DOI ]

2013


[GPM13] Marina Gueroussova, Axel Polleres, and Sheila A. McIlraith. SPARQL with qualitative and quantitative preferences. In Emanuele Della Valle, Markus Krötzsch, Stefan Schlobach, and Irene Celino, editors, 2nd International Workshop on Ordering and Reasoning (OrdRing 2013), CEUR Workshop Proceedings, Sydney, Australia, October 2013. CEUR-WS.org. Position Paper. Technical Report version available at: ftp://ftp.cs.toronto.edu/csrg-technical-reports/619/619.pdf. [ .pdf ]
The volume and diversity of data that is queriable via SPARQL and its increasing integration motivate the desire to query SPARQL information sources via the specification of preferred query outcomes. Such preference-based queries support the ordering of query outcomes with respect to a user's measure of the quality of the response. In this position paper we argue for the incorporation of preference queries into SPARQL. We propose an extension to the SPARQL query language that supports the specification of qualitative and quantitative preferences over query outcomes and examine the realization of the resulting preference-based queries via off-the-shelf SPARQL engines.
[AP13] Albin Ahmeti and Axel Polleres. SPARQL update under RDFS entailment in fully materialized and redundancy-free triple stores. In 2nd International Workshop on Ordering and Reasoning (OrdRing 2013), CEUR Workshop Proceedings, Sydney, Australia, October 2013. CEUR-WS.org. [ .pdf ]
Processing the dynamic evolution of RDF stores has recently been standardized in the SPARQL 1.1 Update specification. However, computing answers entailed by ontologies in triple stores is usually treated orthogonal to updates. Even the W3C's recent SPARQL 1.1 Update language and SPARQL 1.1 Entailment Regimes specifications explicitly exclude a standard behavior how SPARQL endpoints should treat entailment regimes other than simple entailment in the context of updates. In this paper, we take a first step to close this gap, by drawing from query rewriting techniques explored in the context of DL-Lite. We define a fragment of SPARQL basic graph patterns corresponding to (the RDFS fragment of) DL-Lite and the corresponding SPARQL Update language discussing possible semantics along with potential strategies for implementing them. We treat both (i) reduced RDF Stores, that is, redundancy-free RDF stores that do not store any RDF triples (corresponding to DL Lite ABox statements) entailed by others already, and (ii) materialized RDF stores, which store all entailed triples explicitly.
[SP13b] Simon Steyskal and Axel Polleres. Mix'n'match: Iteratively combining ontology matchers in an anytime fashion. In 8th International Workshop on Ontology Matching, volume 111 of CEUR Workshop Proceedings, Sydney, Australia, October 2013. CEUR-WS.org. Extended version avaiable at: http://www.steyskal.info/om2013/extendedversion.pdf. [ .pdf ]
We present a novel architecture for combining off-the-shelf ontology matchers based on iterative calls and exchanging information in the form of reference alignments. Unfortunately though, only a few of the matchers contesting in the past years' OAEI campaigns actually allow the provision of reference alignments in the standard OAEI alignment format to support such a combined matching process. We bypass this lacking functionality by using simple URI replacement to “emulate” reference alignments in the aligned ontologies. While some matchers still consider classes and proper- ties in ontologies aligned in such fashion as different, we experimentally prove that our iterative approach benefits from this emulation, achieving the best results in terms of F-measure on parts of the OAEI benchmark suite, compared to the single results of the competing matchers as well as their combined results. The new combined matcher -- Mix'n'Match -- integrates different matchers in a multi-threaded architecture and pro- vides an anytime behavior in the sense that it can be stopped anytime with the best combined matchings found so far
[PFSF13] Axel Polleres, Melanie Frühstück, Gottfried Schenner, and Gerhard Friedrich. Debugging non-ground ASP programs with choice rules, cardinality constraints and weight constraints. In Pedro Cabalar and Tran Cao Son, editors, Proceedings of the 12th International Conference on Logic Programming and Nonmonotonic Reasoning (LPNMR-2013), volume 8148 of Lecture Notes in Computer Science (LNCS), pages 452--464, Corunna, Spain, September 2013. Springer. [ .pdf ]
When deploying Answer Set Programming (ASP)in an industrial context, for instance for (re-)configuration (Friedrich et al., 2011), knowledge engineers need debugging support on non-ground programs. Current approaches to ASP debugging, however, do not cover extended modeling features of ASP, such as choice rules, conditional cardinality and weight constraints. To this end, we encode non-ground ASP programs using extended modeling features into normal logic progams; this encoding extends existing encodings for the case of ground programs to the non-ground case. We subsequently deploy this translation in order to extend ASP debugging for non-ground normal logic programs. We have implemented and tested the approach and provide evaluation results.
[SP13a] Simon Steyskal and Axel Polleres. Mix'n'match: An alternative approach for combining ontology matchers. In 12th International Conference on Ontologies, DataBases, and Applications of Semantics (ODBASE 2013), Lecture Notes in Computer Science (LNCS), Graz, Austria, September 2013. Springer. Short paper. [ http ]
The existence of a standardized ontology alignment format promoted by the Ontology Alignment Evaluation Initiative (OAEI) potentially enables different ontology matchers to be combined and used together. Along these lines, we present a novel architecture for combining ontology matchers based on iterative calls of off-the-shelf matchers that exchange information in the form of reference mappings in this standard alignment format. However, we argue that only a few of the matchers contesting in the past years' OAEI campaigns actually allow the provision of reference alignments to support the matching process. We bypass this lacking functionality by introducing an alternative approach for aligning results of different ontology matchers using simple URI replacement in the aligned ontologies. We experimentally prove that our iterative approach benefits from this emulation of reference alignments.
[BPS13] Stefan Bischof, Axel Polleres, and Simon Sperl. City data pipeline - a system for making open data useful for cities. In Steffen Lohmann, editor, Proceedings of the I-SEMANTICS 2013 Posters & Demonstrations Track, volume 1026 of CEUR Workshop Proceedings, pages 45--49, Graz, Austria, September 2013. CEUR-WS.org. [ .pdf ]
Some cities publish data in an open form. But even more cities can profit from the data that is already available as open or linked data. Unfortunately open data of different sources is usually given also in different heterogeneous data formats. With the City Data Pipeline we aim to integrate data about cities in a common data model by using Semantic Web technologies. Eventually we want to support city officials with their decisions by providing automated analytics support.
[PHDU13] Axel Polleres, Aidan Hogan, Renaud Delbru, and Jürgen Umbrich. RDFS & OWL reasoning for linked data. In Sebastian Rudolph, Georg Gottlob, Ian Horrocks, and Frank van Harmelen, editors, Reasoning Web. Semantic Technologies for Intelligent Data Access (Reasoning Web 2013), volume 8067 of Lecture Notes in Computer Science (LNCS), pages 91--149. Springer, Mannheim, Germany, July 2013. [ DOI | .pdf ]
Linked Data promises that a large portion of Web Data will be usable as one big interlinked RDF database against which structured queries can be answered. In this lecture we will show how reasoning -- using RDF Schema (RDFS) and the Web Ontology Language (OWL) -- can help to obtain more complete answers for such queries over Linked Data. We first look at the extent to which RDFS and OWL features are being adopted on the Web. We then introduce two high-level architectures for query answering over Linked Data and outline how these can be enriched by (lightweight) RDFS and OWL reasoning, enumerating the main challenges faced and discussing reasoning methods that make practical and theoretical trade-offs to address these challenges. In the end, we also ask whether or not RDFS and OWL are enough and discuss numeric reasoning methods that are beyond the scope of these standards but that are often important when integrating Linked Data from several, heterogeneous sources.
[Pol13b] Axel Polleres. Building blocks for a linked data ecosystem. In Andreas Harth, Craig A. Knoblock, Kai-Uwe Sattler, and Rudi Studer, editors, Report from Dagstuhl Seminar 13252 Interoperation in Complex Information Ecosystems, pages 116--118. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, June 2013. Position Statement. [ http ]
It is probably a good moment to take a step back to critically reflect on which puzzle pieces might be missing to achieve (even more) widespread adoption of Linked Data. Particularly, it seems that more than a few publishing principles and community enthusiasm is necessary to keep the idea of a Web-scale data ecosystem afloat. We outline some challenges and missing building blocks to complement the available standards for Linked Data towards a fully functioning ecosystem
[BP13] Stefan Bischof and Axel Polleres. RDFS with attribute equations via SPARQL rewriting. In Philipp Cimiano, Oscar Corcho, Valentina Presutti, Laura Hollink, and Sebastian Rudolph, editors, The Semantic Web: Semantics and Big Data -- Proceedings of the 10th ESWC (ESWC2013), volume 7882 of Lecture Notes in Computer Science (LNCS), pages 335--350, Montpellier, France, May 2013. Springer. [ .pdf ]
In addition to taxonomic knowledge about concepts and properties typically expressible in languages such as RDFS and OWL, implicit information in an RDF graph may be likewise determined by arithmetic equations. The main use case here is exploiting knowledge about functional dependencies among numerical attributes expressible by means of such equations. While some of this knowledge can be encoded in rule extensions to ontology languages, we provide an arguably more flexible framework that treats attribute equations as first class citizens in the ontology language. The combination of ontological reasoning and attribute equations is realized by extending query rewriting techniques already successfully applied for ontology languages such as (the DL-Lite-fragment of) RDFS or OWL, respectively. We deploy this technique for rewriting SPARQL queries and discuss the feasibility of alternative implementations, such as rule-based approaches.
[BCL+13] Olivier Boissier, Marco Colombetti, Michael Luck, John-Jules Meyer, and Axel Polleres. Norms, organizations, and semantics. The Knowledge Engineering Review, 28(1):107--116, March 2013. [ DOI ]
This paper integrates the responses to a set of questions from a distinguished set of panelists involved in a discussion at the Agreement Technologies workshop in Cyprus in December 2009. The panel was concerned with the relationship between the research areas of semantics, norms, and organizations, and the ways in which each may contribute to the development of the others in support of next generation agreement technologies.
[SZF+13] Ratnesh Sahay, Antoine Zimmermann, Ronan Fox, Axel Polleres, and Manfred Hauswirth. A formal investigation of semantic interoperability of HCLS systems. In Miguel-Angel Sicilia and Pablo Serrano-Balazote, editors, Interoperability in Healthcare Information Systems: Standards, Management, and Technology. IGI Global, February 2013. [ http ]
Semantic interoperability facilitates Health Care and Life Sciences (HCLS) systems in connecting stakeholders (e.g., patients, physicians, pharmacies) at various levels as well as ensures seamless use of healthcare resources (e.g., data, schema, applications). Their scope ranges from local (within, e.g., hospitals or hospital networks) to regional, national and cross-border. The use of semantics in delivering interoperable solutions for HCLS systems is weakened by fact that an Ontology Based Information System (OBIS) has restrictions in modeling, aggregating, and interpreting global knowledge (e.g., terminologies for disease, drug, clinical event) in conjunction with local information (e.g., policy, profiles). This chapter presents an example-scenario that shows such limitations and recognizes that enabling two key features, namely the type and scope of knowledge, within a knowledge base could enhance the overall effectiveness of an OBIS. We provide the idea of separating knowledge bases in types (e.g., general or constraint knowledge) with scope (e.g., global or local) of applicability. Then, we propose two concrete solutions on this general notion. Finally, we describe open research issues that may be of interest to knowledge system developers and broader research community.
[PPSW13] Reinhard Pichler, Axel Polleres, Sebastian Skritek, and Stefan Woltran. Complexity of redundancy detection on rdf graphs in the presence of rules, constraints, and queries. Semantic Web -- Interoperability, Usability, Applicability (SWJ), 4(4), 2013. [ DOI | .pdf ]
Based on practical observations on rule-based inference on RDF data, we study the problem of redundancy elimination on RDF graphs in the presence of rules (in the form of Datalog rules) and constraints, (in the form of so-called tuple-generating dependencies), and with respect to queries (ranging from conjunctive queries up to more complex ones, particularly covering features of SPARQL, such as union, negation, or filters). To this end, we investigate the influence of several problem parameters (like restrictions on the size of the rules, the constraints, and/or the queries) on the complexity of detecting redundancy. The main result of this paper is a fine-grained complexity analysis of both graph and rule minimisation in various settings.
[BAACP13] Carlos Buil-Aranda, Marcelo Arenas, Oscar Corcho, and Axel Polleres. Federating queries in SPARQL1.1: Syntax, semantics and evaluation. Journal of Web Semantics (JWS), 18(1), 2013. [ DOI | http ]
Given the sustained growth that we are experiencing in the number of SPARQL endpoints available, the need to be able to send federated SPARQL queries across these has also grown. To address this use case, the W3C SPARQL working group is defining a federation extension for SPARQL 1.1 which allows for combining graph patterns that can be evaluated over several endpoints within a single query. In this paper, we describe the syntax of that extension and formalize its semantics. Additionally, we describe how a query evaluation system can be implemented for that federation extension, describing some static optimization techniques and reusing a query engine used for data-intensive science, so as to deal with large amounts of intermediate and final results. Finally we carry out a series of experiments that show that our optimizations speed up the federated query evaluation process.
[FMPG+13] Javier D. Fernández, Miguel A. Martinez-Prieto, Claudio Gutiérrez, Axel Polleres, and Mario Arias. Binary RDF Representation for Publication and Exchange (HDT). Journal of Web Semantics (JWS), 19(2), 2013. [ DOI | .pdf ]
The current Web of Data is producing increasingly large RDF data sets. Massive publication efforts of RDF data driven by initiatives like the Linked Open Data movement, and the need to exchange large data sets has unveiled the drawbacks of traditional RDF representations, inspired and designed by a document-centric and human-readable Web. Among the main problems are high levels of verbosity/redundancy and weak machine-processable capabilities in the description of these data sets. This scenario calls for efficient formats for publication and exchange. This article presents a binary RDF representation addressing these issues. Based on a set of metrics that characterizes the skewed structure of real-world RDF data, we develop a proposal of an RDF representation that modularly partitions and efficiently represents three components of RDF data sets: Header information, a Dictionary, and the actual Triples structure (thus called HDT). Our experimental evaluation shows that data sets in HDT format can be compacted by more than fifteen times as compared to current naive representations, improving both parsing and processing while keeping a consistent pub- lication scheme. Specific compression techniques over HDT further improve these compression rates and prove to outperform existing compression solutions for efficient RDF exchange.
[Pol13a] Axel Polleres. Agreement technologies and the semantic web. In Sascha Ossowski, editor, Agreement Technologies, volume 8 of Law, Governance and Technology Series, pages 57--68. Springer, January 2013. [ http ]
In this chapter we discuss the relationship between Agreement Technologies and the Semantic Web, especially focusing on how Semantic Web standards play a role in the Agreement Technologies stack, but also issues related to Linked Data and the Web of Data. We start the chapter with an account of Semantic Web standards. Then the scientific foundations of Semantic Web standards are discussed. Finally, we relate the work on semantic technologies to other fields of Agreement Technologies, from the point of view of Semantic Web standards.
[PW13] Axel Polleres and Johannes Wallner. On the relation between SPARQL1.1 and answer set programming. Journal of Applied Non-Classical Logics (JANCL), 23(1--2):159--212, 2013. Special issue on Equilibrium Logic and Answer Set Programming. [ DOI ]
In the context of the emerging Semantic Web and the quest for a common logical framework underpinning its architecture, the relation of rule-based languages such as Answer Set Programming (ASP) and ontology languages such as OWL has attracted a lot of attention in the literature over the past years. With its roots in Deductive Databases and Datalog though, ASP shares much more commonality with another Semantic Web standard, namely the query language SPARQL. In this paper, we take the forthcoming approval of the SPARQL1.1 standard by the World Wide Web consortium (W3C) as an opportunity to introduce this standard to the Logic Programming community by providing a translation of SPARQL1.1 into ASP. In this translation, we explain and highlight peculiarities of the new W3C standard. Along the way, we survey existing literature on foundations of SPARQL and SPARQL1.1, and also combinations of SPARQL with ontology and rules languages. Thereby, apart from providing means to implement and support SPARQL natively within Logic Programming engines and particularly ASP engines, we hope to pave the way for further research on a common logical framework for Semantic Web languages, including query languages, from an ASP point of view.

2012


[LKZ+12] Nuno Lopes, Sabrina Kirrane, Antoine Zimmermann, Axel Polleres, and Alessandra Mileo. A Logic Programming approach for Access Control over RDF. In Agostino Dovier and Vítor Santos Costa, editors, Technical Communications of the ICLP 2012, volume 17 of LIPIcs, pages 381--392, Budapest, Hungary, September 2012. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik. Short paper. [ .pdf ]
The Resource Description Framework (RDF) is an interoperable data representation format suitable for interchange and integration of data, especially in Open Data contexts. However, RDF is also becoming increasingly attractive in scenarios involving sensitive data, where data protection is a major concern. At its core, RDF does not support any form of access control and current proposals for extending RDF with access control do not fit well with the RDF representation model. Considering an enterprise scenario, we present a modelling that caters for access control over the stored RDF data in an intuitive and transparent manner. For this paper we rely on Annotated RDF, which introduces concepts from Annotated Logic Programming into RDF. Based on this model of the access control annotation domain, we propose a mechanism to manage permissions via application-specific logic rules. Furthermore, we illustrate how our Annotated Query Language (AnQL) provides a secure way to query this access control annotated RDF data.
[Pol12] Axel Polleres. How (well) do datalog, SPARQL and RIF interplay? In Pablo Barceló and Reinhard Pichler, editors, Datalog in Academia and Industry -- Second International Workshop, Datalog 2.0, volume 7494 of Lecture Notes in Computer Science (LNCS), pages 27--30. Springer, September 2012. Invited tutorial, slides available at http://www.polleres.net/presentations/20120913Datalog20_Tutorial.pdf. [ http ]
In this tutorial we will give an overview of the W3C standard query language for RDF -- SPARQL -- and its relation to Datalog as well as on the interplay with another W3C standard closely related to Datalog, the Rule Interchange Format (RIF). As we will learn -- while these three interplay nicely on the surface and in academic research papers -- some details within the W3C specs impose challenges on seamlessly integrating Datalog rules and SPARQL.
[UHPD12] Jürgen Umbrich, Aidan Hogan, Axel Polleres, and Stefan Decker. Improving the recall of live linked data querying through reasoning. In Markus Krötzsch and Umberto Straccia, editors, Web Reasoning and Rule Systems -- 6th International Conference, RR2012, volume 7497 of Lecture Notes in Computer Science (LNCS), pages 188--204, Vienna, Austria, September 2012. Springer. [ DOI | .pdf ]
Linked Data principles allow for processing SPARQL queries on-the-fly by dereferencing URIs. Link-traversal query approaches for Linked Data have the benefit of up-to-date results and decentralised execution, but operate only on explicit data from dereferenced documents, affecting recall. In this paper, we show how inferable knowledge -- specifically that found through owl:sameAs and RDFS reasoning -- can improve recall in this setting. We first analyse a corpus featuring 7 million Linked Data sources and 2.1 billion quadruples: we (1) measure expected recall by only considering dereferenceable information, (2) measure the improvement in recall given by considering rdfs:seeAlso links as previous proposals did. We further propose and measure the impact of additionally considering (3) owl:sameAs links, and (4) applying lightweight RDFS reasoning for finding more results, relying on static schema information. We evaluate different configurations for live queries covering different shapes and domains, generated from random walks over our corpus.
[RPF+12] Anna Ryabokon, Axel Polleres, Gerhard Friedrich, Andreas Falkner, Alois Haselböck, and Herwig Schreiner. (Re)Configuration using Web Data: a case study on the reviewer assignment problem. In Markus Krötzsch and Umberto Straccia, editors, Web Reasoning and Rule Systems -- 6th International Conference, RR2012, volume 7497 of Lecture Notes in Computer Science (LNCS), pages 258--261, Vienna, Austria, September 2012. Springer. Short paper. [ DOI | .pdf ]
Constraint-based configuration is -- on the one hand -- one of the classical problem domains in AI and also in industrial practice. Additional problems arise, when configuration objects come from an open environment such as the Web, or in case of a reconfiguration. On the other hand, (re)configuration is a reasoning task very much ignored in the current (Semantic) Web reasoning literature, despite (i) the increased availability of structured data on the Web, particularly due to movements such as the Semantic Web and Linked Data, (ii) numerous practically relevant tasks in terms of using Web data involve (re)configuration. To bridge these gaps, we discuss the challenges and possible approaches for reconfiguration in an open Web environment, based on a practical use case leveraging Linked Data as a “component catalog” for configuration. In this paper, we present techniques to enhance existing review management systems with (re)configuration facilities and provide a practical evaluation.
[HUH+12] Aidan Hogan, Jürgen Umbrich, Andreas Harth, Richard Cyganiak, Axel Polleres, and Stefan Decker. An empirical survey of linked data conformance. Journal of Web Semantics (JWS), 14:14--44, July 2012. [ .pdf ]
There has been a recent, tangible growth in RDF published on the Web in accordance with the Linked Data principles and best practices, the result of which has been dubbed the “Web of Data”. Linked Data guidelines are designed to facilitate ad hoc re-use and integration of conformant structured data-across the Webby consumer applications; however, thus far, systems have yet to emerge that convincingly demonstrate the potential applications for consuming currently available Linked Data. Herein, we compile a list of fourteen concrete guidelines as given in the “How to Publish Linked Data on the Web” tutorial. Thereafter, we evaluate conformance of current RDF data providers with respect to these guidelines. Our evaluation is based on quantitative empirical analyses of a crawl of  4 million RDF/XML documents constituting over 1 billion quadruples, where we also look at the stability of hosted documents for a corpus consisting of nine monthly snapshots from a sample of 151 thousand documents. Backed by our empirical survey, we provide insights into the current level of conformance with respect to various Linked Data guidelines, enumerating lists of the most (non-)conformant data providers. We show that certain guidelines are broadly adhered to (esp. use HTTP URIs, keep URIs stable), whilst others are commonly overlooked (esp. provide licencing and human-readable meta-data). We also compare PageRank scores for the data-providers and their conformance to Linked Data guidelines, showing that both factors negatively correlate for guidelines restricting use of RDF features, while positively correlating for guidelines encouraging external linkage and vocabulary re-use. Finally, we present a summary of conformance for the different guidelines, and present the top-ranked data providers in terms of a combined PageRank and Linked Data conformance score.
[UKP+12] Jürgen Umbrich, Marcel Karnstedt, Josiane Xavier Parreira, Axel Polleres, and Manfred Hauswirth. Linked Data and Live Querying for Enabling Support Platforms for Web Dataspaces. In Proceedings of the 3rd International Workshop on Data Engineering Meets the Semantic Web (DESWEB), co-located with ICDE2012, Washington DC, USA, April 2012. [ .pdf ]
Enabling the “Web of Data” has recently gained increased attention, particularly driven by the success of Linked Data. The agreed need for technologies from the database domain is therein often referred to as the “Web as a Database”, a concept that is still more a vision than a reality. Meanwhile, the database community proposed the notion of dataspaces managed by support platforms, as an alternative view on the data management problem for small-scale, loosely connected environments of heterogenous data sources. The Web of Data can actually be seen as a collection of inter-connected dataspaces. In this work, we propose a combination of Linked Data and database technologies to provide support platforms for these Web dataspaces. We argue that while separated, Linked Data still lacks database technology and the dataspace idea lacks openness and scale. We put particular focus on the challenge of how to index, search and query structured data on the Web in a way that is appropriate for its dynamic, heterogeneous, loosely connected, and open character. Based on an empirical study, we argue that none of the two extremes on its own -- centralised repositories vs. on-demand distributed querying - can meet all requirements. We propose and discuss an alternative hybrid approach combining the best of both sides to find a better tradeoff between result freshness and fast query response times.
[KUHP12] Tobias Käfer, Jürgen Umbrich, Aidan Hogan, and Axel Polleres. Towards a dynamic linked data observatory. In WWW2012 Workshop on Linked Data on the Web (LDOW2012), Lyon, France, April 2012. [ .pdf ]
We describe work-in-progress on the design and methodology of the Dynamic Linked Data Observatory: a framework to monitor Linked Data over an extended period of time. The core goal of our work is to collect frequent, continuous snapshots of a subset of the Web of Data that is interesting for further study and experimentation, with an aim to capture raw data about the dynamics of Linked Data. The resulting corpora will be made openly and continuously available to the Linked Data research community. Herein, we (1) motivate the importance of such a corpus; (2) out- line some of the use-cases and requirements for the resulting snapshots; (3) discuss different “views” of the Web of Data which affect how we define a sample to monitor; (4) detail how we select the scope of the monitoring experiment through sampling, (5) discuss the final design of the monitoring framework which will capture regular snapshots of (subsets of) the Web of Data over the coming months and years.
[GHKP12] Birte Glimm, Adian Hogan, Markus Krötzsch, and Axel Polleres. OWL: Yet to arrive on the web of data? In WWW2012 Workshop on Linked Data on the Web (LDOW2012), Lyon, France, April 2012. [ .pdf ]
Seven years on from OWL becoming a W3C recommendation, and two years on from the more recent OWL 2 W3C recommendation, OWL has still experienced only patchy uptake on the Web. Although certain OWL features (like owl:sameAs) are very popular, other features of OWL are largely neglected by publishers in the Linked Data world. This may suggest that despite the promise of easy implementations and the proposal of tractable profiles suggested in OWL's second version, there is still no “right” standard fragment for the Linked Data community. In this paper, we (1) analyse uptake of OWL on the Web of Data, (2) gain insights into the OWL fragment that is actually used/usable on the Web, where we arrive at the conclusion that this fragment is likely to be a simplified profile based on OWL RL, (3) propose and discuss such a new fragment, which we call OWL LD (for Linked Data).
[ZLPS12] Antoine Zimmermann, Nuno Lopes, Axel Polleres, and Umberto Straccia. A general framework for representing, reasoning and querying with annotated semantic web data. Journal of Web Semantics (JWS), 12:72--95, March 2012. [ .pdf ]
We describe a generic framework for representing and reasoning with annotated Semantic Web data, a task becoming more important with the recent increased amount of inconsistent and non-reliable meta-data on the web. We formalise the annotated language, the corresponding deductive system and address the query answering problem. Previous contributions on specific RDF annotation domains are encompassed by our unified reasoning formalism as we show by instantiating it on (i) temporal, (ii) fuzzy, and (iii) provenance annotations. Moreover, we provide a generic method for combining multiple annotation domains allowing to represent, e.g., temporally-annotated fuzzy RDF. Furthermore, we address the development of a query language -- AnQL -- that is inspired by SPARQL, including several features of SPARQL 1.1 (subqueries, aggregates, assignment, solution modifiers) along with the formal definitions of their semantics.
[HZU+12] Aidan Hogan, Antoine Zimmermann, Jürgen Umbrich, Axel Polleres, and Stefan Decker. Scalable and distributed methods for entity matching, consolidation and disambiguation over linked data corpora. Journal of Web Semantics (JWS), 10:76--110, January 2012. [ DOI ]
With respect to large-scale, static, Linked Data corpora, in this paper we discuss scalable and distributed methods for: (i) entity consolidation---identifying entities that signify the same referent, aka. smushing, entity resolution, object consolidation, etc.---using explicit owl:sameAs relations; (ii) extended entity consolidation based on a subset of OWL 2 RL/RDF rules---particularly over inverse-functional properties, functional-properties and (max-)cardinality restrictions with value one; (iii) deriving weighted concurrence measures between entities in the corpus based on shared inlinks/outlinks and attribute values using statistical analyses; (iv) disambiguating (initially) consolidated entities based on inconsistency detection using OWL 2 RL/RDF rules. Our methods are based upon distributed sorts and scans of the corpus, where we purposefully avoid the requirement for indexing all data. Throughout, we offer evaluation over a diverse Linked Data corpus consisting of 1.118 billion quadruples derived from a domain-agnostic, open crawl of 3.985 million RDF/XML Web documents, demonstrating the feasibility of our methods at that scale, and giving insights into the quality of the results for real-world data.
[BDK+12] Stefan Bischof, Stefan Decker, Thomas Krennwallner, Nuno Lopes, and Axel Polleres. Mapping between RDF and XML with XSPARQL. Journal on Data Semantics (JoDS), 1(3):147--185, 2012. [ http ]
One promise of Semantic Web applications is to seamlessly deal with heterogeneous data. The Extensible Markup Language (XML) has become widely adopted as an almost ubiquitous interchange format for data, along with transformation languages like XSLT and XQuery to translate data from one XML format into another. However, the more recent Resource Description Framework (RDF) has become another popular standard for data representation and exchange, supported by its own query language SPARQL, that enables extraction and transformation of RDF data. Being able to work with XML and RDF using a common framework eliminates several unnecessary steps that are currently required when handling both formats side by side. In this paper we present the XSPARQL language that, by combin- ing XQuery and SPARQL, allows to query XML and RDF data using the same framework and transform data from one format into the other. We focus on the semantics of this combined language and present an implementation, including discussion of query optimisations along with benchmark evaluation.

2011


[HNP+11] Andreas Harth, Barry Norton, Axel Polleres, Brahmananda Sapkota, Sebastian Speiser, Steffen Stadtmüller, and Osma Suominen. Towards uniform access to web data and services. In W3C Workshop on Data and Services Integration, Bedford, MA, USA, October 2011. [ .pdf ]
A sizable amount of data on the Web is currently available via Web APIs that expose data in formats such as JSON or XML. Combining data from different APIs and data sources requires glue code which is typically not shared and hence not reused. We derive requirements for a mechanism that brings data and functionality currently available via ad-hoc APIs into a coherent framework. Such standardised access to content and functionality would reduce the effort for data integration and the combination of service functionality, leading to reduced effort in composing data and services from multiple providers.
[LBP11] Nuno Lopes, Stefan Bischof, and Axel Polleres. On the semantics of heterogeneous querying of relational, XML, and RDF data with XSPARQL. In Proceedings of the 15th Portuguese Conference on Artificial Intelligence (EPIA2011) -- Computational Logic with Applications Track, Lisbon, Portugal, October 2011. [ .pdf ]
XSPARQL is a transformation and query language that caters for heterogenous sources: in its present status it is possible to transform data between XML and RDF formats due to the integration of the XQuery and SPARQL query languages. In this paper we propose an extension of the XSPARQL language to incorporate data contained in relational databases by integrating a subset of SQL in the syntax of XSPARQL. Exposing data contained in relational databases as RDF is a necessary step towards the realisation of the Semantic Web and Web of Data. We present the syntax of an extension of the XSPARQL language catering for the inclusion of the SQL query language along with the semantics based on the XQuery formal semantics and sketch how this extended XSPARQL language can be used to expose RDB2RDF mappings, as currently being discussed in the W3C RDB2RDF Working Group.
[MAHP11] Alejandro Mallea, Marcelo Arenas, Aidan Hogan, and Axel Polleres. On Blank Nodes. In Proceedings of the 10th International Semantic Web Conference (ISWC 2011), volume 7031 of Lecture Notes in Computer Science (LNCS), Bonn, Germany, October 2011. Springer. Nominated for best paper award. [ .pdf ]
Blank nodes are defined in RDF as `existential variables' in the same way that has been used before in mathematical logic. However, evidence suggests that actual usage of RDF does not follow this definition. In this paper we thoroughly cover the issue of blank nodes, from incomplete information in database theory, over different treatments of blank nodes across the W3C stack of RDF-related standards, to empirical analysis of RDF data publicly available on the Web. We then summarize alternative approaches to the problem, weighing up advantages and disadvantages, also discussing proposals for Skolemization.
[DMZ+11] Anca Dumitrache, Alessandra Mileo, Antoine Zimmermann, Axel Polleres, Philipp Obermeier, and Owen Friel. Enabling privacy-preserving semantic presence in instant messaging systems. In 7th International and Interdisciplinary Conference on Modeling and Using Context 2011 (CONTEXT'11), Karlsruhe, Germany, September 2011. [ .pdf ]
In pervasive environments, presence-based application development via Presence Management Systems (PMSs) is a key factor to optimise the management of communication channels, driving productivity increase. Solutions for presence management should satisfy the interoperability requirements, in turn providing context-centric presence analysis and privacy management. In order to push PMSs towards flexible, open and context-aware presence management, we propose some adaptation of two extensions to standard XML-based XMPP for message exchange in online communication systems. The contribution allows for more complex specification and management of nested group and privacy lists, where semantic technologies are used to map all messages into RDF vocabularies and pave the way for a broader semantic integration of heterogeneous and distributed presence information sources in the standard PMSs framework.
[SFZ+11] Ratnesh Sahay, Ronan Fox, Antoine Zimmermann, Axel Polleres, and Manfred Hauswrith. A methodological approach for ontologising and aligning health level seven (HL7) applications. In MISI (Massive Information Sharing and Integration) Conference - Special Track on Eletronic Healthcare (SALUS 2011), Vienna, Austria, August 2011. [ .pdf ]
Healthcare applications are complex in the way data and schemas are organised in their internal systems. Widely deployed healthcare standards like Health Level Seven (HL7) V2 are designed using flexible schemas which allow several choices when constructing clinical messages. The recently emerged HL7 V3 has a centrally consistent information model that controls terminologies and concepts shared by V3 applications. V3 information models are arranged in several layers (abstract to concrete layers). V2 and V3 systems raise interoperability challenges: firstly, how to exchange clinical messages between V2 and V3 applications, and secondly, how to integrate globally defined clinical concepts with locally constructed concepts. The use of ontologies for interoperable healthcare applications has been advocated by domain and knowledge representation specialists. This paper addresses two main areas of an ontology-based integration framework: (1) an ontology building methodology for the HL7 standard where ontologies are developed in separated global and local layers; and (2) aligning V2 and V3 ontologies. We propose solutions that: (1) provide a semi-automatic mechanism to build HL7 ontologies; (2) provide a semi-automatic mechanism to align HL7 ontologies and transform underlying clinical messages. The proposed methodology has developed HL7 ontologies of 300 concepts in average for each version. These ontologies and their alignments are deployed and evaluated under a semantically-enabled healthcare integration framework.
[BLP11] Stefan Bischof, Nuno Lopes, and Axel Polleres. Improve efficiency of mapping data between XML and RDF with XSPARQL. In Web Reasoning and Rule Systems -- Fifth International Conference, RR2011, volume 6902 of Lecture Notes in Computer Science (LNCS), pages 232--237, Galway, Ireland, August 2011. Springer. Short paper. [ .pdf ]
XSPARQL is a language to transform data between the tree-based XML format and the graph-based RDF format. XML is a widely adopted data exchange format which brings its own query language XQuery along. RDF is the standard data format of the Semantic Web with SPARQL being the corresponding query language. XSPARQL combines XQuery and SPARQL to a unified query language which provides a more intuitive and maintainable way to translate data between the two data formats. A naive implementation of XSPARQL can be inefficient when evaluating nested queries. However, such queries occur often in practice when dealing with XML data. We present and compare several approaches to optimise nested queries. By implementing these optimisations we improve efficiency up to two orders of magnitude in a practical evaluation.
[DTP11] Renaud Delbru, Giovanni Tummarello, and Axel Polleres. Context-dependent OWL reasoning in sindice - experiences and lessons learnt. In Web Reasoning and Rule Systems -- Fifth International Conference, RR2011, volume 6902 of Lecture Notes in Computer Science (LNCS), pages 46--60, Galway, Ireland, August 2011. Springer. [ .pdf ]
The Sindice Semantic Web index provides search capabilities over today more than 220 million documents. Reasoning over web data enables to make explicit what would otherwise be implicit knowledge: it adds value to the information and enables Sindice to ultimately be more competitive in terms of precision and recall. However, due to the scale and heterogeneity of web data, a reasoning engine for the Sindice system must (1) scale out through parallelisation over a cluster of machines; and (2) cope with unexpected data usage. In this paper, we report our experiences and lessons learnt in building a large scale reasoning engine for Sindice. The reasoning approach has been deployed, used and improved since 2008 within Sindice and has enabled Sindice to reason over billions of triples. First, we introduce our notion of context-dependent reasoning for RDF entities published on the Web according to the linked data principle. We then illustrate an efficient methodology to perform context-dependent RDFS and partial OWL inference based on a persistent TBox composed of a network of web ontologies. Finally we report performance evaluation results of our implementation underlying the Sindice web data index.
[HPPR11] Aidan Hogan, Jeff Z. Pan, Axel Polleres, and Yuan Ren. Scalable OWL 2 reasoning for linked data. In Axel Polleres, Claudia D'Amato, Marcelo Arenas, Siegfried Handschuh, Paula Kroner, Sascha Ossowski, and Peter Patel-Schneider, editors, Reasoning Web. Semantic Technologies for the Web of Data. (Reasoning Web 2011), volume 6848 of Lecture Notes in Computer Science (LNCS), pages 250--325. Springer, Galway, Ireland, August 2011. [ .pdf ]
The goal of the Scalable OWL 2 Reasoning for Linked Data lecture is twofold: first, to introduce scalable reasoning and querying techniques to Semantic Web researchers as powerful tools to make use of Linked Data and large-scale ontologies, and second, to present interesting research problems for the Semantic Web that arise in dealing with TBox and ABox reasoning in OWL 2. The lecture consists of three parts. The first part will begin with an introduction and motivation for reasoning over Linked Data, including a survey of the use of RDFS and OWL on the Web. The second part will present a scalable, distributed reasoning service for instance data, applying a custom subset of OWL 2 RL/RDF rules (based on a tractable fragment of OWL 2). The third part will present recent work on faithful approximate reasoning for OWL 2 DL. The lecture will include our implementation of the mentioned techniques as well as their evaluations. These notes provide complimentary reference material for the lecture, and follow the three-part structure and content of the lecture.
[PDA+11] Axel Polleres, Claudia D'Amato, Marcelo Arenas, Siegfried Handschuh, Paula Kroner, Sascha Ossowski, and Peter Patel-Schneider, editors. Reasoning Web. Semantic Technologies for the Web of Data. (Reasoning Web 2011), volume 6848 of Lecture Notes in Computer Science (LNCS). Springer, Galway, Ireland, August 2011. [ http ]
The Reasoning Web Summer School has become a well-established event in the area of applications of reasoning techniques on the Web both targeting scientific discourse of established researchers and attracting young researchers to this emerging field. After the previous successful editions in Malta (2005), Lisbon (2006), Dresden (2007 and 2010), Venice (2008), and Bressanone-Brixen (2009), this year's edition moved to the west of Ireland, hosted by the Digital Enterprise Research Institute (DERI) at the National University of Ireland, Galway. By co-locating this year's summer school with the 5th International Conference on Web Reasoning and Rule Systems (RR2011) we hope to have further promoted interaction between researchers, practitioners and students. The 2011 school programme focused around the central topic of applications of Reasoning for the emerging “Web of Data”, with twelve exciting lectures.
[AHP11] Andreas Harth Aidan Hogan and Axel Polleres. Scalable authoritative owl reasoning for the web. In Amit Sheth, editor, Semantic Services, Interoperability and Web Applications: Emerging Concepts, pages 131--177. IGI Global, June 2011. Invited re-publication. [ http ]
In this chapter, the authors discuss the challenges of performing reasoning on large scale RDF datasets from the Web. Using ter-Horst's pD* fragment of OWL as a base, the authors compose a rule-based framework for application to Web data: they argue their decisions using observations of undesirable examples taken directly from the Web. The authors further temper their OWL fragment through consideration of “authoritative sources” which counter-acts an observed behaviour which they term “ontology hijackin”: new ontologies published on the Web re-defining the semantics of existing entities resident in other ontologies. They then present their system for performing rule-based forward-chaining reasoning which they call SAOR: Scalable Authoritative OWL Reasoner. Based upon observed characteristics of Web data and reasoning in general, they design their system to scale: the system is based upon a separation of terminological data from assertional data and comprises of a lightweight in-memory index, on-disk sorts and file-scans. The authors evaluate their methods on a dataset in the order of a hundred million statements collected from real-world Web sources and present scale-up experiments on a dataset in the order of a billion statements collected from the Web. In this republished version, the authors also present extended discussion reflecting upon recent developments in the area of scalable RDFS/OWL reasoning, some of which has drawn inspiration from the original publication (Hogan, et al., 2009).
[FMPGP11] Javier D. Fernández, Miguel A. Martínez-Prieto, Claudio Gutierrez, and Axel Polleres. Binary RDF Representation for Publication and Exchange (HDT), March 2011. W3C member submission. [ http ]
RDF HDT (Header-Dictionary-Triples) is a binary format for publishing and exchanging RDF data at large scale. RDF HDT represents RDF in a compact manner, natively supporting splitting huge RDF graphs into several chunks. It is designed to allow high compression rates. This is achieved by organizing and representing the RDF graph in terms of two main components: Dictionary and Triples structure. The Dictionary organizes all vocabulary present in the RDF graph in a manner that permits rapid search and high levels of compression. The Triples component comprises the pure structure of the underlying graph in a compressed form. An additional and RECOMMENDED Header component includes extensible metadata describing the RDF data set and its organization. Further, the document specifies how to efficiently translate between HDT and other RDF representation formats, such as Notation 3.
[Pol11] Axel Polleres. Semantic Web Technologies: From Theory to Practice. Thesis, Vienna University of Technology, Wien, Österreich, March 2011. Kumulative Habilitationsschrift zur Erlangung der Lehrbefugnis im Fach “Informationssysteme”. [ .pdf ]
The Semantic Web is about to grow up. Over the last few years technologies and standards to build up the architecture of this next generation of the Web have matured and are being deployed on large scale in many live Web sites. The underlying technology stack of the Semantic Web consists of several standards endorsed by the World Wide Web consortium (W3C) that provide the formal underpinings of a machine-readable “Web of Data”: (i) the eXtensible Markup Language (XML) as a uniform exchange syntax; (ii) the Resource Description Framework (RDF) as a uniform data exchange format; (iii) RDF Schema and the Web Ontology Language (OWL) for describig ontologies; (iv) the Rule Interchange Format (RIF) to exchange rules; (v) XQuery and SPARQL as query and transformation languages. The present habilitation thesis comprises a collection of articles reflecting the author's contribution in addressing a number of relevant research problems to close gaps in the Semantic Web architecture regarding the theoretical and practical interplay of these standards.
[dBEPT11] Jos de Bruijn, Thomas Eiter, Axel Polleres, and Hans Tompits. Embedding non-ground logic programs into autoepistemic logic for knowledge base combination. ACM Transactions on Computational Logic, 12(3), 2011. [ .pdf ]
In the context of the Semantic Web, several approaches to the combination of ontologies, given in terms of theories of classical first-order logic and rule bases, have been proposed. They either cast rules into classical logic or limit the interaction between rules and ontologies. Autoepistemic logic (AEL) is an attractive formalism which allows to overcome these limitations, by serving as a uniform host language to embed ontologies and nonmonotonic logic programs into it. For the latter, so far only the propositional setting has been considered. In this paper, we present three embeddings of normal and three embeddings of disjunctive non-ground logic programs under the stable model semantics into first-order AEL. While the embeddings all correspond with respect to objective ground atoms, differences arise when considering non-atomic formulas and combinations with first-order theories. We compare the embeddings with respect to stable expansions and autoepistemic consequences, considering the embeddings by themselves, as well as combinations with classical theories. Our results reveal differences and correspondences of the embeddings and provide useful guidance in the choice of a particular embedding for knowledge combination.
[UHK+11] Jürgen Umbrich, Katja Hose, Marcel Karnstedt, Andreas Harth, and Axel Polleres. Comparing data summaries for processing live queries over linked data. World Wide Web Journal, 14(5--6):495--544, 2011. [ http ]
A growing amount of Linked Data -- graph-structured data accessible at sources distributed across the Web -- enables advanced data integration and decision-making applications. Typical systems operating on Linked Data collect (crawl) and pre-process (index) large amounts of data, and evaluate queries against a centralised repository. Given that crawling and indexing are time-consuming operations, the data in the centralised index may be out of date at query execution time. An ideal query answering system for querying Linked Data live should return current answers in a reasonable amount of time, even on corpora as large as the Web. In such a live query system source selection -- determining which sources contribute answers to a query -- is a crucial step. In this article we propose to use lightweight data summaries for determining relevant sources during query evaluation. We compare several data structures and hash functions with respect to their suitability for building such summaries, stressing benefits for queries that contain joins and require ranking of results and sources. We elaborate on join variants, join ordering and ranking. We analyse the different approaches theoretically and provide results of an extensive experimental evaluation.
[HBPS11] Aidan Hogan, Piero Bonatti, Axel Polleres, and Luigi Sauro. Robust and scalable linked data reasoning incorporating provenance and trust annotations. Journal of Web Semantics (JWS), 9(2):165--201, 2011. [ .pdf ]
In this paper, we leverage annotated logic programs for tracking indicators of provenance and trust during reasoning, specifically focussing on the use-case of applying a scalable subset of OWL 2 RL/RDF rules over static corpora of arbitrary Linked Data (Web data). Our annotations encode three facets of information: (i) blacklist: a (possibly manually generated) boolean annotation which indicates that the referent data are known to be harmful and should be ignored during reasoning; (ii) ranking: a numeric value derived by a PageRank-inspired technique---adapted for Linked Data---which determines the centrality of certain data artefacts (such as RDF documents and statements); (iii) authority: a boolean value which uses Linked Data principles to conservatively determine whether or not some terminological information can be trusted. We formalise a logical framework which annotates inferences with the strength of derivation along these dimensions of trust and provenance; we formally demonstrate some desirable properties of the deployment of annotated logic programming in our setting, which guarantees (i) a unique minimal model (least fixpoint); (ii) monotonicity; (iii) finitariness; and (iv) finally decidability. In so doing, we also give some formal results which reveal strategies for scalable and efficient implementation of various reasoning tasks one might consider. Thereafter, we discuss scalable and distributed implementation strategies for applying our ranking and reasoning methods over a cluster of commodity hardware; throughout, we provide evaluation of our methods over 1 billion Linked Data quadruples crawled from approximately 4 million individual Web documents, empirically demonstrating the scalability of our approach, and how our annotation values help ensure a more robust form of reasoning. We finally sketch, discuss and evaluate a use-case for a simple repair of inconsistencies detectable within OWL 2 RL/RDF constraint rules using ranking annotations to detect and defeat the “marginal view”, and in so doing, infer an empirical “consistency threshold” for the Web of Data in our setting.
[HHU+11] Aidan Hogan, Andreas Harth, Jürgen Umbrich, Sheila Kinsella, Axel Polleres, and Stefan Decker. Searching and browsing linked data with SWSE: The semantic web search engine. Journal of Web Semantics (JWS), 9(4):365--401, 2011. [ http ]
In this paper, we discuss the architecture and implementation of the Semantic Web Search Engine (SWSE). Following traditional search engine architecture, SWSE consists of crawling, data enhancing, indexing and a user interface for search, browsing and retrieval of information; unlike traditional search engines, SWSE operates over RDF Web data -- loosely also known as Linked Data -- which implies unique challenges for the system design, architecture, algorithms, implementation and user interface. In particular, many challenges exist in adopting Semantic Web technologies for Web data: the unique challenges of the Web -- in terms of scale, unreliability, inconsistency and noise -- are largely overlooked by the current Semantic Web standards. Herein, we describe the current SWSE system, initially detailing the architecture and later elaborating upon the function, design, implementation and performance of each individual component. In so doing, we also give an insight into how current Semantic Web standards can be tailored, in a best-effort manner, for use on Web data. Throughout, we offer evaluation and complementary argumentation to support our design choices, and also offer discussion on future directions and open research questions. Later, we also provide candid discussion relating to the difficulties currently faced in bringing such a search engine into the mainstream, and lessons learnt from roughly six years working on the Semantic Web Search Engine project.

2010


[HPPD10] Aidan Hogan, Jeff Z. Pan, Axel Polleres, and Stefan Decker. SAOR: Template rule optimisations for distributed reasoning over 1 billion linked data triples. In Proceedings of the 9th International Semantic Web Conference (ISWC 2010), volume 6496 of Lecture Notes in Computer Science (LNCS), Shanghai, China, November 2010. Springer. [ .pdf ]
In this paper, we discuss generic optimisations of rule-based materialisation approaches for reasoning over large static RDF datasets. We generalise and re-formalise what we call the "partial-indexing" approach to scalable rule-based materialisation: the approach is based on a separation of terminological data, which has been shown in previous and related works to enable highly scalable and distributable reasoning for specific rulesets; in so doing, we provide some completeness propositions with respect to semi-naive evaluation. We then show how related work on template rules -- T-Box-specific dynamic rulesets created by binding the terminological patterns in the static ruleset -- can be incorporated in the partial-indexing approach, and optimisations that are possible thereafter. We demonstrate our methods using LUBM(10) for RDFS, pD* (OWL Horst) and OWL 2 RL, and thereafter demonstrate pragmatic distributed reasoning over 1.12b Linked Data triples for a subset of OWL 2 RL we argue to be suitable for the Web use-case.
[LPSZ10] Nuno Lopes, Axel Polleres, Umberto Straccia, and Antoine Zimmermann. AnQL: SPARQLing up annotated RDFS. In Proceedings of the 9th International Semantic Web Conference (ISWC 2010), volume 6496 of Lecture Notes in Computer Science (LNCS), Shanghai, China, November 2010. Springer. [ .pdf ]
Starting from the general framework for Annotated RDFS which we presented in previous work (extending Udrea et al.'s Annotated RDF), we address the development of a query language -- AnQL -- that is inspired by SPARQL, including several features of SPARQL 1.1. As a side effect we propose formal definitions of the semantics of these features (subqueries, aggregates, assignment, solution modifiers) which could serve as a basis for the ongoing work in SPARQL 1.1. We demonstrate the value of such a framework by comparing our approach to previously proposed extensions of SPARQL and show that AnQL generalises and extends them.
[PC10] Axel Polleres and Huajun Chen, editors. Proceedings of the ISWC 2010 Posters & Demonstrations Track: Collected Abstracts, volume 658 of CEUR Workshop Proceedings, Shanghai, China, November 2010. CEUR-WS.org. [ http ]
The posters and demonstrations track of ISWC 2010 continues the established tradition of providing an interaction and connection opportunity for researchers and practitioners to present and demonstrate their new and innovative work-in-progress. The track gives conference attendees a way to learn about novel on-going research projects that might not yet be complete, but whose preliminary results are already interesting. The track also provides presenters with an excellent opportunity to obtain feedback from their peers in an informal setting from knowledgeable sources. New in this year, we also encouraged authors of accepted full research or in-use papers to present a practical demonstration or poster with additional results.
[HEF+10] Manfred Hauswirth, Jérôme Euzenat, Owen Friel, Keith Griffin, Pat Hession, Brendan Jennings, Tudor Groza, Siegfried Handschuh, Ivana Podnar Zarko, Axel Polleres, and Antoine Zimmermann. Towards consolidated presence. In The 6th International Conference on Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom 2010), Chicago, Illinois, USA, October 2010. IEEE Computer Society. Invited paper. [ .pdf ]
Presence management, i.e., the ability to automatically identify the status and availability of communication partners, is becoming an invaluable tool for collaboration in enterprise contexts. In this paper, we argue for efficient presence management by means of a holistic view of both physical context and virtual presence in online communication channels. We sketch the components for enabling presence as a service integrating both online information as well as physical sensors, discussing benefits, possible applications on top, and challenges of establishing such a service.
[OMP10] Philipp Obermeier, Marco Marano, and Axel Polleres. Processing RIF and OWL2RL within DLVHEX. In Pascal Hitzler and Thomas Lukasiewicz, editors, Web Reasoning and Rule Systems -- Fourth International Conference, RR 2010, volume 6333 of Lecture Notes in Computer Science (LNCS), pages 244--250, Bressanone, Italy, September 2010. Springer. Demo Paper. [ .pdf ]
We present an extension of the DLVHEX system to support RIF-Core, a dialect of W3C's Rule Interchange Format (RIF), as well as combinations of RIF-Core and OWL2RL ontologies. DLVHEX is a plugin system on top of DLV, a disjunctive Datalog engine which enables higher-order and external atoms, as well as input rewriting capabilities, which are provided as plugins and enable DLVHEX to bidirectionally exchange data with external knowledge bases and consuming input in different Semantic Web languages. In fact, there already exist plugins for languages such as RDF and SPARQL. Our new plugin facilitates consumption and processing of RIF rulesets, as well as OWL2RL reasoning by a 2-step-reduction to DVLHEX via embedding in RIF-Core. The current version implements the translation from OWL2RL to RIF by a static rule set and supports the RIF built-ins mandatory for this reduction trough external atoms in DLVHEX. For the future we plan to switch to a dynamic approach for RIF embedding of OWL2RL and extend the RIF reasoning capabilities to more features of RIF-BLD. We provide a description of our current system, its current development status as well as an illustrative example, and conclude future plans to complete the Semantic Web library of plugins for DLVHEX.
[Pol10b] Axel Polleres. SPARQL1.1: new features and friends (OWL2, RIF). In Pascal Hitzler and Thomas Lukasiewicz, editors, Web Reasoning and Rule Systems -- Fourth International Conference, RR2010, volume 6333 of Lecture Notes in Computer Science (LNCS), pages 23--26, Bressanone, Italy, September 2010. Springer. Slides available at http://www.polleres.net/RR2010_SPARQL11_Tutorial/. [ DOI | .pdf ]
In this tutorial we will give an overview of new features in SPARQL 1.1, which the W3C is currently working on, as well as on the interplay with its ”neighbour standards”, OWL2 and RIF. We will also give a rough overview of existing implementations to play around with.
[PPSW10b] Reinhard Pichler, Axel Polleres, Sebastian Skritek, and Stefan Woltran. Redundancy elimination on RDF graphs in the presence of rules, constraints, and queries. In Pascal Hitzler and Thomas Lukasiewicz, editors, Web Reasoning and Rule Systems -- Fourth International Conference, RR2010, volume 6333 of Lecture Notes in Computer Science (LNCS), pages 133--148, Bressanone, Italy, September 2010. Springer. Best paper award, technical report version available at http://polleres.net/publications/DERI-TR-2010-04-23.pdf. [ .pdf ]
Based on practical observations on rule-based inference on RDF data, we study the problem of redundancy elimination on RDF graphs in the presence of rules (in the form of Datalog rules) and con- straints, (in the form of so-called tuple-generating dependencies), and with respect to queries (ranging from conjunctive queries up to more complex ones, particularly covering features of SPARQL, such as union, negation, or filters). To this end, we investigate the influence of several problem parameters (like restrictions on the size of the rules, the con- straints, and/or the queries) on the complexity of detecting redundancy. The main result of this paper is a fine-grained complexity analysis of both graph and rule minimisation in various settings.
[Pol10a] Axel Polleres. Semantic web technologies: From theory to standards. In 21st National Conference on Artificial Intelligence and Cognitive Science (AICS2010), Galway, Ireland, August 2010. Review paper (appeared in the informal conference proceedings). [ .pdf ]
This paper summarises the evolution of W3C standards in the area of Semantic Web technologies, as well as gaps within these standards still to be filled in terms of standardisation. Moreover, we give a subjective survey of the most influential scientific works which have contributed to the development of these standards and to closing the gaps between them. The Semantic Web proves to become an interesting application field for Artificial Intelligence; we aim here at both giving an overview of own work in the area as well as providing an entry point for researchers interested in the foundations of Semantic Web standards and technologies.
[SLLP10] Umberto Straccia, Nuno Lopes, Gergely Lukácsy, and Axel Polleres. A general framework for representing and reasoning with annotated semantic web data. In Proceedings of the 24th AAAI Conference on Artificial Intelligence (AAAI 2010), Special Track on Artificial Intelligence and the Web, Atlanta, Georgia, USA, July 2010. [ .pdf ]
We describe a generic framework for representing and reasoning with annotated Semantic Web data, a task becoming more important with the recent increased amount of inconsistent and non- reliable meta-data on the web. We formalise the annotated language, the corresponding deductive system and address the query answering problem. Our work extends previous contributions on RDF annotations by providing a unified reasoning formalism and allowing the seamless combination of different annotation domains. We show that current RDF stores can easily be extended to our framework. We demonstrate the feasibility of our method by instantiating it on (i) temporal RDF; (ii) fuzzy RDF; (iii) and their combination. A prototype shows that implementing and combining new domains is easy.
[LBE+10] Nuno Lopes, Stefan Bischof, Orri Erling, Axel Polleres, Alexandre Passant, Diego Berrueta, Antonio Campos, Jérôme Euzenat, Kingsley Idehen, Stefan Decker, Stéphane Corlosquet, Jacek Kopecký, Janne Saarela, Thomas Krennwallner, Davide Palmisano, and Michal Zaremba. RDF and XML: Towards a unified query layer. In W3C Workshop on RDF Next Steps, Stanford, Palo Alto, CA, USA, June 2010. [ http ]
One of the requirements of current Semantic Web applications is to deal with heterogeneous data. The Resource Description Framework (RDF) is the W3C recommended standard for data representation, yet data represented and stored using the Extensible Markup Language (XML) is almost ubiquitous and remains the standard for data exchange. While RDF has a standard XML representation, XML Query languages are of limited use for transformations between natively stored RDF data and XML. Being able to work with both XML and RDF data using a common framework would be a great advantage and eliminate unnecessary intermediate steps that are currently used when handling both formats.
[LZH+10] Nuno Lopes, Antoine Zimmermann, Aidan Hogan, Gergely Lukácsy, Axel Polleres, Umberto Straccia, and Stefan Decker. RDF needs annotations. In W3C Workshop on RDF Next Steps, Stanford, Palo Alto, CA, USA, June 2010. [ http ]
While the current mechanism of reification in RDF is without semantics and widely considered inappropriate and cumbersome, some form of reification -- speaking about triples themselves -- is needed in RDF for many reasonable applications: in particular, reification allows for enhancing triples with annotations relating to provenance, spatio-temporal validity, degrees of trust, fuzzy values and/or other contextual information. In this position paper, we argue that -- besides resolving the issue of how to syntactically represent reification in the future (i.e., whether to stick with the current reification mechanism or standardise a different mechanism such as Named Graphs) -- it is time to agree on certain core annotations that are widely needed. We summarise existing work and provide a possible direction towards handling reification by means of a general annotation framework that can be instantiated for those major use cases we currently see arising.
[PPSW10a] Reinhard Pichler, Axel Polleres, Sebastian Skritek, and Stefan Woltran. Minimising rdf graphs under rules and constraints revisited. In 4th Alberto Mendelzon Workshop on Foundations of Data Management, volume 494 of CEUR Workshop Proceedings. CEUR-WS.org, May 2010. [ .pdf ]
Based on practical observations on rule-based inference on RDF data, we study the problem of redundancy elimination in RDF in the presence of rules (in the form of Datalog rules) and constraints (in the form of so-called tuple-generating dependencies). To this end, we investigate the influence of several problem parameters (like restrictions on the size of the rules and/or the constraints) on the complexity of detecting redundancy. The main result of this paper is a fine-grained complexity analysis of both graph and rule minimisation in various settings.
[KOPP10] Philipp Kärger, Daniel Olmedilla, Alexandre Passant, and Axel Polleres, editors. Proceedings of the Second Workshop on Trust and Privacy on the Social and Semantic Web (SPOT2010), volume 576 of CEUR Workshop Proceedings, Heraklion, Greece, May 2010. CEUR-WS.org. [ http ]
Workshop Proceedings. This workshop was co-located with ESWC 2010.
[HPUZ10] Aidan Hogan, Axel Polleres, Jürgen Umbrich, and Antoine Zimmermann. Some entities are more equal than others: statistical methods to consolidate linked data. In Workshop on New Forms of Reasoning for the Semantic Web: Scalable & Dynamic (NeFoRS10), Heraklion, Greece, May 2010. [ .pdf ]
We propose a method for consolidating entities in RDF data on the Web. Our approach is based on a statistical analysis of the use of predicates and their associated values to identify “quasi”-key properties. Compared to a purely symbolic based approach, we obtain promising results, retrieving more identical entities with a high precision. We also argue that our technique scales well -- possibly to the size of the current Web of Data -- as opposed to more expensive existing approaches.
[PPH10] Jeff Z. Pan, Axel Polleres, and Aidan Hogan. Scalable owl reasoning for linked data, May 2010. Slides available at http://www.abdn.ac.uk/~csc280/tutorial/eswc2010/. [ http ]
Tutorial at the 7th Extended Semantic Web Conference (ESWC2010)
[VRL+10] María-Esther Vidal, Edna Ruckhaus, Tomas Lampo, Amadís Marínez, Javier Sierra, and Axel Polleres. On the efficiency of joining group patterns in SPARQL queries. In Proceedings of the 7th European Semantic Web Conference (ESWC2010), Heraklion, Greece, May 2010. Springer. [ .pdf ]
In SPARQL queries, the combination of triple patterns is expressed by using shared variables across patterns. Based on this characterization, basic graph patterns in a SPARQL query can be partitioned into groups of acyclic pattern combinations that share exactly one variable, or star-shaped groups. We observe that the number of triples in a group is proportional to the number of individuals that play the role of the subject or the object; however, depending on the degree of participation of the subject individuals in the properties, a group could be not much larger than a class or type to which the subject or object belongs. Thus, it may be significantly more efficient to independently evaluate each of the groups, and then merge the resulting sets, than linearly joining all triples in a basic graph pattern. Based on these properties of star-shaped groups, we have developed query optimization and evaluation techniques. We have conducted an empirical analysis on the benefits of the optimization and evaluation techniques in several SPARQL query engines. We observe that our proposed techniques are able to speed up query evaluation time for join queries with star-shaped patterns by at least one order of magnitude.
[HHK+10] Andreas Harth, Katja Hose, Marcel Karnstedt, Axel Polleres, Kai-Uwe Sattler, and Jürgen Umbrich. Data summaries for on-demand queries over linked data. In Proceedings of the 19th World Wide Web Conference (WWW2010), Raleigh, NC, USA, April 2010. Technical report version available at http://polleres.net/publications/DERI-TR-2009-11-17.pdf. [ .pdf ]
Typical approaches for search and querying over structured Web Data collect (crawl) and pre-process (index) large amounts of data before allowing for query answering in a central data warehouse. This time-consuming pre-processing phase decreases the freshness of query results and only uses to a limited degree the benefits of Linked Data where structured data is accessible live and up-to-date at distributed Web resources that may change constantly. An ideal query answering system for Linked Data should return always current answers in a reasonable amount of time, even on corpora as large as the web. Query processors evaluating queries directly on the life sources require knowledge of the contents of data sources. In the current paper we develop and evaluate a probabilistic index structure for covering graph-structured content of sources adhering to Linked Data principles, provide an algorithm for answering conjunctive queries over Linked Data on the web exploiting this structure, and evaluate the system using synthetically generated queries. We find that our lightweight index structure enable more complete query results over Linked Data compared to direct lookup approaches, while keeping the overhead for additional lookups and index maintenance low.
[UHH+10] Jürgen Umbrich, Michael Hausenblas, Aidan Hogan, Axel Polleres, and Stefan Decker. Towards dataset dynamics: Change frequency of linked open data sources. In 3rd International Workshop on Linked Data on the Web (LDOW2010) at WWW2010, Raleigh, USA, April 2010. [ .pdf ]
Datasets in the LOD cloud are far from being static in their nature and how they are exposed. As resources are added and new links are set, applications consuming the data should be able to deal with these changes. In this paper we investigate how LOD datasets change and what sensible measures there are to accommodate dataset dynamics. We compare our findings with traditional, document-centric studies concerning the “freshness” of the document collections and propose metrics for LOD datasets.
[HHP+10] Aidan Hogan, Andreas Harth, Alexandre Passant, Stefan Decker, and Axel Polleres. Weaving the pedantic web. In 3rd International Workshop on Linked Data on the Web (LDOW2010) at WWW2010, Raleigh, USA, April 2010. [ .pdf ]
Over a decade after RDF has been published as a W3C recommendation, publishing open and machine-readable content on the Web has recently received a lot more attention, including from corporate and governmental bodies; notably thanks to the Linked Open Data community, there now exists a rich vein of heterogeneous RDF data published on the Web (the so-called “Web of Data”) accessible to all. However, RDF publishers are prone to making errors which compromise the effectiveness of applications leveraging the resulting data. In this paper, we discuss common errors in RDF publishing, their consequences for applications, along with possible publisher-oriented approaches to improve the quality of structured, machine-readable and open data on the Web.
[dBPPV10] Jos de Bruijn, David Pearce, Axel Polleres, and Agustín Valverde. A semantical framework for hybrid knowledge bases. Knowledge and Information Systems (KAIS), 25(1):81--104, 2010. [ .pdf ]
In the ongoing discussion about combining rules and Ontologies on the Semantic Web a recurring issue is how to combine first-order classical logic with nonmonotonic rule languages. Whereas several modular approaches to define a combined semantics for such hybrid knowledge bases focus mainly on decidability issues, we tackle the matter from a more general point of view. In this paper we show how Quantified Equilibrium Logic (QEL) can function as a unified framework which embraces classical logic as well as disjunctive logic programs under the (open) answer set semantics. In the proposed variant of QEL we relax the unique names assumption, which was present in earlier versions of QEL. Moreover, we show that this framework elegantly captures the existing modular approaches for hybrid knowledge bases in a unified way.
[PAV+10] Julian Padget, Alexander Artikis, Wamberto Vasconcelos, Kostas Stathis, Viviane Torres da Silva, Eric Matson, and Axel Polleres, editors. Coordination, Organizations, Institutions, and Norms in Agent Systems V, volume 6069 of Lecture Notes in AI (LNAI). Springer, 2010.
COIN 2009 International Workshops: COIN@AAMAS 2009 Budapest, Hungary, May 2009, COIN@IJCAI 2009, Pasadena, USA, July 2009, COIN@MALLOW 2009,Turin, Italy, September 2009, Revised Selected Papers. This book constitutes the thoroughly refereed post-workshop proceedings of the International Workshop on Coordination, Organization, Institutions and Norms in Agent Systems, COIN 2009.
[PHHD10] Axel Polleres, Aidan Hogan, Andreas Harth, and Stefan Decker. Can we ever catch up with the web? Semantic Web -- Interoperability, Usability, Applicability (SWJ), 1(1-2):45--52, 2010. [ .pdf ]
The Semantic Web is about to grow up. By efforts such as the Linking Open Data initiative, we finally find ourselves at the edge of a Web of Data becoming reality. Standards such as OWL 2, RIF and SPARQL 1.1 shall allow us to reason with and ask complex structured queries on this data, but still they do not play together smoothly and robustly enough to cope with huge amounts of noisy Web data. In this paper, we discuss open challenges relating to querying and reasoning with Web data and raise the question: can the burgeoning Web of Data ever catch up with the now ubiquitous HTML Web?

2009


[ZSFP09] Antoine Zimmermann, Ratnesh Sahay, Ronan Fox, and Axel Polleres. Heterogeneity and context in semantic-web-enabled HCLS systems. In Robert Meersman, Tharam S. Dillon, and Pilar Herrero, editors, OTM 2009, Part II: Proceedings of the 8th International Conference on Ontologies, DataBases, and Applications of Semantics (ODBASE 2009), volume 5871 of Lecture Notes in Computer Science (LNCS), pages 1165--1182, Vilamoura, Algarve, Portugal, November 2009. Springer. [ .pdf ]
The need for semantics preserving integration of complex data has been widely recognized in the healthcare domain. While standards such as Health Level Seven (HL7) have been developed in this direction, they have mostly been applied in limited, controlled environments, still being used incoherently across countries, organizations, or hospitals. In a more mobile and global society, data and knowledge are going to be commonly exchanged between various systems at Web scale. Specialists in this domain have increasingly argued in favor of using Semantic Web technologies for modeling healthcare data in a well formalized way. This paper provides a reality check in how far current Semantic Web standards can tackle interoperability issues arising in such systems driven by the modeling of concrete use cases on exchanging clinical data and practices. Recognizing the insufficiency of standard OWL to model our scenario, we survey theoretical approaches to extend OWL by modularity and context towards handling heterogeneity in Semantic-Web-enabled health care and life sciences (HCLS) systems. We come to the conclusion that none of these approaches addresses all of our use case heterogeneity aspects in its entirety. We finally sketch paths on how better approaches could be devised by combining several existing techniques.
[CDC+09] Stéphane Corlosquet, Renaud Delbru, Tim Clark, Axel Polleres, and Stefan Decker. Produce and consume linked data with drupal! In Abraham Bernstein, David R. Karger, Tom Heath, Lee Feigenbaum, Diana Maynard, Enrico Motta, and Krishnaprasad Thirunarayan, editors, Proceedings of the 8th International Semantic Web Conference (ISWC 2009), volume 5823 of Lecture Notes in Computer Science (LNCS), pages 763--778, Washington DC, USA, October 2009. Springer. Best paper award In-Use track. [ .pdf ]
Currently a large number of Web sites are driven by Content Management Systems (CMS) which manage textual and multimedia content but also - inherently - carry valuable information about a site's structure and content model. Exposing this structured information to the Web of Data has so far required considerable expertise in RDF and OWL modelling and additional programming effort. In this paper we tackle one of the most popular CMS: Drupal. We enable site administrators to export their site content model and data to the Web of Data without requiring extensive knowledge on Semantic Web technologies. Our modules create RDFa annotations and -- optionally -- a SPARQL endpoint for any Drupal site out of the box. Likewise, we add the means to map the site data to existing ontologies on the Web with a search interface to find commonly used ontology terms. We also allow a Drupal site administrator to include existing RDF data from remote SPARQL endpoints on the Web in the site. When brought together, these features allow networked RDF Drupal sites that reuse and enrich Linked Data. We finally discuss the adoption of our modules and report on a use case in the biomedical field and the current status of its deployment.
[IKMP09a] Giovambattista Ianni, Thomas Krennwallner, Alessandra Martello, and Axel Polleres. Dynamic querying of mass-storage rdf data with rule-based entailment regimes. In Abraham Bernstein, David R. Karger, Tom Heath, Lee Feigenbaum, Diana Maynard, Enrico Motta, and Krishnaprasad Thirunarayan, editors, Proceedings of the 8th International Semantic Web Conference (ISWC 2009), volume 5823 of Lecture Notes in Computer Science (LNCS), pages 310--327, Washington D, CUSA, October 2009. Springer. [ .pdf ]
RDF Schema (RDFS) as a lightweight ontology language is gaining popularity and, consequently, tools for scalable RDFS inference and querying are needed. SPARQL has become recently a W3C standard for querying RDF data, but it mostly provides means for querying simple RDF graphs only, whereas querying with respect to RDFS or other entailment regimes is left outside the current specification. In this paper, we show that SPARQL faces certain unwanted ramifications when querying ontologies in conjunction with RDF datasets that comprise multiple named graphs, and we provide an extension for SPARQL that remedies these effects. Moreover, since RDFS inference has a close relationship with logic rules, we generalize our approach to select a custom ruleset for specifying inferences to be taken into account in a SPARQL query. We show that our extensions are technically feasible by providing benchmark results for RDFS querying in our prototype system , which uses Datalog coupled with a persistent Relational Database as a back-end for implementing SPARQL with dynamic rule-based inference. By employing different optimization techniques like magic set rewriting our system remains competitive with state-of-the-art RDFS querying systems.
[PS09] Axel Polleres and Terrance Swift, editors. Web Reasoning and Rule Systems -- Third International Conference, RR2009, volume 5837 of Lecture Notes in Computer Science (LNCS), Chantilly, VA, USA, October 2009. Springer. [ http ]
This book constitutes the refereed proceedings of the Third International Conference on Web Reasoning and Rule Systems, RR 2009, held in Chantilly, VA, USA, in October 2009. The 15 revised full papers presented together with 3 invited papers were carefully reviewed and selected from 41 submissions. The papers address all current topics in Web reasoning and rule systems such as proof/deduction procedures, scalability, uncertainty, knowledge amalgamation and querying, and rules for decision support and production systems.
[BBB+09] Matteo Baldoni, Cristina Baroglio, Jamal Bentahar, Guido Boella, Massimo Cossentino, Mehdi Dastani, Barbara Dunin-Keplicz, Giancarlo Fortino, Marie-Peirre Gleizes, João Leite, Viviana Mascardi, Julian Padget, Juan Pavón, Axel Polleres, Amal El Fallah Seghrouchni, Paolo Torroni, and Rineke Verbrugge, editors. Proceedings of the Second Multi-Agent Logics, Languages, and Organisations Federated Workshops (MALLOW'009), volume 494 of CEUR Workshop Proceedings, Torino, Italy, September 2009. CEUR-WS.org. [ http ]
The Multi-Agent Logics, Languages, and Organisations Federated Workshops (MALLOW for short), in its second edition this year after the success of MALLOW'007 held in Durham (UK), is a forum for researchers interested in sharing their experiences in agents and multi-agent systems. MALLOW'009 was held at the Educatorio della Provvidenza, in Torino (Italy), from September 7th, 2009 through September 10th, 2009. This volume contains the proceedings of the five workshops, for a total of forty-seven high quality papers, which were selected by the programme committees of the workshops for presentation. Each workshop has an introductory essay, authored by the organizers, which presents the workshop.
[HKO+09] Michael Hausenblas, Philipp Kärger, Daniel Olmedilla, Alexandre Passant, and Axel Polleres, editors. Proceedings of the First Workshop on Trust and Privacy on the Social and Semantic Web (SPOT2009), volume 447 of CEUR Workshop Proceedings, Heraklion, Greece, June 2009. CEUR-WS.org. [ http ]
Workshop Proceedings. This workshop was co-located with ESWC 2009.
[IKMP09b] Giovambattista Ianni, Thomas Krennwallner, Alessandra Martello, and Axel Polleres. A rule system for querying persistent rdfs data. In Proceedings of the 6th European Semantic Web Conference (ESWC2009), Heraklion, Greece, May 2009. Springer. Demo Paper. [ .pdf ]
In this demo we present GiaBATA, a system for storing, aggregating, and querying Semantic Web data, based on declarative logic programming technology, namely on the dlvhex system, which allows us to implement a fully SPARQL compliant semantics, and on DLVDB , which extends the DLV system with persistent storage capabilities. Compared with off-the-shelf RDF stores and SPARQL engines, we offer more flexible support for rule-based RDFS and other higher entailment regimes by enabling custom reasoning via rules, and the possibility to choose the reference ontology on a per query basis. Due to the declarative approach, GiaBATA gains the possibility to apply well-known logic-level optimization features of logic programming (LP) and deductive database systems. Moreover, our architecture allows for extensions of SPARQL by non-standard features such as aggregates, custom built-ins, or arbitrary rulesets. With the resulting system we provide a flexible toolbox that embeds Semantic Web data and ontologies in a fully declarative LP environment.
[CCPD09] Stéphane Corlosquet, Richard Cyganiak, Axel Polleres, and Stefan Decker. Rdfa in drupal: Bringing cheese to the web of data. In 5th Workshop on Scripting and Development for the Semantic Web, Heraklion, Greece, May 2009. [ .pdf ]
A large number of web sites are driven by content management systems (CMS), which manage not only textual content but also structured data related to the site's topic. Exposing this information to the Web of Data has so far required considerable expertise in RDF modelling and programming. We present a plugin for the popular CMS Drupal that enables high-quality RDF output with minimal effort from site administrators. This has the potential of greatly increasing the amount and topical range of information available on the Web of Data.
[PPT+09] Danh Le Phuoc, Axel Polleres, Giovanni Tummarello, Christian Morbidoni, and Manfred Hauswirth. Rapid semantic web mashup development through semantic web pipes. In Proceedings of the 18th World Wide Web Conference (WWW2009), pages 581--590, Madrid, Spain, April 2009. ACM Press. [ .pdf ]
The use of RDF data published on the Web for applications is still a cumbersome and resource-intensive task due to the limited software support and the lack of standard programming paradigms to deal with everyday problems such as combination of RDF data from different sources, object identifier consolidation, ontology alignment and mediation or plain querying and processing tasks. While in a lot of other areas such tasks are supported by excellent libraries and component-oriented toolboxes of basic processing functionalities, RDF-based Web applications are still largely customized programs for a specific purpose, with little potential for reuse. This increases development costs and incurs a more error-prone development process. Speaking in software engineering terms, this means that a good standard architectural style with good support for rapid application development is still missing. In this paper we present a framework based on the classical abstraction of pipes which tries to remedy this problem and support the fast implementation of software, while preserving desirable properties such as abstraction, encapsulation, component-orientation, code re-usability and maintainability, which are common and well supported in other application areas.
[PM09] Axel Polleres and Malgorzata Mochol. Expertise bewerben und finden im Social Semantic Web. In Andreas Blumauer and Tassilo Pellegrini, editors, Social Semantic Web, pages 175--206. Springer, 2009. in German. [ .pdf ]
Im vorliegenden Beitrag diskutieren wir Rahmenbedingungen zur Kombination, Wiederverwendung und Erweiterung bestehender RDF Vokabulare im Social Semantic Web. Hierbei konzentrieren wir uns auf das Anwendungsszenario des Auffindens und Bewerbens von Experten im Web oder Intranet. Wir präsentieren, wie RDF Vokabulare mit zunehmendem Verbreitungsgrad im Semantic Web einerseits und de facto Standardformate, die von täglich verwendeten Applikationen benutzt werden, andererseits (z.B. vCard, iCal oder Dublin Core) kombiniert werden können, um konkrete Awendungsfälle der Expertensuche und zum Management von Expertise zu lösen. Unser Fokus liegt darauf aufzuzeigen, dass für praktische Anwendungsszenarien nicht notwendigerweise neue Ontologien entwickelt werden müssen, sondern der Schlüssel vielmehr in der Integration von bestehenden, weit verbreiteten, und sich ergänzenden Formaten zu einem kohärenten Netzwerk von Ontologien liegt. Dieser Ansatz garantiert sowohl direkte Anwendbarkeit als auch niedrige Einstiegsbarrieren für Semantic Web Technologien, sowie einfache Integrierbarkeit in bestehende Applikationen. Die im Web verfügbaren und verwendeten RDF Formate decken zwar einen großen Bereich der Aspekte zur Beschreibung von Personen und Expertisen ab, zeigen aber auch signifikante Überlappungen. Bisher gibt es wenig systematische Ansätze, um diese Vokabulare zu verbinden, sei es in Form von allgemeingültigen Praktiken, die definieren, wann welches Format zu benutzen ist, oder in Form von Regeln, die Überlappungen zwischen einzelnen Formaten formalisieren. Der vorliegende Artikel analysiert, wie bestehende Formate zur Beschreibung von Personen, Organisationen und deren Expertise kombiniert und wo nötig erweitert werden können. Darüber hinaus diskutieren wir Regelsprachen zur Beschreibung von Formatüberlappungen, sowie deren praktische Verwendbarkeit zur Erstellung eines Ontologie-Netzwerks zur Beschreibung von Experten.
[HHP09] Aidan Hogan, Andreas Harth, and Axel Polleres. Scalable authoritative owl reasoning for the web. International Journal on Semantic Web and Information Systems (IJSWIS), 5(2):49--90, 2009. [ .pdf ]
In this paper we discuss the challenges of performing reasoning on large scale RDF datasets from the Web. Using ter-Horst's pD* fragment of OWL as a base, we compose a rule-based framework for application to web data: we argue our decisions using observations of undesirable examples taken directly from the Web. We further temper our OWL fragment through consideration of “authoritative sources” which counter-acts an observed behaviour which we term “ontology hijacking”: new ontologies published on the Web re-defining the semantics of existing entities resident in other ontologies. We then present our system for performing rule-based forward-chaining reasoning which we call SAOR: Scalable Authoritative OWL Reasoner. Based upon observed characteristics of web data and reasoning in general, we design our system to scale: our system is based upon a separation of terminological data from assertional data and comprises of a lightweight in-memory index, on-disk sorts and file-scans. We evaluate our methods on a dataset in the order of a hundred million statements collected from real-world Web sources and present scale-up experiments on a dataset in the order of a billion statements collected from the Web.
[PKC+09] Alexandre Passant, Jacek Kopecký, Stéphane Corlosquet, Diego Berrueta, Davide Palmisano, and Axel Polleres. XSPARQL: Use cases, January 2009. W3C member submission. [ http ]
XSPARQL is a query language combining XQuery and SPARQL for transformations between RDF and XML. This document contains an overview of XSPARQL use cases within various scenarios.
[LKP+09] Nuno Lopes, Thomas Krennwallner, Axel Polleres, Waseem Akhtar, and Stéphane Corlosquet. XSPARQL: Implementation and Test-cases, January 2009. W3C member submission. [ http ]
XSPARQL is a query language combining XQuery and SPARQL for transformations between RDF and XML. This document provides a description of a prototype implementation of the language based on off-the-shelf XQuery and SPARQL engines. Along with a high-level description of the prototype the document presents a set of test queries and their expected output which are to be understood as illustrative help for possible other implementers.
[KLP09] Thomas Krennwallner, Nuno Lopes, and Axel Polleres. XSPARQL: Semantics, January 2009. W3C member submission. [ http ]
XSPARQL is a query language combining XQuery and SPARQL for transformations between RDF and XML. This document defines the semantics of XSPARQL.
[PKL+09] Axel Polleres, Thomas Krennwallner, Nuno Lopes, Jacek Kopecký, and Stefan Decker. XSPARQL Language Specification, January 2009. W3C member submission. [ http ]
XSPARQL is a query language combining XQuery and SPARQL for transformations between RDF and XML. XSPARQL subsumes XQuery and most of SPARQL (excluding ASK and DESCRIBE). This document defines the XSPARQL language.
[BDH+09] John Breslin, Stefan Decker, Manfred Hauswirth, Gearoid Hynes, Danh Le Phuoc, Alexandre Passant, Axel Polleres, Cornelius Rabsch, and Vinny Reynolds. Integrating social networks and sensor networks. In W3C Workshop on the Future of Social Networking, Barcelona, Spain, January 2009. [ .html ]
Sensors have begun to infiltrate people's everyday lives. They can provide information about a car's condition, can enable smart buildings, and are being used in various mobile applications, to name a few. Generally, sensors provide information about various aspects of the real world. Online social networks, another emerging trend over the past six or seven years, can provide insights into the communication links and patterns between people. They have enabled novel developments in communications as well as transforming the Web from a technical infrastructure to a social platform, very much along the lines of the original Web as proposed by Tim Berners-Lee, which is now often referred to as the Social Web. In this position paper, we highlight some of the interesting research areas where sensors and social networks can fruitfully interface, from sensors providing contextual information in context-aware and personalized social applications, to using social networks as “storage infrastructures” for sensor information.
[PKH+09] Alexandre Passant, Philipp Kärger, Michael Hausenblas, Daniel Olmedilla, Axel Polleres, and Stefan Decker. Enabling trust and privacy on the social web. In W3C Workshop on the Future of Social Networking, Barcelona, Spain, January 2009. [ .html ]
Based on our recent observations at the 7th International Semantic Web Conference and some related workshops as the “Social Data on The Web”, as well as other frequent discussion threads on the Web, trust and privacy on the Social Web remains a hot, yet unresolved topic. Indeed, while Web 2.0 helped people to easily produce data, it lead to various issues regarding how to protect and trust this data, especially when it comes to personal data. On the one hand, we are wondering how to protect our private information online, above all when this information is re-used at our disadvantage. On the other hand, information should not only be protected when being published by its owners, but tools should also help users to assess trustworthiness of third-party information online. According to our recent research works, both from a theoretical and practical point of view, we think that Semantic Web technologies can provide at least partial solutions to enable a 'trust and privacy layer' on top of the Social Web. Hence, this position paper will present our work on the topic, that is in our opinion, also particularly relevant to the mobile Web community, according to the advances of ubiquitous Social Networking with, e.g., microblogging from mobile devices.
[PH09] Axel Polleres and David Huynh, editors. Journal of Web Semantics, Special Issue: The Web of Data, volume 7(3). Elsevier, 2009. Editorial.

2008


[HHP08a] Aidan Hogan, Andreas Harth, and Axel Polleres. SAOR: Authoritative Reasoning for the Web. In John Domingue and Chutiporn Anutariya, editors, Proceedings of the 3rd Asian Semantic Web Conference (ASWC 2008), volume 5367 of Lecture Notes in Computer Science (LNCS), pages 76--90, Bankok, Thailand, December 2008. [ .pdf ]
In this paper we discuss the challenges of performing reasoning on large scale RDF datasets from the Web. We discuss issues and practical solutions relating to reasoning over web data using a rule-based approach to forward-chaining; in particular, we identify the problem of ontology hijacking: new ontologies published on the Web re-defining the semantics of existing concepts resident in other ontologies. Our solution introduces consideration of authoritative sources. Our system is designed to scale, comprising file-scans and selected lightweight on-disk indices. We evaluate our methods on a dataset in the order of a hundred million statements collected from real-world Web sources.
[KLOP08] Philipp Kärger, Nuno Lopes, Daniel Olmedilla, and Axel Polleres. Towards logic programs with ordered and unordered disjunction. In Workshop on Answer Set Programming and Other Computing Paradigms (ASPOCP 2008), December 2008. [ .pdf ]
Logic Programming paradigms that allow for expressing preferences have drawn a lot of research interest over the last few years. Among them, the principle of ordered disjunction was developed to express totally ordered preferences for the alternatives in rule heads. In this paper we introduce an extension of this approach called Disjunctive Logic Programs with Ordered Disjunction (DLPOD) that combines ordered disjunction with common disjunction in rule heads. By this extension, we enhance the preference notions expressible with totally ordered disjunctions to partially ordered preferences. Furthermore, we show that computing optimal stable models for DLPODs still stays in Σ2p for head-cycle free programs and establish Σ3p upper bounds for the general case.
[dBHP+08] Jos de Bruijn, Stijn Heymans, David Pearce, Axel Polleres, and Edna Ruckhaus, editors. Proceedings of the 3rd International Workshop on Applications of Logic Programming to the (Semantic) Web and Web Services (ALPSWS2008), volume 434 of CEUR Workshop Proceedings, Udine, Italy, December 2008. CEUR-WS.org. [ http ]
Workshop Proceedings. This workshop was co-located with ICLP 2008.
[EPS08b] Jérôme Euzenat, Axel Polleres, and François Scharffe. SPARQL extensions for processing alignments. IEEE Intelligent Systems, 23(6):82--84, November 2008. Appeared as part of the article “Making Ontologies Talk: Knowledge Interoperability in the Semantic Web”, Monika Lanzenberger and Jennifer Sampson (eds.). [ www: ]
We propose to extend the SPARQL query language to express mapping between ontologies. We use SPARQL queries as a mechanism for translating RDF data of one ontology to another. Such functionality lets users exploit instance data described in one ontology while they work with an application that's been designed for another. An example translation of FOAF (friend-of-a-friend) files into vCards shows how to use queries to extract data from the source ontology and generate new data for the target ontology.
[FHL+08] Alberto Fernandez, Conor Hayes, Nikos Loutas, Vassilios Peristeras, Axel Polleres, and Konstantinos Tarabanis. Closing the Service Discovery Gap by Collaborative Tagging and Clustering Techniques. In Proceedings of 2nd Internatioal Workshop on Service Matchmaking and Resource Retrieval in the Semantic Web (SMR2 2008), Karlsruhe, Germany, October 2008. [ .pdf ]
Whereas the number of services that are provided online is growing rapidly, current service discovery approaches seem to have problems fulfilling their objectives. These existing approaches are hampered by the complexity of underlying semantic service models and by the fact that they try to impose a technical vocabulary to users. This leads to what we call the service discovery gap. In this paper we envision an approach that allows users first to query or browse services using free text tags, thus providing an interface in terms of the users' vocabulary instead of the service's vocabulary. Unlike simple keyword search, we envision tag clouds associated with services themselves as semantic descriptions carrying collaborative knowledge about the service that can be clustered hierarchically, forming lightweight “ontologies”. Besides tag-based discovery only describing the service on a global view, we envision refined tags and refined search/discovery in terms of the concepts that are common to all current semantic service description models, i.e. input, output, and operation. We argue that Service matching can be achieved, by applying tag-cloud-based service similarity on the one hand and by clustering services using case based indexing and retrieval techniques on the other hand.
[DPTD08] Renaud Delbru, Axel Polleres, Giovanni Tummarello, and Stefan Decker. Context dependent reasoning for semantic documents in sindice. In Proceedings of the 4th International Workshop on Scalable Semantic Web Knowledge Base Systems (SSWS 2008), Karlsruhe, Germany, October 2008. [ .pdf ]
The Sindice Semantic Web index provides search capabilities over today more than 30 million documents. A scalable reasoning mechanism for real-world web data is important in order to increase the precision and recall of the Sindice index by inferring useful information (e.g. RDF Schema features, equality, property characteristic such as inverse functional properties or annotation properties from OWL). In this paper, we introduce our notion of context dependent reasoning for RDF documents published on the Web according to the linked data principle. We then illustrate an efficient methodology to perform context dependent RDFS and partial OWL inference based on a persistent TBox composed of a network of web ontologies. Finally we report preliminary evaluation results of our implementation underlying the Sindice web data index.
[MZN+08] Malgorzata Mochol, Anna V. Zhdanova, Lyndon Nixon, John Breslin, and Axel Polleres, editors. Proceedings of the 3rd Expert Finder Workshop on Personal Identification and Collaborations: Knowledge Mediation and Extraction (PICKME 2008), volume 403 of CEUR Workshop Proceedings, Karlsruhe, Germany, October 2008. CEUR-WS.org. [ http ]
Workshop Proceedings. This workshop was co-located with ISWC 2008. The Semantic Web, Social Networks and other emerging technology streams promise to enable finding experts more efficiently on a Web scale across boundaries. To leverage synergies among these streams, the ExpertFinder Initiative started in 2006 with the aim of devising vocabularies, rule extensions (for e.g. FOAF and SIOC) and best practices to annotate and extract expertise-relevant information from personal and organizational web pages, blogs, wikis, conferences, publication indexes, etc. Following two previous workshops - EFW and FEWS - PICKME2008 solocited new research contributions from the Semantic Web community towards the tasks of formally representing and reusing knowledge of skills and collaborations on the Web and consequently finding people according to their expertise.
[HHP08b] Aidan Hogan, Andreas Harth, and Axel Polleres. Scalable authoritative OWL reasoning on a billion triples. In ISWC2008 Semantic Web Challenge 2008 -- Billion Triples Track, Karlsruhe, Germany, October 2008. [ .pdf ]
In this paper we present a scalable algorithm for performing a subset of OWL reasoning over web data using a rule-based approach to forward-chaining; in particular, we identify the problem of ontology hijacking: new ontologies published on the Web re-defining the semantics of existing concepts resident in other ontologies. Our solution introduces consideration of authoritative sources. We present the results of applying our methods on a re-crawl of the billion triple challenge dataset.
[EIKP08] Thomas Eiter, Giovambattista Ianni, Thomas Krennwallner, and Axel Polleres. Rules and ontologies for the semantic web. In Cristina Baroglio, Piero A. Bonatti, Jan Maluszynski, Massimo Marchiori, Axel Polleres, and Sebastian Schaffert, editors, Reasoning Web 2008, volume 5224 of Lecture Notes in Computer Science (LNCS), pages 1--53. Springer, San Servolo Island, Venice, Italy, September 2008. [ .pdf ]
Rules and ontologies play a key role in the layered architecture of the Semantic Web, as they are used to ascribe meaning to, and to reason about, data on the Web. While the Ontology Layer of the Semantic Web is quite developed, and the Web Ontology Language (OWL) is a W3C recommendation since a couple of years already, the rules layer is far less developed and an active area of research; a number of initiatives and proposals have been made so far, but no standard as been released yet. Many implementations of rule engines are around which deal with Semantic Web data in one or another way. This article gives a comprehensive, although not exhaustive, overview of such systems, describes their supported languages, and sets them in relation with theoretical approaches for combining rules and ontologies as foreseen in the Semantic Web architecture. In the course of this, we identify desired properties and common features of rule languages and evaluate existing systems against their support. Furthermore, we review technical problems underlying the integration of rules and ontologies, and classify representative proposals for theoretical integration approaches into different categories.
[KPP08] Matthias Klusch, Michal Pechoucek, and Axel Polleres, editors. Cooperative Information Agents XII, volume 5180 of Lecture Notes in Computer Science (LNCS), Prague, Czech Republic, September 2008. Springer. [ http ]
The objective of the international workshop series on cooperative information agents (CIA), since its establishment in 1997, is to provide a distinguished, interdisciplinary forum for researchers, programmers, and managers to get informed about, present, and discuss latest high quality results in research and development of agent-based intelligent and cooperative information systems, and applications for the Internet, Web and Semantic Web. Each event of the series offers regular and invited talks of excellence that are given by renown experts in the field, a selected set of system demonstrations, and honors innovative research and development of information agents by means of a best paper award, and respectively, a system innovation award. The proceedings of the series are regularly published as volumes of the Lecture Notes in Artificial Intelligence (LNAI) series of the Springer Verlag. In keeping with its tradition, this year's workshop featured a sequence of regular and invited talks of excellence given by leading researchers covering a broad area of topics of interest. In particular, CIA 2008 featured five invited and nineteen regular papers selected from thirty-eight submissions. The result of the peer-review of all contributions is included in this volume that is, as we think, again rich of interesting, inspiring, and advanced work on research and development of intelligent information agents worldwide.
[BBM+08] Cristina Baroglio, Piero A. Bonatti, Jan Maluszynski, Massimo Marchiori, Axel Polleres, and Sebastian Schaffert, editors. Reasoning Web 2008, volume 5224 of Lecture Notes in Computer Science (LNCS). Springer, San Servolo Island, Venice, Italy, September 2008. [ http ]
The Reasoning Web summer school series is a well-established event, attracting experts from academy and industry as well as PhD students interested in foundational and applicative aspects of the Semantic Web. This volume contains the lecture notes of the fourth edition, that took place in Venice, Italy, in September 2008. This year, the school has been focussed on some important application domains where semantic web techniques proved to be particularly effective or promising in tackling application needs.
[TDB+08] Hong-Linh Truong, Schahram Dustdar, Dino Baggio, Stéphane Corlosquet, Christoph Dorn, Giovanni Giuliani, Robert Gombotz, Yi Hong, Pete Kendal, Christian Melchiorre, Sarit Moretzky, Sébastien Peray, Axel Polleres, Stephan Reiff-Marganiec, Daniel Schall, Simona Stringa, Marcel Tilly, and Hong Qing Yu. incontext: a pervasive and collaborative working environment for emerging team forms. In International Symposium on Applications and the Internet (SAINT 2008), pages 118--125, Turku, Finland, July 2008. IEEE Computer Society. [ .pdf ]
Participants in current team collaborations belong to different organizations, work on multiple objectives at the same time, and frequently change locations. They use different devices and infrastructures in collaboration processes that can last from a few hours to several years. All these factors pose new challenges to the development of collaborative working environments (CWEs). Existing CWEs are unable to support emerging teams because diverse collaboration services are not well integrated or adapting to the team context. We present the inContext approach to providing a novel pervasive CWE infrastructure for emerging team forms. inContext aggregates disparate collaboration services using Web services and Semantic Web technologies and provides a platform that captures diverse dynamic aspects of team collaborations. By utilizing runtime and historical context and interaction information, adaptation techniques can be deployed to cope with the changes of emerging teams.
[TDC+08] Hong-Linh Truong, Christoph Dorn, Giovanni Casella, Axel Polleres, Stephan Reiff-Marganiec, and Schahram Dustdar. incontext: On coupling and sharing context for collaborative teams. In 14th International Conference of Concurrent Enterprising (ICE 2008), pages 225--232, Lisboa, Portugal, June 2008. [ .pdf ]
Present team members have difficulties in keeping the relations between their various, concurrent activities due to the lack of suitable tools supporting context coupling and sharing. Furthermore, collaboration services are hardly aware of related context of team members and their activities. Such awareness is required to adapt to the dynamics of collaborative teams. In this paper, we discuss the context coupling techniques provided by the inContext project. Utilizing the concept of activity-based context and Web services techniques, we can couple individual and team contexts at runtime, thus improving the context-awareness and adaptation of collaboration services such as email, shared calendars, instant messaging and document management.
[MPP+08] Christian Morbidoni, Danh Le Phuoc, Axel Polleres, Matthias Samwald, and Giovanni Tummarello. Previewing semantic web pipes. In Proceedings of the 5th European Semantic Web Conference (ESWC2008), pages 843--848, Tenerife, Spain, June 2008. Springer. Demo Paper. [ .pdf ]
In this demo we present a first implementation of Semantic Web Pipes, a powerful tool to build RDF-based mashups. Semantic Web pipes are defined in XML and when executed they fetch RDF graphs on the Web, operate on them, and produce an RDF output which is itself accessible via a stable URL. Humans can also use pipes directly thanks to HTML wrapping of the pipe parameters and outputs. The implementation we will demo includes an online AJAX pipe editor and execution engine. Pipes can be published and combined thus fostering collaborative editing and reuse of data mashups.
[PPWW08] Reinhard Pichler, Axel Polleres, Fang Wei, and Stefan Woltran. Entailment for domain-restricted RDF. In Proceedings of the 5th European Semantic Web Conference (ESWC2008), volume 5021 of Lecture Notes in Computer Science (LNCS), pages 200--214, Tenerife, Spain, June 2008. Springer. [ .pdf ]
We introduce domain-restricted RDF (dRDF) which allows to associate an RDF graph with a fixed, finite domain that interpretations for it may range over. We show that dRDF is a real extension of RDF and discuss impacts on the complexity of entailment in dRDF. The entailment problem represents the key reasoning task for RDF and is well known to be NP-complete. Remarkably, we show that the restriction of domains in dRDF raises the complexity of entailment from NP- to ΠP2-completeness. In order to lower complexity of entailment for both domain-restricted and unrestricted graphs, we take a closer look at the graph structure. For cases where the structure of RDF graphs is restricted via the concept of bounded treewidth, we prove that the entailment is tractable for unrestricted graphs and coNP-complete for domain-restricted graphs.
[AKKP08] Waseem Akhtar, Jacek Kopecky, Thomas Krennwallner, and Axel Polleres. XSPARQL: Traveling between the XML and RDF worlds -- and avoiding the XSLT pilgrimage. In Proceedings of the 5th European Semantic Web Conference (ESWC2008), volume 5021 of Lecture Notes in Computer Science (LNCS), pages 432--447, Tenerife, Spain, June 2008. Springer. Nominated for best paper award. [ .pdf ]
With currently available tools and languages, translating between an existing XML format and RDF is a tedious and error-prone task. The importance of this problem is acknowledged by the W3C GRDDL working group who faces the issue of extracting RDF data out of existing HTML or XML files, as well as by the Web service community around SAWSDL, who need to perform lowering and lifting between RDF data from a semantic client and XML messages for a Web service. However, at the moment, both these groups rely solely on XSLT transformations between RDF/XML and the respective other XML format at hand. In this paper, we propose a more natural approach for such transformations based on merging XQuery and SPARQL into the novel language XSPARQL. We demonstrate that XSPARQL provides concise and intuitive solutions for mapping between XML and RDF in either direction, addressing both the use cases of GRDDL and SAWSDL. We also provide and describe an initial implementation of an XSPARQL engine, available for user evaluation.
[EPS08a] Jérôme Euzenat, Axel Polleres, and François Scharffe. Processing ontology alignments with SPARQL. In International Workshop on Ontology Alignment and Visualization - OnAV'08, Proceedings of the Second International Conference on Complex, Intelligent and Software Intensive Systems, pages 913--917, Barcelona, Spain, March 2008. IEEE Computer Society. [ .pdf ]
Solving problems raised by heterogeneous ontologies can be achieved by matching the ontologies and processing the resulting alignments. This is typical of data mediation in which the data must be translated from one knowledge source to another. In this position paper we propose to solve the data translation problem, i.e. the processing part, using the SPARQL query language. Indeed, such a language is particularly adequate for extracting data from one ontology and, through its CONSTRUCT statement, for generating new data. We present examples of such transformations, but we also present a set of example correspondences illustrating the needs for particular representation constructs, such as aggregates, value-generating built-in functions and paths, which are missing from SPARQL. Hence, we advocate the use of two SPARQL extensions providing these missing features.

2007


[MPT07b] Christian Morbidoni, Axel Polleres, and Giovanni Tummarello. Who the FOAF knows Alice? RDF Revocation in DBin 2.0. In 4th Italian Semantic Web Workshop SEMANTIC WEB APPLICATIONS AND PERSPECTIVES (SWAP), Bari, Italy, December 2007. [ .pdf ]
In this paper we take a view from the bottom to RDF(S) reasoning. We discuss some issues and requirements on reasoning towards effectively building Semantic Web Pipes, aggregating and patching RDF data from various distributed sources. Even if we leave out complex description logics reasoning and restrict ourselves to the RDF world, it turns out that some problems, in particular how to deal with contradicting RDF statements and patching RDF graphs, do not yet find their proper solutions within the current Semantic Web Stack. Besides theoretical solutions which involve full DL reasoning, we believe that more practical and probably more scalable solutions are conceivable one of which we discuss in this paper. Namely, we provide means to express revocations in RDF and resolve such revocations by means of a specialized RDF merge procedure. We have implemented this conflict-resolving merge procedure in the DBin 2.0 system.
[FPO07] Alberto Fernandez, Axel Polleres, and Sascha Ossowski. Towards Fine-grained Service Matchmaking by Using Concept Similarity. In Proceedings of the SMR2 2007 Workshop on Service Matchmaking and Resource Retrieval in the Semantic Web (SMR2 2007), volume 243 of CEUR Workshop Proceedings, pages 31--45, Busan, Korea, November 2007. CEUR-WS.org. [ .pdf ]
Several description frameworks to semantically describe and match services on the one hand and service requests on the other have been presented in the literature. Many of the current proposals for defining notions of match between service advertisements and requests are based on subsumption checking in more or less expressive Description Logics, thus providing boolean match functions, rather than a fine-grained, numerical degree of match. By contrast, concept similarity measures investigated in the DL literature explicitely include such a quantitative notion. In this paper we try to take a step forward in this area by means of an analysis of existing approaches from both semantic web service matching and concept similarity, and provide preliminary ideas on how to combine these two building blocks in a unified service selection framework.
[dNLP+07] Tommaso di Noia, Rubén Lara, Axel Polleres, Ioan Toma, Takahiro Kawamura, Matthias Klusch, Abraham Bernstein, Massimo Paolucci, Alain Leger, and David Martin, editors. Proceedings of the SMR2 2007 Workshop on Service Matchmaking and Resource Retrieval in the Semantic Web (SMR2 2007), volume 243 of CEUR Workshop Proceedings, Busan, Korea, November 2007. CEUR-WS.org. [ http ]
Workshop Proceedings. This workshop was co-located with ISWC 2007 + ASWC 2007.
[MPT07a] Christian Morbidoni, Axel Polleres, and Giovanni Tummarello. Who the FOAF knows Alice? A needed step towards Semantic Web Pipes. In ISWC 2007 Workshop on New forms of Reasoning for the Semantic Web: Scaleable, Tolerant and Dynamic, Busan, Korea, November 2007. [ .pdf ]
In this paper we take a view from the bottom to RDF(S) reasoning. We discuss some issues and requirements on reasoning towards effectively building Semantic Web Pipes, aggregating RDF data from various distributed sources. If we leave out complex description logics reasoning and restrict ourselves to the RDF world, it turns out that some problems, in particular how to deal with contradicting RDF statements, do not yet find their proper solutions within the current Semantic Web Stack. Besides theoretical solutions which involve full DL reasoning, we believe that more practical and probably more scalable solutions are conceivable one of which we discuss in this paper, namely, expressing and resolving conflicting RDF statements by means of a specialized RDF merge procedure. We implemented this conflict-resolving merge procedure in the DBin system.
[PSS07] Axel Polleres, François Scharffe, and Roman Schindlauer. SPARQL++ for mapping between RDF vocabularies. In OTM 2007, Part I : Proceedings of the 6th International Conference on Ontologies, DataBases, and Applications of Semantics (ODBASE 2007), volume 4803 of Lecture Notes in Computer Science (LNCS), pages 878--896, Vilamoura, Algarve, Portugal, November 2007. Springer. [ .pdf ]
Lightweight ontologies in the form of RDF vocabularies such as SIOC, FOAF, vCard, etc. are increasingly being used and exported by “serious” applications recently. Such vocabularies, together with query languages like SPARQL also allow to syndicate resulting RDF data from arbitrary Web sources and open the path to finally bringing the Semantic Web to operation mode. Considering, however, that many of the promoted lightweight ontologies overlap, the lack of suitable standards to describe these overlaps in a declarative fashion becomes evident. In this paper we argue that one does not necessarily need to delve into the huge body of research on ontology mapping for a solution, but itself might --- with extensions such as external functions and aggregates --- serve as a basis for declaratively describing ontology mappings. We provide the semantic foundations and a path towards implementation for such a mapping language by means of a translation to Datalog with external predicates.
[HPD07] Andreas Harth, Axel Polleres, and Stefan Decker. Towards a social provenance model for the web. In Workshop on Principles of Provenance (PrOPr), Edinburgh, Scotland, November 2007. [ .pdf ]
In this position paper we firstly present the established notion of provenance on the Semantic Web (also referred to as named graphs or contexts), and secondly argue for the benefit of adding to the pure technical notion of provenance a social dimension to associate provenance with the originator (typically a person) of a given piece of information.
[BKPP07] Harold Boley, Michael Kifer, Paula-Lavinia Pătrânjan, and Axel Polleres. Rule interchange on the web. In Reasoning Web 2007, volume 4636 of Lecture Notes in Computer Science (LNCS), pages 269--309. Springer, September 2007. [ .pdf ]
Rules play an increasingly important role in a variety of Semantic Web applications as well as in traditional IT systems. As a universal medium for publishing information, the Web is envisioned to become the place for publishing, distributing, and exchanging rule-based knowledge. Realizing the importance and the promise of this vision, the W3C has created the Rule Interchange Format Working Group (RIF WG) and chartered it to develop an interchange format for rules in alignment with the existing standards in the Semantic Web architecture stack. However, creating a generally accepted interchange format is by no means a trivial task. First, there are different understandings of what a “rule” is. Researchers and practitioners distinguish between deduction rules, normative rules, production rules, reactive rules, etc. Second, even within the same category of rules, systems use different (often incompatible) semantics and syntaxes. Third, existing Semantic Web standards, such as RDF and OWL, show incompatibilities with many kinds of rule languages at a conceptual level. This article discusses the role that different kinds of rule languages and systems play on the Web, illustrates the problems and opportunities in exchanging rules through a standardized format, and provides a snapshot of the current work of the W3C RIF WG.
[PS07] Axel Polleres and Roman Schindlauer. dlvhex-sparql: A SPARQL-compliant query engine based on dlvhex. In 2nd International Workshop on Applications of Logic Programming to the Web, Semantic Web and Semantic Web Services (ALPSWS2007), volume 287 of CEUR Workshop Proceedings, pages 3--12, Porto, Portugal, September 2007. CEUR-WS.org. [ .pdf ]
This paper describes the dlvhex SPARQL plugin, a query processor for the upcoming Semantic Web query language standard by W3C. We report on the implementation of this languages using dlvhex, a flexible plugin system on top of the DLV solver. This work advances our earlier translation based on the semantics by Perez et al. towards an engine which is fully compliant to the official SPARQL specification. As it turns out, the differences between these two definitions of SPARQL, which might seem moderate at first glance, need some extra machinery. We also briefly report the status of implementation, and extensions currently being implemented, such as handling of aggregates, nested CONSTRUCT queries in the spirit of networked RDF graphs, or partially support of RDFS entailment. For such extensions a tight integration of SPARQL query processing and Answer-Set Programming, the underlying logic programming formalism of our engine, turns out to be particularly useful, as the resulting programs can actually involve unstratified negation.
[HPP+07] Stijn Heymans, David Pearce, Axel Polleres, Edna Ruckhaus, and Gopal Gupta, editors. ALPSWS2007: 2nd International Workshop on Applications of Logic Programming in the Semantic Web and Semantic Web Services. Proceedings, volume 287 of CEUR Workshop Proceedings, Porto, Portugal, September 2007. CEUR-WS.org. [ http ]
Workshop Proceedings. This workshop was co-located with ICLP 2007.
[AGP+07] Marcelo Arenas, Claudio Gutierrez, Bijan Parsia, Jorge Pérez, Axel Polleres, and Andy Seaborne. SPARQL -- where are we? current state, theory and practice, June 2007. Slides available at http://www.polleres.net/sparqltutorial/. [ http ]
Tutorial at the 4th European Semantic Web Conference (ESWC2007)
[AMBB+07] Boanerges Aleman-Meza, Uldis Bojars, Harold Boley, John G. Breslin, Malgorzata Mochol, Lyndon J.B. Nixon, Axel Polleres, and Anna V. Zhdanova. Combining RDF vocabularies for expert finding. In Enrico Franconi, Michael Kifer, and Wolfgang May, editors, Proceedings of the 4th European Semantic Web Conference (ESWC2007), volume 4519 of Lecture Notes in Computer Science (LNCS), pages 235--250, Innsbruck, Austria, June 2007. Springer. Slides available at http://www.polleres.net/publications/alem-etal-2007eswc-slides.pdf. [ .pdf ]
This paper presents a framework for the reuse and extension of existing, established vocabularies in the Semantic Web. Driven by the primary application of expert finding, we will explore the reuse of vocabularies that have attracted a considerable user community already (FOAF, SIOC, etc.) or are derived from de facto standards used in tools or industrial practice (such as vCard, iCal and Dublin Core). This focus guarantees direct applicability and low entry barriers, unlike when devising a new ontology from scratch. The Web is already populated with several vocabularies which complement each other (but also have considerable overlap) in that they cover a wide range of necessary features to adequately describe the expert finding domain. Little effort has been made so far to identify and compare existing approaches, and to devise best practices on how to use and extend various vocabularies conjointly. It is the goal of the recently started ExpertFinder initiative to fill this gap. In this paper we present the ExpertFinder framework for reuse and extension of existing vocabularies in the Semantic Web. We provide a practical analysis of overlaps and options for combined use and extensions of several existing vocabularies, as well as a proposal for applying rules and other enabling technologies to the expert finding task.
[dBPPV07] Jos de Bruijn, David Pearce, Axel Polleres, and Agustín Valverde. Quantified equilibrium logic and hybrid rules. In Massimo Marchiori, Jeff Z. Pan, and Christian de Sainte Marie, editors, First International Conference on Web Reasoning and Rule Systems (RR2007), volume 4524 of Lecture Notes in Computer Science (LNCS), pages 58--72, Innsbruck, Austria, June 2007. Springer. [ .pdf ]
In the ongoing discussion about combining rules and Ontologies on the Semantic Web a recurring issue is how to combine first-order classical logic with nonmonotonic rule languages. Whereas several modular approaches to define a combined semantics for such hybrid knowledge bases focus mainly on decidability issues, we tackle the matter from a more general point of view. In this paper we show how Quantified Equilibrium Logic (QEL) can function as a unified framework which embraces classical logic as well as disjunctive logic programs under the (open) answer set semantics. In the proposed variant of QEL we relax the unique names assumption, which was present in earlier versions of QEL. Moreover, we show that this framework elegantly captures the existing modular approaches for hybrid knowledge bases in a unified way.
[BBB+07] Uldis Bojars, John G. Breslin, Diego Berrueta, Dan Brickley, Stefan Decker, Sergio Fernández, Christoph Görn, Andreas Harth, Tom Heath, Kingsley Idehen, Kjetil Kjernsmo, Alistair Miles, Alexandre Passant, Axel Polleres, Luis Polo, and Michael Sintek. SIOC Core Ontology Specification, June 2007. W3C member submission. [ http ]
The SIOC (Semantically-Interlinked Online Communities) Core Ontology provides the main concepts and properties required to describe information from online communities (e.g., message boards, wikis, weblogs, etc.) on the Semantic Web. This document contains a detailed description of the SIOC Core Ontology.
[Pol07] Axel Polleres. From SPARQL to rules (and back). In Proceedings of the 16th World Wide Web Conference (WWW2007), pages 787--796, Banff, Canada, May 2007. ACM Press. Extended technical report version available at http://www.polleres.net/TRs/GIA-TR-2006-11-28.pdf, slides available at http://www.polleres.net/publications/poll-2007www-slides.pdf. [ DOI | http ]
As the data and ontology layers of the Semantic Web stack have achieved a certain level of maturity in standard recommendations such as RDF and OWL, the current focus lies on two related aspects. On the one hand, the definition of a suitable query language for RDF, SPARQL, is close to recommendation status within the W3C. The establishment of the rules layer on top of the existing stack on the other hand marks the next step to be taken, where languages with their roots in Logic Programming and Deductive Databases are receiving considerable attention. The purpose of this paper is threefold. First, we discuss the formal semantics of SPARQL extending recent results in several ways. Second, we provide translations from SPARQL to Datalog with negation as failure. Third, we propose some useful and easy to implement extensions of SPARQL, based on this translation. As it turns out, the combination serves for direct implementations of SPARQL on top of existing rules engines as well as a basis for more general rules and query languages on top of RDF.
[PPVW07] David Pearce, Axel Polleres, Agustín Valverde, and Stefan Woltran, editors. Workshop on Correspondence and Equivalence for Nonmonotonic Theories (CENT 2007) Working Notes, volume 265 of CEUR Workshop Proceedings, Tempe, AZ, May 2007. CEUR-WS.org. [ http ]
Workshop Proceedings. This workshop was co-located with LPNMR 2007.
[BFM+07] Martin Brain, Wolfgang Faber, Marco Maratea, Axel Polleres, Torsten Schaub, and Roman Schindlauer. What should an ASP solver output? a multiple position paper. In Marina De Vos and Torsten Schaub, editors, First International Workshop on Software Engineering for Answer Set Programming 2007 (SEA'07), pages 26--37, Tempe, AZ, May 2007. [ .pdf ]
This position paper raises some issues regarding the output of solvers for Answer Set Programming and discusses experiences made in several different settings. The first set of issues was raised in the context of the first ASP system competition, which led to a first suggestion for a standardised yet miniature output format. We then turn to experiences made in related fields, like Satisfiability Checking, and finally adopt an application point of view by investigating interface issues both with simple tools and in the context of the Semantic Web and query answering.
[dBEPT07] Jos de Bruijn, Thomas Eiter, Axel Polleres, and Hans Tompits. Embedding non-ground logic programs into autoepistemic logic for knowledge-base combination. In Twentieth International Joint Conference on Artificial Intelligence (IJCAI'07), pages 304--309, Hyderabad, India, January 2007. AAAI. [ .pdf ]
In the context of the Semantic Web, several approaches to the combination of ontologies, given in terms of theories of classical first-order logic, and rule bases have been proposed. They either cast rules into classical logic or limit the interaction between rules and ontologies. Autoepistemic logic (AEL) is an attractive formalism which allows to overcome these limitations, by serving as a uniform host language to embed ontologies and nonmonotonic logic programs into it. For the latter, so far only the propositional setting has been considered. In this paper, we present several embeddings of normal and disjunctive non-ground logic programs under the stable-model semantics into first-order AEL, and compare them in combination with classical theories, with respect to stable expansions and autoepistemic consequences. Our results reveal differences and correspondences of the embeddings and provide a useful guidance in the choice of a particular embedding for knowledge combination.
[BBAM+07] John G. Breslin, Uldis Bojars, Boanerges Aleman-Meza, Harold Boley, Malgorzata Mochol, Lyndon J.B. Nixon, Axel Polleres, and Anna V. Zhdanova. Finding experts using internet-based discussions in online communities and associated social networks. In 1st International ExpertFinder Workshop, January 2007. [ .pdf ]
This position paper on expert finding presents a conceptual framework for the reuse and interlinking of existing, well-established vocabularies in the Semantic Web. Such a framework can be used to connect people with people, based on joint or complementing interests (e.g. the need to develop specific new or existing skills for upcoming projects). Driven by a requirement to find experts using the profiles of people in social networks and using the content they create in online communities, we are exploring the usage of vocabularies in these domains that have already gained considerable momentum and that have suitable concepts for this application area. We will present the relevant properties of the FOAF ontology for matching people and their skills in social networks, then detail the SIOC project and methods for identifying relevant discussion topics/individuals, and finally we will outline a combinatory scenario that will allow people to find individuals with the desired expertise in a particular domain of interest.
[LLP+07] Holger Lausen, Rubén Lara, Axel Polleres, Jos de Bruijn, and Dumitru Roman. Chapter 7: Description -- semantic annotation for web services. In Rudi Studer, Stephan Grimm, and Andreas Abecker, editors, Semantic Web Services, pages 179--209. Springer, 2007. [ http ]
Web Services have added a new level of functionality to the current Web, making the first step to achieve seamless integration of distributed components. Nevertheless, current Web Service technologies only address the syntactical aspects of a Web Service and, therefore, only provide a set of rigid services that cannot adapt to a changing environment without human intervention. The human programmer has to be kept in the loop and scalability as well as economy of Web Services are limited. The description of Web Services in a machine-understandable fashion is expected to have a great impact in areas of e-Commerce and Enterprise Application Integration, as it can enable dynamic and scalable cooperation between different systems and organisations. These great potential benefits have led to the establishment of an important research activity, both in industry and in academia, which aims at realising Semantic Web Services. This chapter outlines aspects of the description of semantic Web Services.

2006


[PS06] Axel Polleres and Roman Schindlauer. SPAR2QL: From SPARQL to rules. In International Semantic Web Conference (ISWC2006 -- Posters Track), Athens, GA, USA, November 2006. Abstract, the full poster is available at http://www.polleres.net/publications/poll-schi-2006-poster.pdf. [ .pdf ]
As the data and ontology layers of the Semantic Web stack have achieved a certain level of maturity in standard recommendations such as RDF and OWL, the current focus lies on two related aspects. On the one hand, the definition of a suitable query language for RDF, SPARQL, has just reached candidate recommendation status within the W3C. The establishment of the rules layer on top of the existing stack on the other hand marks the next step to be tackled, where especially languages with their roots in Logic Programming and Deductive Databases are receiving considerable attention. In this work we try to bridge the gap between these two efforts by providing translations between SPARQL and Datalog extended with negation and external built-in predicates. It appears that such a combination serves both as an underpinning for a more general rules and query language on top of RDF and SPARQL as well as for direct implementations of SPARQL on top of existing rules engines. Our prototype implementation is based on the datalog engine DLV. As it turns out, features of the language of this system can be fruitfully combined with SPARQL.
[dBPPV06] Jos de Bruijn, David Pearce, Axel Polleres, and Agustín Valverde. A logic for hybrid rules. In RuleML 2006 Workshop: Ontology and Rule Integration, November 2006. [ .pdf ]
In the ongoing discussion about rule extensions for Ontology languages on the Semantic Web a recurring issue is how to combine first-order classical logic with nonmonotonic rule languages. Whereas several modular approaches to define a combined semantics for such hybrid knowledge bases focus mainly on decidability issues, we tackle the matter from a more general point of view. In this paper we show how Quantified Equilibrium Logic (QEL) can function as a unified framework that embraces classical logic as well as disjunctive logic programs under the (open) answer set semantics. In the proposed variant of QEL we relax the unique names assumption from earlier versions. Moreover, we show that this framework elegantly captures several modular approaches to nonmonotonic semantics for hybrid knowledge bases.
[EIP+06] Thomas Eiter, Giovambattista Ianni, Axel Polleres, Roman Schindlauer, and Hans Tompits. Reasoning with rules and ontologies. In P. Barahona et al., editor, Reasoning Web 2006, volume 4126 of Lecture Notes in Computer Science (LNCS), pages 93--127. Springer, September 2006. [ .pdf ]
For realizing the Semantic Web vision, extensive work is underway for getting the layers of its conceived architecture ready. Given that the Ontology Layer has reached a certain level of maturity with W3C recommendations such as RDF and the OWL Web Ontology Language, current interest focuses on the Rules Layer and its integration with the Ontology Layer. Several proposals have been made for solving this problem, which does not have a straightforward solution due to various obstacles. One of them is the fact that evaluation principles like the closed-world assumption, which is common in rule languages, are usually not adopted in ontologies. Furthermore, naively adding rules to ontologies raises undecidability issues. In this paper, after giving a brief overview about the current state of the Semantic-Web stack and its components, we will discuss nonmonotonic logic programs under the answer-set semantics as a possible formalism of choice for realizing the Rules Layer. We will briefly discuss open issues in combining rules and ontologies, and survey some existing proposals to facilitate reasoning with rules and ontologies. We will then focus on description-logic programs (or dl-programs, for short), which realize a transparent integration of rules and ontologies supported by existing reasoning engines, based on the answer-set semantics. We will further discuss a generalization of dl-programs, viz. HEX-programs, which offer access to different ontologies as well as higher-order language constructs.
[dBEPT06] Jos de Bruijn, Thomas Eiter, Axel Polleres, and Hans Tompits. On representational issues about combinations of classical theories with nonmonotonic rules. In Proceedings of the 1st International Conference on Knowledge Science, Engineering and Management (KSEM'06), volume 4092 of Lecture Notes in Computer Science (LNCS), pages 1--22, Gullin, China, August 2006. Springer. Invited paper. [ .pdf ]
In the context of current efforts around Semantic-Web languages, the combination of classical theories in classical first-order logic (and in particular of ontologies in various description logics) with rule languages rooted in logic programming is receiving considerable attention. Existing approaches such as SWRL, dl-programs, and DL+log, differ significantly in the way ontologies interact with (nonmonotonic) rules bases. In this paper, we identify fundamental representational issues which need to be addressed by such combinations and formulate a number of formal principles which help to characterize and classify existing and possible future approaches to the combination of rules and classical theories. We use the formal principles to explicate the underlying assumptions of current approaches. Finally, we propose a number of settings, based on our analysis of the representational issues and the fundamental principles underlying current approaches.
[PDGdB06] Axel Polleres, Stefan Decker, Gopal Gupta, and Jos de Bruijn, editors. ALPSWS2006: Applications of Logic Programming in the Semantic Web and Semantic Web Services. Proceedings, volume 196 of CEUR Workshop Proceedings, Seattle, WA, August 2006. CEUR-WS.org. [ http ]
Workshop Proceedings. This workshop was co-located with ICLP 2006.
[PLL06] Axel Polleres, Holger Lausen, and Rubén Lara. Semantische Beschreibung von Web Services. In Tassilo Pellegrini and Andreas Blumauer, editors, Semantic Web -- Wege zur vernetzten Wissensgesellschaft. Springer, June 2006. (in German). [ .pdf ]
In diesem Kapitel werden Anwendungsgebiete und Ansätze für die semantische Beschreibung von Web Services behandelt. Bestehende Web Service Technologien leisten einen entscheidenden Beitrag zur Entwicklung verteilter Anwendungen dadurch, dass weithin akzeptierte Standards vorliegen, die die Kommunikation zwischen Anwendungen bestimmen und womit deren Kombination zu komplexeren Einheiten erm öglicht wird. Automatisierter Mechanismen zum Auffinden geeigneter Web Services und deren Komposition dagegen werden von bestehenden Technologien in vergleichsweise geringem Mas unterstützt. Ähnlich wie bei der Annotation statischer Daten im “Semantic Web” setzen Forschung und Industrie grosse Hoffnungen in die semantischen Beschreibung von Web Services zur weitgehenden Automatisierung dieser Aufgaben.
[EIPS06] Thomas Eiter, Giovambattista Ianni, Axel Polleres, and Roman Schindlauer. Answer set programming for the semantic web, June 2006. Slides available at http://asptut.gibbi.com/. [ http ]
Tutorial at the 3rd European Semantic Web Conference (ESWC2006)
[PFH06] Axel Polleres, Cristina Feier, and Andreas Harth. Rules with contextually scoped negation. In Proceedings of the 3rd European Semantic Web Conference (ESWC2006), volume 4011 of Lecture Notes in Computer Science (LNCS), pages 332--347, Budva, Montenegro, June 2006. Springer. [ .pdf ]
Knowledge representation formalisms used on the Semantic Web adhere to a strict open world assumption. Therefore, nonmonotonic reasoning techniques are often viewed with scepticism. Especially negation as failure, which intuitively adopts a closed world view, is often claimed to be unsuitable for the Web where knowledge is notoriously incomplete. Nonetheless, it was suggested in the ongoing discussions around rules extensions for languages like RDF(S) or OWL to allow at least restricted forms of negation as failure, as long as negation has an explicitly defined, finite scope. Yet clear definitions of such “scoped negation” as well as formal semantics thereof are missing. We propose logic programs with contexts and scoped negation and discuss two possible semantics with desirable properties. We also argue that this class of logic programs can be viewed as a rule extension to a subset of RDF(S).
[dBLPF06] Jos de Bruijn, Holger Lausen, Axel Polleres, and Dieter Fensel. The web service modeling language: An overview. In Proceedings of the 3rd European Semantic Web Conference (ESWC2006), volume 4011 of Lecture Notes in Computer Science (LNCS), Budva, Montenegro, June 2006. Springer. Nominated for the 7 Years Most Influential ESWC Paper award at ESWC2013. [ DOI | .pdf ]
The Web Service Modeling Language (WSML) is a language for the specification of different aspects of Semantic Web Services. It provides a formal language for the Web Service Modeling Ontology WSMO which is based on well-known logical formalisms, specifying one coherent language framework for the description of Semantic Web Services, starting from the intersection of Datalog and the Description Logic SHIQ. This core language is extended in the directions of Description Logics and Logic Programming in a principled manner with strict layering. WSML distinguishes between conceptual and logical modeling in order to facilitate users who are not familiar with formal logic, while not restricting the expressive power of the language for the expert user. IRIs play a central role in WSML as identifiers. Furthermore, WSML defines XML and RDF serializations for inter-operation over the Semantic Web.
[Pol06] Axel Polleres. Logic programs with contextually scoped negation. In 20th Workshop on Logic Programming (WLP 2006), Vienna, Austria, February 2006. [ .pdf ]
The Semantic Web community is currently dominated by knowledge representation formalisms adhering to a strict open world assumption. Nonmonotonic reasoning formalisms are viewed with partial scepticism and it is often argued that nonmonotonic reasoning techniques which adopt a closed world assumption are invalid in an open environment such as the Web where knowledge is notoriously incomplete. Nonetheless, in the ongoing discussion about rule extensions for Semantic Web Languages like RDF(S) or OWL several proposals have been made to partly break with this view and to allow a restricted form of negation as failure. Recently, the term “scoped negation” emerged in discussions around this topic, yet a clear definition about the meaning of “scope” and “scoped negation” and a formal semantics are still missing. In this paper we provide preliminary results towards these missing definitions and define two possible semantics for logic programs with contextually scoped negation, which we propose as an extension of RDFS.
[EP06] Thomas Eiter and Axel Polleres. Towards automated integration of guess and check programs in answer set programming: A meta-interpreter and applications. Theory and Practice of Logic Programming (TPLP), 6(1-2):23--60, 2006. [ http ]
Answer set programming (ASP) with disjunction offers a powerful tool for declaratively representing and solving hard problems. Many NP-complete problems can be encoded in the answer set semantics of logic programs in a very concise and intuitive way, where the encoding reflects the typical “guess and check” nature of NP problems: The property is encoded in a way such that polynomial size certificates for it correspond to stable models of a program. However, the problem-solving capacity of full disjunctive logic programs (DLPs) is beyond NP, and captures a class of problems at the second level of the polynomial hierarchy. While these problems also have a clear “guess and check” structure, finding an encoding in a DLP reflecting this structure may sometimes be a non-obvious task, in particular if the “check” itself is a coNP-complete problem; usually, such problems are solved by interleaving separate guess and check programs, where the check is expressed by inconsistency of the check program. In this paper, we present general transformations of head-cycle free (extended) disjunctive logic programs into stratified and positive (extended) disjunctive logic programs based on meta-interpretation techniques. The answer sets of the original and the transformed program are in simple correspondence, and, moreover, inconsistency of the original program is indicated by a designated answer set of the transformed program. Our transformations facilitate the integration of separate “guess” and “check” programs, which are often easy to obtain, automatically into a single disjunctive logic program. Our results complement recent results on meta-interpretation in ASP, and extend methods and techniques for a declarative “guess and check” problem solving paradigm through ASP.
[FLP+06] Dieter Fensel, Holger Lausen, Axel Polleres, Jos de Bruijn, Michael Stollberg, Dumitru Roman, and John Domingue. Enabling Semantic Web Services : The Web Service Modeling Ontology. Springer, 2006. [ http ]
The goal of this book is to provide an insight into and an understanding of the problems faced by Web services and service-oriented architectures, as well as the promises and solutions of the Semantic Web. We focus particularly on the Web Service Modeling Ontology (WSMO), which provides a comprehensive conceptual framework for the fruitful combination of Semantic Web technologies and Web services. With the present book we want to give an overall understanding of the WSMO framework and show how it can be applied to the problems of service-oriented architectures. It is not a ready-to-install “user manual” for Semantic Web services that is provided with this book, but rather an in-depth introduction. While many of the related technologies and standards are still under development we nevertheless think it is not too early for such a book: it is important to create an awareness of this technology and think about it today rather than tomorrow. The technology might not be at an industrial strength maturity yet, but the problems are already.

2005


[HPvHG05] Martin Hepp, Axel Polleres, Frank van Harmelen, and Michael R. Genesereth, editors. Proceedings of the First International Workshop on Mediation in Semantic Web Services (MEDIATE 2005), volume 168 of CEUR Workshop Proceedings, Amsterdam, The Netherlands, December 2005. CEUR-WS.org. [ http ]
Workshop Proceedings. This workshop was co-located with ICSOC 2005.
[KHP+05] Reto Krummenacher, Martin Hepp, Axel Polleres, Christoph Bussler, and Dieter Fensel. WWW or What is Wrong with Web Services. In Welf Löwe and Jean-Philippe Martin-Flatin, editors, Proceedings of the 3rd European Conference on Web Services (ECOWS 2005), pages 235--243, Växjö, Sweden, November 2005. IEEE Computer Society. [ .pdf ]
A core paradigm of the Web is information exchange via persistent publication, i.e., one party publishes a piece of information on the Web, and any other party who knows the location of the resource can retrieve and process the information at any later point in time and without the need for synchronization with the original publisher. This functionality significantly contributed to the scalability of the Web, since it reduced the amount of interaction between the sender and receiver. Current approaches of extending the World Wide Web from a collection of human-readable information, connecting humans, into a network that connects computing devices based on machine-processable semantics of data lack this feature and are instead based on tightly-coupled message exchange. In this paper, we (1) show that Web services based on the message-exchange paradigm are not fully compliant with core paradigms of the Web itself, (2) outline how the idea of persistent publication as a communication paradigm can be beneficially applied to Web services, and (3) propose a minimal architecture for fully Web-enabled Semantic Web services based on publication in shared information spaces, which we call Triple Space Computing.
[ABdB+05] Jürgen Angele, Harold Boley, Jos de Bruijn, Dieter Fensel, Pascal Hitzler, Michael Kifer, Reto Krummenacher, Holger Lausen, Axel Polleres, and Rudi Studer. Web Rule Language (WRL), September 2005. W3C member submission. [ http ]
The Web Rule Language WRL is a rule-based ontology language for the Semantic Web. The language is located in the Semantic Web stack next to the Description Logic based Ontology language OWL. WRL defines three variants, namely Core, Flight and Full. The Core variant marks the common fragment between WRL and OWL. WRL-Flight is a Datalog-based rule language. WRL-Full is a full-fledged rule language with function symbols and negation under the Well-Founded Semantics.
[FDP+05] Cristina Feier, Roman Dumitru, Axel Polleres, John Domingue, Michael Stollberg, and Dieter Fensel. Towards intelligent web services: The web service modeling ontology (WSMO). In Proceedings of the 2005 International Conference on Intelligent Computing (ICIC'05), Hefei, China, August 2005. [ .pdf ]
The Semantic Web and Semantic Web Services build a natural application area for Intelligent Agents, namely querying and reasoning about structured knowledge and semantic descriptions of services and their interfaces on the Web. This paper provides an overview of the Web Service Modeling Ontology, a conceptual framework for the semantical description of Web services.
[FKL+05] Dieter Fensel, Uwe Keller, Holger Lausen, Axel Polleres, and Ioan Toma. What is wrong with Web service discovery. In W3C Workshop on Frameworks for Semantics in Web Services, Innsbruck, Austria, June 2005. [ .pdf ]
[dBFK+05] Jos de Bruijn, Dieter Fensel, Uwe Keller, Michael Kifer, Reto Krummenacher, Holger Lausen, Axel Polleres, and Livia Predoiu. Web Service Modeling Language (WSML), June 2005. W3C member submission. [ http ]
In this document, we introduce the Web Service Modeling Language WSML which provides a formal syntax and semantics for the Web Service Modeling Ontology WSMO. WSML is based on different logical formalisms, namely, Description Logics, First-Order Logic and Logic Programming, which are useful for the modeling of Semantic Web services. WSML consists of a number of variants based on these different logical formalisms, namely WSML-Core, WSML-DL, WSML-Flight, WSML-Rule and WSML-Full. WSML-Core corresponds with the intersection of Description Logic and Horn Logic. The other WSML variants provide increasing expressiveness in the direction of Description Logics and Logic Programming. Finally, both paradigms are unified in WSML-Full, the most expressive WSML variant. WSML is specified in terms of a normative human-readable syntax. Besides the human-readable syntax, WSML has an XML and an RDF syntax for exchange over the Web and for interoperation with RDF-based applications. Furthermore, we provide a mapping between WSML ontologies and OWL ontologies for interoperation with OWL ontologies through a common semantic subset of OWL and WSML.
[dBBD+05] Jos de Bruijn, Christoph Bussler, John Domingue, Dieter Fensel, Martin Hepp, Uwe Keller, Michael Kifer, Birgitta König-Ries, Jacek Kopecky, Rubén Lara, Holger Lausen, Eyal Oren, Axel Polleres, Dumitru Roman, James Scicluna, and Michael Stollberg. Web Service Modeling Ontology (WSMO), June 2005. W3C member submission. [ http ]
The potential to achieve dynamic, scalable and cost-effective infrastructure for electronic transactions in business and public administration has driven recent research efforts towards so-called Semantic Web services, that is enriching Web services with machine-processable semantics. Supporting this goal, the Web Service Modeling Ontology (WSMO) provides a conceptual framework and a formal language for semantically describing all relevant aspects of Web services in order to facilitate the automation of discovering, combining and invoking electronic services over the Web. This document describes the overall structure of WSMO by its four main elements: ontologies, which provide the terminology used by other WSMO elements, Web service descriptions, which describe the functional and behavioral aspects of a Web service, goals that represent user desires, and mediators, which aim at automatically handling interoperability problems between different WSMO elements. Along with introducing the main elements of WSMO, the syntax of the formal logic language used in WSMO is provided. The semantics and computationally tractable subsets of this logical language are defined and discussed in a separate document of the submission, the Web Service Modeling Language (WSML) document.
[Pol05] Axel Polleres. Semantic web languages and semantic web services as application areas for answer set programming. In The Dagstuhl Seminar 05171 -- Nonmonotonic Reasoning, Answer Set Programming and Constraints, May 2005. Extended Abstract. [ http ]
In the Semantic Web and Semantic Web Services areas there are still unclear issues concerning an appropriate language. Answer Set Programming and ASP engines can be particularly interesting for Ontological Reasoning, especially in the light of ongoing discussions of non-monotonic extensions for Ontology Languages. Previously, the main concern of discussions was around OWL and Description Logics. Recently many extensions and suggestions for Rule Languages and Semantic Web Languages pop up, particularly in the the context of Semantic Web Services, which involve the meta-data description of Services instaead of static data on the Web only. These lanuages involve SWRL, WSML, SWSL-Rules, etc. I want to give an outline of languages, challenges and initiatives in this area and where I think Answer Set Programming research can hook in.
[BPLF05] Jos De Bruijn, Axel Polleres, Rubén Lara, and Dieter Fensel. OWL DL vs. OWL Flight: Conceptual modeling and reasoning for the semantic web. In Proceedings of the 14th World Wide Web Conference (WWW2005), pages 623--632, Chiba, Japan, May 2005. ACM Press. [ .pdf ]
The Semantic Web languages RDFS and OWL have been around for some time now. However, the presence of these languages has not brought the breakthrough of the Semantic Web the creators of the languages had hoped for. OWL has a number of problems in the area of interoperability and usability in the context of many practical application scenarios which impede the connection to the Software Engineering and Database communities. In this paper we present OWL Flight, which is loosely based on OWL, but the semantics is grounded in Logic Programming rather than Description Logics, and it borrows the constraint-based modeling style common in databases. This results in different types of modeling primitives and enforces a different style of ontology modeling. We analyze the modeling paradigms of OWL DL and OWL Flight, as well as reasoning tasks supported by both languages. We argue that different applications on the Semantic Web require different styles of modeling and thus both types of languages are required for the Semantic Web.
[KLL+05] Uwe Keller, Rubén Lara, Holger Lausen, Axel Polleres, and Dieter Fensel. Automatic location of services. In Proceedings of the 2nd European Semantic Web Conference (ESWC2005), May 2005. [ .pdf ]
The automatic location of services that fulfill a given need is seen as a key step towards dynamic and scalable integration. In this paper we present a model for the automatic location of services that considers the static and dynamic aspects of service descriptions and identifies what notions of match and techniques are useful for the matching of both. Our model presents three important features: ease of use for the requester, efficient pre-filtering of relevant services, and accurate contracting of services that fulfill a given requester goal. We further elaborate previous work and results on Web service discovery by analyzing what steps and what kind of descriptions are necessary for an efficient and usable automatic service location. Furthermore, we analyze the intuitive and formal notions of match that are of interest for locating services that fulfill a given goal. Although having a formal underpinning, the proposed model does not impose any restrictions on how to implement it for specific applications, but proposes some useful formalisms for providing such implementation.
[PTF05] Axel Polleres, Ioan Toma, and Dieter Fensel. Modeling services for the semantic grid. In The Dagstuhl Seminar 05271 -- Semantic Grid: The Convergence of Technologies, May 2005. Extended Abstract. [ http ]
The Grid has emerged as a new distributed computing infrastructure for advanced science and engineering aiming at enabling sharing of resources and information towards coordinated problem solving in dynamic environments. Research in Grid Computing and Web Services has recently converged in what is known as the Web Service Resource Framework. While Web Service technologies and standards such as SOAP and WSDL provide the syntactical basis for communication in this framework, a service oriented grid architecture for communication has been defined in the Open Grid Service architecture. Wide agreement that a flexible service Grid is not possible without support by Semantic technologies has lead to the term “Semantic Grid” which is at the moment only vaguely defined. In our ongoing work on the Web Service Modeling Ontology (WSMO) we so far concentrated on the semantic description of Web services with respect to applications in Enterprise Application Integration and B2B integration scenarios. Although the typical application areas of Semantic Web services have slightly different requirements than the typical application scenarios in the Grid a big overlap justifies the assumption that most research results in the Semantic Web Services area can be similarly applied in the Semantic Grid. The present abstract summarizes the authors view on how to fruitfully integrate Semantic Web service technologies around WSMO/WSML and WSMX and Grid technologies in a Semantic Service Grid and gives an outlook on further possible directions and research. The reminder of this abstract is structured as follows. After giving a short overview of the current Grid Service architecture and its particular requirements, we shortly review the basic usage tasks for Semantic Web services. We then point out how these crucial tasks of Semantic Web services are to be addressed by WSMO. In turn, we try to analyze which special requirements for Semantic Web Services arise with respect to the Grid. We conclude by giving an outlook on the limitations of current Semantic Web services technologies and how we plan to address these in the future in a common Framework for Semantic Grid services.
[SP05] James Scicluna and Axel Polleres. Semantic web service execution for WSMO based choreographies. In Workshop on Semantic Web Applications at the 11th EUROMEDIA Conference, Toulouse, France, April 2005. [ .pdf ]
The Semantic Web is slowly gathering more importance as both academic and industrial organizations are realizing the potential benefit that might be obtained from it. This is especially true in the areas of tourism in which Semantic Web Services can provide a drastically new way on how to find and book related services such as hotel, flights and taxi transfers. However, many aspects of Semantic Web Services are still under development. This short paper presents issues related to choreography and orchestration representation in the Web Service Modelling Ontology (WSMO) and also how such ideas can be applied to an e-tourism use case.
[dBLPF05] Jos de Bruijn, Holger Lausen, Axel Polleres, and Dieter Fensel. The WSML rule languages for the semantic web. In W3C Workshop on Rule Languages for Interoperability, Washington, D.C., USA, April 2005. [ http ]
The Web Service Modeling Language WSML provides a framework for the modeling of ontologies and semantic Web services based on the conceptual model of the Web Service Modeling Ontology. In this paper we describe the two rule-based WSML-variants and outline our position with respect to a rule language for the Semantic Web. The first rule-based WSML variant, WSML-Flight, semantically corresponds to the Datalog fragment of F-Logic, extended with inequality in the body and locally stratified negation under the Perfect model semantics. The second, WSML-Rule, is an extension of WSML-Flight to the logic programming subset of F-Logic which allows the use of function symbols and unsafe rules (i.e., there may be variables in rule heads which do not occur in the body).
[LdBPF05] Holger Lausen, Jos de Bruijn, Axel Polleres, and Dieter Fensel. WSML - a language framework for semantic web services. In W3C Workshop on Rule Languages for Interoperability, Washington, D.C., USA, April 2005. [ http ]
The Web Service Modeling Language (WSML) provides a framework of different language variants to describe semantic Web services. This paper presents the design rationale and relation with existing language recommendations. WSML is a frame based language with an intuitive human readable syntax and XML and RDF exchange syntaxes, as well as a mapping to OWL. It provides different variants, allowing for open and closed world modeling; it is a fully-fledged ontology and rule language with defined variants grounded in well known formalisms, namely Datalog, Description Logic and Frame Logic. Taking the key aspects of WSML as a starting point, we rationalize the design decisions which we consider relevant in designing a proper layering of ontology and rule languages for the Semantic Web and semantic Web services.
[RKL+05] Dumitru Roman, Uwe Keller, Holger Lausen, Jos de Bruijn, Rubén Lara, Michael Stollberg, Axel Polleres, Cristina Feier, Cristoph Bussler, and Dieter Fensel. Web service modeling ontology. Applied Ontology, 2005. [ http ]
The potential to achieve dynamic, scalable and cost-effective marketplaces and eCommerce solutions has driven recent research efforts towards so-called Semantic Web Services that are enriching Web services with machine-processable semantics. To this end, the Web Service Modeling Ontology (WSMO) provides the conceptual underpinning and a formal language for semantically describing all relevant aspects of Web services in order to facilitate the automatization of discovering, combining and invoking electronic services over the Web. In this paper we describe the overall structure of WSMO by its four main elements: ontologies, which provide the terminology used by other WSMO elements, Web services, which provide access to services that, in turn, provide some value in some domain, goals that represent user desires, and mediators, which deal with interoperability problems between different WSMO elements. Along with introducing the main elements of WSMO, we provide a logical language for defining formal statements in WSMO together with some motivating examples from practical use cases which shall demonstrate the benefits of Semantic Web Services.

2004


[KLP+04] Michael Kifer, Rubén Lara, Axel Polleres, Chang Zhao, Uwe Keller, Holger Lausen, and Dieter Fensel. A logical framework for web service discovery. In ISWC 2004 Workshop on Semantic Web Services: Preparing to Meet the World of Business Applications, Hiroshima, Japan, November 2004. [ .pdf ]
[BPLF04] Jos De Bruijn, Axel Polleres, Rubén Lara, and Dieter Fensel. OWL DL vs. OWL Flight: Conceptual modeling and reasoning for the semantic web. Technical Report DERI-TR-2004-11-10, Digital Enterprise Research Institute (DERI), November 2004. [ .pdf ]
[ABK+04] Sinuhé Arroyo, Christoph Bussler, Jacek Kopecký, Rubén Lara, Axel Polleres, and Michal Zaremba. Web service capabilities and constraints in WSMO. In W3C Workshop on Constraints and Capabilities for Web Services, Oracle Conference Center, Redwood Shores, CA, USA, October 2004. [ .html ]
[LRPF04] Rubén Lara, Dumitru Roman, Axel Polleres, and Dieter Fensel. A conceptual comparison of WSMO and OWL-S. In Proceedings of the European Conference on Web Services (ECOWS 2004), volume 3250 of Lecture Notes in Computer Science (LNCS), pages 254--269, Erfurt, Germany, September 2004. [ http ]
[OLPL04] Daniel Olmedilla, Rubén Lara, Axel Polleres, and Holger Lausen. Trust negotiation for semantic web services. In Proceedings of the First International Workshop on Semantic Web Services and Web Process Composition (SWSWPC 2004), San Diego, California, USA, July 2004. [ .pdf ]
[BP04] Jos De Bruijn and Axel Polleres. Towards an ontology mapping specification language for the semantic web. Technical Report DERI-TR-2004-06-30, Digital Enterprise Research Institute (DERI), June 2004. [ .pdf ]
[EFL+04] Thomas Eiter, Wolfgang Faber, Nicola Leone, Gerald Pfeifer, and Axel Polleres. A Logic Programming Approach to Knowledge-State Planning: Semantics and Complexity. ACM Transactions on Computational Logic, 5(2):206--263, April 2004. [ DOI | http ]
[EP04] Thomas Eiter and Axel Polleres. Towards automated integration of guess and check programs in answer set programming. In Vladimir Lifschitz and Ilkka Niemelä, editors, Proceedings of the Seventh International Conference on Logic Programming and Nonmonotonic Reasoning (LPNMR-7), number 2923 in Lecture Notes in AI (LNAI), pages 100--113, Fort Lauderdale, Florida, USA, January 2004. Springer. [ http ]
[EFPP04] Thomas Eiter, Wolfgang Faber, Gerald Pfeifer, and Axel Polleres. Declarative planning and knowledge representation in an action language. In Ioannis Vlahavas and Dimitris Vrakas, editors, Intelligent Techniques for Planning. IDEA Group Publishing, 2004. [ http ]
This chapter introduces planning and knowledge representation in the declarative action language K. Rooted in the area of Knowledge Representation & Reasoning, action languages like K allow the formalization of complex planning problems involving non-determinism and incomplete knowledge in a very flexible manner. By giving an overview of existing planning languages and comparing these against our language, we aim on further promoting the applicability and usefulness of high-level action languages in the area of planning. As opposed to previously existing languages for modeling actions and change, K adopts a logic programming view where fluents representing the epistemic state of an agent might be true, false or undefined in each state. We will show that this view of knowledge states can be fruitfully applied to several well-known planning domains from the literature as well as novel planning domains. Remarkably, K often allows to model problems more concisely than previous action languages. All the examples given can be tested in an available implementation, the DLVK planning system.

2003


[DEF+03b] Jürgen Dix, Thomas Eiter, Michael Fink, Axel Polleres, and Yingqian Zhang. Monitoring Agents using Declarative Planning. Fundamenta Informaticae, 57(2):345--370, November 2003. [ .ps.gz ]
[DEF+03a] Jürgen Dix, Thomas Eiter, Michael Fink, Axel Polleres, and Yingqian Zhang. Monitoring Agents using Declarative Planning. In Proceedings of the 26th German Conference on Artificial Intelligence (KI2003), volume 2821 of Lecture Notes in Computer Science (LNCS), pages 646--660. Springer, September 2003. [ .pdf ]
[Pol03a] Axel Polleres. Advances in Answer Set Planning. Thesis, Institut für Informationssysteme, Technische Universität Wien, Wien, Österreich, September 2003. [ .pdf ]
[EP03] Thomas Eiter and Axel Polleres. Transforming coNP checks to answer set computation by meta-interpretation. In Proceedings of the 2003 Joint Conference on Declarative Programming APPIA-GULP-PRODE 2003, Reggio Calabria, Italy, September 2003. [ .pdf ]
[Pol03b] Axel Polleres. The declarative planning system DLVK: Progress and extensions. In Jeremy Frank and Susanne Biundo, editors, Printed Notes of the ICAPS-03 Doctoral Consortium, pages 94--98, June 2003. [ .pdf ]
[EFL+03a] Thomas Eiter, Wolfgang Faber, Nicola Leone, Gerald Pfeifer, and Axel Polleres. A Logic Programming Approach to Knowledge-State Planning, II: the DLVK System. Artificial Intelligence, 144(1--2):157--211, March 2003. [ .ps.gz ]
[EFL+03b] Thomas Eiter, Wolfgang Faber, Nicola Leone, Gerald Pfeifer, and Axel Polleres. Answer Set Planning under Action Costs. Journal of Artificial Intelligence Research (JAIR), 19:25--71, 2003. [ .pdf ]

2002


[EFL+02b] Thomas Eiter, Wolfgang Faber, Nicola Leone, Gerald Pfeifer, and Axel Polleres. Answer Set Planning under Action Costs. Technical Report INFSYS RR-1843-02-13, Institut für Informationssysteme, Technische Universität Wien, October 2002. Published in Journal of Artificial Intelligence Research. [ .ps.gz ]
[Pol02b] Axel Polleres. Answer Set Planning with DLVK: Planning with Action Costs, September 2002. Poster presented at the PLANET'02 International Summer School on AI Planning 2002. [ .pdf ]
[LPF+02] Nicola Leone, Gerald Pfeifer, Wolfgang Faber, Francesco Calimeri, Tina Dell'Armi, Thomas Eiter, Georg Gottlob, Giovambattista Ianni, Giuseppe Ielpa, Christoph Koch, Simona Perri, and Axel Polleres. The DLV System. In Sergio Flesca, Sergio Greco, Giovambattista Ianni, and Nicola Leone, editors, Proceedings of the 8th European Conference on Logics in Artificial Intelligence (JELIA), volume 2424 of Lecture Notes in Computer Science (LNCS), pages 537--540, Cosenza, Italy, September 2002. (System Description). [ http ]
[EFL+02c] Thomas Eiter, Wolfgang Faber, Nicola Leone, Gerald Pfeifer, and Axel Polleres. The DLVK Planning System: Progress Report. In Sergio Flesca, Sergio Greco, Giovambattista Ianni, and Nicola Leone, editors, Proceedings of the 8th European Conference on Artificial Intelligence (JELIA), volume 2424 of Lecture Notes in Computer Science (LNCS), pages 541--544, Cosenza, Italy, September 2002. (System Description). [ http ]
[EFL+02a] Thomas Eiter, Wolfgang Faber, Nicola Leone, Gerald Pfeifer, and Axel Polleres. Answer Set Planning under Action Costs. In Sergio Flesca, Sergio Greco, Giovambattista Ianni, and Nicola Leone, editors, Proceedings of the 8th European Conference on Artificial Intelligence (JELIA), volume 2424 of Lecture Notes in Computer Science (LNCS), pages 186--197, Cosenza, Italy, September 2002. [ http ]
[Pol02f] Axel Polleres. PLANET International Summer School on AI Planning 2002, Chalkidiki, Griechenland. ÖGAI Journal, 21(4):26--29, 2002. Conference report.
[Pol02d] Axel Polleres. JELIA 2002. ÖGAI Journal, 21(4):23--25, 2002. Conference report.
[Pol02e] Axel Polleres. Planen in der AI. COMPUTER kommunikativ, 5/2002:30--31, 2002. Conference report.
[Pol02c] Axel Polleres. JELIA 2002. COMPUTER kommunikativ, 5/2002:28--29, 2002. Conference report.
[Pol02a] Axel Polleres. Answer Set Planning with DLVK. The PLANET Newsletter, 5:36--37, 2002. [ .pdf ]

2001


[EFL+01a] Thomas Eiter, Wolfgang Faber, Nicola Leone, Gerald Pfeifer, and Axel Polleres. A Logic Programming Approach to Knowledge-State Planning, II: the DLVK System. Technical Report INFSYS RR-1843-01-12, Institut für Informationssysteme, Technische Universität Wien, December 2001. [ .ps.gz ]
[EFL+01b] Thomas Eiter, Wolfgang Faber, Nicola Leone, Gerald Pfeifer, and Axel Polleres. A Logic Programming Approach to Knowledge-State Planning: Semantics and Complexity. Technical Report INFSYS RR-1843-01-11, Institut für Informationssysteme, Technische Universität Wien, December 2001. [ .ps.gz ]
[EFL+01c] Thomas Eiter, Wolfgang Faber, Nicola Leone, Gerald Pfeifer, and Axel Polleres. System Description: The DLVK Planning System. In Thomas Eiter, Wolfgang Faber, and Miroslaw Truszczyński, editors, Logic Programming and Nonmonotonic Reasoning --- 6th International Conference, LPNMR'01, Vienna, Austria, September 2001, Proceedings, number 2173 in Lecture Notes in AI (LNAI), pages 413--416. Springer Verlag, September 2001. [ .ps.gz ]
[EFL+01d] Thomas Eiter, Wolfgang Faber, Nicola Leone, Gerald Pfeifer, and Axel Polleres. The DLVK Planning System. In Alessandro Cimatti, Héctor Geffner, Enrico Giunchiglia, and Jussi Rintanen, editors, IJCAI-01 Workshop on Planning under Uncertainty and Incomplete Information, pages 76--81, August 2001.
[Pol01] Axel Polleres. The DLVK System for Planning with Incomplete Knowledge. Thesis, Institut für Informationssysteme, Technische Universität Wien, Wien, Österreich, February 2001. [ .pdf ]

2000


[EFL+00a] Thomas Eiter, Wolfgang Faber, Nicola Leone, Gerald Pfeifer, and Axel Polleres. Planning under Incomplete Knowledge. In John Lloyd, Veronica Dahl, Ulrich Furbach, Manfred Kerber, Kung-Kiu Lau, Catuscia Palamidessi, Luís Moniz Pereira, Yehoshua Sagiv, and Peter J. Stuckey, editors, Computational Logic - CL 2000, First International Conference, Proceedings, number 1861 in Lecture Notes in AI (LNAI), pages 807--821, London, UK, July 2000. Springer Verlag. [ .ps.gz ]
[EFL+00b] Thomas Eiter, Wolfgang Faber, Nicola Leone, Gerald Pfeifer, and Axel Polleres. Using the dlv System for Planning and Diagnostic Reasoning. In François Bry, Ulrich Geske, and Dietmar Seipel, editors, Proceedings of the 14th Workshop on Logic Programming (WLP'99), pages 125--134. GMD -- Forschungszentrum Informationstechnik GmbH, Berlin, January 2000. ISSN 1435-2702. [ .pdf ]

1999


[EFPT99] Uwe Egly, Michael Fink, Axel Polleres, and Hans Tompits. A web-based tutoring tool for calculating default logic extensions. In Proceedings of the World Conference on the WWW and Internet (WEBNET'99). AACE, 1999. [ .pdf ]

This file has been generated by bibtex2html 1.78

This page is maintained by Axel Polleres, last update 2024/02/25

Valid XHTML 1.0 Transitional