Bridging ML & Knowledge Graphs

Ensuring that Machine Learning datasets are not just loadable by code, but semantically understood across research infrastructures.

View Deliverables Get in Touch

The Semantic Croissant Framework

Our core strategy revolves around extending the ML-focused Croissant standard to bridge the gap between raw data and domain-specific knowledge.

Ontology Alignment

Mapping ML dataset layers to high-level ontologies (e.g., Schema.org, PROV-O, and domain-specific vocabularies).

Multilingual Support

Utilizing RDF-native multilingual properties to allow metadata to be queryable and accessible in multiple languages.

Data Integration

Alignment and integration with other datasets to broaden research infrastructure compatibility.

Deep Dive into Croissant Spec

Croissant is a high-level format for machine learning datasets that brings together four rich layers to ensure discoverability, portability, and reproducibility.

Layer 1

Future-Proof Research

By adopting the Semantic Croissant approach, we deliver services that support the next generation of AI-driven research.

Explore the Project Scope