Skip to main content

Data model

The central entity types of the datamodel are lexical_unit and sense. They connect the morphological and semantical data in the data model. In essence itthe model is designed to be a multilingual model, however, currently it is used as a monolingual model that connects with multilingual data (which does not have the same level of granularity) via special entity types.

On athe top level view the model can be devided into clusters (color-coded in the model):

  • sense data (blue)
  • lexical unit data (yellow-green)
  • morphological data (green)
  • structure data (grey-brown)
  • multilingual data (pink, red)
  • sense frame data (violet)
  • example data (yellow)
  • data pertaining to division into resources i.e. different dictionaries (orange)
  • feature data for various entities (grey)
  • tablesentity types that reference other tablesentity types via meta-attributes (white)

The corpus data is not contained in the database itself, but is referenced and accessed via concordancer. Some parts of the data model (e.g. structure data) are defined as XML. They are used directly in existing processing pipelines, but can be ported to ER model if necessary.