The representation of concepts (meant here in the broad sense of covering (narrow) concepts (definitions), predicates and integrity rules) should be sufficiently broad to encompass different viewpoints, representations and use cases for the same concept or set of concepts.
NOTE: The perspective and examples represented here are intended to provide an approach and example to satisfying the SIMF requirements. Other approaches are allowed for within the scope of SIMF. In addition, the marriage example is not intended to condone or preclude any particular view of marriage. However to take a real practical example we consider the federation use case of the EU statistical office, federating marriage information of the different European countries. These individual countries started to collect marriage information independently of each other. It is therefore necessary to specify a Conceptual Domain Model such that the model of each country can be expressed in terms of part of the Conceptual Domain Model.
For example the concepts of “marriage” and “spouse” are tightly tied; in a certain context one could say that every marriage has exactly two spouses. Other interpretations of marriage allow for multiple spouses. In the EU Statistical Office (Eurostat) Marriage Use Case the “exactly two” constraint is selected as this is the case for all associated countries. Hence the concepts of marriage and spouse are clearly related. Such relations will be explicitly described in the SIMF Conceptual Domain Model using MBRs (Model Bridging Relationships). When the relationships between these composite concepts are lost we have trouble understanding what they mean.
It is the position of the SIMF RFP specification team that understanding composite concepts is crucial for capturing the semantics of a domain in a conceptual model. The relationships between composite concepts and their represntation in various information models are explicitly described using SIMF MBRs (Model Bridging Relationships).
Eurostat has decided to make an analysis of certain aspects of marriages that are legal within the EU countries. Eurostat will make a website available with all the statistical information and conclusions.
Consider the concepts of “Marriage”, “Spouse”, “Husband”, “Wife” and “Married”. In a dictionary these will sometimes be defined separately. In Webster's we find:
Marriage: relation between husband and wife. [Hence a marriage is a relation between two parties.]
Marriage: the act which unites the two parties.
Husband: a man joined to a woman by marriage.
Husband: a married man.
Wife: a married woman.
Spouse: a husband or wife.
Spouse: either member of a married couple spoken of in relation to the other.
Woman: an adult female human being.
Male: designating or of the sex that fertilizes the ovum.
Female: designating or of the sex that produces ova.
In a SIMF aproach we would say: let us specify the CDM (Conceptual Domain Model) for a domain where we want to have a discourse about marriages, spouse, husband, wife and being married.
We should consider the concept definition of marriage by Webster's as well as from other resources. In order to illustrate the typical SIMF points we consider the view of Eurostat; statistics about marriages of these countries need to cover Different-Sex Marriage (hereafter called Traditional European Marriage) as well as Same-Sex Marriage; we also could include Polygamy, if we elected so. Hence we adopt a view that at the most abstract level for Eurostat we consider marriage is an agreement or contract between a pair of persons. In Conceptual Domain Modeling we want to describe the most abstract contract we want to model the concepts, predicates and integrity rules of the Eurostat marriage (in this case the Conceptual Domain Model) from the many interpretations and constraints cultures and society may place on marriage within the EU; the EU is not willing to make the model more general as this will cost extra money.
Hence we will first focus on the situation that Eurostat has asked that the most general concept is an agreement between two persons. We may later discuss the more abstract set option.
Identification of individuals
The first question then is: if we want to communicate about a specific marriage, how is an individual marriage identified within the collection of all marriages? Please note that at a country level the identification is only part of what is needed at Eurostat level.
The answer depends also on the answer to the question: do we only consider current marriages, or do we want to keep the history of all recorded marriages as well? The Eurostat answer is: we consider both current and all past recorded marriages as this is needed for trend analyses. Thinking of the famous Richard and Liz marriages this means that the identifier of an individual marriage is the combination of husband (spouse-1) and inauguration date, or the combination of wife (spouse-2) and inauguration date. Hence we have here two integrity rules. One can also say that this is an example of synonyms.
Another integrity rule is that under the definitions given so far we need to declare that every husband and every wife cannot have overlapping marriage periodes. This can only be garanteed if the enddate of a marriage is considered within the Conceptual Domain Model. We furthermore need to be able to model that every husband and every wife has at most one marriage with an unknown enddate, this represents the common restriction on the Eurostat marriage. Eurostat also wants to know statistics about the termination of marriages, be it by divorce or by death. Of a Traditional European Marriage Eurostat wants to know the number of children born in such marriages.
Examples of marriages: Serge married Claudine on 1988-09-03; this marriage resulted in 3 children. Sjir married Mia on April 17, 1964; this marriage resulted in 2 children. (President Reagan): Ronald married Jane on 1940-01-26; this marriage was terminated on 1948-06-29 by divorce; this marriage resulted in 2 children. Ronald married Nancy on 1952-03-04; this marriage was terminated by death on 2004-06-05; this marriage resulted in 2 children.
For ease of communication in this paper we will identify the persons in this example by first name; of each person we want to know the birthdate and gender, and when known the death date. Examples: Serge, 1960-06-19, male, -; Claudine, 1960-11-15, female, -; Cory, …., male, nn. Cheryl, … , female, nn.Sjir, 1938-10-18, male, nn. Mia, 1939-06-06, female, nn. Ronald, 1911-02-06, male, 2004-06-05. Jane, 1917-01-05, female, 2007-09-10. Nancy, 1921-07-06, female, nn.
The next question is: which predicates are within the selected domain? Why? As these predicates determine the possible communication about the universe of discourse, hence the predicates determine the boundary of the system. Or: about what do we want to have a discourse, how are the individual instances of the what referred to and which predicates are we interested in?
A list of predicates so far:
There are various ways to form speech act forms to communicate the above predicates, such as:
1: There exists a person <FirstName>
2: <Person> is born on <BirthDate>.
3: <Person> is <Gender>.
4: <Person> died on <DeathDate>.
10: There exists a marriage of <Person-1> and <Person-2> inaugurated on <InaugurationDate>.
11: The marriage of <Person-1> inaugurated on <InaugurationDate> is of <MarriageKind>.
12: The marriage of <Person-1> inaugurated on <InaugurationDate> was terminated on <TerminationDate> by <Cause>.
13: The Traditional European Marriage of <Husband> inaugurated on <InaugurationDate> resulted in <NumberOfChildren>.
15: The object type Marriage is the nominalization of the fact type There exists a marriage of <Person-1> and <Person-2> inaugurated on <InaugurationDate>.
20: A Traditional European Marriage is a Marriage with MarriageKind = Traditional European
21: The Traditional European Marriage of <Husband> inaugurated on <InaugurationDate> resulted in <NumberOfChildren>.
30: A Same-Sex Marriage is a Marriage with Gender of Person-1 = Gender of Person-2.
In Logical Information Models (expressed in UML, Relational, XSD, OWL, RDF, ER) we may use one or more of these predicates or parts of it as well as associated concepts where possible. But, certainly there is some connection!
This diagram (in no particular notation) suggests that the noun concept “marriage” provides a context for a set of associated role concepts such “spouse”, “husband” and “wife”. It also provides context for verb concepts such as “has spouse” or “is married to”. That, in fact, each of these verb concepts comes from a particular perspective on marriage. Once you select a certain context or view of marriage you can describe how these various roles and verb phrases are related. Once certain of these parts of the composite concept are known, it is possible to infer the others.
We can also add some more constraints, also called integrity rules in SIMF, such that a “Husband” is a “Male being” and a wife a female in the context of a Traditional European Marriage, etc.
As is the subject of much political debate, some would like to “lock down” the concept of marriage. Lets call this “Traditional European Marriage”. This becomes a specialized concept within some context such as Europe.
Traditional European Marriage (or 'Christian Marriage') is defined within the context of Europe and imposes certain constraints (integrity rules) - that a spouse must be a person, and more precise that the husband must be a male and the wife a female. That a person may be a spouse zero or one time (at any moment in time) and that within a Traditional European Marriage there is exactly one husband (a man) and one wife (a woman). These constrains may, of course, be represented by textual expression, diagrams, tables or other forms of formal expression. What is consistent is that they are constraints placed on (the communication about a Traditional European Marriage, a specialization of marriage in the context of Europe.
The above example is intended to show how composite concepts relate to both natural language statements and information representations that are about the same underlying concept.
A conceptual modeling capability (such as SIMF CDM) should be able to represent both the composite concepts as well as the viewpoint specific concepts within the composite concept.
In some European countries there is a form of marriage, called Same-Sex Marriage. Two persons of the same sex can marry. The rule applies that every spouse in such a marriage can at any moment be only spouse in one marriage.
Once the full extent of a composite concept is understood it also seems clear that such concepts (predicate instances) can have “identity” and that they can have properties and be involved in other relationships (predicates). This is what some call the quality of identification. This provides a good conceptual model of the domain.
Logical Information Models (expressed in UML, Relational, XSD, OWL, RDF, ER) may provide various ways to represent these composite concepts, their properties and relations in a way that is more efficent or understandable for a given purpose, or in this case country. Many Logical Informatio Models select only one property (e.g. “has spouse”) as a sufficent representation of marriage for a particular purpose.
A composite concept represents a pattern, or type, that can describe instances. Hence a composite concept is a populatable construct. Each such instance may be considered an “assertion” or a “fact”.
This diagram (in no particular notation) shows how an “instance” of marriage, a particular marriage, involves individuals that “play the roles” in that composite concept. A particular Logical Information Model may capture this information as a property or a reified class. In logic the instance of a composite concept is known as a “tuple”, a limited verson of a tuple is the “triple” used in RDF.
The nouns (The marriage of Ronald inaugurated on 1940-01-26) and verbs (There exists a marriage of Ronald inaugurated on 1940-01-26) are “two sides of the same coin”, just different ways to say the same thing. The verb phrases that describe the same concepts are represented by the following diagram.
Here we can see that the same individuals (Cory and Cheryl) are the subjects and objects of verb phrases about the same marriage.
It is important to capture “a marriage” as a single identifiable concept so that we can manage it, understand who asserted it and ascribe properties to it. For example, a marriage may have a property that describes when it happened. Information models that simply capture spouse properties of individuals loose this ability.
This diagram shows that the composite marriage concept can have properties, such as the date-time when it was initiated )when the couple was married). Such a concept could have any number of properties or associations with other concepts.
What frequently happens in information models is that these domain concepts get “flattened” into properties of a single class or type, such as a person. The model bridging relations are required to connect between the conceptual representations and the information models. The SIMF tooling should be able to help users make these distinctions while connecting the various representations.
Objectified (nominalized) constructs and variable based constructs have both their advantages and disadvantages. Hence there are user communities that prefer the one and other user communities prefer the other. Hence SIMF will provide both options in the information model.
In many languages these noun, verb and role concepts are defined independently, indeed they are just “string names” on the ends of relations or properties. What this fails to do is provide for the connecting semantics of these different views of the same fact or to provide a single identity for that fact. What can we do with a SIMF understanding of composite concepts?
These ideas are not new, but are also not directly supported in many languages. Languages that represent composite concepts directly include:
In this section we will discuss how a SIMF Conceptual Domain Model can be linked to existing logical information models in the most popular modeling languages.
Several information models have been expressed by the individual countries in UML. Below we will give the UML diagram as used in Ireland and the one used in The Netherlands.
The UML-diagram of Ireland
With the SIMF MBR available it is possible to describe which elements of the Conceptual Domain Model correspond with which elements in the UML diagram.
The variables FirstName of predicate 1, the variable Person of predicate 2, the variable Person of predicate 3 and the variable Person of predicate 4 are all four represented in the UML diagram by property FirstName of class Person.
The variable BirthDate of predicate 2 corresponds with property BirthDate of class Person.
The variable Gender of predicate 3 corresponds with property Gender of class Person.
The variable DeathDate of predicate 4 corresponds with property DeathDate of class Person.
The integrity rule that FirstName identifies a specific person cannot be represented in UML.
The variable Person-1 of predicate 10 is represented by the association Person1 between the classes Person and Marriage. Etc etc.
UML-diagram from The Netherlands
UML diagram from the Dept. of Motor Vehicles
In a Dept. of Motor Vehicles (DMV) “Spouse” is captured as a property but the DMV is less interested in the details of marriage. However, the “Spouse” property can be directly related to the spouse verb phrase in our conceptual model. We could, for exampl,e infer that the same person represented in the DMV had the same marriage as that person represented in the model from the Netherlands. The property in the UML information model is a data representation of the concept of spouse in the conceptual model of marriage.
There are two relational solutions, one with two tables and one with three. In the three table solution a separate table for the subtype Christian Marriage is use. We will now give the two table solution.
Person (FirstName, BirthDate, Gender, DeathDate); primary key: FirstName; Mandatory: BirthDate, Gender.
Marriage( Person-1, Person-2, InaugurationDate, TerminationDate, Cause, MarriageKind, NumberOfChildren); primary key: Person-1, InaugurationDate; alternate key: Person-2, InaugurationDate; Person-1, TerminationDate; Person-2, TerminationDate. Mandatory: MarriageKind
Here follows a description in OWL of part of the Conceptual Domain Model.
Prefix (: = <http://www.pna-group.nl/exampleMarriage>)
Ontology ( <http://www.pna-group.nl/exampleMarriage>
hasKey(:Person () (:isIdentifiedByFirstName))
DataPropertyRange(:isGender DataOneOf(“Male” “Female”))
hasKey(:Marriage (:hasPerson1) (:hasInaugurationDate))
hasKey(:Marriage (:hasPerson2) (:hasInaugurationDate))
hasKey(:Marriage (:hasPerson1) (:wasTerminatedOn))
hasKey(:Marriage (:hasPerson2) (:wasTerminatedOn))
DataPropertyRange(:wasTerminatedBecauseOf DataOneOf (“Death” “Divorce”))
An interesting question is: if part of the Conceptual Domain Model is described in OWL, in which document is the remaining part described?