Is C ⊑ D definitely true, definitely false, or unknown?
Is there a model of our KB such that a given concept is non-empty? Equivalently, is C ⊑ ⊥ true, false, or unknown?
C ≡ D iff C ⊑ D ∧ D ⊑ C
C and D are disjoint if C ⊓ D ⊑ ⊥; equivalently if their intersection is empty in every model.
Organise all the mentioned concepts into an explicit hierarchy. Depends on subsumption.
If we can classify concepts, we can classify individuals.
Can we conclude x ∈ C ∧ x ∉ C for some individual and concept?
Can we conclude (x,y) ∈ r ∧ (x,y) ∉ r for some individuals and rôle?
Could there be a non-empty ABox for a given TBox?
Is x ∈ C definitely true, definitely false, or unknown?
Take all the assertions x ∈ Ai and test whether A1 ⊓ ⋯ ⊓ An ⊑ C.
Find all individuals x such that x ∈ C is known to be true.
Given an individual and a set of concepts, find the most specific concept that the individual belongs to.
Concepts can be
There is no union, and negation cannot be applied to anything but an atomic concept. None of the operations of the algebra of binary relations are available for rôles.
There is a family of languages obtained by extending AL. They are named AL[U][E][N][C] where the name includes
C implies U (why? de Morgan) and E (why?), while U and E imply C (by expanding out to negation normal form). So ALUEC is abbreviated ALC and ALEUNC is abbreviated ALCN.
In AL we can write
Concepts: Person, Pet, Dog, Bird, Male, Female Roles: tends Axioms: Person ⊑ ∀ tends.Pet ∃ tends.⊤ ⊑ Person Person ⊓ Pet ⊑ ⊥ Dog ⊑ Pet Bird ⊑ Pet Dog ⊓ Bird ⊑ ⊥ Male ⊓ Female ⊑ ⊥ ABox: richard ∈ Person ⊓ Male lily ∈ Dog ⊓ Female jazzie ∈ Bird ⊓ Female chassie ∈ Bird ⊓ Male (richard,lily) ∈ tends (richard,jazzie) ∈ tends (richard,chassie) ∈ tends
Note that (richard ∈ Person ⊓ Male) could be simplified to (richard ∈ Male) because (richard,lily) ∈ tends → richard ∈ ∃ tends.⊤ → richard ∈ Person.
In 1991 it was shown that there is a decision algorithm for ALC satisfiability (hence also ALC subsumption). Since then sound and complete algorithms for several extensions of ALC have been found: number restrictions, transitive rôles and transitive closure of rôles.
Decidable is not the same as tractable. For example, the satisfiability problem for propositional calculus is NP-complete. That is, if you N(ondeterministically guess) a model for a formula, you can verify that it satisfies the formula in P(olynomial time), but nobody how to manage without guessing in less than exponential time.
If you have a problem that is NP-complete or worse, you can
There is a description logic in which subsumption is not just decidable, it's polynomial time. This is EL.
Concepts can be
Terminology axioms are
This looks rather weak. On the other hand, I've seen something very like this before. There is a graphical notation for first order logic and a set of operations on such graphs called Conceptual Graphs, described in a book by John Sowa. The formulas that can be modelled simply in this way are the existential-conjunctive ones. There is a research group at Montpellier University who have been studying the existential-conjunctive fragments of first-order logic. If you say that a formula is one of
then “does α entail β” is decidable (unlike the case in general logic), but very very hard. Technically, it is in Π2P.
But if we don't just limit negation to logical atoms, and forbid it completely, the entailment problem is “only” NP complete.
This should remind you of the relationship between ALC and EL. (The entailment problem and the subsumption problem are basically the same.)
Dr Labuschagne has talked to you about belief revision. If you have a knowledge base in existential-conjunctive form, then no matter how many positive facts you add, you will never get a contradiction. Belief revision is simple growth.
This immediately tells us that EL cannot accept axioms like
Person ⊓ Pet ⊑ ⊥
Well, it's obvious: ⊥ is not allowed by EL syntax. But more importantly, there's no other way to express the same intention.
There is another thing we might want to do with formulas in a description logic. Suppose we have a concept C expressed using concept variables V... and a second concept D
Franz Baader and Barbara Morawska have shown that both unification and matching are NP-complete for EL. If you allow ∀ r.C instead of ∃ r.C, unification is still decidable, but it is EXPTIME-complete. We don't know if P=NP and we don't know if NP=EXPTIME, but we do know that P≠EXPTIME, and it's widely suspected that NP≠EXPTIME.
EL is tractable, but it's not very expressive. But it's nearly expressive enough.
EL++ is the logic underlying the SNOMED Clinical Terms ontology and the Gene Ontology. SNOMED contains several hundred thousand concepts and well over a million relationships between them. At this sort of scale, “tractable” matters. (SNOMED is not the largest medical ontology around, but we have a copy of it.)
Here are the forms of EL++, using a, b, ... to stand for individuals, r (with or without subscripts) to stand for rôles, f to stand for features (functions mapping individuals to concrete domains such as numbers or strings, p to stand for predicates over concrete domains (such as ≤), and C, D, ... to stand for concepts.
We haven't seen “concrete domain” before. The interpretation of such a form is {x ∈ Δ | p(f1(x),...,fn(x))}. There's a link here with Entity-Relationship modelling in relational data bases. Consider a tiny fragment of a University data base:
Entities: Student, Paper Relationships: Takes(Student, Paper) Attributes: Name : Student → string Date_Of_Birth : Student → date Debt : Student → money Title : Paper → string Level : Paper → number Points : Paper → number
Some things are modelled as “entities”, rather like objects in an OO domain model. As in this case, sometimes entities refer to concrete physical things like students and sometimes to social constructs like papers. Other things are modelled as mathematical values like strings and numbers. In an E-R model, an entity type like Student or Paper corresponds to a table, while a value type like string or number doesn't. An association between entities is called a relationship and also corresponds to a table (which has foreign keys pointing to the entities). An association between an entity and a value is represented by a column in the entity's table.
In the same way, EL++ and RDF let you model some things as “individuals” (the RDF name is “resource”, but it's the same idea) and other things as values (numbers or strings). And that's what features are for.
Here are the axioms you can put in the TBox:
In rôle inclusions, the case n=0 is allowed; the identity for the composition of binary relations is the identity function, here written ε.
Here are the assertions you can put in the ABox:
Rôle inclusions let us express
Here are some examples.
father ⊑ parent mother ⊑ parent ε ⊑ ancestor parent ⊑ ancestor ancestor ∘ ancestor ⊑ ancestor parent ⊑ looks_after tends ⊑ looks_after
If y is x's father, y is one of x's parents. Being the parent of a child and the owner of a dog are similar (you have to provide your children and your animals with suitable food, drink, and shelter) but different (you can sell your dog but not your daughter). We may need to model the similarities so that we can draw correct inferences (how many creatures is Richard responsible for feeding?) and we may also need to model the differences so that we don't draw incorrect inferences.
The empty concept ⊥ together with concept inclusions gives us disjointness of concepts, as before.
Something we couldn't do before: {a} ⊑ {b} expresses “a and b are identical” and {a} ⊓ {b} ⊑ ⊥ expresses “a and b are different”. This means that we can allow a=b and a≠b in the language without extending its power.
We can now express “Richard has these three pets”:
Concepts: Person, Pet Roles: tends Axioms: dom(tends) ⊑ Person ran(tends) ⊑ Pet Person ⊓ Pet ⊑ ⊥ lily ≠ jazzie, lily ≠ chassie, jazzie ≠ chassie. ABox: (richard,lily) ∈ tends (richard,jazzie) ∈ tends (richard,chassie) ∈ tends
and now you can safely conclude that I have at least three pets, without knowing anything about their sex or genus.
There is an additional limitation on EL++, and it's rather technical. It's a beautiful example (in its own horrible way) of just how delicate the differences between complexity classes can be.
Define T ⊨ r ⊑ s iff there are rôle inclusions r1 ⊑ r2, r2 ⊑ r3, ... rn-1 ⊑ rn in T with r=r1 and rn=s.
Define T ⊨ ran(r) ⊑ C iff there is a rôle name s such that T ⊨ r ⊑ s and the range restriction ran(s) ⊑ C is in T.
We need to impose an extra restriction on the whole TBox.
If r1 ∘ ⋯ ∘ rn ⊑ s is in T, with n≥1
and T ⊨ ran(s) ⊑ C
then T ⊨ ran(rn) ⊑ C.
Without this restriction, subsumption in EL++ is undecidable. With it, subsumption can be done in polynomial time.
Sadly, we can't add rôle inverse or symmetric rôles without losing tractability.
Much of this is cribbed from papers by Franz Baader and his colleagues.