Property P31 – “instance of”
Most Wikidata properties describe features that an item has – the item for Star Wars Episode IV: A New Hope (Q17738) has the property director (P57), it has a certain duration (P2047), it has the property cost (P2130), and so on. But often we are interested in what something is. The majority of Wikidata items have at least one statement with the property P31 – instance of –which tells us the class of which this item is a particular example and member:
- Star Wars Episode IV: A New Hope (Q17738) is an instance of a film (Q11424).
- Star Wars (Q22092344) is an instance of a film series (Q24856).
- Star Wars (Q462) is an instance of a media franchise (Q196600).
Note that an item is not limited to one P31 statement. For example, Star Wars: Episode VIII – The Last Jedi (Q18486021) is an instance of a film (Q11424) and also an instance of a 3D film (Q229390).
Also note that P31 statements aim to make the most general distinctions and relegate other data to other properties:
George Lucas (Q38222) is an instance of human (Q5).
We could also make a statement that George Lucas is an instance of film director (Q2526255), because Lucas is obviously an example and member of the class of film directors. However, the classification strategy is to set the “instance of” statement to the most general value, and include more specific information with other properties. For example, that Lucas is a film director is given with a statement using the occupation (P106) property.
Property P279 – “subclass of”
So, while Q17738 (Star Wars Episode IV: A New Hope) represents a particular film – it has a particular director (George Lucas), a specific duration (121 minutes), a list of cast members (Carrie Fisher, Harrison Ford, …), and so on – the item film (Q11424) is a general concept. Films can have directors, durations, and cast members, but the general concept “film” does not have any particular director, duration, or cast members.
General concepts receive the property of subclass (P279) – and can have more than one. For example:
- Film (Q11424) is a subclass of visual artwork (Q4502142), but also of audiovisual work (Q2431196).
- Film series (Q24856) is a subclass of series of creative works (Q7725310), work of art (Q838948), audiovisual work (Q2431196), and media franchise (Q196600).
The significance of the instance/subclass distinction
Suppose we wanted a list of all the films that take place in the fictional Star Wars universe. We could run the following query:
The query returns only 10 films. Clearly, some films are missing in the results, such as Star Wars: Episode I – The Phantom Menace (Q165713). Why?
Because some items have “feature film” (Q24869) as the value of their P31 statement. “Feature film” is a subclass of “film” (Q11424), but as far as the query is concerned the pattern in the WHERE part of the query does not match that of the item, and therefore items that are not an instance of “film” are not a match and are not retrieved.
We could use the UNION construction to select films that are either an instance of “film” or an instance of “feature film”:
This query retrieves more results, but it is still possible that there are relevant items (i.e., films taking place in the Star Wars universe) that have an “instance of” property with a value which is some other subclass of film – action film, 3D film, epic film… Listing all the different subclasses of film in UNION statements is not a very good strategy. A more general solution is shown in the next section.