Semantic Web knowledge–An introduction
This is a simple introduction, for more info read the further info links at the bottom and if you like what you read consider enrolling in a Masters-level Degree in Computer Science\Computing (seriously, this was the most rewarding thing I have ever done)
A semantic web, is that some sort of crazy net that sailors use to catch fish?
No that is a fishing net, to explain the ideas behind the semantic web I must first do 2 things
- Provide and example of a semantic sentence opposed to one that is merely syntactically valid
- Explain what a semantic web could be and provide example of semantic reasoning
Spot the difference
If someone would to ask me what my favourite colour was I may respond with the something like:
My Favourite colour is green
This is a semantically valid sentence, it has a clear meaning and is constructed in the following manner:
[pronoun] [adjective] [noun] [verb] [noun]
Which arguably forms a technically correct English sentence, however if we were to use a different noun in place of green for example Kilbeggan then the sentence would not make any sense/hold semantic meaning but still be syntactically valid as we were talking about a colour not something delicious like dark chocolate, a good steak or a bottle of Kilbeggan which all represent my favourite elements of a different collection (e.g whiskey) .
My favourite colour is Kilbeggan
This is a very simple example of meaning and how it corresponds to understanding of an overall concept but if a person or intelligent agent did not know that Kilbeggan was not a colour and so an invalid choice for this sentence there would be a possibility of this simple non-sentence being accepted as fact to that person/intelligent agent.
Classification via sematic association provides entities with related properties of terms and words so that they can correctly determine if a sentence is valid and additionally infer meaningful information\facts from a pool of data (known as an ontology).
So is the goal to give meaning to every single word on the web?
That’s a tough one, some would argue that such a scenario would be ideal but I believe that’s not really an immediately feasible goal and so it would be preferable to build semantic webs that can represent knowledge about specific domains for example a database of medial terms and conditions which can independently be understood by computer agents.
Semantic webs are based on representing knowledge using a collection of Subject::Predicate::Object /Entity Attribute Value statements\Tuples
An example of three such SPO tuples in RDF notation are
Statement 0: (#Joe, rdf:type, #Student)
Statement 1: (#Student, rdf:type, #Human)
Statement 2: (#Human, rdf:type, #Animal)
These Statements represent the following facts
Statement 0: Joe is a Student
Statement 1: A Student is a Human
Statement 2: A Human is an Animal
By storing these 3 EAV based facts in a semantic database system we can write queries/rules to infer facts about entities, for example using just these 3 rules you can successfully determine that either :
Inference 0: Joe is an Animal | which in RDF is | (#Joe, rdf:type, #Animal)
Inference 1: Joe is a Human | which in RDF is | (#Joe, rdf:type, #Human)
The reasoning outcome depends on the structure of the inference rule and desired goal, such inferences could be
Inference Rule 0: [rule1: (?z http://www.w3.org/2000/01/rdfschema#type ?y) (?y http://www.w3.org/2000/01/rdfschema#type ?x) (?x http://www.w3.org/2000/01/rdfschema#type ?w) -> ( ?z http://www.w3.org/2000/01/rdfschema#type?w )]
Could this not be achieved in a traditional RMDBS with some inference logic?
Yes, but traditional RMDBS systems are not designed for representation and inferring semantic facts so it would not really be suited and the use of specific semantic tools and data stores provide:
- consistent performance
- specialised optimisation of execution
- standard interfaces for modifying and reasoning facts
- mechanisms to efficiently understand/execute semantic rules
So there is less needless re-implementation that may be present if you were to use pre-existing DB system and write inference logic in custom blocks of your chosen programming language.
What does this mean for us?
Ultimately computer agents that can understand content & human communication that will reliably be able to provide facts from large collections of this content. An outcome could be a computer agent that could replace every diagnostic expert, a personal assistant application akin to SIRI that is 100% reliable and can handle any knowledge request that a human could process given a global dataset or using semantic reasoning to see if a webpage contains meaningful information or is just a SEO optimised spam /link bait advertisement pages (these may fool search engines).
Search engine spam
I eagerly anticipate the day where search engines are fully semantic and can understand and reason any query to give flawless and succinct responses from the worlds knowledge