Provenance and Semantic Web

Semantic Web is an extension of current web that provide machine readable format to allow machine (computer) to process, analyze and understand the meaning of data. There are some supported components to realize the concept of semantic web, also known as Semantic Cake. Figure below depict the picture of Semantic Cake. The widely used format in semantic web is rdf and ontology. RDF is a XML format document that represent data or information in directed-labeled graph in the Web. It does that by structuring information in subject-predicate-object called statement. The subject of 1 statement can also be the object of another statement creating rich-interrelated knowledge based. RDF is needed in the concept of Semantic Web, because it conveys the meaning that can be understood by computer.


Although RDF is good to structuring the data in machine-readable format and convey its meaning, it is not enough to represent the model of the world. Ontology is needed to enrich the vocabularies, listing the type and the relationship of objects that connect them, and put constraint that limit to the objects and relationships combined. In other words, ontology brings common sense into the knowledge based which has been created and structured.

Once data is representing as RDF or ontology format, it can be queried by using SPARQL. SPARQL (SPARQL Protocol and RDF Query Language) is an RDF query language, (a.k.a. semantic query language) to retrieve and manipulate data stored in Resource Description Framework (RDF) format [1].

In conjunction with provenance, we can combine some basic semantic web concepts (web services, ontology, triple store, reasoning, rules, and SPARQL) to represent data and connections surround it, verify it, store it, and query over it; ultimately answering challenge queries including the question about the origin of something. Each web service takes URI of a file as an input, and then processes it to produce provenance data about that file. The output will be stored in web-accessible environment, such that consumers can query over it. Consumers can also reasoning the data with the ontology and make many logical inferences about interconnectedness and dependencies of files and processes [2].

[1] E. Prud’hommeaux and A. Seaborne, “SPARQL Query Language for RDF” World Wide Web Consortium, W3C Recommendation REC-rdf-sparql-query-20080115, Jan. 2008.
[2] J. Golbeck and J. Hendler, “A semantic web approach to the provenance challenge” Concurrency and Computation: Practice and Experience, 2008.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s