Similarly to how XML became the well-known standard adapted by many software vendors for data exchange, Resource Description Framework (RDF) is going in the same direction for describing and interchanging metadata. XML describes data using a document type definition (DTD) or an XML Schema Definition (XSD). RDF uses XML syntax and RDF Schema (RFDS) to describe the metadata as a data model.

This article explains how to use custom utilities developed with the Jena RDF API for managing RDF models stored in either a relational database or a file. Developed by HP Labs, the Jena framework is an open source implementation of RDF, RDFS, and OWL (Web Ontology Language) and includes a rule-based inference engine. It provides a Java API for creating and manipulating RDF models. In this article, I explain scripts that I developed using Jena for maintaining RDF database or file models. This article also explains how to use Protégé for creating semantic RDF files that include the schema (.rdfs) and the data file (.rdf).

Software installation
The following software must be installed before using SemanticRDFUtils.bat, a script with several tasks that can be used for maintaining Jena RDF metadata models stored in a relational database or a flat file. The links are included in Resources.

  • Java SDK 1.3 or a more recent version
  • Jena 2.0
  • Oracle 9.2.0.1.0
  • Apache ANT 1.5.4 or a more recent version
  • Protégé _2.1

A quick look at RDF and RDFS files
The following XML listings show the RDF and RDFS files for a sample alphabet cross reference model. They were created using the Protégé 2.1 GUI tool. The RDF file can be used as an input while running the scripts and RDF query tool. The RDFS file is useful when you work with Protégé to add more data to the RDF file.

Listing 1. RDFTest1.rdf

<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE rdf:RDF [
<!ENTITY rdf 'http://www.w3.org/1999/02/22-rdf-syntax-ns#'>
<!ENTITY rdfs 'http://www.w3.org/TR/1999/PR-rdf-schema-19990303#'>
<!ENTITY Maana 'http://www.vvasam.com/Maana#'>
]>
<rdf:RDF xmlns:rdf="&rdf;"
xmlns:Maana="&Maana;"
xmlns:rdfs="&rdfs;">
<Maana:ASCII rdf:about="&Maana;RDFTest_Instance_0"
Maana:Name="A"
Maana:value="65"
rdfs:label="A:65">
<Maana:system rdf:resource="&Maana;RDFTest_Instance_2"/>
</Maana:ASCII>
<Maana:System rdf:about="&Maana;RDFTest_Instance_1"
Maana:Name="lowercase"
rdfs:label="lowercase"/>
<Maana:ASCII rdf:about="&Maana;RDFTest_Instance_10000"
Maana:Name="b"
Maana:value="98"
rdfs:label="b:98">
<Maana:system rdf:resource="&Maana;RDFTest_Instance_1"/>
</Maana:ASCII>
<Maana:ASCII rdf:about="&Maana;RDFTest_Instance_10001"
Maana:Name="B"
Maana:value="66"
rdfs:label="B:66">
<Maana:system rdf:resource="&Maana;RDFTest_Instance_2"/>
</Maana:ASCII>
<Maana:AscXRef rdf:about="&Maana;RDFTest_Instance_10002"
rdfs:label="b:98:B:66">
<Maana:keyName rdf:resource="&Maana;RDFTest_Instance_10000"/>
<Maana:keyValue rdf:resource="&Maana;RDFTest_Instance_10001"/>
</Maana:AscXRef>
<Maana:AscXRef rdf:about="&Maana;RDFTest_Instance_10005"
rdfs:label="a:97:A:65">
<Maana:keyValue rdf:resource="&Maana;RDFTest_Instance_0"/>
<Maana:keyName rdf:resource="&Maana;RDFTest_Instance_8"/>
</Maana:AscXRef>
<Maana:System rdf:about="&Maana;RDFTest_Instance_2"
Maana:Name="uppercase"
rdfs:label="uppercase"/>
<Maana:ASCII rdf:about="&Maana;RDFTest_Instance_8"
Maana:Name="a"
Maana:value="97"
rdfs:label="a:97">
<Maana:system rdf:resource="&Maana;RDFTest_Instance_1"/>
</Maana:ASCII>
</rdf:RDF>

Listing 2. RDFTest1.rdfs

<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE rdf:RDF [
<!ENTITY rdf 'http://www.w3.org/1999/02/22-rdf-syntax-ns#'>
<!ENTITY system 'http://protege.stanford.edu/system#'>
<!ENTITY Maana 'http://www.vvasam.com/Maana#'>
<!ENTITY rdfs 'http://www.w3.org/TR/1999/PR-rdf-schema-19990303#'>
]>
<rdf:RDF xmlns:rdf="&rdf;"
xmlns:system="&system;"
xmlns:rdfs="&rdfs;"
xmlns:Maana="&Maana;">
<rdf:Property rdf:about="&system;maxCardinality"
rdfs:label="system:maxCardinality"/>
<rdf:Property rdf:about="&system;minCardinality"
rdfs:label="system:minCardinality"/>
<rdf:Property rdf:about="&system;range"
rdfs:label="system:range"/>
<rdfs:Class rdf:about="&Maana;ASCII"
rdfs:label="ASCII">
<rdfs:subClassOf rdf:resource="&rdfs;Resource"/>
</rdfs:Class>
<rdfs:Class rdf:about="&Maana;AscXRef"
rdfs:label="AscXRef">
<rdfs:subClassOf rdf:resource="&rdfs;Resource"/>
</rdfs:Class>
<rdf:Property rdf:about="&Maana;Name"
rdfs:label="Name">
<rdfs:domain rdf:resource="&Maana;ASCII"/>
<rdfs:domain rdf:resource="&Maana;System"/>
<rdfs:range rdf:resource="&rdfs;Literal"/>
</rdf:Property>
<rdf:Property rdf:about="&Maana;RDFTest_Slot_10003"
rdfs:label="RDFTest_Slot_10003">
<rdfs:range rdf:resource="&rdfs;Literal"/>
</rdf:Property>
<rdfs:Class rdf:about="&Maana;System"
rdfs:label="System">
<rdfs:subClassOf rdf:resource="&rdfs;Resource"/>
</rdfs:Class>
<rdf:Property rdf:about="&Maana;keyName"
rdfs:label="keyName">
<rdfs:range rdf:resource="&Maana;ASCII"/>
<rdfs:domain rdf:resource="&Maana;AscXRef"/>
</rdf:Property>
<rdf:Property rdf:about="&Maana;keyValue"
rdfs:label="keyValue">
<rdfs:range rdf:resource="&Maana;ASCII"/>
<rdfs:domain rdf:resource="&Maana;AscXRef"/>
</rdf:Property>
<rdf:Property rdf:about="&Maana;system"
rdfs:label="system">
<rdfs:domain rdf:resource="&Maana;ASCII"/>
<rdfs:range rdf:resource="&Maana;System"/>

</rdf:Property>
<rdf:Property rdf:about="&Maana;value"
rdfs:label="value">
<rdfs:domain rdf:resource="&Maana;ASCII"/>
<rdfs:range rdf:resource="&rdfs;Literal"/>
</rdf:Property>
</rdf:RDF>

Overview of Jena and Protégé
The following sections give a high-level overview of Jena and Protégé. You can get more detailed information on both productsfrom the links in Resources. In this article, I expect you already have a good understanding of Jena and Protégé.

Jena RDF and RDQL
The RDF data model is a collection of statements, with each statement consisting of three parts: resource, property, and a value. The resource can be anything that can be identified by a URI and it can have properties. Each property contains values. A property, value, and statement can be a resource and have their own properties and values.

Jena persists the RDF model in either a database or a file. RDQL is query language for querying the RDF model. RDF provides a graph with directed edges where the nodes can be either resources or literals. RDQL offers a way of specifying a graph pattern that is matched against the graph to yield a set of matches. Figure 1 shows the RDF graph representation of the files shown in Listings 1 and 2.

 

Figure 1. RDF graph representation for the sample RDF file. Click on thumbnail to view full-sized image.

The resource is represented by ellipse and the literal is represented by rectangle. The resource(subject) is linked to another resource or literal( object or value) through an arc(predicate or property) can be considered as a triple and called as a statement.

The query below is an example RDQL query. The triple (?x “97”) in the query is a statement. The xis a bind variable that represents a resource; http://www.vvasam.com/Maana#value is a property with name value; and 97 is the value of the property.

SELECT ?x WHERE (?x “97”)

The Jena toolkit provides a Java class (jjena.rdfquery) that can be executed from a command line to carry out RDQL queries. The following shows the results from executing the query shown above by saving it as test1.rdql.

java jena.rdfquery --data RDFTest1.rdf --query test1.rdql
x
================================================
http://www.vvasam.com/Maana#RDFTest_Instance_8

Note: Browse through the links in Resourcesfor more information on RDF and RDQL.

RDF using Protégé
Protégé is a GUI tool for creating and editing ontologies and knowledge bases. Protégé can create and store data in RDF format. To create an RDF model in Protégé, the RDF Schema format must be selected when creating a project, as shown in Figure 2.

 

Figure 2. RDF Schema project

The Select Format box appears when New is selected from Protégé’s Project menu. After clicking the OK button, the window in Figure 3 appears.

 

Figure 3. Default Protégé project screen. Click on thumbnail to view full-sized image.

As you can see from Figure 3 Protégé has several tabs. In this article, I briefly discuss the Classes, Instances, and Algernon tabs.

Figure 4 shows Protégé’s Save dialog box. It includes entries for typing the names of the project, the classes file, instances file, and namespace. As shown in Figure 4, Classes File Name contains RDF Schema information, and Instances File Name contains the RDF data. Namespace identifies the RDF model with a unique URI.

 

Figure 4: Protégé’s Save dialog

Figures 5 and 6 show the Protégé Classes and Instances tab, respectively, of the .rdfand .rdfsfiles shown in Listings 1 and 2. The files are created using Protégé RDF schema format.

 

Figure 5. Protégé class tab. Click on thumbnail to view full-sized image.

:END_IMAGE

 

 

 

Figure 6. Protégé instances tab. Click on thumbnail to view full-sized image.

Algernon queries in Protégé
Protégé’s Algernon query tab provides a UI for running Algernon queries and viewing the results, see Figure 7. Algernon is triple based query language that returns resources based on the traversal path as shown in Figure 7. By default, the Algernon query tab doesn’t appear in the view. To see this tab, it must be selected from the Configure submenu in the Project menu.

 

Figure 7. Algernon tab. Click on thumbnail to view full-sized image.

Terminology mapping between Jena and Protégé
Jena and Protégé are two separate open source technologies, so they have RDF terminology differences. The following table maps the terminology to make things easier when using them for creating and manipulating RDF documents.

Table 1. Jena and Protégé terminology comparison

Jena Protégé Comments
Resource Class The values of resource’s properties can be seen in the instances tab of Protégé .
Property
RDF Model
Slot
RDF Project
Jena can manipulate .rdf files without .rdfs file. But Protégé requires both .rdf and .rdfs files.

Jena Semantic RDF utilities
This section explains utilities/scripts useful for maintaining the Jena database and file models. The script files are in the SemanticRDFUtils-scripts-files.zip file. The list below describes the tasks you can perform using the scripts. When you execute the SemanticRDFUtilsbatch file with no command line parameters as a task ID, the following appears on your console:

C:\RDF\SemanticRDFUtils
Usage: SemanticRDFUtils taskid
Where taskid should be any one of the following:
1 --> To create and initialize the Jena system tables with a system model name as JenaRDFSystem
2 --> To create a database model
3 --> To remove a database model.
4 --> To list the contents of a given model.
5 --> To import RDF/XML file to a database.
6 --> To list existing database model names
7 --> To export a database model to a RDF/XML file
8 --> To delete all the contents of a database model
9 --> To create a union(RDF/XML file) of RDF/XML file models
10 --> To create an intersection(RDF/XML file) of RDF/XML file models
11 --> To create a difference(RDF/XML file) of RDF/XML file models
12 --> To get the size of the given model
13 --> Export the RDF query results as RDF/XML file.
14 --> Delete the resource(s) from a model based on RDF query.

The SemanticRDFUtilsscript uses the SemanticRDFUtils.propertiesfile to get configuration information. The following table shows all the properties you can configure in that file.

Table 1. Title

Property names Possible values Comments
rdf_sytem_model_name
rdf_namespace URL followed by # Example: http://www.vvasam.net/vvasam#
isRDFInDB True or false If true, then the db_user, db_password, and url properties should be populated. The URL should have value with JDBC protocol. If false, then url property should have value with file protocol convention.
db_user User name and password of the database
url JDBC URL or file URL If isRDFInDB‘s value is true then URL format should be jdbc:oracle:thin:@<<hostname>>:<<port>>:<<sid >>. Example: url=jdbc:oracle:thin:@rdfhost:1521:alpc. If isRDFInDB‘s value is false then URL format should be file:///c:/…... Example: url=file:///c:/temp/refDataTest.rdf.
modelName Name of the database model. This is only used if isRDFInDB‘s value is true. This property can be empty for executing some tasks/options/services (methods)in SemanticRDFUtilslike

  1. List existing models
  2. Create a new model
  3. Create and initialize Jena system tables
  4. Remove model
rdf_query and bind_var_name RDF query and name of the bind variable in RDF query. If rdf_query property is empty, then the utility prompts the user to enter a value.
export_rdffile_abs_name If this property is empty, then the utility prompts the user to enter a value.
import_rdffile_abs_name If this property is empty, then the utility prompts the user to enter a value.
file_rdfmodel1_abs_name, file_rdfmodel2_abs_name, result_rdfmodel_abs_name The following properties are for model operations like union, intersection, and difference:

  1. file_model1_abs_name
  2. file_model2_abs_name
  3. output_model_abs_name

If any of the model operation properties are empty, then the utility prompts the user to enter a value.

The first task creates Jena system tables in a database and requires appropriate values for the following properties:

  1. rdf_sytem_model_name
  2. db_user
  3. db_password
  4. isRDFInDB
  5. url

The second task creates a new RDF model in a database and requires appropriate values for the following properties:

  • rdf_sytem_model_name
  • db_user
  • db_password
  • isRDFInDB
  • url
  • modelName(if this property is empty, then the script prompts for a value to be entered from the keyboard)

The third task deletes an RDF model from the database and requires appropriate values for the following properties:

  • rdf_sytem_model_name
  • db_user
  • db_password
  • isRDFInDB
  • url
  • modelName(if this property is empty, then the script prompts for a value to be entered from the keyboard)

The fourth task lists the contents of an RDF database model and requires appropriate values for the following properties:

  • rdf_sytem_model_name
  • db_user
  • db_password
  • isRDFInDB
  • url
  • modelName(if this property is empty, then the script prompts for a value to be entered from the keyboard)

The fifth task imports an RDF file to a database model and requires appropriate values for the following properties:

    • rdf_sytem_model_name
    • db_user

db_password

  • isRDFInDB
  • url
  • modelName(if this property is empty, then the script prompts for a value to be entered from the keyboard)
  • import_rdffile_abs_name

 

The value of the property import_rdffile_abs_nameis populated with the absolute path of the .rdffile as shown below. If the value is empty, then the script prompts for an input value.

import_rdffile_abs_name=C:\temp\RDFTest1.rdf

The sixth task lists all the models in a database model and requires appropriate values for the following properties:

  • rdf_sytem_model_name
  • db_user
  • db_password
  • isRDFInDB
  • url

The seventh task exports the contents of the given database model to an RDF file and requires appropriate values for the following properties:

    • rdf_sytem_model_name
    • db_user

db_password

  • isRDFInDB
  • url
  • modelName(if this property is empty, then the script prompts for a value to be entered from the keyboard)
  • import_rdffile_abs_name

 

The value of the property export_rdffile_abs_nameis populated with the absolute path of the .rdffile as shown below. If the value is empty, then the script prompts for an input value.

export_rdffile_abs_name=C:/temp/export.rdf

The eighth task deletes a database model’s contents and requires appropriate values for the following properties:

    • rdf_sytem_model_name
    • db_user

db_password

  • isRDFInDB
  • url
  •  

 

The ninth task performs a union operation on two file models and requires appropriate values for the following properties:

  • isRDFInDB
  • url
  • file_rdfmodel1_abs_name
  • file_rdfmodel2_abs_name
  • result_rdfmodel_abs_name

The following shows the sample property values:

file_rdfmodel1_abs_name=C:/temp/RDFTest1.rdf

file_rdfmodel2_abs_name=C:/temp/RDFTest2.rdf

result_rdfmodel_abs_name=C:/temp/RDFTestUnion.rdf

When SemanticRDFUtilsis executed with the task ID 9, the two .rdffiles are merged. To create a Protégé project on the merged .rdffile, the .rdfsfiles must be merged manually.

The tenth task performs an intersection operation on two file models and requires appropriate values for the following properties:

  • isRDFInDB
  • url
  • file_rdfmodel1_abs_name
  • file_rdfmodel2_abs_name
  • result_rdfmodel_abs_name

The following shows the sample property values:

file_rdfmodel1_abs_name=C:/temp/RDFTest1.rdf

file_rdfmodel2_abs_name=C:/temp/RDFTest2.rdf

result_rdfmodel_abs_name=C:/temp/RDFTestInterSection.rdf

The eleventh task performs a difference operation on two file models and requires appropriate values for the following properties:

  • isRDFInDB
  • url
  • file_rdfmodel1_abs_name
  • file_rdfmodel2_abs_name
  • result_rdfmodel_abs_name

The following shows the sample property values:

file_rdfmodel1_abs_name=C:/temp/RDFTest1.rdf

file_rdfmodel2_abs_name=C:/temp/RDFTest2.rdf

result_rdfmodel_abs_name=C:/temp/ RDFTestDifference.rdf

The twelfth task lists the size of a given database model and requires appropriate values for the following properties:

    • rdf_sytem_model_name
    • db_user

db_password

  • isRDFInDB
  • url
  • modelName(if this property is empty, then the script prompts for a value to be entered from the keyboard)

 

The thirteenth task exports the results of a RDF query on a database model to an RDF file and requires appropriate values for the following properties:

    • rdf_sytem_model_name
    • db_user

db_password

  • isRDFInDB
  • url
  • modelName(if this property is empty, then the script prompts for a value to be entered from the keyboard)
  • rdf_query
  • bind_var_name
  • export_rdffile_abs_name

 

The following shows the sample property values:

rdf_query=SELECT ?x WHERE (?x    "65")

bind_var_name=x

export_rdffile_abs_name=C:/temp/exportquery.rdf

The fourteenth task deletes the results of an RDF query from a database model and requires appropriate values for the following properties:

    • rdf_sytem_model_name
    • db_user

db_password

  • isRDFInDB
  • url
  • modelName(if this property is empty, then the script prompts for a value to be entered from the keyboard)
  • rdf_query
  • bind_var_name

 

The following shows the sample property values:

rdf_query=SELECT ?x WHERE (?x    "65")

bind_var_name=x

Conclusion
This article has given you insight into how to create RDF metadata models using Jena and Protégé. It described how to use the SemanticRDFUtilscommand line script for maintaining the RDF models. The SemanticRDFUtils-source-files.zip included in this article’s accompanying source files can be used to create a Web-based interface or Protégé GUI plug-in for manipulating the RDF models. The load scripts for each RDF model would differ and should be handled case by case.

Leave a Reply

Your email address will not be published. Required fields are marked *