DAML Validator -------------- $Id: Readme.txt,v 1.25 2003/08/20 18:59:22 drager Exp $ This application version of the DAML Validator is made available for personal use and for comments. Latest version and web based validator available at http://www.daml.org/validator/index.html Along with Java, you will also need the xerces XML parser (xerces.jar) available from http://xml.apache.org/ The xerces.jar file can be found as part of the xerces-j tools bundle in the download section of the site. What's Included: ---------------- preferences.xml Preferences file. Defines outdated URIs and the cache files. validator.dtd DTD file for the preferences file and the XML output. html.xsl XSLT file for generation of html from validator XML output. cache/ Cache of commonly referenced files. org/ Contains the source code for the DAML Validator. doc/ Contains javadoc documentation. lib/ Contains required java libraries. lib/validator.jar The JAR archive of DAML Validator classes. validate.bat Sample batch file for running the DAML Validator. validate-html.bat Sample batch file for running the DAML validator and generating output as html. Standalone Command Line Usage ----------------------------- Setup: All necessary java library files (.jar) are included in the lib directory. Add those files to your CLASSPATH. Note: under Java 1.4 the order of the jar files can cause problems. Please use the ordering that is used in validate.bat. Run: On web pages or local files: java org.daml.validator.Validator http://www.daml.org/2001/01/gedcom/gedcom.daml java org.daml.validator.Validator file:///pathname/test.daml The validate.bat file is included for the convenience of Windows users. API Usage --------- The ValidatorAPI class provides an API to the DAML validator. You can add DAML files to a backing model that will be used to validate against. After creating the backing model you can validate multiple DAML files/models against it. DAML ontologies/content can be provided as RDF Models or as URLs to DAML files. Results are returned as Indication objects that contain pointers into the RDF Statements that triggered the indication. See the documentation provided in doc/ for more information. Cache maintainence ----------------------------- The validator provides a cache mechanism that allows you to create a cache of files that are read locally whenever the validator needs to load their URL. The cache relationships (http URL -> file URL) are defined in the preferences file. The validator now includes code to help maintain the cache. Run 'java org.daml.validator.Cache' to see usage information and examples. Notes ----- The Validator looks for the preferences.xml file in the directory where it is being run. You can override this using the '-pref ' command line argument. Note for Windows users: The java XML parser can't find the file if you use the drive letter in the pathname. The DTD for the preference file and for xml output is located at http://www.daml.org/validator/validator.dtd The validator attempts to access and load all namespaces referenced in the file being validated, so a web connection is recommended. It does this under the false assumption that a URI refers to an actual file. Resources whose URIs do not refer to an actual file will produce Undefined Resource Warnings. Using the preferences.xml file you can tell the validator to load a local file when it encounters a specific URI. See the way the cache files are referenced in the included preferences.xml file. The current preferences.xml uses relative pathnames for the cache files. If you plan to run the validator from a directory other than than the validator directory, you will need to make the pathnames absolute. Another note about drive letters on Windows: Using a drive letter in a file URL (eg. file://d:/file.daml) does not work as expected. For some reason Java completely ignores the specified drive letter in file URLs and uses the current drive letter instead. So the above URL would be treated the same as file://c:/file.daml and file:///file.daml). Note: I believe there is a problem with the DAML+OIL language file. sameClassAs is defined with a domain and range of DAML#Class, but is used to equate DAML and RDFS classes. Since DAML#Class is a subClassOf RDFS#Class, the RDFS classes in the sameClassAs statements do not meet the domain or range restrictions on sameClassAs. That's why you get indications when you run the validator on the DAML+OIL language file. Feedback -------- David Rager (BBN Technologies) drager@bbn.com Change Log [Most recent at bottom] ---------------------------------- 16 Aug 2001 - "-report " command line option now stops all processing after finding indications. Default is 100. - "java org.daml.validator.Cache" cache maintainence functions added. - Fixed bug related to undefined resource errors in referenced files. 17 Aug 2001 - Added indication for when URI being validated contains no RDF. - Fixed output bug when a URIIndication has a call stack of length 1. - Remembered to update version number. 21 Aug 2001 - Changed code so that the original RDF-API can be used without producing an error. But source URI and line numbers will not be printed in error messages. - Created a Validator API that can be called from other java code. - Updated javadoc for API and indications. 06 Sep 2001 - Fixed bug in hasClass restriction test that caused false indications. - Fixed bug dealing with domain and range constraints. - Fixed message associated with various restriction indications where it would display null as the object type. - Added Datatype Verification using code written by Mark Neighbors using the Oracle XML Schema processor. - Added initial datatype verification in the following cases: - range constraints on predicates - Restrictions with range constraints 01 Oct 2001 - Fixed bug where Domain and Range Indications printed incorrect type for the found class. - Fixed in underlying model that probably caused some indications to not be reported. - Improved Datatype validation - for domain, range, and hasClass restrictions for both typed and untyped literals. - Added validation of typed literals. - Datatype definitions added as subClasses of daml:Datatype in the underlying model. - Added special case testing for rdfs:Restriction and rdfs:Literal for when they are used in restrictions. - Checks samePropertyAs to ensure the domain and range are properties. - Checks sameClassAs to ensure the domain and range are classes. - Checks subClassOf to ensure the domain and range are classes. 17 Oct 2001 - Created a set of example/test files for the validator. - IndicationList.xls lists the indications generated by the validator sorted by validation 'phase'. Includes description of the phases, indication causes, and pointers to example files. - Expanded TODO/bug list. -Fixed bugs in the DAMLModel that caused some problems from being discovered during validation. -Added property node testing. -Additional bugs in the validator. 29 Oct 2001 - Created IndicationList.xls that lists the Indications generated by the validator, including descriptions. - Allows undefined Properties to be used. An implicit property can be found as a statement predicate, or as the object of an onProperty, subPropertyOf, or samePropertyAs statement. When an undefined property is encountered, the property is explicitly defined in the model, and an informational indication is generated. - Implemented regression testing using the files in the examples directory (download version only). test-setup.bat will generate result files that are considered the correct results. Running test.bat at a later time will compare the current results with the files generated by test-setup.bat and print out the differences. - Updated API functions so they are in sync with the command line validator. To add XML datatype definition (xsd) files use the addURL(url) function. - Added validation on the use of ObjectProperty and DatatypeProperty. Includes restrictions on subPropertyOf and samePropertyAs. See IndicationList.xls for more information. - Added default definitions for undefined properties to the model (so they don't triggler invalid resource indications). If a property is used that is not explicitly defined, an explicit definition is generated in the model and an Informational Indication is generated. The validator detects implicit properties used as the property of a statement, and as the object of an onProperty, subPropertyOf, or samePropertyAs statement. - SAXParseExceptions now print line and column information. 25 Feb 2002 - Domain and range restrictions not inherited by subproperties. Inheritance using subPropertyOf was not used. Fixed. - Added vocabulary for www.w3.org/2001/10/daml+oil 13 March 2002: - No longer reports incorrect domain indications regarding Lists because the validator does not property handle Lists. - Improved the reporting of expected type in domain and range indications. - Added testing that removes the falsely reported undefined resource indication when a referenced file makes reference to a resource defined in the file being validated. - Determines which version of DAML is being used, and uses that to generate additional statements (eg: for implied predicates). - Prints warning if you try to use the w3 strawman specification for DAML+OIL (requires updated preference.xml file). - Reworded the OutdatedURIIndication to use the phrasing "preferred URI." 9 April 2002: - Updated Oracle XDT libraries (used in XML Schema datatype validation). - Migrated to Java v1.4 - should still work under v1.3.1. 16 April 2002: - Slightly modified XML output and preference file format. (old files should still work) - Fixed bug in validate-html.bat that produced errors when doing xslt. - Moved all jar files to a lib directory in the downloadable validator. 5 August 2002: - Migrated to Jena-1.5.0. The Validator no longer uses RDF-API as it's RDF Parser. It now uses the Jena toolkit (v 1.5.0). Gained functionality: Able to parse more files correctly, including daml:collection parseTypes. Lost functionality (for now?): No longer reports file position corresponding to a statement. Also, the Jena toolkit returns many more warnings about the parsed file (in the command line version). 8 August 2002: - Fixed some bugs brought about by the Jena changeover. - Updated the example files, added regression-test.bat. 12 March 2003: - Updated to Jena 1.6.1 (modified to store locations) - Implemented "recurse" parameters used to control loading of referenced and imported URIs. - Reports the URIs that were loaded to support validation. - Suppress warnings from Jena 1.6.1 dealing with rdf:datatype being unsupported in Jena1. - Fixed bug dealing with check the type of special classes. - Downgraded Domain and Range indications to WARNINGS and changed the wording to indicate the use of a property implies the object of the statement is in the range class. - Cleaned up output during processing (standalone). - Improved display of Indication messages in the html output. - Added recurseNS, maxRecurseNS, recurseImports, and maxRecurseImports as new preferences to control what gets loaded to support validation. If recurseNS is true then referenced namespaces get automatically loaded. If recurseImports is true then imported files get loaded. This applies to the file being loaded as well as the supporting files. The 'max' properties control how far this process recurses by limiting the depth. A value of 1 means only load files referenced in the file being validated. The default values are true, 1, true, 5. A value of -1 means infinite. - Added command line arguments to set properties. The values override what is defined in the preference file. The format is -:. To set maxRecurseNS to 2 use "-maxRecurseNS:2". - Now loads DAML+OIL, RDF, RDFS and RDF Schema automatically for supporting validation. 13 March 2003 - Fixed bug where some incorrect Undefined Resources were reported. - Reports and indication if a namespace is used both with and without a file extension. 18 March 2003 - Added undocumented command line arguments -VStmts and -OStmts that print out all the triples in the validation model and the supporting (Overall) model. - Added -file command line argument that better handles how to validate local files as if they were at another URI. To validate a local file "file.daml" as if it were "http://test.org/file.daml" use: Validate -file file.daml http://test.org/file.daml - Correctly reports that any Resource is an instance of daml:Thing - Fixed bug: Errors that occur while loading supporting URIs are not reported. 20 August 2003 - Incorporated some code changes from the OWL Validator into the base model that supports validation. This includes handling of oneOf, union, and intersection. - When testing restrictions, it now takes into account implied type statements from parent and equivalent classes and implied property statements from parent and equivalent hasValue restrictions. Eg. If (X type C1) and (C1 sameAs C2) then this implies (X type C2). Also If (C1 sameAs R1), (R1 type Restriction), (R1 onProperty P), (R1 hasValue Y) then this implies (X P Y). - When testing cardinality, explicit equivalence statements are taken into account. Previously equivalant items were each counted separately. TODO: - UniqueProperty - differentIndividualFrom - UnambigousProperty - Resolve duplication of domain/range indications caused by inheritance. Necessary? - When running locally, force full pathnames for file: URIs - Suppress Range Type Mismatch message generated by - Suppress Undefined Resource Indications caused by a namespace load errors(?) - Suppress statement indications caused by undefined resources(?) - Combine the internal and external resource checking code. - Allow RDF-API to access https URLs. - Streaming parser. - Update list of outdated URLs in default preference file. - Handle java.net.ConnectException errors during load phase. - Print the found type when creating domain and range errors (this used to work). - Datatype validator is not rejecting some datatypes that are siblings in the datatype hierarchy (eg. int and nonNegativeInteger). see st10.daml - Check to see if a non-property class is being used as a predicate. st11.daml - Provide mechanism for parsing/converting daml:collections - instead of using rdf:resource. What does it mean? Produces an empty anonymous class! - Should "Thing" and "Nothing" be explicitly added to the model? - Allow complementOf multiple classes as long as they are equivalent. - Allow proper handling of boolean class expressions in validation (for example in a range constraint). - Fix hasValue restrictions so they work with typed literals. (see in3.daml and in4.daml) - Fix toClass restrictions so they work with typed literals. (see in1.daml) - What type of validation is needed for restriction nodes? - Investigate why some daml:Class and rdfs:Class are not considered equivalent. (daml+oil-ex.daml) - Investigate using Jena instead of RDF-API. - Add example files to web page. - Test for duplicate type statements caused by instead of using . - Import URIs are causing undefined resource indications. - datatypes are not being recognized as classes in toClass properties of restrictions. - Defining a property w/ domain and range constraints where the domain class does not exist produces a residual indication for the range. - Unexpected indication generated when namespace prefix contains two #s: [7] ERROR - Range Type Mismatch: Expected object to be of type http://www.daml.org/experiment/ontology/elements-ont#ElementOfNationalPower In file:/daml/experiment/instance/example1.daml# At triple("file:/daml/experiment/instance/example1.daml#option1", "http://www.daml.org/experiment/ontology/options-ont##element", "file:/daml/experiment/instance/example1.daml#bridge1") near line 89, column 6 - Detect when Restriction nodes are not used within a subClassOf element. - Suppress or mark Indications about resources that could not be found (domain/range errors because no type to the object). - Add additional Vocabulary for older versions of RDF and RDFS. - Correctly handle things when the user includes the file extension. (eg. http://www.ksl.stanford.edu/projects/DAML/Proof/query-answer.daml) - TODO for DAMLModelImpl (make standalone) -- Convert to Jena model - Remove preferences from DAMLModelImpl - Integrate DAMLDatatypeValidator with DAMLModelImpl - Incorporate parseHelper to allow loading of URLs