Easy and Efficient XML Processing: Upgrade to JAXP 1.3 - Part 2  
 

(Post 27/06/2006) Before we look at this approach, let's look at how we have been doing schema validation using the schema properties that were defined in JAXP 1.2...

Figure 2. Set Compiled Schema on DocumentBuilder/SAXParserFactory

Validate XML Using Compiled Schema

Before we look at this approach, let's look at how we have been doing schema validation using the schema properties that were defined in JAXP 1.2:

http://java.sun.com/xml/properties/jaxp/schemaLanguage
http://java.sun.com/xml/properties/jaxp/schemaSource

 Here is an example showing how these two properties are used in JAXP 1.3:

SAXParserFactory spf = SAXParserFactory.newInstance();
spf.setNamespace(true); 
spf.setValidating(true); 
SAXParser sp = spf.newSAXParser();
sp.setProperty("http://java.sun.com/xml/properties/jaxp/schemaLanguage",
"http://www.w3.org/2001/XMLSchema"); 
sp.setProperty("http://java.sun.com/xml/properties/jaxp/schemaSource",
"mySchema.xsd") ; 
sp.parse(<XML Document>, <ContentHandler);

 The user sets the schemaLanguage and/or the schemaSource property on SAXParser and sets the validation to true. Generally, a business application defines a set of schemas containing the business rules against which XML documents must be validated. To accomplish this, an application sets the schema using the schemaSource property or relies on the xsi:schemaLocation attribute in the instance document to specify the schema location(s).

This approach works well, but there is a tremendous performance penalty: The specified schemas are loaded again and again for every XML document that needs to be validated! However, with the new Validation APIs, an application needs to parse a set of schemas only once. See Figure 2.

After the Compile Schema step, do the following.

SAXParserFactory spf = SAXParserFactory.newInstance();
spf.setSchema(schema);
SAXParser saxParser = spf.newSAXParser();
saxParser.parse(new File("instance.xml"), myHandler);

 Just set the Schema instance on the factory and you are done. There is no need to set the validation to true and no need to set the schemaLanguage or schemaSource property. Validation of XML documents is done against the compiled schema set on the factory. You will be amazed by the performance gain using this approach. Try it yourself.

Run the sample ComparePerformance.java, which can be downloaded from here. Performance gain largely depends on the ratio of the size of the XML schema to the size of the XML document. Larger ratios lead to a larger performance gain. Look at the Reusing a Parser Instance section to further improve the performance.

Note that it is an error to use either of the following properties:

http://java.sun.com/xml/jaxp/properties/schemaLanguage
http://java.sun.com/xml/jaxp/properties/schemaSource

 in conjunction with a non-null Schema object. Such configuration will cause a SAXException when those properties are set on SAXParser or DocumentBuilderFactory.

Validate a SAXSource or DOMSource

As we mentioned earlier, there has been fundamental shift in XML parsing and validation. Now XML validation is considered a process independent from XML parsing. Once you have the Schema instance loaded into memory, you can do many things. You can create a ValidatorHandler that can validate a SAX stream or create a stand-alone Validator (see Figure 3). A stand-alone Validator can validate a SAXSource, a DOMSource, or an XML document against any schema. In fact, a Validator can still work if the SAX stream or DOM object comes from a different implementation.

Figure 3. Validate a SAXSource or DOMSource Using a Validator

To receive any errors during the validation, an ErrorHandler should be registered with the Validator. Let's look at some working code. (Note: For clarity, only a section of code is shown here. For the complete source, look at the sample Validate.java, which can be downloaded here.)

Validator validator = schema.newValidator();
validator.setErrorHandler( new ErrorHandlerImpl());
validator.validate(new StreamSource(<XML Document>));

Validator can also be used to validate the instance document or DOM object in memory, with the augmented result sent to DOMResult.

Document document = //DOM object
validator.validate(new DOMSource(document), new DOMResult());

The Validation APIs can validate a SAX stream and work in conjunction with Transformation APIs to achieve pipeline processing, as we will see in the next section.

 Validate XML After Transformation

Transformation APIs are used to transform one XML document into another by applying a style sheet. There are times when we need to validate the transformed XML document against a schema. Should we feed that XML document to a parser and then use the schema feature to do the schema validation? No. The new Validation APIs give you the power to validate the transformed XML document against a different schema by allowing the application to create a pipeline and pass the output of a transformer to the Validation APIs to validate against the desired schema. It doesn't matter if the output of the transformation is a SAX stream or a DOM in memory.

Validate a SAX Stream

The following code snippet shows you how to use specially designed javax.xml.validation.ValidatorHandler to validate a SAX stream. In the downloadable source, look at the sample ValidateSAXStream.java for more detail. Also look at the sample TransformerValidationHandler.java, which shows how to chain the output of Transformer to ValidatorHandler. Here is a section of the code:

String language =  XMLConstants.W3C_XML_SCHEMA_NS_URI ;
SchemaFactory sf = SchemaFactory.newInstance(language);
Schema schema = sf.newSchema(new File(<SCHEMA>)); 
ValidatorHandler vh = schema.newValidatorHandler();
vh.setErrorHandler(new ErrorHandlerImpl());
vh.setContentHandler(new ApplicationContentHandler()); 
TransformerFactory tf = TransformerFactory.newInstance();
StreamSource ss = new new StreamSource(<STYLESHEET>);
Transformer t = tf.newTransformer(ss);
StreamSource xml = new StreamSource(<XML DOCUMENT>);
t.transform(new StreamSource(xml, new SAXResult(vh));

Figure 4 shows the whole flow, with an XML document and a style sheet given as input to a Transformer and a SAX stream as the output. We take advantage of the modular approach of doing validation independent from parsing. The ValidatorHandler is a special handler that is capable of working directly with a SAX stream. It validates the stream and passes it to the application.

Figure 4. Validating a SAX Stream

(Continued)

Neeraj Bajaj


 
 

 
     
 
Công nghệ khác:


Easy and Efficient XML Processing: Upgrade to JAXP 1.3Sử dụng Regular Expression - kiểm tra tính hợp lệ của e-mail với PHP
Công ty phần mềm thời nay: Nhiều tiền chưa hẳn đã hayLỗi driver làm giảm thời lượng dùng pin laptop
Để hiểu thêm về phần mềm, mã nguồn mởMobile Web Application Secret
  Xem tiếp    
 
Lịch khai giảng của hệ thống
 
Ngày
Giờ
T.Tâm
TP Hồ Chí Minh
Hà Nội
 
   
New ADSE - Nhấn vào để xem chi tiết
Mừng Sinh Nhật Lần Thứ 20 FPT-APTECH
Nhấn vào để xem chi tiết
Bảng Vàng Thành Tích Sinh Viên FPT APTECH - Nhấn vào để xem chi tiết
Cập nhật công nghệ miễn phí cho tất cả cựu sinh viên APTECH toàn quốc
Tiết Thực Vì Cộng Đồng
Hội Thảo CNTT
Những khoảnh khắc không phai của Thầy Trò FPT-APTECH Ngày 20-11