XML mit SAX parsen (Javaparser)
christian
- java
0 fjh0 Thomas J.S.
hi,
sitzte gerade an einen Projekt in dem ich einen JavaServer bastle der einen XML-File parst und ihn jeweils an einen webbrowser schickt oder gegebenenfalls an einen c++ parser. Das Modell was ich benutze ist sax.
Da dieser Server nicht nur diesen XML-FIlE einlesen soll sondern auch neue einträge hinzugefügt werden sollen, ist nun meine Frage ob das mit sax überhaupt geht?
Im DOM Modell geht das, aber dort wird mir ja meine ganze Baumstruktur in verschiedene Objekte geladen und da dort später ziemlich viele Datensätze zusammenkommen, weiß ich nicht wie es da mit der Performenz aussieht.
falls mir jemand ein paar tips geben kann, wäre ich sehr dankbar...
grüße christian
Hallo Christian,
Da dieser Server nicht nur diesen XML-FIlE einlesen soll sondern auch neue einträge hinzugefügt werden sollen, ist nun meine Frage ob das mit sax überhaupt geht?
SAX ist nicht zur Manipulation von XML-Dokumenten gedacht.
Im DOM Modell geht das, aber dort wird mir ja meine ganze Baumstruktur in verschiedene Objekte geladen und da dort später ziemlich viele Datensätze zusammenkommen, weiß ich nicht wie es da mit der Performenz aussieht.
Performanz ist auch ein Problem, aber v.a. der Speicher, da der DOM-Parser aus deinem Dokument eine Struktur im Hauptspeicher aufbaut, die ein vielfaches des Dokuments ausmacht
falls mir jemand ein paar tips geben kann, wäre ich sehr dankbar...
Probier es mit Deinem Parser aus, ob er erträgliche Antwortzeiten liefert, wenn du die DOM-Schnittstelle nutzt.
Gruß
Franz
Hallo Christian,
Da dieser Server nicht nur diesen XML-FIlE einlesen soll sondern auch neue einträge hinzugefügt werden sollen, ist nun meine Frage ob das mit sax überhaupt geht?
Vielleicht hilft dir das:
====================================================================== IBM developerWorks XML Tip October 22, 2002 Vol. 2, Issue 43
TIP: USE A SAX FILTER TO MANIPULATE DATA Change the events output by a SAX stream
Nicholas Chase (nicholas@nicholaschase.com) President, Chase and Chase, Inc.
Hello, XML Tip readers,
The streaming nature of the Simple API for XML (SAX) provides not only an opportunity to process large amounts of data in a short time, but also the ability to insert changes into the stream that implement business rules without affecting the underlying application. This tip explains how to create and use a SAX filter to control how data is processed.
For the rest of the tip, read on below.
Until next week, XML Tip team at IBM developerWorks dWnews@us.ibm.com
Note: This tip uses JAXP. The classes are also part of the Java 2 SDK 1.4, so if you have this version installed, you don't need any additional software. It briefly covers the basics of SAX, but you should already understand the basics of both Java and XML.
<?xml version="1.0"?> <personnel> <employee empid="332" deptid="24" shift="night" status="contact"> JennyBerman </employee> <employee empid="994" deptid="24" shift="day" status="donotcontact"> AndrewFule </employee> <employee empid="948" deptid="3" shift="night" status="contact"> AnnaBangle </employee> <employee empid="1032" deptid="3" shift="day" status="contact"> DavidBaines </employee> </personnel>
THE BASIC APPLICATION
A SAX application consists of two parts. The main application creates an XMLReader that actually parses the document, sending events such as startElement and endDocument to a content handler. You can send errors to a separate error handler object. The handler objects receive these events and act on them.
import org.xml.sax.helpers.XMLReaderFactory; import org.xml.sax.XMLReader; import org.xml.sax.SAXException; import org.xml.sax.InputSource; import java.io.IOException;
public class MainSaxApp {
public staticvoid main (String[] args){
try {
StringparserClass = "org.apache.crimson.parser.XMLReaderImpl"; XMLReader reader = XMLReaderFactory.createXMLReader(parserClass);
reader.setContentHandler(new DataProcessor()); reader.setErrorHandler(new ErrorProcessor());
InputSource file = new InputSource("employees.xml"); reader.parse(file);
} catch (IOException ioe) { System.out.println("IO Exception: "+ioe.getMessage()); } catch(SAXException se) { System.out.println("SAX Exception: "+se.getMessage()); }
}
}
import org.xml.sax.helpers.DefaultHandler; import org.xml.sax.Attributes;
public class DataProcessor extends DefaultHandler { public voidstartElement (String namespaceUri, String localName, String qualifiedName, Attributesattributes) {
if(localName.equals("employee")){ if(attributes.getValue("status").equals("contact")){ System.out.println("Contacting employee "+ attributes.getValue("empid")); //Implement actual contact here } } } }
The ErrorProcessor class is trivial, and is included in the source code for this tip. (See Resources related to this tip in Links to other good stuff, below, to download the source code.)
When the application runs, the output includes all of the employees with a status attribute of contact, no matter which department they work in:
Contacting employee 332 Contacting employee 948 Contacting employee 1032
FILTERING THE DATA
So far the application contacts all employees that are listed as on duty regardless of their department, and it works well (or at least, we can hope so!). When you receive a new requirement to contact only employees in a particular department, you have two options:
- Change the content handler and risk all sorts of new bugs - Change the data that comes to the content handler so that only the appropriate employees are seen as on duty.
Because other requirements are also likely to be added later, it makes more sense to implement them separately.
import org.xml.sax.helpers.XMLFilterImpl; import org.xml.sax.Attributes; import org.xml.sax.SAXException;
public class DataFilter extends XMLFilterImpl {
}
... XMLReader reader = XMLReaderFactory.createXMLReader(parserClass);
DataFilter filter = new DataFilter(); filter.setParent(reader);
filter.setContentHandler(new DataProcessor()); filter.setErrorHandler(new ErrorProcessor());
filter.parse("employees.xml");
} catch (IOException ioe) { ...
The application creates the XMLReader as usual, but it's actually the filter that initiates the parse of the file receiving the events from its parent, the XMLReader. (Remember, the filter calls super(parent).) It passes the events on to its content handler -- the same DataProcessor object used in the original version.
So far, the filter just passes the events on unchanged, so running the application still produces this:
Contacting employee 332 Contacting employee 948 Contacting employee 1032
... import org.xml.sax.helpers.AttributesImpl;
public class DataFilter extends XMLFilterImpl {s
public void startElement (String namespaceUri, String localName, String qualifiedName, Attributes attributes) throws SAXException {
AttributesImpl attributesImpl = new AttributesImpl(attributes); if (localName.equals("employee")){ if (!attributes.getValue("deptid").equals("24")){ attributesImpl.setValue(3, "donotcontact"); } } super.startElement(namespaceUri, localName, qualifiedName, attributesImpl); }
}
In this case, you're overriding the startElement() method defined in XMLFilterImpl. It still passes on the event, but if the employee is not in department 24, the filter passes it on with an altered Attributes object that lists the employee as do not contact.
The DataProcessor object has no idea that the data has been manipulated. It simply knows that some employees should be contacted and others shouldn't. Processing now produces a different result:
Contacting employee 332
NEXT STEPS
This tip has demonstrated a simple way to alter the processing of a SAX application using an XML filter. In this case, the filter has been pre-determined, but you can build an application to accomodate different situations by choosing filter behavior at run-time. You might accomplish this by replacing the DataFilter class, by passing a parameter at run-time, or even by using a factory to create the filter class in the first place.
LINKS TO OTHER GOOD STUFF
::: IBM developerWorks XML Zone ::: http://www.ibm.com/developerworks/xml/?nx-10222
::: Resources related to this tip ::: http://www.ibm.com/developerworks/library/x-tipsaxfilter/#resources
::: Full text of this tip on the Web ::: http://www.ibm.com/developerworks/library/x-tipsaxfilter/?nx-10222
::: Index of other XML tips ::: http://www.ibm.com/developerworks/library/x-tips.html?nx-10222
Subscribe: http://www-106.ibm.com/developerworks/newsletter/?n-about Unsubscribe: http://ibm.email-publisher.com/u/?a84vCg.baXFXj Get help: mailto:customersupport@ibmdw.email-publisher.com Send comments: http://www-105.ibm.com/developerworks/newcontent.nsf/dW_feedback/ IBM's privacy policy: http://www.ibm.com/privacy/ IBM's copyright and trademark information: http://www.ibm.com/legal/copytrade.phtml
THIS NEWSLETTER IS FOR INFORMATION ONLY. This newsletter should not be interpreted to be a commitment on the part of IBM, and, after the publication date, IBM cannot guarantee the accuracy of any information presented. You may copy and distribute this newsletter, as long as: