Parsing XML with OOoBasic
Tuesday, 17. April 2007, 09:24:31
First is the powerful UNO framework which is lives in the inner pieces of OpenOffice.org. The UNO interface is conformed by interfaces, services and methods. The cool thing is the wide array of interfaces that OOoBasic can use and manipulate.
OOoBasic can parse XML on different ways, from SAX which is a smaller and simpler stream parser to a DOM parsing which will be a more indepth parsing based on the Document Object Model. This are both on the XML Module with hundred of tools that will be able to configure and reconfigure the code.
So here is the code that I was working with. First I needed to get an XML file, the file was a simple employee document.
<Employees>
<Employee id="101">
<Name>
<First>John</First>
<Last>Smith</Last>
</Name>
<Phone type="Home">785-555-1234</Phone>
</Employee>
</Employees>
Then here is the first stage of the code, we basically load the XML by the first interface which is the one that deals with external file manipulation:
Sub Main cXmlFile = "/home/user/tmp/test.xml" cXmlUrl = ConvertToURL( cXmlFile ) ReadXmlFromUrl( cXmlUrl ) End Sub
We first create the ConverToURL function that will basically make the path to the file get used like a URL and then execute the function ReadXmlFromUrl that we will show next:
Sub ReadXmlFromUrl( cUrl ) oSFA = createUnoService( "com.sun.star.ucb.SimpleFileAccess" ) oInputStream = oSFA.openFileRead( cUrl ) ReadXmlFromInputStream( oInputStream ) oInputStream.closeInput() End Sub
This function use the SimplefileAccess to generate a Service using the createUnoSerive using the interface from the API. Then we will execute one of the methods called openFileRead this will get the file and to a variable and then implement the ReadXmlFromInputStream. Finally we close the the file using closeUput.
The next function is the ReadXmlFromInputStream, this is the one in charge of reading the XML.
Sub ReadXmlFromInputStream( oInputStream )
oSaxParser = createUnoService( "com.sun.star.xml.sax.Parser" )
oDocEventsHandler = CreateDocumentHandler()
oSaxParser.setDocumentHandler( oDocEventsHandler )
oInputSource = createUnoStruct( "com.sun.star.xml.sax.InputSource" )
With oInputSource
.aInputStream = oInputStream
End With
oSaxParser.parseStream( oInputSource )
End Sub
This is the second function that is supposed to read the XML and will execute the parser itself. First we call the Parser service into a variable called oSaxParser. Then we have the CreateDocumentHandler then the parser will get the setDocumentHandler function.
Private goLocator As Object Private glLocatorSet As Boolean
We build an object as goLocator and make it as a boolean object, we later assign it to false under the DocumentHandler. We need to create the service for XDocumentHandler first.
Function CreateDocumentHandler()
oDocHandler = CreateUnoListener( "DocHandler_",_
"com.sun.star.xml.sax.XDocumentHandler" )
glLocatorSet = False
CreateDocumentHandler() = oDocHandler
End Function
Finally we have a series of functions where we specified the DocumentHandler to print out on the different elements of the XML. By default I comment all this handlers except for the character which is the one that specified the content. Unfortunately print will not just report the visible content such as John but all the invisible characters such as spaces, end of line and tab keys..
Sub DocHandler_startDocument()
' Print "Start document"
End Sub
Sub DocHandler_endDocument()
' Print "End document"
End Sub
Sub DocHandler_startElement( cName As String, oAttributes As _
com.sun.star.xml.sax.XAttributeList )
' Print cName
'Print oAttributes.Length
End Sub
Sub DocHandler_endElement( cName As String )
' Print "End element", cName
End Sub
Sub DocHandler_characters( cChars As String )
Print "Contenido:",cChars
End Sub
Sub DocHandler_ignorableWhitespace( cWhitespace As String )
'Print cWhitespace
End Sub
Sub DocHandler_processingInstruction( cTarget As String, cData As String )
End Sub
Sub DocHandler_setDocumentLocator( oLocator As com.sun.star.xml.sax.XLocator ) goLocator = oLocator glLocatorSet = True End Sub
Debo admitir que el proceso no es muy claro todavia pero haciendo una revision se puede ver que queremos 3 cosas:
- Invocar el servicio de parseo de XML
- Enviar nuestro llamado a una ventana dentro de OOo





















