Open Life

Opendocuments, Web Office, Office suites

Parsing XML with OOoBasic

, , ,

OpenOffice.org is loaded with a full IDE and a language that even if it looks like a toy language. This weekend I have been reviweing a lot of code on OOoBasic and found that OOoBasic is a powerful script. One of the things that show the power of a very high level OOoBasic is parsing an XML file. Since OOo is made from XML it seems glorious that OOo could autoconfigure itself.

First is the powerful UNO framework which is lives in the inner pieces of OpenOffice.org. The UNO interface is conformed by interfaces, services and methods. The cool thing is the wide array of interfaces that OOoBasic can use and manipulate.

OOoBasic can parse XML on different ways, from SAX which is a smaller and simpler stream parser to a DOM parsing which will be a more indepth parsing based on the Document Object Model. This are both on the XML Module with hundred of tools that will be able to configure and reconfigure the code.

So here is the code that I was working with. First I needed to get an XML file, the file was a simple employee document.
<Employees>
   <Employee id="101">
       <Name>
          <First>John</First>
          <Last>Smith</Last>
       </Name>
       <Phone type="Home">785-555-1234</Phone>
   </Employee>
</Employees>


Then here is the first stage of the code, we basically load the XML by the first interface which is the one that deals with external file manipulation:
Sub Main
   cXmlFile = "/home/user/tmp/test.xml"
   
   cXmlUrl = ConvertToURL( cXmlFile )
   
   ReadXmlFromUrl( cXmlUrl )
End Sub


We first create the ConverToURL function that will basically make the path to the file get used like a URL and then execute the function ReadXmlFromUrl that we will show next:

Sub ReadXmlFromUrl( cUrl )
   oSFA = createUnoService( "com.sun.star.ucb.SimpleFileAccess" )
   oInputStream = oSFA.openFileRead( cUrl )
   ReadXmlFromInputStream( oInputStream )
   oInputStream.closeInput()
End Sub


This function use the SimplefileAccess to generate a Service using the createUnoSerive using the interface from the API. Then we will execute one of the methods called openFileRead this will get the file and to a variable and then implement the ReadXmlFromInputStream. Finally we close the the file using closeUput.

The next function is the ReadXmlFromInputStream, this is the one in charge of reading the XML.

Sub ReadXmlFromInputStream( oInputStream )
   oSaxParser = createUnoService( "com.sun.star.xml.sax.Parser" )
   oDocEventsHandler = CreateDocumentHandler()
   oSaxParser.setDocumentHandler( oDocEventsHandler )
   oInputSource = createUnoStruct( "com.sun.star.xml.sax.InputSource" )
   With oInputSource
      .aInputStream = oInputStream 
   End With
   oSaxParser.parseStream( oInputSource )
End Sub


This is the second function that is supposed to read the XML and will execute the parser itself. First we call the Parser service into a variable called oSaxParser. Then we have the CreateDocumentHandler then the parser will get the setDocumentHandler function.

Private goLocator As Object
Private glLocatorSet As Boolean


We build an object as goLocator and make it as a boolean object, we later assign it to false under the DocumentHandler. We need to create the service for XDocumentHandler first.

Function CreateDocumentHandler()
   oDocHandler = CreateUnoListener( "DocHandler_",_
                                    "com.sun.star.xml.sax.XDocumentHandler" )
   glLocatorSet = False
   CreateDocumentHandler() = oDocHandler
End Function


Finally we have a series of functions where we specified the DocumentHandler to print out on the different elements of the XML. By default I comment all this handlers except for the character which is the one that specified the content. Unfortunately print will not just report the visible content such as John but all the invisible characters such as spaces, end of line and tab keys..


Sub DocHandler_startDocument()
'   Print "Start document"
End Sub

Sub DocHandler_endDocument()
'   Print "End document"
End Sub

Sub DocHandler_startElement( cName As String, oAttributes As _
                             com.sun.star.xml.sax.XAttributeList )
'    Print cName
'	Print oAttributes.Length
End Sub

Sub DocHandler_endElement( cName As String )
'   Print "End element", cName
End Sub

Sub DocHandler_characters( cChars As String )
	Print "Contenido:",cChars
End Sub 

Sub DocHandler_ignorableWhitespace( cWhitespace As String )
'	Print cWhitespace
End Sub

Sub DocHandler_processingInstruction( cTarget As String, cData As String )
End Sub


Sub DocHandler_setDocumentLocator( oLocator As com.sun.star.xml.sax.XLocator )
   goLocator = oLocator
   glLocatorSet = True
End Sub

Debo admitir que el proceso no es muy claro todavia pero haciendo una revision se puede ver que queremos 3 cosas:
  1. Invocar el servicio de parseo de XML
  2. Enviar nuestro llamado a una ventana dentro de OOo

Automating work with Basic IIFlisol 2007 in Guadalajara Part I

Write a comment

New comments have been disabled for this post.

June 2012
S M T W T F S
May 2012July 2012
1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30