XML Basics

From Starsonata Wiki
Jump to: navigation, search

XML stands for eXtensible Markup Language, it is a language that on its own doesn't actually do anything. It is used for storing and transferring data. XML has no preset tags and the syntax is quite simple to understand.

The following a is a simple bit of (made up) XML for, say two accounts for something.


<ACCOUNTLIST>
   <ACCOUNT>
      <NAME>Mr. Foo</NAME>
      <SETTINGS savePassword="1" autoLogin="1"/>
      <ALIASLIST>
         <ALIAS1>Mr. Foo</ALIAS1>
         <ALIAS1>Foobar</ALIAS1>
      </ALIASLIST>
   </ACCOUNT>
   <ACCOUNT>
      <NAME>CandyMan</NAME>
      <SETTINGS savePassowrd="1" autoLogin="0"/>
      <ALIASLIST>
         <ALIAS1>CandyMan</ALIAS1>
         <ALIAS2>McCandyAndy</ALIAS2>
      </ALIASLIST>
   </ACCOUNT>
</ACCOUNTLIST>


Now let's break that down, first of all, we can see that XML is made up of words which contain data called tags, so let's take a closer look at these.

The first tag is the ACCOUNTLIST tag. It is the root tag of this piece of xml, it is the tag that everything else comes under. XML files must have a root tag.

The second tag is the ACCOUNT tag, this tag is the one that has all the data for that specific account in, a bit like how all XML files have a root tag, this is the 'root' for that account.

The third tag is different from the first two, it contains data, in this case; a text string 'Mr. Foo', and is also closed straight away, unlike the first two.

The forth tag has something new in, 'savePassword="1" autoLogin="1"', savePassword and autoLogin are what's known as elements or attributes, these set information about a specific tag, or can be used instead of tags for neatness. These then have their own values, in this case both are 1. Double or single quotes can be used to enclose your value, but don't swap half way through your XML. Double quotes is the norm.

These tags are opened by typing '<>' around the tag name, eg '<ACCOUNT>', and are closed by putting a forward slash before the tagname, eg '</ACCOUNT>'. If a tag only needs to be present, or only has attributes and no data in it, it can't be closed by putting the forward slash before the closing pointy bracket, such as '<SETTINGS savePassword="1" autoLogin="1"/>'.

XML syntax must be correct, if there is something wrong like a tag you forgot to close, the application (SS2) will not work (or just save over it, if possible). Tags are case sensitive, like all XML is and they must be nested in the correct order and you can't close a tag until all of its children tags are closed. You can easily test your syntax by opening your XML file in a browser, if it's incorrect it'll tell you why and on what line.

There are two lines we haven't looked at yet, so let's look at those.

The first one is the '<!--Begin listing accounts-->' line. This is what is known as a comment. Comments are used to help the user understand what a specific piece of the XML does, or someone else whose looking at the XML to understand it. Comments are ignored by the program, so if you are testing something which requires you to remove XML but you don't want to delete it, you can comment it out. Comments are NOT for old data though, if you are done with a bit of XML for good, remove it. Comments must be closed correctly like tags, opened using '<!--' and closed using '-->'. Comments should not have double hyphon (--) sequences in them, although they are ignored by the program, if you try to check your syntax using a browser, this will throw an error.

The second one, which only appears in some documents is this '<?xml version="1.0" encoding="UTF-8" standalone="yes"?>' this is a declaration of the file, this declares that the XML version is 1, the encoding is UTF-8 and that it is standalone, note that this line can appear in other forms (eg, only two of the three attributes are there). This can be ignored for the most part, and is unneeded if not already present.

In XML, whitespace (new lines and spaces/tabs) are generally ignored unless in a tag or data, as such. Both of these are the same:

<TAG>
   <TAG2/>
</TAG>
<TAG>
<TAG2/>
</TAG>

Times when whitespace is important is attributes, as they need to be separated by a space. Otherwise, tags are indented for neatness and readability, you can use tabs or spaces for this. I prefer tabs (4 spaces), but some people prefer two spaces.

XML is a plain text language, so just about any text editor can read and edit XML files, including your standard Notepad. Although this can quickly get annoying for anything more than a quick edit. One of the most popular editors is Notepad++ (I use that myself), as it has some brilliant features and can be expanded on with plugins and is free. You can choose your own editor, I suggest Notepad++ personally.

If you'd like to read more, I'd suggest w3schools XML tutorial, which can be found on their websites: W3Schools