Introduction to XML
Introduction
XML stands for Extensible Markup Language. An XML document stores data in the form of text. The data itself can be textual or binary. The binary data is not stored as binary data but is first converted to and is stored as text data. Elements and attributes are used in XML document to encapsulate data in a more logical hierarchical fashion.
This tutorial is 1st one in the series of tutorials about XML and ASP.NET:- Introduction to XML
Sample XML Document
Following is a simple XML document:
<?xml version="1.0" encoding="utf-8"?> <article> <author isadmin="true">Faisal Khan</author> <title>Sample XML Document</title> <body>The body of the article goes here.</body> </article>XML Declaration
All XML documents start with
<?xml version="1.0" ?>
. This tells the XML parser that what is to follow is an XML document. An optional encoding
attribute is often added as well.<?xml version="1.0" encoding="utf-8"?>
<article>
<author isadmin="true">Faisal Khan</author>
<title>Sample XML Document</title>
<body>The body of the article goes here.</body>
</article>
Storing Data in an XML DocumentThe XML document contains text data that is held in its place by a logical hierarchy of elements. The data in the sample XML document, above, is highlighted below:
<?xml version="1.0" encoding="utf-8"?> <article> <author isadmin="true">Faisal Khan</author> <title>Sample XML Document</title> <body>The body of the article goes here.</body> </article>Elements
All XML documents must have a root element. In our sample XML document, the root element is
article
. All elements must have a starting tag and an ending tag. The name of the element can be anything you want. But it is recommended to keep the element names short and simple to understand e.g., article, full_name, first_name, etc. If an element name has to consist of two words, it is recommended to insert an '_' (underscore) character in the place of space e.g., first_name for "first name". An element is contained between < and > characters. The ending tag has an additional slash '/' just before the name of the elementThe elements in the sample XML document are highlighted below:
<?xml version="1.0" encoding="utf-8"?> <article> <author isadmin="true">Faisal Khan</author> <title>Sample XML Document</title> <body>The body of the article goes here.</body> </article>Attributes
An attribute is name/value pair within an element that can store further information for an element. It has to be in the format of name="value". In our XML document,
isadmin
is the name of the attribute within author
element, whose value is "true". The attribute in our XML document is highlighted below:<?xml version="1.0" encoding="utf-8"?>
<article>
<author isadmin="true">Faisal Khan</author>
<title>Sample XML Document</title>
<body>The body of the article goes here.</body>
</article>
Entity ReferencesCertain characters are illegal to be present in data segments of the elements i.e., between element start and end tags. These characters are &, <, >, ', and ". The reason is that they have special meaning in an XML document. As you read above, XML elements and attributes use these characters to encapsulate data. Now what should you do if you want to use these characters as data in an XML document? Well, make use of entity references.
An "entity" is a name/value pair defined in a DTD (Document Type Definition) file. We will learn more about DTD files later. For now, you should know that an entity can be a name/value pair, just like an attribute, which you can define yourself. While attributes are defined in XML documents, entities (1 ore more) are defined inside DTD files. You can define an entity in an XML document using DTD syntax and make use of it later in the data in your XML document using an entity reference who syntax is: &entityName;. This is dynamic substitution. The XML parser will replace the entity reference at runtime with the value of that entity.
<!ENTITY websiteName "Stardeveloper.com">Above code defines an entity "websiteName" with the value, "Stardeveloper.com". You can access the value of this entity using an entity reference within your XML document using the syntax &websiteName; as shown below:
<?xml version="1.0" encoding="utf-8"?> <article> <author isadmin="true">Faisal Khan</author> <title>Sample XML Document</title> <body>This article will be posted at &websiteName;.</body>
</article>
Now that you know what entities are and how entity references can be used to access their values inside the data portions of an XML document, dynamically; you should know that all XML documents can use 5 pre-loaded entity references as given below:
Now, if you want to use any of the 5 special characters as data in your XML document, you can use them using entity references like this:
Some of the things that we learned about XML documents in this tutorial:
Character | Entity Reference |
& | & |
< | < |
> | > |
' | ' |
" | " |
<?xml version="1.0" encoding="utf-8"?> <article> <author isadmin="true">Faisal Khan</author> <title>Sample XML Document</title> <body>Now you can use &, <, >, ', and " tags as often as you want.</body> </article>Summary
Some of the things that we learned about XML documents in this tutorial:
- How to declare an XML document?
- How to store data in an XML document?
- What are elements and how to use elements to provide logical hierarchical access to the data stored in XML document?
- What are attributes?
- What are entities and entity references?
Reading XML Files with ASP.NET
Introduction
In this tutorial, we will learn how to read the contents of an XML file with ASP.NET. We will make use of the sample XML file which we created in the previous tutorial (Introduction to XML). We will create an ASP.NET page which reads the contents of this XML file and then displays it to the user.
This tutorial is 2nd one in the series of tutorials about XML and ASP.NET:- Introduction to XML
- Reading XML Files with ASP.NET
- Introduction to XPath for ASP.NET Developers
- Creating new XML Document in ASP.NET Programmatically
- Introduction to DTD (Document Type Definition)
- Validating an XML Document using DTD in ASP.NET
Okay, let us get started now. Copy and paste the contents below in a new text file and then save that file as "sample.xml" in the /App_Data sub-folder of your ASP.NET web application. Placing the XML file in /App_Data sub-folder ensures that no one will be able to directly access this file from the web. It will only be accessed by our ASP.NET page which will read and display its contents.
<?xml version="1.0" encoding="utf-8"?> <article> <author isadmin="true">Faisal Khan</author> <title>Sample XML Document</title> <body>The body of the article goes here.</body> </article>Reading the Contents of XML File using XmlDocument class
System.Xml
namespace has a class with the name of XmlDocument
which we will use to read XML file's contents and display it to the user. Reading the contents of sample.xml file is as easy as these two lines:XmlDocument doc = new XmlDocument(); doc.Load(Server.MapPath("~/App_Data/sample.xml"));Before we delve ourselves deeper into the code, we should create the ASP.NET page first and study its code later.
Reader.aspx
Copy and paste following code into an ASP.NET page and then save it as "reader.aspx":
<%@ Page Language="C#" AutoEventWireup="true" %> <%@ Import Namespace="System.Xml" %> <script runat="server"> protected void Page_Load(object source, EventArgs e) { XmlDocument doc = new XmlDocument(); doc.Load(Server.MapPath("~/App_Data/sample.xml")); XmlNode root = doc.DocumentElement; AuthorLiteral.Text = root.SelectSingleNode("author").ChildNodes[0].Value; TitleLiteral.Text = root.SelectSingleNode("title").ChildNodes[0].Value; BodyLiteral.Text = root.SelectSingleNode("body").ChildNodes[0].Value; } </script> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head runat="server"> <title>Reading an XML File</title> <style type="text/css"> body { font-family: Verdana; font-size: 9pt; } .name { background-color: #F7F7F7; } </style> </head> <body> <form id="form1" runat="server"> <div> <table width="50%" cellpadding="5" cellspacing="2"> <tr> <td class="name">Name</td> <td><asp:Literal ID="AuthorLiteral" runat="server" /> <asp:Literal ID="AdminLiteral" runat="server" /></td> </tr> <tr> <td class="name">Title</td> <td><asp:Literal ID="TitleLiteral" runat="server" /></td> </tr> <tr> <td class="name">Body</td> <td><asp:Literal ID="BodyLiteral" runat="server" /></td> </tr> </table> </div> </form> </body>
</html>When you have properly placed sample.xml file in /App_Data sub-folder and reader.aspx in your web application; you should run the ASP.NET page by accessing it in your browser. On my computer, the ASP.NET page, when run, looked like this:
ASP.NET page displaying the contents of an XML File
Code to read XML File
We first need to create an instance of
Summary
In this tutorial we learned the code to read XML file in to memory and then access its elements using methods and properties of
We first need to create an instance of
XmlDocument
class. This class provides methods and properties to access the contents of XML file. This class implements W3C Document Object (DOM) level 1 core and level 2. The DOM is an in-memory tree representation of the XML document, providing easy navigation and editing capabilities.XmlDocument doc = new XmlDocument();Next, we need to tell
XmlDocument
class instance to load our XML file in memory. We do that by calling Load()
method of XmlDocument
class. Its argument is the complete physical path to the sample.xml file. We use Server.MapPath()
method to convert the relative path to sample.xml, to complete physical path.doc.Load(Server.MapPath("~/App_Data/sample.xml"));Now that we have sample.xml file in memory, we will use properties and methods of
XmlDocument
class to read the "name", "title", and "body" elements. To do that first we need to get access to the root element ("article") of the XML file.XmlNode root = doc.DocumentElement;Once we have access to the root element, accessing the values contained in "name", "title", and "body" sub-elements is as simple as calling
SelectSingleNode()
method, and then accessing its ChildNodes
collection. The argument to SelectSingleNode()
is the name of the element/tag you want to access the value of.AuthorLiteral.Text = root.SelectSingleNode("author").ChildNodes[0].Value; TitleLiteral.Text = root.SelectSingleNode("title").ChildNodes[0].Value;
BodyLiteral.Text = root.SelectSingleNode("body").ChildNodes[0].Value;And that is it.
XmlDocument
provides even more easy navigability to the contents of its XML file using XPath
queries, something which we will explore in future tutorials.Summary
In this tutorial we learned the code to read XML file in to memory and then access its elements using methods and properties of
XmlDocument
class. In next tutorial, we will learn how to selectively access the elements of in memory XML file, using XPath
queries.Introduction to XPath for ASP.NET Developers
Introduction
XPath is a language for selecting nodes (parts or segments) within an XML document. As you are already familiar with XML, XML is a markup language which uses elements and attributes to encapsulate data in a logical manner. XPath furthers builds on it and provides navigational abilities in an XML document. For example, you can use an XPath query to select one or more nodes within an XML document that match a certain criteria. That criteria can by anything from an element matching a given name to an element whose attribute matches a specific value.
This tutorial is 3rd one in the series of tutorials about XML and ASP.NET:Tree like representation of an XML Document
In XPath, an XML document is represented as a tree of nodes. There is a parent node with one or more child nodes. While a node in XPath can be of 7 types; for practical purposes, a node in XPath corresponds to an element or attribute within an XML document. For example, have a look at following XML document:
<?xml version="1.0" encoding="utf-8" ?> <userinfo> <username admin="true">someUserName1</username> <email>xyz@whatever.com</email> </userinfo>Following is XPath's tree representation of the above document:
- userinfo
- username
- admin
- username
userinfo
is root node. userinfo
is the parent of username
and email
nodes. username
and email
nodes are siblings. username
and email
are children of userinfo
node. Similarly, admin
is the child of username
node.Selecting Nodes within an XML Document
Now that we understand how elements and attributes are represented as nodes in XPath, we will focus on how to use XPath expressions to select one or more nodes within an XML document.
XPath Expressions
Following are some of the expressions that you can use to select one or more nodes from the XML document above:
/userinfo
- Selects the root element./userinfo/username
- Selects theusername
node which is the child ofuserinfo
root node.//email
- Selects all the nodes in the document which match the name (email) irrespective of where they lie in the document.//username[@admin]
- Selects all nodes with the name of "username" which have an attribute; "admin"./userinfo/username[1]
- Selects the firstusername
node that is the child ofuserinfo
node./userinfo/username[last()]
- Assuminguserinfo
had more than oneusername
child nodes, it will return the lastusername
node that is the child ofuserinfo
node.
We will now create a sample XML document and then use XPath expressions to select and display only few nodes from it.
Sample XML Document
Copy and paste following text in a new text file and save it as "sample.xml" in the /App_Data folder of your ASP.NET web application:
<?xml version="1.0" encoding="utf-8" ?> <article> <author isadmin="true">Faisal Khan</author> <title>Sample XML Document</title> <body> <page>This is page #1.</page> <page>This is page #2.</page> <page>This is page #3.</page> </body> </article>XPath.aspx ASP.NET Page
Now, we will create the ASP.NET page which will read the above sample.xml file and selectively display its contents using XPath. Copy and paste the following text into a new text file and save it as "XPath.aspx" in your ASP.NET web application:
<%@ Page Language="C#" AutoEventWireup="true" %> <%@ Import Namespace="System.Xml" %> <script runat="server"> protected void Page_Load(object source, EventArgs e) { XmlDocument doc = new XmlDocument(); doc.Load(Server.MapPath("~/App_Data/sample.xml")); XmlNodeList nodes = doc.SelectNodes("/article/body/page"); foreach (XmlNode node in nodes) { TableRow row = new TableRow(); TableCell cell = new TableCell(); cell.Text = node.FirstChild.InnerText; row.Cells.Add(cell); PagesTable.Rows.Add(row); } } </script> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head runat="server"> <title>Using XPath Expressions</title> <style type="text/css"> body { font-family: Verdana; font-size: 9pt; } .name { background-color: #F7F7F7; } </style> </head> <body> <form id="form1" runat="server"> <div> <asp:Table id="PagesTable" runat="server" /> </div> </form> </body> </html>We will look into its code a little later, for now when this ASP.NET page was run on my computer, it produced following result (displaying only
page
elements from the XML file):Looking into the Code
We learned how to read an XML file and display its contents using
Had we only wanted to fetch the first
In this tutorial, we learned how to use XPath expressions to select parts and segments from an XML file. In the next tutorial, we will learn how to generate an XML document programmatically using ASP.NET.
We learned how to read an XML file and display its contents using
XmlDocument
class from System.Xml
namespace in an ASP.NET page, in previous tutorial. We will focus in this tutorial on how to use XPath expressions to only selectively return the list of nodes we want to display to the user.protected void Page_Load(object source, EventArgs e)
{
XmlDocument doc = new XmlDocument();
doc.Load(Server.MapPath("~/App_Data/sample.xml"));
XmlNodeList nodes = doc.SelectNodes("/article/body/page");
foreach (XmlNode node in nodes)
{
TableRow row = new TableRow();
TableCell cell = new TableCell();
cell.Text = node.FirstChild.InnerText;
row.Cells.Add(cell);
PagesTable.Rows.Add(row);
}
}
We create a new instance of XmlDocument
class and make it load our "sample.xml" file. Next, we want to only display the page elements so we use a simple XPath expression; "/article/body/page
" to select only the page
nodes.XmlDocument doc = new XmlDocument(); doc.Load(Server.MapPath("~/App_Data/sample.xml")); XmlNodeList nodes = doc.SelectNodes("/article/body/page");Next, we iterate through the returned list of
XmlNode
s and insert its contents in our ASP.NET table. To get the text from the page
element in XML file, we use XmlNode.FirstChild.InnerText
property.foreach (XmlNode node in nodes)
{
TableRow row = new TableRow();
TableCell cell = new TableCell();
cell.Text = node.FirstChild.InnerText;
row.Cells.Add(cell);
PagesTable.Rows.Add(row);
}
Some More XPath ExpressionsHad we only wanted to fetch the first
page
element from the XML document, what is the XPath expression we should have used? And what if we wanted to return only the last page
element?- Returning the first page only:
/article/body/page[1]
- Returning the last page only:
/article/body/page[last()]
In this tutorial, we learned how to use XPath expressions to select parts and segments from an XML file. In the next tutorial, we will learn how to generate an XML document programmatically using ASP.NET.
Introduction to DTD (Document Type Definition)
Introduction
In this tutorial, we will learn about "Document Type Definition". DTD is an XML schema language that is used to describe the structure of an XML document. It describes what elements an XML document can contain and in which order. It further specifies how these elements are arranged within one another and what attributes these elements can contain. The description of the structure and these constraints are written in a formal syntax, some of which we will learn in this tutorial. These declarations can be placed within an XML file, or more commonly in a separate file with ".dtd" extension.
Following topics will be covered in this tutorial:This tutorial is 5th one in the series of tutorials about XML and ASP.NET:
We will now learn how elements and attributes are declared in DTD, and how DTD declarations can be associated with XML document(s).
Declaring Elements in DTD
An element declaration is used in DTD to describe what content, child element(s), and attribute(s), that element can contain. One element declaration is used for each element within an XML document, separately. We begin with the declaration of the root element in our XML file.
<!ELEMENT article (title, author, body)>Above DTD element declaration states that there must be a root element with the name of "article" and that element must have child elements with the names of "title", "author" and "body", in that order.
Each element that has been referenced by name in the above declaration, now must have its own separate declaration.
<!ELEMENT article (title, author, body)> <!ELEMENT title (#PCDATA)> <!ELEMENT author (#PCDATA)> <!ELEMENT body (#PCDATA)>Above element declarations that follow the root element declaration ("article") state that these elements can contain character data.
If you want to have the option where you do not put "author" element within an XML document, all you have to do is to put a '?' (question mark) in front of the "author" reference like this:
<!ELEMENT article (title, author?, body)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT author (#PCDATA)>
<!ELEMENT body (#PCDATA)>
A '?' (question mark) in front of an element reference means that, that element can be present zero or one time. Now, an XML document without any "author" element (but with article, title and body elements in proper order) can still be a validated successfully against this DTD.We will learn about the full list of specifiers that can be used in element declarations in future tutorials.
Declaraing Attributes in DTD
An attribute declaration is used in DTD to describe what attribute(s) a given element can have, what content those attributes should have, and whether these attribute(s) are required, implied (not required), or fixed. To explain the syntax of attribute declaration, we will add an attribute with the name of "email" to our "author" element, that we declared above. Here is how we do it:
<!ATTLIST author email CDATA #REQUIRED>Above attribute declaration states that the "author" element should have an attribute with the name of "email" and that attribute's value should be character data (CDATA). If we wanted to give the option to our XML writer that this attribute can be omitted, we should have used #IMPLIED instead of #REQUIRED in the attribute declaration statement.
Our DTD declarations now look like this:
<!ELEMENT article (title, author?, body)> <!ELEMENT title (#PCDATA)> <!ELEMENT author (#PCDATA)> <!ELEMENT body (#PCDATA)> <!ATTLIST author email CDATA #REQUIRED>Associating DTD with XML Documents
There are two methods of associating DTD declarations with XML document(s). We will look into both of them.
i. Placing DTD Declarations in a separate file
We do that by placing following DTD declarations in a separate file with the name of "article.dtd".
Here is a sample XML document that conforms to the DTD declarations in article.dtd:
ii. Placing DTD Declarations within an XML Document
We can also place DTD declarations directly in an XML document. This is how it is done:
Summary
In this tutorial, we learnt about Document Type Declarations. We learnt that DTD is an ideal way to describe and put constraints on the structure of an XML document. Next, we learnt how to declare elements and attributes. We also learnt how to associate DTD declarations with an XML document.
In the next tutorial, we will learn how to validate an XML document programmatically using an ASP.NET page with DTD declarations in a .dtd file.
Introduction
Following topics will be covered in this tutorial:
This tutorial is 6th one in the series of tutorials about XML and ASP.NET:
Well-formedness vs Validation
Well-formedness means that an XML document obeys the rules that are necessary for an XML document to be, frankly, an XML document. What this means is that there should be an XML declaration, there should be a root element, the name of element in opening tag must match with the name in the closing tag, and so on. We covered a good part of it in the first article on introduction to XML.
Validation on the other hand is an entirely different thing. An XML document is valid only if it conforms to the constraints declared in DTD. Put it simply, if the structure of an XML document is exactly as it is described in DTD, then that XML document is valid (in reference to that DTD), otherwise it is not.
Validating XML Document using DTD
We will use the sample XML document and DTD file from the article on Document Type Definition. We will then write code to validate that XML document using that DTD.
Sample XML Document
Following is the XML document that we will be using for demonstration:
Sample DTD File
Following is the DTD file that we will be using for demonstration:
If you look at the XML document and DTD file closely, you'll see that the XML document conforms to the constraints in the DTD file, and is thus, a valid XML document with respect to that DTD file. To do this manual verfication programmatically, we'll have to write code. Luckily, in .NET, writing code that validates an XML document is quite simple.
ASP.NET Page that Validates an XML Document using DTD
Copy following code in a new text file and save it as "ValidateXml.aspx" in your ASP.NET application:
Since our code is going to be in the language, C#, we declare it as the first line in our ASP.NET page.
private bool valid = true; private StringBuilder msgs = new StringBuilder(); We have placed our code in the
protected void Page_Load(object source, EventArgs e) { .. } The first thing we do is to get the full physical path to the 'article.xml' file, using
string fullPathToXmlFile = Server.MapPath("~/App_Data/article.xml");
XmlReaderSettings settings = new XmlReaderSettings(); settings.ProhibitDtd = false; settings.ValidationType = ValidationType.DTD; settings.ValidationEventHandler += new ValidationEventHandler(ValidationHandler); The working of this event handler has already been explained.
protected void ValidationHandler(object sender, ValidationEventArgs args) { valid = false; msgs.Append(args.Message); msgs.Append("<br />"); } It is this
XmlReader reader = XmlReader.Create(fullPathToXmlFile, settings); Once we have created the
while (reader.Read()) ; Once all the nodes have been examined, we close the
reader.Close(); reader = null; We set the text to be displayed to the user depending if the
if (valid == true) StatusLiteral.Text = "Is Valid"; else StatusLiteral.Text = "Is NOT Valid"; Error messages, if any, are set to the appropriate control, to be displayed to the user.
MessagesLiteral.Text = msgs.ToString();
We do that by placing following DTD declarations in a separate file with the name of "article.dtd".
<!ELEMENT article (title, author?, body)> <!ELEMENT title (#PCDATA)> <!ELEMENT author (#PCDATA)> <!ELEMENT body (#PCDATA)> <!ATTLIST author email CDATA #REQUIRED>By placing DTD declarations in a separate file, we can associate as many XML documents as we want with these DTD declarations, by just adding one line of code in those XML documents.
Here is a sample XML document that conforms to the DTD declarations in article.dtd:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article SYSTEM "article.dtd">
<article>
<title>Sample XML Document.</title>
<author email="hidden@xyz.com">Faisal Khan</author>
<body>This is a sample XML Document.</body>
</article>
The DOCTYPE
statement highlighted above, is all that is necessary to associate the DTD document with the XML document. Following the DOCTYPE
should be the name of root element of the XML document. After that comes the SYSTEM
identifier. The last word in this statement is the name of the DTD document containing DTD declarations.ii. Placing DTD Declarations within an XML Document
We can also place DTD declarations directly in an XML document. This is how it is done:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE article [ <!ELEMENT article (title, author?, body)> <!ELEMENT title (#PCDATA)> <!ELEMENT author (#PCDATA)> <!ELEMENT body (#PCDATA)> <!ATTLIST author email CDATA #REQUIRED> ]> <article> <title>Sample XML Document.</title> <author email="hidden@xyz.com">Faisal Khan</author> <body>This is a sample XML Document.</body> </article>Either of these two methods can be used to associate DTD declarations with XML document(s).
Summary
In this tutorial, we learnt about Document Type Declarations. We learnt that DTD is an ideal way to describe and put constraints on the structure of an XML document. Next, we learnt how to declare elements and attributes. We also learnt how to associate DTD declarations with an XML document.
In the next tutorial, we will learn how to validate an XML document programmatically using an ASP.NET page with DTD declarations in a .dtd file.
Validating an XML Document using DTD in ASP.NET
Introduction
In this tutorial, we will learn how to validate an XML document using DTD. We will develop an ASP.NET page to demonstrate the code. After you are finished reading this tutorial, you will be proficient in writing C# code that validates XML documents using DTD.
Following topics will be covered in this tutorial:This tutorial is 6th one in the series of tutorials about XML and ASP.NET:
Well-formedness vs Validation
Well-formedness means that an XML document obeys the rules that are necessary for an XML document to be, frankly, an XML document. What this means is that there should be an XML declaration, there should be a root element, the name of element in opening tag must match with the name in the closing tag, and so on. We covered a good part of it in the first article on introduction to XML.
Validation on the other hand is an entirely different thing. An XML document is valid only if it conforms to the constraints declared in DTD. Put it simply, if the structure of an XML document is exactly as it is described in DTD, then that XML document is valid (in reference to that DTD), otherwise it is not.
Validating XML Document using DTD
We will use the sample XML document and DTD file from the article on Document Type Definition. We will then write code to validate that XML document using that DTD.
Sample XML Document
Following is the XML document that we will be using for demonstration:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE article SYSTEM "article.dtd"> <article> <title>Sample XML Document.</title> <author email="hidden@xyz.com">Faisal Khan</author> <body>This is a sample XML Document.</body> </article>Copy above code in a new text file and save it as "article.xml" in the /App_Data folder of your ASP.NET application.
Sample DTD File
Following is the DTD file that we will be using for demonstration:
<!ELEMENT article (title, author?, body)> <!ELEMENT title (#PCDATA)> <!ELEMENT author (#PCDATA)> <!ELEMENT body (#PCDATA)> <!ATTLIST author email CDATA #REQUIRED>Copy above code in a new text file and save it as "article.dtd" in the /App_Data folder of your ASP.NET application.
If you look at the XML document and DTD file closely, you'll see that the XML document conforms to the constraints in the DTD file, and is thus, a valid XML document with respect to that DTD file. To do this manual verfication programmatically, we'll have to write code. Luckily, in .NET, writing code that validates an XML document is quite simple.
ASP.NET Page that Validates an XML Document using DTD
Copy following code in a new text file and save it as "ValidateXml.aspx" in your ASP.NET application:
<%@ Page Language="C#" %> <%@ Import Namespace="System.Xml" %> <%@ Import Namespace="System.Xml.Schema" %> <script runat="server"> private bool valid = true; private StringBuilder msgs = new StringBuilder(); protected void Page_Load(object source, EventArgs e) { string fullPathToXmlFile = Server.MapPath("~/App_Data/article.xml"); XmlReaderSettings settings = new XmlReaderSettings(); settings.ProhibitDtd = false; settings.ValidationType = ValidationType.DTD; settings.ValidationEventHandler += new ValidationEventHandler(ValidationHandler); XmlReader reader = XmlReader.Create(fullPathToXmlFile, settings); while (reader.Read()) ; reader.Close(); reader = null; if (valid == true) StatusLiteral.Text = "Is Valid"; else StatusLiteral.Text = "Is NOT Valid"; MessagesLiteral.Text = msgs.ToString(); } protected void ValidationHandler(object sender, ValidationEventArgs args) { valid = false; msgs.Append(args.Message); msgs.Append("<br />"); } </script> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head runat="server"> <title></title> <style type="text/css"> body { font-family: Verdana; font-size: 9pt; } pre { font-family: Lucida Console; font-size: 8pt; } </style> </head> <body> <form id="form1" runat="server"> <div><asp:Label ID="StatusLiteral" runat="server" /></div> <pre><asp:Literal ID="MessagesLiteral" runat="server" /></pre> </form> </body> </html>Explanation
Since our code is going to be in the language, C#, we declare it as the first line in our ASP.NET page.
<%@ Page Language="C#" %>Next, we import the namespaces that we will be using in validating the XML document.
<%@ Import Namespace="System.Xml" %> <%@ Import Namespace="System.Xml.Schema" %>We place our code in script tags.
<script runat="server"> ... </script>Before we look at the code, we should see what server side controls we have laid in-between the HTML to display status messages to the user.
StatusLiteral
will display the success or failure of the validation process in one sentence. MessagesLiteral
will display all the failure messages, if any, to the user in detail.<asp:Label ID="StatusLiteral" runat="server" /> <asp:Literal ID="MessagesLiteral" runat="server" />Now, we will delve ourselves into the real code that does the validation. We are using two instance level variables to hold the data for this request. The first variable,
valid
, will be set to false
by a segment of code later if the XML document turns out to be invalid. The second variable, msgs
, will be appended error messages one after the other (again, if any), and displayed to the user in the end.private bool valid = true; private StringBuilder msgs = new StringBuilder(); We have placed our code in the
Load
event of the Page
object. This way, every time the page loads, our code will be run.protected void Page_Load(object source, EventArgs e) { .. } The first thing we do is to get the full physical path to the 'article.xml' file, using
Server.MapPath()
method.string fullPathToXmlFile = Server.MapPath("~/App_Data/article.xml");
Note: There is no direct reference to the 'article.dtd' file in the code. It will be loaded by the objects we create later, automatically from the
Next, we create an instance of DOCTYPE
declaration given in the 'article.xml' file.XmlReaderSettings
object. This object contains the settings we want to use later when are about to the read the XML document. This object is required to create the XmlReader
object later. In the settings, we set its properties so that it knows we want to use DTD to validate an XML document. We also set a ValidationEventHanlder
method. We do this so that every time an error is thrown if the XML document contains invalid segments, we can set the valid
property to false
and append the error message to the msgs
variable.XmlReaderSettings settings = new XmlReaderSettings(); settings.ProhibitDtd = false; settings.ValidationType = ValidationType.DTD; settings.ValidationEventHandler += new ValidationEventHandler(ValidationHandler); The working of this event handler has already been explained.
protected void ValidationHandler(object sender, ValidationEventArgs args) { valid = false; msgs.Append(args.Message); msgs.Append("<br />"); } It is this
XmlReader
object which we create that does the actual job of validating the XML document against that DTD. We create it by calling a static method of XmlReader
object and passing as its arguments, the reference to full physical path of the XML document and a reference to the XmlReaderSettings
object which we created earlier.XmlReader reader = XmlReader.Create(fullPathToXmlFile, settings); Once we have created the
XmlReader
object, we enter a while
loop to repeatidly call its Read()
method. In each iteration, the XmlReader
object examines each node of the XML document. Any error messages that generated during the call to this method are handled by our event handler method we set earlier and are saved to be displayed to the user, later.while (reader.Read()) ; Once all the nodes have been examined, we close the
XmlReader
object.reader.Close(); reader = null; We set the text to be displayed to the user depending if the
valid
variable is true
or false
. This variable will be false
if even a single error was encountered during validating the XML document.if (valid == true) StatusLiteral.Text = "Is Valid"; else StatusLiteral.Text = "Is NOT Valid"; Error messages, if any, are set to the appropriate control, to be displayed to the user.
MessagesLiteral.Text = msgs.ToString();
No comments:
Post a Comment