0% found this document useful (0 votes)
247 views

XML

XML is a markup language that allows users to define their own tags to structure documents. It was developed by the W3C to effectively transport and store data. XML tags are not predefined like HTML - users can define their own tags. XML has a tree structure where elements can contain other elements and text. Attributes provide additional information about elements, and should be used for metadata rather than core data.

Uploaded by

Saranya Ravi
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
247 views

XML

XML is a markup language that allows users to define their own tags to structure documents. It was developed by the W3C to effectively transport and store data. XML tags are not predefined like HTML - users can define their own tags. XML has a tree structure where elements can contain other elements and text. Attributes provide additional information about elements, and should be used for metadata rather than core data.

Uploaded by

Saranya Ravi
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 24

(EXTENSIBLE

MARKUP
LANGUAGE)

An Eagle’s Eye View of XML


XML stands for eXtensible Markup Language. As the name suggests,
XML is a markup language. The XML specification was created by the
World Web Consortium (W3C), the body that sets standards for the
web. The Extensible Markup Language became a W3C recommendation
on 10th February 1998.
XML is a set of rules for defining semantic tags that break a document
into parts and identify the different parts of the document. It is a meta-
markup language that defines a syntax used to define other domain-
specific, semantic, structured markup languages.

Origin and Goals

XML was developed by an XML Working Group (originally known as the


SGML Editorial Review Board) formed under the auspices of the World
Wide Web Consortium (W3C) in 1996. It was chaired by Jon Bosak of
Sun Microsystems with the active participation of an XML Special
Interest Group (previously known as the SGML Working Group) also
organized by the W3C. The membership of the XML Working Group is
given in an appendix. Dan Connolly served as the Working Group's
contact with the W3C.

The design goals for XML are:

• XML shall be straightforwardly usable over the Internet.


• XML shall support a wide variety of applications.
• XML shall be compatible with SGML.
• It shall be easy to write programs which process XML
documents.
• The number of optional features in XML is to be kept to the
absolute minimum, ideally zero.
• XML documents should be human-legible and reasonably clear.
• The XML design should be prepared quickly.
• The design of XML shall be formal and concise.
• XML documents shall be easy to create.
• Terseness in XML markup is of minimal importance.

WHAT IS XML?

• XML stands for Extensible Markup Language.


• This is a language much similar to HTML.
• XML was designed to carry data, not to display our own tags.
• XML is designed to be self-descriptive.
• XML tags are not predefined. We must define our own tags.
• XML was created to structure, store and transport information.

XML Is a Meta-Markup Language

The first thing we need to understand about XML is that it isn’t just
another markup language like the Hypertext Markup language (HTML).
These languages define a fixed set of tags that describe a fixed number
of elements. If the markup language we use doesn’t contain the tag we
need—we are out of luck. We can wait for the next version of the
markup

Language hoping that it includes the tag we need; but then we are
really at the mercy of what the vendor chooses to include. XML,
however, is a meta-markup language. It’s a language in which we make
up the tags we need as we go along. These tags must be organized
according to certain general principles, but they’re quite flexible in their
meaning.
The difference Between XML and HTMLS

XML looks similar to HTML. Like XML, HTML is also a markup language. In
fact, HTML stands for Hypertext Markup Language. Markup languages
are used for describing how a document’s contents should be
interpreted.

XML is not a replacement for HTML.


XML and HTML are designated with different goals.
XML is complement to HTML.

XML was designed to transport and store data, with focus on what data
is.
HTML was designed to format and display data.

HTML is about displaying information, while XML is for carrying


information.

HTML includes over 100 pre-defined tags to allow the author to specify
how each piece of content should be presented to the end user.

XML

XML allows us to create our own tags to describe the data between them.
It is not interested in how the data is presented. The main focus is
ensuring that the data is well organized within descriptive tags. This is
because while we can view XML documents form. XML viewers interpret
the document so it will display the XML documents using any styles that
have been applied using CSS. It will also warn us of something doesn’t
look right, or if it doesn’t validate correctly.

Most modern browsers include XML support, so it’s quite possible that
our own the browser is able to display the contents of XML files.
The XML file is opened in same way as any other file in the browser. If it
is a local file the full path can be typed into the address bar. Otherwise, if
it’s available over the web, we can type the URL into the address bar.

NOTEPAD
We can use a text editor such as Notepad to create or view a simple
XML files. Here what this XML file looks like in notepad.

INTERNET EXPLORER

Here’s how the XML file appears In Internet Explorer


DISPLAYING ERRORS

If the XML document contains an error, the XML viewer will display a
message indicating the error. In this file, there is an error.
<tutorials>
<tutorial>
<name> XML tutorial</name>
<url>http://www.quackit.com/xml/tutorial</url>
</tutorial>
</Ttutorials>

Below is how the error is reported in Internet Explorer.


With XML We Invent our Own Tags

The tags below (like <to> and <from>) are not defined in any XML
standard. These tags are “invented” by the author of the XML
document. This is because XML language has no predefined tags.

Example:
<note>
<to>john </to>
<from>ravi </from>
<heading>Reminder</heading>
<body>Don’t forget me this weekend</body>
</note>
The note above is self descriptive. It has the senders and receivers
information, it also has a heading and a message body.

The XML allows the author to define his own tags and his own
document structure. The tags used in HTML are predefined. HTML
documents can only use tags defined in the HTML standard (like <p>,
<b>, etc…).

Creating a Simple XML Document

<? xml version=”1.0”? >


<FOO>
Hello XML!
</FOO>

That’s not very complicated, but it is a good XML document. To be more


precise, it’s a well-formed XML document. This document can be
typed in any convenient text editor like Notepad, BBEdit etc.

Saving the XML File

Once we’ve typed the preceding code, save the document in a file
called hello.xml, HelloWorld.xml, MyFirstDocument.xml, or some other
name. The three-letter extension .xml is fairly standard. However, do
make sure that we save it in plain text format, and not in the native
format of some word processor like WordPerfect or Microsoft Word.

XML TREE

XML documents form a tree structure that starts at “the root”


and branches to “the leaves”.

An Example XML Document:

XML documents use a self-describing and simple syntax:


<root>

<child>

<subchild>.....</subchild>

</child>

</root>

The terms parent, child, and sibling are used to describe the
relationships between elements. Parent elements have children.
Children on the same level are called siblings (brothers or sisters).All
elements can have text content and attributes.

XML Elements

An XML document contains XML Elements. An XML element is


everything from the element's start tag to the end tag.

An element can contain other elements, simple text and can also have
attributes.

<bookstore>

<title>Harry Potter</title>

<author>J K.
Rowling</author>

<year>2005</year>

<price>29.99</price>

</bookstore>
In the example above, <bookstore> have element contents because
they contain other elements. <author> has text content because it
contains text.

XML Attributes

Attributes provide additional information about elements.

Attributes often provide information that is not a part of the data.

An Example:

<book category="CHILDREN">

<title>Harry Potter</title>

<author>J K. Rowling</author>

<year>2005</year>

In the example above only <book> has an attribute


(category="CHILDREN").

XML Attributes Must be Quoted

Attribute values must always be enclosed in quotes, but either single or


double quotes can be used.

<person
sex="female">

Or like this:

<person
sex='female'>

If the attribute value itself contains double quotes we can use single
quotes, like in this example:

<gangster name='George "Shotgun"


Ziegler'>

Or we can use character entities:

<gangster name="George&quot;Shotgun&quot;
Ziegler">

XML Elements vs. Attributes


There are no hard and fast rules about when to use child elements and
when to use attributes. Generally, we’ll use whichever suits our
application. With experience, we’ll gain a feel for when attributes are
easier than child elements and vice versa.

Until then, one good rule of thumb is that the data itself should be
stored in
elements. Information about the data (meta-data) should be stored in
attributes.
And when in doubt, put the information in the elements.

To differentiate between data and meta-data, ask yourself whether


someone reading the document would want to see a particular piece of
information. If the answer is yes, then the information probably belongs
in a child element. If the answer is no, then the information probably
belongs in an attribute.

If all tags were stripped from the document along with all the attributes,
the basic information should still be present. Attributes are good places
to put ID numbers, references and other information not directly or
immediately relevant to the reader.

Take a look at these examples:

<person sex="female">
<firstname>Anna</firstnam
e>

<lastname>Smith</lastnam
e>

</person>

<person>

<sex>female</sex>

<firstname>Anna</firstnam
e>

<lastname>Smith</lastnam
e>

</person>

In the first example sex is an attribute. In the last, sex is an element.


Both examples provide the same information.

There are no rules about when to use attributes and when to use
elements. In XML we can avoid using attributes. Use elements instead.

DISADVANTAGES IN USING ATTRIBUTES:

• attributes cannot contain multiple values (elements can)


• attributes cannot contain tree structures (elements can)
• attributes are not easily expandable (for future changes)

Attributes are difficult to read and maintain. Use elements for data. Use
attributes for information that is not relevant to the data.

XML Syntax Rules

1. All XML Elements Must Have a Closing Tag

In HTML, we will often see elements that don't have a closing tag:

<p>This is a paragraph

<p>This is another paragraph

In XML, it is illegal to omit the closing tag. All elements must have a
closing tag:

<p>This is a paragraph</p>

<p>This is another paragraph</p>

2. Assigning Meaning to XML Tags

Markup tags can have three kinds of meaning:


structure, semantics, and style.

Structure divides documents into a tree of elements. Semantics relates


the individual elements to the real world outside of the document itself.
Style specifies how an element is displayed.

Structure merely expresses the form of the document, without regard


for differences between individual tags and elements. For instance, we
can give a different root name. Even though the root name is different
the contents in the
XML document is same.

Semantic meaning exists outside the document, in the mind of the


author or
in some computer program that generates or reads these files.

An English-speaking human would be more likely to understand


<GREETING> and </GREETING> or <DOCUMENT> and </DOCUMENT>
than <FOO> and </FOO> or <P>and </P>.

Naturally, it’s better to pick tags that more closely reflect the meaning
of the information they contain. Many disciplines like math and
chemistry are working on creating industry standard tag sets. These
should be used when appropriate.

3. XML Tags are Case Sensitive

XML tags are case sensitive. With XML, the tag <Letter> is different
from the tag <letter>.Opening and closing tags must be written with
the same case:

<Message>This is incorrect</message>
<message>This is correct</message>

4. XML Elements Must be Properly Nested

In XML, all elements must be properly nested within each other:

<b><i>This text is bold and italic</i></b>

In the example above, "Properly nested" simply means that since the
<i> element is opened inside the <b> element, it must be closed
inside the <b> element.

5. XML Documents Must Have a Root Element

XML documents must contain one element that is the parent of all
other elements. This element is called the root element.

<root>

<child>

<subchild>.....</subchild>

</child>

</root>

6. XML Attribute Values Must be Quoted

In XML the attribute value must always be quoted. Study the two XML
documents below. The first one is incorrect, the second is correct:

<note date=12/11/2007>
<to>Tove</to>

<from>Jani</from>

</note>

<note date="12/11/2007">

<to>Tove</to>

<from>Jani</from>

</note>

The error in the first document is that the date attribute in the note
element is not quoted.

7. Entity References

Some characters have a special meaning in XML.

If we place a character like "<" inside an XML element, it will generate


an error because it interprets as the start of a new element.

This will generate an XML error:

<message>if salary < 1000 then</message>

To avoid this error, replace the "<" character with an entity


reference:

<message>if salary &lt; 1000 then</message>

There are 5 predefined entity references in XML:


&lt; < less than
&gt; > greater than
&amp; & ampersand
&apos; ' apostrophe
&quot; " quotation mark

7. COMMENT LINES

Each language has comment lines just to explain about the program. In
XML, comment lines are written like this.

<! -- This is a comment -->

XML Naming Rules

XML elements must follow these naming rules:

• Names can contain letters, numbers, and other characters


• Names must not start with a number or punctuation character
• Names must not start with the letters xml (or XML, or Xml, etc)
• Names cannot contain spaces

USES OF XML

XML is used in many aspects of web development, often to simplify


data storage and sharing
1. XML separates Data from HTML

If a dynamic data is to be displayed in the HTML document, it will take a


lot of work to edit the HTML each time the data changes.

With XML, data can be stored in separate XML files. This way we can
concentrate on using HTML for layout and display, and be sure that
changes in the underlying data will not require any changes to the
HTML.

2. XML Simplifies Data Sharing

In the real world, computer systems and databases contain data in


incompatible formats.

XML data is stored in plain test format. This provides a software and
hardware independent way of storing data.

This makes it much easier to create data that different applications can
share.

3. XML Simplifies Data Transport

With XML, data can easily be exchanged between incompatible


systems. One of the most time-consuming challenges for developers is
to exchange data between incompatible systems over the Internet.

Exchanging data as XML greatly reduces this complexity, singe the data
can be read by different incompatible applications.

4. XML Simplifies Platform Changes

Upgrading new system is time consuming. Large amounts of data must


be converted and incompatible data is often lost. XML data is stored in
text format. This makes it easier to expand or upgrade to new
operating systems, new applications, or new browsers, without losing
data.

5. XML Makes our Data More Available

Since XML is independent of hardware, software and application, XML


can male our data more available and useful.

Different applications can access our data, not only in HTML pages, but
also from XML data sources.

With XML, our data can be available to all kinds of “reading machines”
(Handheld computers, voice machines, news feed, etc) and make it
more available for blind people with other disabilities.

Related Technologies

XML doesn’t operate in a vacuum. Using XML as more than a data


format requires interaction with a number of related technologies.
These technologies include HTML for backward compatibility with
legacy browsers, the CSS and XSL style sheet languages, URLs and
URI’s, the XLL linking language, and the Unicode character set.

1. Hypertext Markup Language

Mozilla 5.0 and Internet Explorer 5.0 are the first Web browsers to
provide some support for XML, but it takes about two years before most
users have upgraded to a particular release of the software. So we’re
going to need to convert our XML content into classic HTML for
some time to come. Therefore, before we jump into XML, we should be
completely comfortable with HTML.

XML separates the content of a document from the appearance of the


document. The content is developed first; then a format is attached to
that content with a style sheet. Separating content from style is an
extremely effective technique that improves both the content and the
appearance of the document. Among other things, it allows authors and
designers to work more independently of each other.

2. Cascading Style Sheets

Since XML allows arbitrary tags to be included in a document, there


isn’t any way
for the browser to know in advance how each element should be
displayed. When we send a document to a user we also need to send
along a style sheet that tells the browser how to format individual
elements. One kind of style sheet we can use is a Cascading Style
Sheet (CSS).

CSS, initially designed for HTML, defines formatting properties like font
size, font family, font weight, paragraph indentation, paragraph
alignment, and other styles that can be applied to particular elements.
For example, CSS allows HTML documents to specify that all H1
elements should be formatted in 32 point centered Helvetica bold.

It’s easy to apply CSS rules to XML documents. We simply change the
names of the tags we’re applying the rules to. Mozilla 5.0 directly
supports CSS style sheets combined with XML documents, though at
present, it crashes rather too frequently.

3. Extensible Style Language

The Extensible Style Language (XSL) is a more advanced style-


sheet language specifically designed for use with XML documents. XSL
documents are themselves well-formed XML documents.

XSL documents contain a series of rules that apply to particular


patterns of XML
elements. An XSL processor reads an XML document and compares
what it sees to the patterns in a style sheet. When a pattern from the
XSL style sheet is recognized in the XML document, the rule outputs
some combination of text.
CSS can only change the format of a particular element, and it can only
do so on an element-wide basis. XSL style sheets, on the other hand,
can rearrange and reorder elements. They can hide some elements and
display others.

Furthermore, they can choose the style to use not just based on the
tag, but also on the contents and attributes of the tag, on the position
of the tag in the document relative to other elements, and on a variety
of other criteria.
CSS has the advantage of broader browser support. However, XSL is far
more flexible and powerful, and better suited to XML documents

BENIFITS

• Simplicity
Information coded in XML is easy to read and understand, plus
it can be processed easily by computers.

• Extensibility
There is no fixed set of tags. New tags can be created as they
are needed.

• Self-description
In traditional databases, data records require schemas set up
by the database administrator. XML documents can be stored
without such definitions, because they contain meta data in
the form of tags and attributes.

XML provides a basis for author identification and versioning at


the element level. Any XML tag can possess an unlimited
number of attributes such as author or version.

• Separates content from presentation


XML tags describe meaning not presentation. The motto of
HTML is: "I know how it looks", whereas the motto of XML is: "I
know what it means, and you tell me how it should look." The
look and feel of an XML document can be controlled by XSL
style sheets, allowing the look of a document (or of a complete
Web site) to be changed without touching the content of the
document. Multiple views or presentations of the same content
are easily rendered.

• Can embed existing data


Mapping existing data structures like file systems or relational
databases to XML is simple. XML supports multiple data
formats and can cover all existing data structures.

• Rapid adoption by industry


Software AG, IBM, Sun, Microsoft, Netscape, Data Channel, SAP
and many others have already announced support for XML.
Microsoft will use XML as the exchange format for its Office
product line, while both Microsoft's and Netscape's Web
browsers support XML.
(EXTENSIBLE MARKUP
LANGUAGE)

By
R.Saranya

S.Sahira

You might also like