Document 6517775
Transcription
Document 6517775
Web Technologies III B.Tech II Sem (R09) CSE UNIT-II Q: What is XML? Explain the various features of XML. What is Document Type Definition (DTD)? Explain how a DTD is created. Ans: What is XML? The essence of XML is in its name: Extensible Markup Language. Extensible XML is extensible. It lets you define your own tags, the order in which they occur, and how they should be processed or displayed. Markup The most recognizable feature of XML is its tags, or elements. In fact, the elements you'll create in XML will be very similar to the elements you've already been creating in your HTML documents. However, XML allows you to define your own set of tags. Language XML is a language that's very similar to HTML. It's much more flexible than HTML because it allows you to create your own custom tags. However, it's important to realize that XML is not just a language. XML is a meta-language: a language that allows us to create or define other languages. For example, with XML we can create other languages, such as RSS, MathML (a mathematical markup language), and even tools like XSLT. Consider the following <html <head> <title>ABC Products</title> </head> <body> <h1>ABC Products</h1> <h2>Product One</h2> <p>Product One is an exciting new widget that will simplify your life.</p> <p><b>Cost: $19.95</b></p> <h3>Product Two</h3> <p><i>Cost: $29.95</i></p> <p>Product Two is an exciting new widget that will make you Jump up and down</p> <p><b>Shipping: $5.95</b></p> </body> </html> For example, a human can probably deduce that the <h2> tag in the above document has been used to tag a product name within a product listing. Furthermore, a human might be able to guess that the first paragraph after an <h2> holds the description, and that the next two paragraphs contain price and shipping information, in bold. However, even a cursory glance at the rest of the document reveals some very human errors. For example, the last product name is encapsulated in <h3> tags, not <h2> tags. This last product listing also displays a price before the description, and the price is italicized instead of appearing in bold. A computer program (and even some humans) that tried to decipher this document wouldn't be able to make the kinds of semantic leaps required to make sense of it. The computer would be able only to render the document to a browser with the styles associated with each tag. HTML is chiefly a set of instructions for rendering documents inside a Web browser; it's not a method of structuring documents to bring out their meaning. If the above document were created in XML, it might look a little like First Example: <?xml version="1.0"?> Prepared by A. Sharath Kumar (M.Tech), Asst.Prof Page 1 Web Technologies III B.Tech II Sem (R09) CSE UNIT-II <productListing title="ABC Products"> <product> <name>Product One</name> <description>Product One is an exciting new widget that will simplify your life.</description> <cost>$19.95</cost> <shipping>$2.95</shipping> </product> <product> <name>Product Two</name> <description>Product Two is an exciting new widget that will make you Jump up and down</description <cost>$29.95</cost> <shipping>$5.95</shipping> </product> </productListing> When we concentrate on a document's structure, as we've done here, we are better able to ensure that our information is correct. In theory, we should be able to look at any XML document and understand instantly what's going on. In the example above, we know that a product listing contains products, and that each product has a name, a description, a price, and a shipping cost. You could say, rightly, that each XML document is selfdescribing, and is readable by both humans and software. Now, everyone makes mistakes, and XML programmers are no exception. Imagine that you start to share your XML documents with another developer or company, and, somewhere along the line, someone places a product's description after its price. Normally, this wouldn't be a big deal, but perhaps your Web application requires that the description appears after the product name every time. To ensure that everyone plays by the rules, you need a DTD (a document type definition), or schema. Basically, a DTD provides instructions about the structure of your particular XML document. It's a lot like a rule book that states which tags are legal, and where. Once you have a DTD in place, anyone who creates product listings for your application will have to follow the rules. We'll get into DTDs a little later. For now, though, let's continue with the basics. XML DTD: A "Valid" XML document is a "Well Formed" XML document which conforms to the rules of a Document Type Definition (DTD). The purpose of a DTD is to define the structure of an XML document. It defines the structure with a list of legal elements. A DTD can be declared inline in your XML document, or as an external reference. NOTICE To: B.Tech III CSE From: CR Message: Don't forget your seminar presentations will be on 29th March (Thursday) <?xml version="1.0"> <note> <to> B.Tech IV CSE - A </to> <from> CR </from> <heading>Remainder</heading> <Message> Don't forget your seminar presentations will be on 29 th March (Thursday)! </Message> </note> <!DOCTYPE note [ <!ELEMENT note (to,from,heading,body)> Prepared by A. Sharath Kumar (M.Tech), Asst.Prof Page 2 Web Technologies III B.Tech II Sem (R09) CSE UNIT-II <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT Message (#PCDATA)> ]> External DTDs The DTD example we saw at the start of this chapter appeared within the DOCTYPE declaration at the top of the XML document. This is okay for experimentation purposes, but with many projects, you'll likely have dozens—or even hundreds—of files that must conform to the same DTD. In these cases, it's much smarter to put the DTD in a separate file, then reference it from your XML documents. An external DTD is usually a file with a file extension of .dtd—for example, letter.dtd. This external DTD contains the same notational rules set forth for an internal DTD. To reference this external DTD, you need to add two things to your XML document. First, you must edit the XML declaration to include the attribute standalone="no": <?xml version="1.0" standalone="no"?> Add a DOCTYPE declaration that points to the external DTD, like this: <!DOCTYPE letter SYSTEM "letter.dtd"> This will search for the letter.dtd file in the same directory as the XML file. If the DTD lives on a different server, you might point to PUBLIC instead SYSTEM <!DOCTYPE letter PUBLIC "http://www.example.com/xml/dtd/letter.dtd"> This is the same XML document with an external DTD: (Open it in IE5, and select view source) <?xml version="1.0"?> <!DOCTYPE note SYSTEM "note.dtd"> <note> <to> B.Tech III CSE </to> <from> CR </from> <heading>Remainder</heading> <Message> Don't forget your seminar presentations will be on 29 th March (Thursday)! </Message> </note> This is a copy of the file "note.dtd" containing the Document Type Definition: <?xml version="1.0"?> <!ELEMENT note (to,from,heading,body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT Message (#PCDATA)> Element contains elements, pcdata, cdata or empty. DTD Elements Creating a DTD is quite straight forward. It's really just a matter of defining your elements, attributes, and/or entities. Over the next few lessons, I'll explain how to define your elements, attributes, and entities. To define an element in your DTD, you use the <!ELEMENT> declaration. The actual contents of your <!ELEMENT> declaration will depend on the syntax rules you need to apply to your element. Basic Syntax The <!ELEMENT> declaration has the following syntax: <!ELEMENT element_name content_model> Here, element_name is the name of the element you're defining. The content model could indicate a specific rule, data or another element. •If it specifies a rule, it will be set to either ANY or EMPTY. Prepared by A. Sharath Kumar (M.Tech), Asst.Prof Page 3 Web Technologies III B.Tech II Sem (R09) CSE UNIT-II •If specifies data or another element, the data type/element name needs to be surrounded by brackets (i.e. (tutorial) or (#PCDATA)). Plain Text: If an element should contain plain text, you define the element using #PCDATA. PCDATA stands for Parsed Character Data and is the way you specify non-markup text in your DTDs. Using this example - <name>XML Tutorial</name> - the "XML Tutorial" part is the PCDATA. The other part consists of markup. Syntax:<!ELEMENT element_name (#PCDATA)> Example:<!ELEMENT name (#PCDATA)> The above line in your DTD allows the "name" element to contain non-markup data in your XML document: <name>XML Tutorial</name> Unrestricted Elements: If it doesn't matter what your element contains, you can create an element using the content_model of ANY. Note that doing this removes all syntax checking, so you should avoid using this if possible. You're better off defining a specific content model. Syntax:<!ELEMENT element_name ANY> Example:<!ELEMENT tutorials ANY> Empty Elements: You might remember that an empty element is one without a closing tag. For example, in XHTML, the <br /> and <img /> tags are empty elements. Here's how you define an empty element: Syntax:<!ELEMENT element_name EMPTY> Example:<!ELEMENT header EMPTY> The above line in your DTD defines the following empty element for your XML document: <header /> Child Elements: You can specify that an element must contain another element, by providing the name of the element it must contain. Here's how you do that: Syntax:<!ELEMENT element_name (child_element_name)> Example:<!ELEMENT tutorials (tutorial)> The above line in your DTD allows the "tutorials" element to contain one instance of the "tutorial" element in your XML document: <tutorials> <tutorial></tutorial> </tutorials> DTD Element Operators One of the examples in the previous lesson demonstrated how to specify that an element ("tutorials") must contain one instance of another element ("tutorial"). This is fine if there only needs one instance of "tutorial", but what if we didn't want a limit. What if the "tutorials" element should be able to contain any number of "tutorial" instances? Fortunately we can do that using DTD operators. Here's a list of operators/syntax rules we can use when defining child elements: Syntax Description Operator + a+ One or more occurences of a * a* Zero or more occurences of a ? a? Either a or nothing , a, b a followed by b | a|b a followed by b () (expression) An expression surrounded by parentheses is treated as a unit and could have any one of the following suffixes ?, *, or +. Examples of usage follow. Prepared by A. Sharath Kumar (M.Tech), Asst.Prof Page 4 Web Technologies III B.Tech II Sem (R09) CSE UNIT-II Zero or More: To allow zero or more of the same child element, use an asterisk (*): Syntax:<!ELEMENT element_name (child_element_name*)> Example:<!ELEMENT tutorials (tutorial*)> One or More: To allow one or more of the same child element, use a plus sign (+): Syntax:<!ELEMENT element_name (child_element_name+)> Example:<!ELEMENT tutorials (tutorial+)> Zero or One: To allow either zero or one of the same child element, use a question mark (?): Syntax:<!ELEMENT element_name (child_element_name?)> Example:<!ELEMENT tutorials (tutorial?)> Choices: You can define a choice between one or another element by using the pipe (|) operator. For example, if the "tutorial" element requires a child called either "name", "title", or "subject" (but only one of these), you can do the following: Syntax:<!ELEMENT element_name (choice_1 | choice_2 | choice_3)> Example:<!ELEMENT tutorial (name | title | subject)> Mixed Content: You can use the pipe (|) operator to specify that an element can contain both PCDATA and other elements: Syntax:<!ELEMENT element_name (#PCDATA | child_element_name)> Example:<!ELEMENT tutorial (#PCDATA | name | title | subject)*> DTD Attributes: Just as you need to define all elements in your DTD, you also need to define any attributes they use. You use the <!ATTLIST> declaration to define attributes in your DTD. Syntax: You use a single <!ATTLIST> declaration to declare all attributes for a given element. In other words, for each element (that contains attributes), you only need one <!ATTLIST> declaration. The <!ATTLIST> declaration has the following syntax: <!ATTLIST element_name attribute_name TYPE DEFAULT_VALUE ...> Here, element_name refers to the element that you're defining attributes for, attribute_name is the name of the attribute that you're declaring, TYPE is the attribute type, and DEFAULT_VALUE is it's default value. Example: <!ATTLIST tutorial published CDATA "No"> Here, we are defining an attribute called "published" for the "tutorial" element. The attribute's type is CDATA and it's default value is "No". Default Values The attribute TYPE field can be set to one of the following values: Value Description value A simple text value, enclosed in quotes. #IMPLIED Specifies that there is no default value for this attribute, and that the attribute is optional. #REQUIRED There is no default value for this attribute, but a a value must be assigned. #FIXED value The #FIXED part specifies that the value must be the value provided. The value part represents the actual value. Prepared by A. Sharath Kumar (M.Tech), Asst.Prof Page 5 Web Technologies III B.Tech II Sem (R09) CSE UNIT-II Examples of these default values follow. Value: You can provide an actual value to be the default value by placing it in quotes. Syntax:<!ATTLIST element_name attribute_name CDATA "default_value"> Example:<!ATTLIST tutorial published CDATA "No"> #REQUIRED: The #REQUIRED keyword specifies that you won't be providing a default value, but that you require that anyone using this DTD does provide one. Syntax: <!ATTLIST element_name attribute_name CDATA #REQUIRED> Example: <!ATTLIST tutorial published CDATA #REQUIRED> #IMPLIED: The #IMPLIED keyword specifies that you won't be providing a default value, and that the attribute is optional for users of this DTD. Syntax: <!ATTLIST element_name attribute_name CDATA #IMPLIED> Example: <!ATTLIST tutorial rating CDATA #IMPLIED> #FIXED: The #FIXED keyword specifies that you will provide value, and that's the only value that can be used by users of this DTD. Syntax:<!ATTLIST element_name attribute_name CDATA #FIXED "value"> Example:<!ATTLIST tutorial language CDATA #FIXED "EN"> Q: Explain about XML Namespace. Ans: XML Namespace In XML, a namespace is used to prevent any conflicts with element names. Because XML allows you to create your own element names, there's always the possibility of naming an element exactly the same as one in another XML document. This might be OK if you never use both documents together. But what if you need to combine the content of both documents? You would have a name conflict. You would have two different elements, with different purposes, both with the same name. Example Name Conflict Imagine we have an XML document containing a list of books. Something like this: <books> <book> <title>The Dream Saga</title> <author>Matthew Mason</author> </book> ... </books> And imagine we want to combine it with the following HTML page: <html> <head> <title>Cool Books</title> </head> <body> <p>Here's a list of cool books...</p> (XML content goes here) </body> </html> Prepared by A. Sharath Kumar (M.Tech), Asst.Prof Page 6 Web Technologies III B.Tech II Sem (R09) CSE UNIT-II We will encounter a problem if we try to combine the above documents. This is because they both have an element called title. One is the title of the book, the other is the title of the HTML page. We have a name conflict. What we can do to prevent this name conflict is, create a namespace for the XML document. Example Namespace: Using the above example, we could change the XML document to look something like this: <bk:books xmlns:bk="http://somebooksite.com/book_spec"> <bk:book> <bk:title>The Dream Saga</bk:title> <bk:author>Matthew Mason</bk:author> </bk:book> ... </bk:books> We have added the xmlns:{prefix} attribute to the root element. We have assigned this attribute a unique value. This unique value is usually in the form of a Uniform Resource Identifier (URI). This defines the namespace. And, now that the namespace has been defined, we have added a bk prefix to our element names. Now, when we combine the two documents, the XML processor will see two different element names: bk:title (from the XML document) and title (from the HTML document). In the previous lesson, we created a namespace to avoid a name conflict between the elements of two documents we wanted to combine. When we defined the namespace, we defined it against the root element. This meant that the namespace was to be used for the whole document, and we prefixed all child elements with the same namespace. You can also define namespaces against a child node. This way, you could use multiple namespaces within the same document if required. Example Local Namespace: Here, we apply the namespace against the title element only: <books> <book> <bk:title xmlns:bk="http://somebooksite.com/book_spec"> The Dream Saga </bk:title> <author>Matthew Mason</author> </book> ... </books>Here, we apply the namespace against the title element only: <books> <book> <bk:title xmlns:bk="http://somebooksite.com/book_spec"> The Dream Saga </bk:title> <author>Matthew Mason</author> </book> ... </books> XML Default Namespace: The namespaces we created in the previous two lessons involved applying a prefix. We applied the prefix when we defined the namespace, and we applied a prefix to each element that referred to the namespace. Prepared by A. Sharath Kumar (M.Tech), Asst.Prof Page 7 Web Technologies III B.Tech II Sem (R09) CSE UNIT-II You can also use what is known as a default namespace within your XML documents. The only difference between a default namespace and the namespaces we covered in the previous two lessons is, a default namespace is one where you don't apply a prefix. You can also define namespaces against a child node. This way, you could use multiple namespaces within the same document if required. Example Default Namespace: Here, we define the namespace without a prefix: <books xmlns="http://somebooksite.com/book_spec"> <book> <title>The Dream Saga</title> <author>Matthew Mason</author> </book> ... </books> When you define the namespace without a prefix, all descendant elements are assumed to belong to that namespace, unless specified otherwise (i.e. with a local namespace). Q: Explain about XML Schema in detail. Ans: XML Schema XML Schema is an XML-based alternative to DTD.An XML schema describes the structure of an XML document.The XML Schema language is also referred to as XML Schema Definition (XSD).The purpose of an XML Schema is to define the legal building blocks of an XML document, just like a DTD. An XML Schema: Defines elements that can appear in a document Defines attributes that can appear in a document Defines which elements are child elements Defines the order of child elements Defines the number of child elements Defines whether an element is empty or can include text Defines data types for elements and attributes Defines default and fixed values for elements and attributes One of the greatest strength of XML Schemas is the support for data types. With support for data types: It is easier to describe allowable document content It is easier to validate the correctness of data It is easier to work with data from a database It is easier to define data facets (restrictions on data) It is easier to define data patterns (data formats) It is easier to convert data between different data types XML Schemas Secure Data Communication When sending data from a sender to a receiver, it is essential that both parts have the same "expectations" about the content. With XML Schemas, the sender can describe the data in a way that the receiver will understand. A date like: "03-11-2004" will, in some countries, be interpreted as 3.November and in other countries as 11.March. However, an XML element with a data type like this: <date type="date">2004-03-11</date> Prepared by A. Sharath Kumar (M.Tech), Asst.Prof Page 8 Web Technologies III B.Tech II Sem (R09) CSE UNIT-II ensures a mutual understanding of the content, because the XML data type "date" requires the format "YYYYMM-DD". A Simple XML Document Look at this simple XML document called "note.xml": <?xml version="1.0"?> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note> A DTD File The following example is a DTD file called "note.dtd" that defines the elements of the XML document above ("note.xml"): <!ELEMENT note (to, from, heading, body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)> The first line defines the note element to have four child elements: "to, from, heading, body". Line 2-5 defines the to, from, heading, body elements to be of type "#PCDATA". An XML Schema The following example is an XML Schema file called "note.xsd" that defines the elements of the XML document above ("note.xml"): <?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.w3schools.com" xmlns="http://www.w3schools.com" elementFormDefault="qualified"> <xs:element name="note"> <xs:complexType> <xs:sequence> <xs:element name="to" type="xs:string"/> <xs:element name="from" type="xs:string"/> <xs:element name="heading" type="xs:string"/> <xs:element name="body" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema> A Reference to an XML Schema This XML document has a reference to an XML Schema: <?xml version="1.0"?> <note xmlns="http://www.w3schools.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3schools.com note.xsd"> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> Prepared by A. Sharath Kumar (M.Tech), Asst.Prof Page 9 Web Technologies III B.Tech II Sem (R09) CSE UNIT-II <body>Don't forget me this weekend!</body> </note> The <schema> element is the root element of every XML Schema. <?xml version="1.0"?> <xs:schema> ... ... </xs:schema> The <schema> element may contain some attributes. A schema declaration often looks something like this: <?xml version="1.0"?> <xs:schema xmlns:xs=http://www.w3.org/2001/XMLSchema targetNamespace=http://www.w3schools.com xmlns=http://www.w3schools.com elementFormDefault="qualified"> ... ... </xs:schema> The following fragment: xmlns:xs=http://www.w3.org/2001/XMLSchema- indicates that the elements and data types used in the schema come from the "http://www.w3.org/2001/XMLSchema" namespace. It also specifies that the elements and data types that come from the "http://www.w3.org/2001/XMLSchema" namespace should be prefixed with xs: elementFormDefault="qualified"- indicates that any elements used by the XML instance document which were declared in this schema must be namespace qualified. Prepared by A. Sharath Kumar (M.Tech), Asst.Prof Page 10