Display xml on the World Wide Web. XML Basics

The standard defines two levels of correctness for an XML document:

  • Well-formed. A properly constructed document meets all general rules XML syntax, applicable to any XML document. And if, for example, the start tag does not have a corresponding end tag, then this incorrectly constructed XML document. A document that is not properly constructed cannot be considered an XML document; The XML processor (parser) should not process it normally and should classify the situation as a fatal error.
  • Valid. A valid document additionally conforms to certain semantic rules. This is more strict additional check the correctness of the document for compliance with predetermined, but already external rules, in order to minimize the number of errors, for example, the structure and composition of a given, specific document or family of documents. These rules can be developed both by the user himself and third party developers, for example, developers of dictionaries or data exchange standards. Usually such rules are stored in special files- diagrams, where the structure of the document, all valid names of elements, attributes and much more are described in detail. And if a document, for example, contains an element name that is not previously defined in the schemas, then the XML document is considered void; When checking for compliance with rules and schemas, the checking XML processor (validator) is obliged (at the user's choice) to report an error.

These two concepts do not have a well-established standardized translation into Russian, especially the concept valid, which can also be translated as valid, legitimate, reliable, fit, or even tested for compliance with rules, standards, laws. Some programmers use established tracing paper in everyday life " Valid».

XML syntax

This section only discusses correct construction XML documents, that is, their syntax.

XML is a hierarchical structure designed to store any data; visually the structure can be represented as a tree. The most important mandatory syntactic requirement is that the document has only one root element (alternatively called a document element). This means that the text or other data of the entire document must be located between the only one a start root tag and its corresponding end tag.

Next simplest example- a well-formed XML document: This is a book: "Book" The first line of an XML document is called an XML declaration - this is an optional line indicating the version of the XML standard (usually 1.0), character encoding and external dependencies can also be specified here. The specification requires XML processors to support Unicode encodings UTF-8 and UTF-16 (UTF-32 is optional). Other encodings based on the ISO/IEC 8859 standard are recognized as acceptable, supported and widely used (but not required); other encodings are also acceptable, for example, Russian Windows-1251, KOI-8.

A comment can be placed anywhere in the tree. XML comments are placed inside a pair of tags. Two hyphens (--) cannot be used anywhere within a comment.

Below is an example of a simple culinary recipe, marked up using XML:

Plain bread Flour Yeast Warm water Salt

Structure

The rest of this XML document consists of nested elements, some of which have attributes And content. An element typically consists of opening and closing tags that surround text and other elements. The opening tag consists of the element name in angle brackets, for example, " "; The closing tag consists of the same name in angle brackets, but a forward slash is added before the name, for example, " ". The content of an element is everything between the opening and closing tags, including text and other (nested) elements. Below is an example of an XML element that contains an opening tag, an end tag, and the content of the element:

Knead again, place on a baking sheet and put in the oven.

Flour

In the example above, the ingredient element has two attributes: amount, which has the value 3, and unit, which has the value glass. From the point of view of XML markup, the above attributes do not have any meaning, but are simply a set of characters.

In addition to text, an element can contain other elements:

Mix all ingredients and knead thoroughly. Cover with a cloth and leave for one hour in a warm room. Knead again, place on a baking sheet and put in the oven.

IN in this case the "Instructions" element contains three "step" elements. XML does not allow overlapping elements. For example, the following snippet is incorrect because the "em" and "strong" elements overlap.

Regular accented highlighted and accented highlighted

Each XML document must contain exactly one root element document element), so the following fragment cannot be considered a valid XML document.

Entity #1 Entity #2

To denote an element with no content, called an empty element, necessary apply special form an entry consisting of a single tag in which a slash is placed after the element name. If an element is not declared empty in the DTD, but it has no content in the document, for it allowed use this form of recording. For example:

XML defines two recording methods special characters: entity reference and symbol number reference. An entity in XML is named data, usually text, in particular special characters. Entity references are specified where the entity should be and consist of an ampersand (“&”), the entity name, and a semicolon (“;”). There are several predefined entities in XML, such as “lt” (you can refer to it by writing “< ») для левой angle bracket and "amp" (reference - "&") for ampersand, it is also possible to define your own entities. In addition to recording individual characters with entities, they can be used to record frequently occurring blocks of text. Below is an example of using a predefined entity to avoid using an ampersand in the name:

AT&T

The complete list of predefined entities consists of & (“&”),< (« («>""), "(""), and "("") - the latter two are useful for writing delimiters within attribute values. You can define your entities in a DTD document.

Sometimes it is necessary to determine non-breaking space, which is very often used in HTML and is denoted as in XML there is no such predefined entity, it is written, and its use causes an error. The absence of this very common entity often comes as a surprise to many programmers, and this creates some difficulties when migrating their HTML developments to XML.

A numeric character reference looks like a reference to an entity, but instead of the entity name, the # character and a number (in decimal or hexadecimal notation) that is the number of the character in code table Unicode. These are typically characters that cannot be encoded directly, such as an Arabic letter in an ASCII-encoded document. The ampersand can be represented as follows:

AT&T

There are many more rules regarding creating a valid XML document, but the purpose of this brief overview It was just to show the basics necessary to understand the structure of an XML document.

Story

The year of birth of XML can be considered 1996, at the end of which a draft version of the language specification appeared, or when this specification was approved. It all started with the appearance of the SGML language in 1986.

SGML (Standard Generalized Markup Language- standard generalized markup language) has declared itself as a flexible, comprehensive and comprehensive meta-language for creating markup languages. Despite the fact that the concept of hypertext appeared in 1965 (and the underlying principles were formulated in 1945), SGML does not have a hypertext model. The creation of SGML can be confidently called an attempt to embrace the immensity, since it combines capabilities that are extremely rarely used all together. This is its main drawback - the complexity and, as a consequence, the high cost of this language limits its use only large companies who can afford to buy the appropriate software and hire highly paid specialists. In addition, small companies rarely have such complex tasks to involve SGML in their solution.

SGML is most widely used to create other markup languages; it was with its help that the markup language was created hypertext documents- HTML, the specification of which was approved in 1992. Its appearance was associated with the need to organize the rapidly increasing array of documents on the Internet. The rapid growth in the number of connections to the Internet and, accordingly, Web servers has resulted in such a need for coding electronic documents, which SGML could not cope with due to its high learning curve. The advent of HTML is very simple language markup - quickly solved this problem: ease of learning and richness of document design tools made it the most popular language for Internet users. But as the number and quality of documents on the Web grew, so did the requirements placed on them, and the simplicity of HTML became its main drawback. The limited number of tags and complete indifference to the structure of the document prompted developers represented by the W3C consortium to create a markup language that would not be as complex as SGML and not as primitive as HTML. As a result, combining the simplicity of HTML with the markup logic of SGML and meeting the demands of the Internet, XML was born.

Strong and weaknesses Advantages Flaws
  • Modeling ambiguity.
  • XML does not have data type support built into the language. It does not have strong typing, that is, the concepts of “integers”, “strings”, “dates”, “booleans”, etc.
  • The hierarchical data model offered by XML is limited compared to the relational model and object-oriented graphs and network model data.
XML mapping on the World Wide Web

The three most common ways to convert an XML document into a user-displayable form are:

  • Applying CSS styles;
  • Applying an XSLT transformation;
  • Writing an XML document handler in any programming language.
  • Without using CSS or XSL The XML document displays as plain text in most Web browsers. Some browsers, such as Internet Explorer, Mozilla and Mozilla Firefox, display the document structure in a tree view, allowing you to collapse and expand nodes with mouse clicks.

    Application CSS styles

    The process is similar to applying CSS to an HTML document for display.

    For applying CSS When displayed in a browser, the XML document must contain a special link to the style sheet. For example:

    This is different from the HTML approach, which uses the .

    Applying an XSLT transformation

    XSL is a technology that describes how to format or transform XML document data. The document is transformed into a format suitable for display in a browser. The browser is the most frequent use XSL, but don't forget that using XSL you can transform XML into any format, for example

    Purpose of the lesson

    Introduction to XML technology. Explore the possibility of representing XML documents in HTML. Usage JavaScript scripts for navigating through an XML table and organizing data searches by condition. Recommended reading.

    Brief theoretical information

    XML (eXtensible Markup Language) technology was created in the late 90s of the last century. The main advantages of XML text:

    □ has a database structure, accessible to computers and humans;

    □ conveniently processed by means modern languages programming;

    □ easily translated into HTML.

    Consider the following example of a text database written in XML:

    Three men in the boat

    Jerom-K-Jerom

    12000

    Notre Domme de Paris

    V.Hugo

    15000

    A War and Peace

    L. Tolstoy

    16500

    Angelika - the misstress of ghosts A and S. Gallen

    9000

    This is an example of a well-formed XML document whose elements are the tags , , , , ,

    Elements in the text are arranged like a tree with a head element. Each element has an associated closing element. The scope of each element is limited by the opening and closing elements. It is not allowed to cross the scope of elements, i.e. The areas are either nested within one another or do not intersect at all. An element whose scope contains the scope of all other elements is called the root element. An XML document can be thought of as a text database. The value of an element is the information placed between tags defining this element. So, the value of the first element is the string

    Three men in the boat.

    Type this text in any editor and save it as simple text file with an xml extension - for example, name this file textbd.xml. You can view this file with a browser Internet Explorer the same way you viewed HTML files. If there is an error, the XML interpreter will display detailed information about the location and essence of the error.

    Now we will show how to convert this output into a tabular one. HTML form, which is done using HTML. Let's create the next one HTML file(Listing 2.12).

    Listing 2.12. HTML document to display XML tables

    The Book Title

    The author

    The price

    Let's save this HTML file as textbd.html. Now let's open it with a browser. The result will be like this (Fig. 2.9).

    Rice. 2.9. Displaying an XML document in an HTML document

    To connect the previously created XML file and link it to the table, tags are used:

    To display data in a table, tags for cells are used in the following form:

    The tag is used as a container. The DATAFLD parameter contains the value of the XML element to be displayed.

    When working with databases, one of the main issues is finding the required information. In this work we will carry out such a search using JavaScript tools. Since the database can be quite large, it is displayed entirely in a table HTML document very ineffective. Therefore, we will not display the entire table, but, say, only two records. In addition, we will add buttons to scroll through the database. To do this, let's change our HTML document as follows (Listing 2.13).

    Listing2.13. Modified HTML document to display an XML table

    Our first lesson in xml-technology

    The Book Title

    The author

    The price

    >

    <

    The > term is used to draw a right arrow, the &it term is used to draw a left arrow. At the same time, we indicate that only two records need to be displayed in the table:

    Now let's create for our site functional content. Its meaning will be that we will enter the title of the book in its entirety or some of its fragments, and when the button is pressed, the system should display other details of the book: the author and the price, or report that the book was not found. Now you need to use JavaScript. Actually, only a few commands are required.

    □ getElementByTagName("title").item(i).text;

    This command returns the value of the element from the XML file that is the i-th element in the order of listing these elements.

    □ getElementsByTagName("title").length;

    This command returns the total number of elements from the XM L-document.

    □ String.indexOf(stringl);

    This command returns the position from which stringi is included in string string or -i if there are no occurrences.

    Now let's show the extended HTML code for this task (Listing 2.14).

    Listing2.14. Enhanced HTML Document to Display XML Table

    function showelement()

    // Connecting an XML document:

    var odoc=new ActiveXObject("Microsoft.XMLDOM");

    odoc.async=false; // Pause the program,

    // until loading is completed odoc.load("textbd.xml"); // Load an XML document into memory var stringl=document.myform.mytext.value; z=odoc.getElementsByTagName("title").length;// Getting

    // length of element // with tag //

    for(i=0;i

    
    2024, leally.ru - Your guide in the world of computers and the Internet