VOOZH about

URL: https://www.w3.org/TR/NOTE-sgml-xml

⇱ Comparison of SGML and XML


👁 W3C
NOTE-sgml-xml-971215


Comparison of SGML and XML

World Wide Web Consortium Note 15-December-1997


This version:
http://www.w3.org/TR/NOTE-sgml-xml-971215
Author:
James Clark <jjc@jclark.com>

Status of this document

This document is a NOTE made available by the W3 Consortium for discussion only. This indicates no endorsement of its content, nor that the Consortium has, is, or will be allocating any resources to the issues addressed by the NOTE.
Errors or omissions in this document should be reported to the author.


Abstract

This document provides a detailed comparison of SGML (ISO 8879) and XML.


Comparison of SGML and XML

Version 1.0

Table of Contents

1. Differences Between XML and SGML
2. Transforming SGML to XML
3. SGML Declaration for XML

XML allows only documents that use the SGML declaration in this note. This declares all the following SGML features as NO:

Note that it differs from the reference concrete syntax in a number of ways:

The following constructs which are permitted in SGML when SHORTTAG is YES are not allowed in XML:

NET delimiters can be used only to close an empty element. In SGML without the Web SGML Adaptations Annex, the NET delimiter is declared as . With this approach, XML is not allowing null end-tags and is allowing net-enabling start-tags only for elements with no end-tag. In SGML with the Web SGML Adaptations Annex, there is a separate NESTC (net-enabling start tag close) delimiter. This allows the XML syntax to be handled as a combination of a net-enabling start-tag and a null end-tag . With this approach, XML is allowing a net-enabling start-tag only when immediately followed by a null end-tag.

XML imposes the following restrictions not in SGML:

XML predefines the semantics of the attributes and . It also reserves all attribute, element type and notation names beginning with .

XML requires that an SGML parser use an entity manager that behaves as follows:

XML imposes requirements on the information that a parser must make available to an application.

XML depends on the following changes to SGML made by Web SGML Adaptations Annex:

The Web SGML Adaptations Annex also enables some XML restrictions to be enforced in SGML:

For most restrictions in XML that go beyond SGML, it is possible to transform an SGML document automatically into a document that meets the restrictions, and is equivalent in the sense that it has the same ESIS. There are a number of restrictions for which this is not the case:

External SDATA entities, external CDATA entities
These could be transformed into NDATA entities.
Subdocument entities
These could be converted into NDATA entities with a notation that indicates that they are SGML or XML.
References to external data entities in content
These could be transformed into an empty element with an attribute whose declared value is ENTITY.
Data attributes
Since an external data entity can only be used in an ENTITY or ENTITIES attribute on an element, these could be transformed into other attributes on the element.
Internal SDATA entities
References could be transformed into numeric character references to the appropriate Unicode character; if used in an entity or entities attribute, the entity will have to be made external.
Internal CDATA entities
If used in an ENTITY or ENTITIES attribute, the entity will have to be made external (references to CDATA entities are not part of ESIS).
PI entities
If they contain , they cannot be converted into an XML PI. It could be an application convention that entity references are replaced in PIs. Also if they do not start with a name, they cannot be converted into a well-formed XML PI.
names
An SGML document can have a concrete syntax which allows characters in names that XML does not allow in names.

The following SGML declaration takes advantage of the Extended Naming Rules Technical Corrigendum to ISO 8879, but does not make use of the Web SGML Adaptations Annex:

The following SGML declaration takes advantage of the Web SGML Adaptations Annex to ISO 8879: