Tim Home

XML Elements vs. Attributes

发帖者 Tim on 2010年9月17日星期五 | 0 评论

When should I use XML attributes and when use XML elements?

Several frequently pondered questions of DTD design in SGML have followed the legacy to its offshoot, XML. Regardless of what XML schema language you use, you might find yourself asking:

When do I use elements and when do I use attributes for presenting bits of information?
When do I require an order for elements, and when do I just allow arbitrary order?
When do I use wrapper elements around sequences of similar elements?

In my experience working with users of XML, the first question is by far the most common. In some cases the answer is pretty unambiguous:

If the information in question could be itself marked up with elements, put it in an element.
If the information is suitable for attribute form, but could end up as multiple attributes of the same name on the same element, use child elements instead.
If the information is required to be in a standard DTD-like attribute type such as ID, IDREF, or ENTITY, use an attribute.
If the information should not be normalized for white space, use elements. (XML processors normalize attributes in ways that can change the raw text of the attribute value.)

Unfortunately, design decisions don't often fall into such neat categories. The question remains how to make the right choice in the gray areas. The usual answer is No single answer is right, use your best judgment. But this is not very helpful for those trying to find their feet with XML. True, even experts do not always agree on what to do in certain situations, but this is no reason not to offer basic guidelines for choosing between XML and attributes.
First I want to comment on two guidelines that I have heard and do not recommend. I have heard Just make everything an element. The reasons given range from Attributes just complicate things to Attributes can stunt extensibility. But if you do not use attributes, you are leaving out a very important aspect of XML's power, and you're probably better off using some delimited text format.
I have also heard If it is the sort of material you would expect to display in a browser, use element content. The problem with this guideline is that it encourages people to think of XML content design in terms of presentation, two considerations that should not be mixed. I present a very similar guideline in this article, but I express it in terms of the intent of the content, rather than in terms of presentation.
In the rest of this article, I present a set of guidelines that I do recommend when choosing between elements and attributes.
Recommended guidelines
I have divided these guidelines into a set of principles that I think frame the choice between elements and attributes overall. None of the guidelines are meant to be absolute; use them as rules of thumb and feel free to break the rules whenever your particular needs require it.
Principle of core content
If you consider the information in question to be part of the essential material that is being expressed or communicated in the XML, put it in an element. For human-readable documents this generally means the core content that is being communicated to the reader. For machine-oriented records formats this generally means the data that comes directly from the problem domain. If you consider the information to be peripheral or incidental to the main communication, or purely intended to help applications process the main communication, use attributes. This avoids cluttering up the core content with auxiliary material. For machine-oriented records formats, this generally means application-specific notations on the main data from the problem-domain.
As an example, I have seen many XML formats, usually home-grown in businesses, where document titles were placed in an attribute. I think a title is such a fundamental part of the communication of a document that it should always be in element content. On the other hand, I have often seen cases where internal product identifiers were thrown as elements into descriptive records of the product. In some of these cases, attributes were more appropriate because the specific internal product code would not be of primary interest to most readers or processors of the document, especially when the ID was of a very long or inscrutable format.
You might have heard the principle data goes in elements, metadata in attributes. The above two paragraphs really express the same principle, but in more deliberate and less fuzzy language.

Famous exceptions

I've been careful to point out that my design guidelines are flexible and that you will certainly find reason to make exceptions. You'll also find the need to balance the impetus from different principles. In many examples of standard XML vocabularies, conscious design decisions were made that contradict principles I post in this article. Here is a sampling.

SVG vector paths are expressed as a series of space-delimited numbers in a single attribute, which contradicts the Principle of structured information. This decision was made because SVG had an urgent need for terseness, but it has been criticized because of how it can complicate the processing of SVG formats.
Many formats that allow Cascading Stylesheet (CSS) styles in attributes (such as XHTML 1.0 and XSL-FO) allow you to create complex descriptions of style, mixing multiple facets and units of measure. This contradicts the Principle of structured information. Interestingly, this is no longer a concern in XHTML 2.0 because with that version you can only specify a class token that is mapped to a separate CSS. You no longer have in-line style attributes.

Principle of structured information
If the information is expressed in a structured form, especially if the structure may be extensible, use elements. On the other hand: If the information is expressed as an atomic token, use attributes. Elements are the extensible engine for expressing structure in XML. Almost all XML processing tools are designed around this fact, and if you break down structured information properly into elements, you'll find that your processing tools complement your design, and that you thereby gain productivity and maintainability. Attributes are designed for expressing simple properties of the information represented in an element. If you work against the basic architecture of XML by shoehorning structured information into attributes you may gain some specious terseness and convenience, but you will probably pay in maintenance costs.
Dates are a good example: A date has fixed structure and generally acts as a single token, so it makes sense as an attribute (preferably expressed in ISO-8601). Representing personal names, on the other hand, is a case where I've seen this principle surprise designers. I see names in attributes a lot, but I have always argued that personal names should be in element content. A personal name has surprisingly variable structure (in some cultures you can cause confusion or offense by omitting honorifics or assuming an order of parts of names). A personal name is also rarely an atomic token. As an example, sometimes you may want to search or sort by a forename and sometimes by a surname. I should point out that it is just as problematic to shoehorn a full name into the content of a single element as it is to put it in an attribute. Thus:

<customer>
  <name>Gabriel Okara</name>
  <occupation>Poet</occupation>
</customer>

is not much better than:

<customer name="Gabriel Okara">
  <occupation>Poet</occupation>
</customer>

I hope to expand on the treatment of people's names in markup in a future article.
Principle of readability
If the information is intended to be read and understood by a person, use elements. In general this guideline places prose in element content. If the information is most readily understood and digested by a machine, use attributes. In general this guideline means that information tokens that are not natural language go in attributes.
In some cases, people can decipher the information being represented but need a machine to use it properly. URLs are a great example: People have learned to read URLs through exposure in Web browsers and e-mail messages, but a URL is usually not much use without the computer to retrieve the referenced resource. Some database identifiers are also quite readable (although established database management best practice discourages using IDs that could have business meaning), but such IDs are usually props for machine processing. For these reasons I recommend putting URLs and IDs in attributes.
Principle of element/attribute binding
Use an element if you need its value to be modified by another attribute. XML establishes a very strong conceptual bond between an attribute and the element in which it appears. An attribute provides some property or modification of that particular element. Processing tools for XML tend to follow this concept and it is almost always a terrible idea to have one attribute modify another. For example, if you are designing a format for a restaurant menu and you include the portion sizes of items on the menu, you may decide that this is not really important to the typical reader of the menu format so you apply the Principle of core content and make it an attribute. The first attempt is:

<menu>
  <menu-item portion="250 mL">
    <name>Small soft drink</name>
  </menu-item>
  <menu-item portion="500 g">
    <name>Sirloin steak</name>
  </menu-item>
</menu>

Following the Principle of structured information you decide not to shoehorn the portion measurement and units into a single attribute, but instead of using an element, you opt for:

<menu>
  <menu-item portion-size="250" portion-unit="mL">
    <name>Small soft drink</name>
  </menu-item>
  <menu-item portion-size="500" portion-unit="g">
    <name>Sirloin steak</name>
  </menu-item>
</menu>

The attribute portion-unit now modifies portion-size, which as I've mentioned is a bad idea. An attribute on the element menu-item should modify that element, and nothing else. The solution is to give in and use an element:

<menu>
  <menu-item>
    <portion unit="mL">250</portion>
    <name>Small soft drink</name>
  </menu-item>
  <menu-item>
    <portion unit="g">500</portion>
    <name>Sirloin steak</name>
  </menu-item>
</menu>

In this case I applied a mix of the Principle of core content and the Principle of readability to the decision to put the value in content and the units in an attribute. This is one of those cases that are less cut and dried, and other schemes might be as suitable as mine. The solution also involves contradicting the original decision to put the portion size into an attribute based on the Principle of core content. This illustrates that sometimes the principles will lead to conflicting conclusions where you'll have to use your own judgment to decide on each specific matter.

No substitute for study and experience
XML design is a matter for professionals, and if you want to gain value from XML you should be willing to study XML design principles. Many developers accept that programming code benefits from careful design, but in the case of XML they decide it's OK to just do what seems to work . This is a distinction that I have seen lead to real and painful costs down the road. All it takes for you to learn sound XML design is to pay attention to the issues. Examine standard XML vocabularies designed by experts. Take note of your own design decisions and gauge the positive and negative effect each has had on later developments. As you gain experience, your instinct will become the most important tool in making design decisions, and the care you take will pay certain rewards if you find yourself using XML to any significant extent.

Resources

Check out the famous Cover Pages, which offer several compilations of information on the topic aggregated on these pages: SGML/XML: Elements versus attributes, April 1998 and SGML/XML: Using Elements and Attributes.
Read Effective XML by Elliotte Rusty Harold (Addison-Wesley, 2003). In my experience, the author has an excellent command of XML design and best practices. You can read some of the sections online, including "Store metadata in attributes" and "Make structure explicit through markup", both of which are directly relevant to this article.
Don't miss the any of articles in this series on XML design:
- "Use XML namespaces with care" (developerWorks, April 2004).
- "Element structures for names and addresses" (developerWorks, August 2004).

Ref:http://www.ibm.com/developerworks/xml/library/x-eleatt.html

3 different XML design styles: Russian Dolls,Salami Slice,and Venetian Blind

发帖者 Tim on | 0 评论

Russian Doll Design

This design approach has the schema structure mirror the instance document structure, e.g., declare a Book element and within it declare a Title element followed by an Author element:
The instance document has all its components bundled together. Likewise, the schema is designed to bundle together all its element declarations. This design represents one end of the design spectrum.

Salami Slice Design

The Salami Slice design represents the other end of the design spectrum. With this design we disassemble the instance document into its individual components. In the schema we define each component (as an element declaration), and then assemble them together:

<xsd:element name="Title" type="xsd:string"/>

    <xsd:element name="Author" type="xsd:string"/>

    <xsd:element name="Book">
        <xsd:complexType> 
            <xsd:sequence> 
                <xsd:element ref="Title"/> 
                <xsd:element ref="Author"/>
            </xsd:sequence> 
        </xsd:complexType>

</xsd:element>

Note how the schema declared each component individually (Title, and Author) and then assembled them together (by ref'ing them) in the creation of the Book component.
These two designs represent opposite ends of the design spectrum.
To understand these designs it may help to think in terms of boxes, where a box represents an element or type:

The Russian Doll design corresponds to having a single box, and it has nested within it boxes, which in turn have boxes nested within them, and so on. (boxes within boxes, just like a Russian Doll!)
The Salami Slice design corresponds to having many separate boxes which are assembled together (separate boxes combined together, just like Salami slices brought together in a sandwich!)

Venetian Blind Design

With this design approach we disassemble the problem into individual components, as the Salami Slice design does, but instead of creating element declarations, we create type definitions. Here's what our example looks like with this design approach:

<xsd:simpleType name="Title">
        <xsd:restriction base="xsd:string">
            <xsd:enumeration value="Mr."/> 
            <xsd:enumeration value="Mrs."/> 
            <xsd:enumeration value="Dr."/>
        </xsd:restriction>
    </xsd:simpleType>

    <xsd:simpleType name="Name">
        <xsd:restriction base="xsd:string">
            <xsd:minLength value="1"/> 
        </xsd:restriction>
    </xsd:simpleType>

    <xsd:complexType name="Publication">
        <xsd:sequence>
            <xsd:element name="Title" type="Title"/> 
            <xsd:element name="Author" type="Name"/>
        </xsd:sequence>
    </xsd:complexType>

    <xsd:element name="Book" type="Publication"/>

This design has:

maximized reuse (there are four reusable components - the Title type, the Name type, the Publication type, and the Book element)
maximized the potential to hide (localize) namespaces [note how this has been phrased: "maximized the potential ..." Whether, in fact, the namespaces of Title and Author are hidden or exposed, is determined by the elementFormDefault "switch"].

The Venetian Blind design espouses these guidelines ...

Design your schema to maximize the potential for hiding (localizing) namespace complexities.
Use elementFormDefault to act as a switch for controlling namespace exposure - if you want element namespaces exposed in instance documents, simply turn the elementFormDefault switch to "on" (i.e, set elementFormDefault= "qualified"); if you don't want element namespaces exposed in instance documents, simply turn the elementFormDefault switch to "off" (i.e., set elementFormDefault="unqualified").
Design your schema to maximize reuse.
Use type definitions as the main form of component reuse.
Nest element declarations within type definitions.

Ref:http://www.xfront.com/GlobalVersusLocal.html

有趣的计算机图标

发帖者 Tim on 2010年9月12日星期日 | 0 评论

Bluetooth

你可能没有听说过10世纪丹麦国王哈拉尔德的故事吧？这位国王是个有名的蓝莓行家，酷爱吃蓝莓。所以，他的牙齿被染蓝了，便有“蓝牙”这个绰号。在丹麦语中，“Bltand”依次对应的英文是“Bluetooth”。蓝牙Logo是哈拉尔德名字首字母的2个符号的组合体。恰恰有趣的是，首个蓝牙接收器的形状很像牙齿，并且它的颜色就是蓝色。

USB

USB图标本身就是USB1.0规格的部分，它有点类似罗马海神尼普顿的兵器三叉戟。（但你并不能用这个“三叉戟”来叉东西。）USB推广组织把三个叉子头部改成了三角形、正方形和圆形。为什么这么做呢？因为这就表明所有不同的附加的外设都可以用同一种标准。

@符号是唯一的荣登（纽约市）现代艺术博物馆之建筑和设计分类的符号。1971年，雷·汤姆林逊把@ 符号引入到邮件地址中，用来分隔用户名和机器名。在汤姆林逊之前的1885年，美国人安德伍德制造的第一台打字机键盘上就有@符号。更往前，@的历史更加奇特。有资料表明，六世纪的僧侣就已在用@符号。

Power
早在二战期间，工程师用二进制系统来标识电源按钮：1代表“开”，0代表“关”。1973年，国际电工委员会（ICE）把一个“开口处有条线嵌入的圆圈” 粗略地定义为“standbypowerstate（待机电源状态）”，并且一直沿用至今。但是，IEEE认为这个符号非常含糊，故把它定义为 “Power”。

My New Blogger,Cheers!!!

发帖者 Tim on | 0 评论

Tim Home

Pages

Lorem Ipsum/DOL Differet Solmena