When should I use XML attributes and when use XML elements? Several frequently pondered questions of DTD design in SGML have followed the legacy to its offshoot, XML. Regardless of what XML schema language you use, you might find yourself asking:
First I want to comment on two guidelines that I have heard and do not recommend. I have heard Just make everything an element. The reasons given range from Attributes just complicate things to Attributes can stunt extensibility. But if you do not use attributes, you are leaving out a very important aspect of XML's power, and you're probably better off using some delimited text format. I have also heard If it is the sort of material you would expect to display in a browser, use element content. The problem with this guideline is that it encourages people to think of XML content design in terms of presentation, two considerations that should not be mixed. I present a very similar guideline in this article, but I express it in terms of the intent of the content, rather than in terms of presentation. In the rest of this article, I present a set of guidelines that I do recommend when choosing between elements and attributes. Recommended guidelines I have divided these guidelines into a set of principles that I think frame the choice between elements and attributes overall. None of the guidelines are meant to be absolute; use them as rules of thumb and feel free to break the rules whenever your particular needs require it. Principle of core content If you consider the information in question to be part of the essential material that is being expressed or communicated in the XML, put it in an element. For human-readable documents this generally means the core content that is being communicated to the reader. For machine-oriented records formats this generally means the data that comes directly from the problem domain. If you consider the information to be peripheral or incidental to the main communication, or purely intended to help applications process the main communication, use attributes. This avoids cluttering up the core content with auxiliary material. For machine-oriented records formats, this generally means application-specific notations on the main data from the problem-domain. As an example, I have seen many XML formats, usually home-grown in businesses, where document titles were placed in an attribute. I think a title is such a fundamental part of the communication of a document that it should always be in element content. On the other hand, I have often seen cases where internal product identifiers were thrown as elements into descriptive records of the product. In some of these cases, attributes were more appropriate because the specific internal product code would not be of primary interest to most readers or processors of the document, especially when the ID was of a very long or inscrutable format. You might have heard the principle data goes in elements, metadata in attributes. The above two paragraphs really express the same principle, but in more deliberate and less fuzzy language. Principle of structured information If the information is expressed in a structured form, especially if the structure may be extensible, use elements. On the other hand: If the information is expressed as an atomic token, use attributes. Elements are the extensible engine for expressing structure in XML. Almost all XML processing tools are designed around this fact, and if you break down structured information properly into elements, you'll find that your processing tools complement your design, and that you thereby gain productivity and maintainability. Attributes are designed for expressing simple properties of the information represented in an element. If you work against the basic architecture of XML by shoehorning structured information into attributes you may gain some specious terseness and convenience, but you will probably pay in maintenance costs. Dates are a good example: A date has fixed structure and generally acts as a single token, so it makes sense as an attribute (preferably expressed in ISO-8601). Representing personal names, on the other hand, is a case where I've seen this principle surprise designers. I see names in attributes a lot, but I have always argued that personal names should be in element content. A personal name has surprisingly variable structure (in some cultures you can cause confusion or offense by omitting honorifics or assuming an order of parts of names). A personal name is also rarely an atomic token. As an example, sometimes you may want to search or sort by a forename and sometimes by a surname. I should point out that it is just as problematic to shoehorn a full name into the content of a single element as it is to put it in an attribute. Thus:
is not much better than:
I hope to expand on the treatment of people's names in markup in a future article. Principle of readability If the information is intended to be read and understood by a person, use elements. In general this guideline places prose in element content. If the information is most readily understood and digested by a machine, use attributes. In general this guideline means that information tokens that are not natural language go in attributes. In some cases, people can decipher the information being represented but need a machine to use it properly. URLs are a great example: People have learned to read URLs through exposure in Web browsers and e-mail messages, but a URL is usually not much use without the computer to retrieve the referenced resource. Some database identifiers are also quite readable (although established database management best practice discourages using IDs that could have business meaning), but such IDs are usually props for machine processing. For these reasons I recommend putting URLs and IDs in attributes. Principle of element/attribute binding Use an element if you need its value to be modified by another attribute. XML establishes a very strong conceptual bond between an attribute and the element in which it appears. An attribute provides some property or modification of that particular element. Processing tools for XML tend to follow this concept and it is almost always a terrible idea to have one attribute modify another. For example, if you are designing a format for a restaurant menu and you include the portion sizes of items on the menu, you may decide that this is not really important to the typical reader of the menu format so you apply the Principle of core content and make it an attribute. The first attempt is:
Following the Principle of structured information you decide not to shoehorn the portion measurement and units into a single attribute, but instead of using an element, you opt for:
The attribute portion-unit now modifies portion-size , which as I've mentioned is a bad idea. An attribute on the element menu-item should modify that element, and nothing else. The solution is to give in and use an element:
In this case I applied a mix of the Principle of core content and the Principle of readability to the decision to put the value in content and the units in an attribute. This is one of those cases that are less cut and dried, and other schemes might be as suitable as mine. The solution also involves contradicting the original decision to put the portion size into an attribute based on the Principle of core content. This illustrates that sometimes the principles will lead to conflicting conclusions where you'll have to use your own judgment to decide on each specific matter. XML design is a matter for professionals, and if you want to gain value from XML you should be willing to study XML design principles. Many developers accept that programming code benefits from careful design, but in the case of XML they decide it's OK to just do what seems to work . This is a distinction that I have seen lead to real and painful costs down the road. All it takes for you to learn sound XML design is to pay attention to the issues. Examine standard XML vocabularies designed by experts. Take note of your own design decisions and gauge the positive and negative effect each has had on later developments. As you gain experience, your instinct will become the most important tool in making design decisions, and the care you take will pay certain rewards if you find yourself using XML to any significant extent. Resources
|
发帖者
Tim
on
2010年9月17日星期五
|
0
评论
发帖者
Tim
on
|
0
评论
The instance document has all its components bundled together. Likewise, the schema is designed to bundle together all its element declarations. This design represents one end of the design spectrum.
This design has:
Russian Doll Design
This design approach has the schema structure mirror the instance document structure, e.g., declare a Book element and within it declare a Title element followed by an Author element:The instance document has all its components bundled together. Likewise, the schema is designed to bundle together all its element declarations. This design represents one end of the design spectrum.
Salami Slice Design
The Salami Slice design represents the other end of the design spectrum. With this design we disassemble the instance document into its individual components. In the schema we define each component (as an element declaration), and then assemble them together:This design has:
- maximized reuse (there are four reusable components - the Title type, the Name type, the Publication type, and the Book element)
- maximized the potential to hide (localize) namespaces [note how this has been phrased: "maximized the potential ..." Whether, in fact, the namespaces of Title and Author are hidden or exposed, is determined by the elementFormDefault "switch"].
- Design your schema to maximize the potential for hiding (localizing) namespace complexities.
- Use elementFormDefault to act as a switch for controlling namespace exposure - if you want element namespaces exposed in instance documents, simply turn the elementFormDefault switch to "on" (i.e, set elementFormDefault= "qualified"); if you don't want element namespaces exposed in instance documents, simply turn the elementFormDefault switch to "off" (i.e., set elementFormDefault="unqualified").
- Design your schema to maximize reuse.
- Use type definitions as the main form of component reuse.
- Nest element declarations within type definitions.
发帖者
Tim
on
2010年9月12日星期日
|
0
评论
Bluetooth
Bluetooth
你可能没有听说过10世纪丹麦国王哈拉尔德的故事吧?这位国王是个有名的蓝莓行家,酷爱吃蓝莓。所以,他的牙齿被染蓝了,便有“蓝牙”这个绰号。在丹麦语 中,“Bltand”依次对应的英文是“Bluetooth”。蓝牙Logo是哈拉尔德名字首字母的2个符号的组合体。 恰恰有趣的是,首个蓝牙接收器的形状很像牙齿,并且它的颜色就是蓝色。
USB
USB图标本身就是USB1.0规格的部分,它有点类似罗马海神尼普顿的兵器三叉戟。(但你并不能用这个“三叉戟”来叉东西。)USB推广组织把三个叉子头部改成了三角形、正方形和圆形。为什么这么做呢?因为这就表明所有不同的附加的外设都可以用同一种标准。
@
@符号是唯一的荣登(纽约市)现代艺术博物馆之建筑和设计分类的符号。1971年,雷·汤姆林逊把@ 符号引入到邮件地址中,用来分隔用户名和机器名。在汤姆林逊之前的1885年,美国人安德伍德制造的第一台打字机键盘上就有@符号。更往前,@的历史更加 奇特。有资料表明,六世纪的僧侣就已在用@符号。
Power
早在二战期间,工程师用二进制系统来标识电源按钮:1代表“开”,0代表“关”。1973年,国际电工委员会(ICE)把一个“开口处有条线嵌入的圆圈” 粗略地定义为“standbypowerstate(待机电源状态)”,并且一直沿用至今。但是,IEEE认为这个符号非常含糊,故把它定义为 “Power”。
早在二战期间,工程师用二进制系统来标识电源按钮:1代表“开”,0代表“关”。1973年,国际电工委员会(ICE)把一个“开口处有条线嵌入的圆圈” 粗略地定义为“standbypowerstate(待机电源状态)”,并且一直沿用至今。但是,IEEE认为这个符号非常含糊,故把它定义为 “Power”。