When should I use XML attributes and when use XML elements? Several frequently pondered questions of DTD design in SGML have followed the legacy to its offshoot, XML. Regardless of what XML schema language you use, you might find yourself asking:
First I want to comment on two guidelines that I have heard and do not recommend. I have heard Just make everything an element. The reasons given range from Attributes just complicate things to Attributes can stunt extensibility. But if you do not use attributes, you are leaving out a very important aspect of XML's power, and you're probably better off using some delimited text format. I have also heard If it is the sort of material you would expect to display in a browser, use element content. The problem with this guideline is that it encourages people to think of XML content design in terms of presentation, two considerations that should not be mixed. I present a very similar guideline in this article, but I express it in terms of the intent of the content, rather than in terms of presentation. In the rest of this article, I present a set of guidelines that I do recommend when choosing between elements and attributes. Recommended guidelines I have divided these guidelines into a set of principles that I think frame the choice between elements and attributes overall. None of the guidelines are meant to be absolute; use them as rules of thumb and feel free to break the rules whenever your particular needs require it. Principle of core content If you consider the information in question to be part of the essential material that is being expressed or communicated in the XML, put it in an element. For human-readable documents this generally means the core content that is being communicated to the reader. For machine-oriented records formats this generally means the data that comes directly from the problem domain. If you consider the information to be peripheral or incidental to the main communication, or purely intended to help applications process the main communication, use attributes. This avoids cluttering up the core content with auxiliary material. For machine-oriented records formats, this generally means application-specific notations on the main data from the problem-domain. As an example, I have seen many XML formats, usually home-grown in businesses, where document titles were placed in an attribute. I think a title is such a fundamental part of the communication of a document that it should always be in element content. On the other hand, I have often seen cases where internal product identifiers were thrown as elements into descriptive records of the product. In some of these cases, attributes were more appropriate because the specific internal product code would not be of primary interest to most readers or processors of the document, especially when the ID was of a very long or inscrutable format. You might have heard the principle data goes in elements, metadata in attributes. The above two paragraphs really express the same principle, but in more deliberate and less fuzzy language. Principle of structured information If the information is expressed in a structured form, especially if the structure may be extensible, use elements. On the other hand: If the information is expressed as an atomic token, use attributes. Elements are the extensible engine for expressing structure in XML. Almost all XML processing tools are designed around this fact, and if you break down structured information properly into elements, you'll find that your processing tools complement your design, and that you thereby gain productivity and maintainability. Attributes are designed for expressing simple properties of the information represented in an element. If you work against the basic architecture of XML by shoehorning structured information into attributes you may gain some specious terseness and convenience, but you will probably pay in maintenance costs. Dates are a good example: A date has fixed structure and generally acts as a single token, so it makes sense as an attribute (preferably expressed in ISO-8601). Representing personal names, on the other hand, is a case where I've seen this principle surprise designers. I see names in attributes a lot, but I have always argued that personal names should be in element content. A personal name has surprisingly variable structure (in some cultures you can cause confusion or offense by omitting honorifics or assuming an order of parts of names). A personal name is also rarely an atomic token. As an example, sometimes you may want to search or sort by a forename and sometimes by a surname. I should point out that it is just as problematic to shoehorn a full name into the content of a single element as it is to put it in an attribute. Thus:
is not much better than:
I hope to expand on the treatment of people's names in markup in a future article. Principle of readability If the information is intended to be read and understood by a person, use elements. In general this guideline places prose in element content. If the information is most readily understood and digested by a machine, use attributes. In general this guideline means that information tokens that are not natural language go in attributes. In some cases, people can decipher the information being represented but need a machine to use it properly. URLs are a great example: People have learned to read URLs through exposure in Web browsers and e-mail messages, but a URL is usually not much use without the computer to retrieve the referenced resource. Some database identifiers are also quite readable (although established database management best practice discourages using IDs that could have business meaning), but such IDs are usually props for machine processing. For these reasons I recommend putting URLs and IDs in attributes. Principle of element/attribute binding Use an element if you need its value to be modified by another attribute. XML establishes a very strong conceptual bond between an attribute and the element in which it appears. An attribute provides some property or modification of that particular element. Processing tools for XML tend to follow this concept and it is almost always a terrible idea to have one attribute modify another. For example, if you are designing a format for a restaurant menu and you include the portion sizes of items on the menu, you may decide that this is not really important to the typical reader of the menu format so you apply the Principle of core content and make it an attribute. The first attempt is:
Following the Principle of structured information you decide not to shoehorn the portion measurement and units into a single attribute, but instead of using an element, you opt for:
The attribute portion-unit now modifies portion-size , which as I've mentioned is a bad idea. An attribute on the element menu-item should modify that element, and nothing else. The solution is to give in and use an element:
In this case I applied a mix of the Principle of core content and the Principle of readability to the decision to put the value in content and the units in an attribute. This is one of those cases that are less cut and dried, and other schemes might be as suitable as mine. The solution also involves contradicting the original decision to put the portion size into an attribute based on the Principle of core content. This illustrates that sometimes the principles will lead to conflicting conclusions where you'll have to use your own judgment to decide on each specific matter. XML design is a matter for professionals, and if you want to gain value from XML you should be willing to study XML design principles. Many developers accept that programming code benefits from careful design, but in the case of XML they decide it's OK to just do what seems to work . This is a distinction that I have seen lead to real and painful costs down the road. All it takes for you to learn sound XML design is to pay attention to the issues. Examine standard XML vocabularies designed by experts. Take note of your own design decisions and gauge the positive and negative effect each has had on later developments. As you gain experience, your instinct will become the most important tool in making design decisions, and the care you take will pay certain rewards if you find yourself using XML to any significant extent. Resources
|
发帖者
Tim
on
2010年9月17日星期五
|
0
评论
0 评论:
发表评论