XML, VoiceXML, XLink, XHTML, XBRL, XForm,
XSLT, RDF and
Semantic Web Watch
Bob Jensen at Trinity University
This is a threaded discussion about meta languages and extensions of SGML. I cannot overly stress how much these newer technologies will impact upon e-Commerce, education, and the Web in general.
The document below is a disorganized collection of threads on XML and related topics. For a more organized introduction to these topics, go to Jensen's Overview and Timeline of OLAP, GML, SGML, HTML, XML, RDF, and XBRL at http://www.trinity.edu/rjensen/XBRLandOLAP.htm
WEB
TIMELINE
Hypertext ---> PC ---> GUI,Mouse ---> GML,SGML --->Internet
--->Hypermedia --->HTML,HTTP,WWW --->
DYNAMIC WEB TIMELINE
CGI,Java,JavaScript,DHTML,ActiveX,ASP ---> XML --->RDF ---> OLAP
---> HBRL
Accounting Relational Databases Versus XML Databases
Selected Software Alternatives for XML Authoring
OLAP Online Analytical Processing and Pivot Tables
Summary of XBRL and Business Reporting on the Internet
Frequently Asked Questions (FAQs)
What kind of language is XSLT?
Related Documents
Extended Summary of OLAP http://www.trinity.edu/rjensen/XBRLandOLAP.htm
Extended Summary of XBRL http://www.trinity.edu/rjensen/XBRLandOLAP.htm
Extended Summary of HTML http://www.trinity.edu/rjensen/XBRLandOLAP.htm
Extended Summary of XML http://www.trinity.edu/rjensen/XBRLandOLAP.htm
Bob Jensen's Technology Glossary
Working Paper 260: Network Databases: Past, Present, and Future
| Bob Jensen's Home Page | XML and RDF Overview | XML Software Review | Table of Contents |
| Offline References | Online References | Technology Glossaries | XML FAQs |
Overview of XML and RDF --- The Next Big Things on the Web
A March 15, 2001 message from Neil Hannon recommends the following three references:
The best place to start (for learning about XML) is a Scientific American article, http://www.sciam.com/1999/0599issue/0599bosak.html written by Jon Bosak and Tim Bray. Bosak and Bray were on the original XML working group. The article is short, readable and lays out the basic concepts of XML.
Next, try my XML resource page, located at http://web.bryant.edu/~nhannon/xbrl/xml.htm . At that site, I have gathered several articles that focus on the users side to XML. For books, I recommend XML, A Manager's Guide by Kevin Dick (Addison Wesley).
Neal Hannon [nhannon@HOME.COM]
Avoiding Information Overload: Knowledge Management on the Internet --- http://www.jisc.ac.uk/techwatch/reports/tsw_02-02.html
Keywords: search, knowledge management, XML, metadata, RDF, ontology, agent
It is estimated that there are over two billion Web pages, and thousands of newsgroups and forums, on the Internet - covering virtually every topic imaginable. However, many users find that searching the Internet can be a time consuming and tedious process. Even experienced searchers sometimes run into difficulties. To fully benefit from the potential opportunities of the Internet, both Web site developers and users need to be aware of the tools and techniques for managing and retrieving online knowledge.
This has driven the development of improved search and information retrieval systems. However, we now need sophisticated information extraction (and/or summary) capabilities to present the user only with the information they need, rather than a large set of relevant documents to read.
Search service providers, Web portals, and amalgamations of community Web sites could all help their users to benefit today, just by adopting the current generation of knowledge management systems, particularly those with effective information extraction capabilities.
Metadata has a very useful role to play, but it has limitations with regard to information extraction.
One of the key opportunities of the XML initiative is to allow structure and (indirectly) "meaning" to be embedded into the content of the resource itself. XML provides the much needed data structure for computer-to-computer interaction. The availability of good user-friendly, and "intelligent", tools will be critical in persuading the wider community to adopt XML as an alternative to HTML.
It is probably reasonable to state that the current generation of knowledge management systems is an interim measure, to be superseded by AI systems in the long-term. Such systems will probably be able to process natural language and XML encoded content.
The success of Internet based knowledge management, and the Semantic Web, will require the development and integration of various data standards, ontology definitions, and knowledge management and agent technologies. It will take a concerted and significant effort to get there. The likely longer-term benefits are much more effective Internet searches and smart information extraction services, which present the user with concise relevant extracts.
In the meantime, perhaps we should also think about how authors represent knowledge and present information, and how users apply knowledge, in a more structured and meaningful way.
From Ecommerce Discussion Digest on August 1, 2001
What is XML?? A technical definition of XML or Extensible Markup Language is "a document markup language for defining structured information" --- http://html.about.com/library/weekly/aa091500a.htm#markup
http://www.compuware.com/products/fileaid/cs/ --- COMPUWARE's product supports it?
http://www.xmlspy.com / --- XML development environment?
http://www.netwind.com/html/xml.htm l --- XML Development Courses?
http://www.online-learning.com/ --- XML Online (Internet) Courses?
http://www.learnkey.com/lkweb/Products/XML/index.asp --- Courses from "Industry Experts"?
http://www.citrixiforum.com/iForumGuest/cds/host.dll - Forum Topics?
http://www.netwind.com/html/xml.html --- Training Vedios and CDs?
Paul Adams works up a lather over the Simple Object Access Protocol, a fast, easy, XML-based way for Web aps to talk to each other --- http://hotwired.lycos.com/webmonkey/02/08/index0a.html
Deconstructing Babel: XML and application integration XML may not yet be a
true "silver bullet," but it can be used to great effect in
integration projects if IT managers create a detailed plan that can overpower
its weaknesses.
"Deconstructing Babel: XML and application integration," By Henry
Balen, Application Development Trends, December 2000 --- http://www.adtmag.com/
| XML: A brief description | |
| The eXtensible
Markup Language, or XML, came out of the world of the Standard
Generalized Markup Language (SGML). Since its introduction a little
more than three years ago, XML has spawned a set of technologies that
allow users to manipulate XML documents programmatically and use style
sheets to perform transformations.
Initially, XML was developed to overcome the shortcomings of HTML, a markup language containing stylistic information. The aim of XML's developers was to create a language that was easy to use over the Internet, supported by a wide variety of applications, compatible with SGML and legible to humans. Like its ancestor, SGML, XML separates content from style. A typical XML document is hierarchical. It is made up of elements defined by tags. The code below is an example of a simple XML document. A document type definition (DTD), or XML Schema, is used to define the structure of a document. It was originally envisioned that the presentation of the information within an XML document could be viewed with an XML browser and associated style sheet. Now it is not uncommon for the style sheet to be applied on the server side and the XML translated into HTML.
An XML document is referred to as well formed if it conforms to the XML standard, and correct (or valid) if it complies with a DTD or Schema. At the core of any XML application is an XML parser. All XML parsers will check that the documents they receive are well formed, and most can also check to see if the document is valid. A Simple API for XML (SAX) parser has become the de facto standard for event-driven parsers. Most XML parsers are either event-driven or produce an in-memory Document Object Model (DOM) instance of the document. The one you use depends on the application and memory requirements. Producing a DOM tree requires more memory, but it can provide greater programmatic flexibility. SAX may be suitable for applications that need smaller memory footprints, and it can process the XML document as a stream of events. — Henry Balen |
XML has become the lingua franca for inter-application communication. Using XML, all messages sent between applications consist of self-describing text. This makes the messages easily understandable by both humans and machines, although it does not supply an efficient packaging of the message. (XML messages can be considerably larger than a binary representation of the same information.)
There are three aspects of inter-application communication:
Transport—how to get information across the wire; Protocol—how to package the information sent across the wire; and Message—the information itself. The transport is usually a lower level network standard such as TCP/IP. Inter-process communications standards, such as CORBA, DCE and DCOM, have their own protocols that sit on top of such transports.
The protocol used depends on the communication mechanism. Standards may use different protocols to communicate: CORBA uses IIOP, while electronic mail uses SMTP. Each of these protocols allows you to package a message, specify a destination and get the message to the designated location. In protocols that support remote method invocation (RMI), the destination can consist of an object reference and method.
With each of these protocols, the user defines the message that is sent across the wire. In the case of CORBA, DCE, DCOM and so on, the message is defined using an Interface Definition Language (IDL). In E-mail and message-oriented middleware (MOM) it can be more fluid. No matter what you use, there is an agreement between the sender and receiver about the meaning of the message. The meaning is not transferred with the message.
So why use XML? In XML, documents contain meta-information about the information being transmitted, and can be extended easily. However, XML is less efficient than transmitting the information using a binary protocol. One advantage, though, is that humans and computers can both read the document.
To overcome the communication problem, the application can be enabled to send and receive information in the form of XML. This can be done independent of protocol, and if the meaning is agreed upon between the applications or organizations, then you just need to get the package to its intended destination. How it gets there is up to you. Of course, in these days of the Internet, the HTTP protocol is a natural choice. There are business domain-specific XML vocabularies under development.
Application integration From the point of view of an application, there are various points of integration: data store, APIs or components, and protocol. The point of integration used depends on the nature of the application. If integration means the ability to speak XML, then you will need to acquire or build adapters for the point of integration. These adapters are responsible for getting information in and out of the application, and performing any necessary transformations along the way.
If the integration involves the sharing of information, you may want to integrate at the level of the data store. Assuming you have an existing database containing the information you want to share, your integration adapter is responsible for translating from a query's result set to an XML document. Conversely, when the application receives information in the form of XML, the adapter performs a reverse translation and maps the document elements to the appropriate database entities.
Oracle Corp., Redwood Shores, Calif., sells a relational database with a degree of XML support. XML is either mapped as just described, or the database's hybrid capabilities can store XML natively. The SQL syntax has been extended with an XML Query language.
Object or network databases may provide a more natural mapping for XML to the database's representation. A persistent Document Object Model (DOM) mechanism can preserve the structure of the XML document. You should be aware that while an XML document provides a good way in which to represent information, it is not an application domain model.
In addition, some products are being marketed as XML databases. eXcelon Corp. (formerly Object Design Inc.), Burlington, Mass., has re-purposed its object database to handle XML. Conversely, there are some products, such as Tamino from Software AG, Darmstadt, Germany, that were built from the ground up to handle just XML. While each product provides an XML Query language, it has not been standardized. The World Wide Web Consortium (W3C) is currently working on a standard for XML Queries, which I expect most vendors will adopt.
Of course, most existing data is kept in hierarchical or relational databases, and you cannot ignore this if you want to integrate at the database level. If you are in this camp, take a look at tools that help with the translation to and from XML.
Integration at the API level can be achieved through handcrafted adapters. Using an off-the-shelf XML parser, an adapter can be constructed that will translate from the received XML to an object model or function/method invocation. Similarly, you can transform information from the application to an XML document for transmittal. If the application supports one of the component models, you may be able to acquire an adapter that implements a bridge to the world of XML. If you are using one of the industry XML schemas, however, you will most likely have to code a transformation; an XSL Transformations (XSLT) processor is useful. With XSLT, you can use an XML dialect to define the transformation rules.
When the application already utilizes middleware, such as MOM or CORBA, then the adapter provides a gateway. This gateway can receive the XML messages, decide which components need to be notified, and perform the necessary translation. Commercial implementations of these gateways, such as CapeConnect from Cape Clear Software, Dublin, Ireland, are starting to ship. These XML brokers use XML for the content of the message and protocol to specify the destination. The W3C is working on an XML RPC mechanism standard. One submission, first promoted by Microsoft and now gaining wide support, is SOAP.
SOAP is a lightweight XML protocol for the exchange of information. It is probably the leading contender for adoption by the W3C. SOAP can provide synchronous and asynchronous mechanisms to send requests between applications using a variety of protocols. Robust security and transactional capabilities still need to be added to the SOAP protocol.
CORBA middleware users may find it interesting that the Object Management Group (OMG) has put out a call for proposals for a SOAP–CORBA mapping. Along with work on XML value types for CORBA, this can provide a natural basis for XML enabling the CORBA infrastructure.
XML Standards In Effect or In Process Much of the material in the following table was derived from the OASIS Cover Pages, maintained by Robin Cover. These pages are a tremendous resource for anyone wanting to keep track of the rapidly changing XML marketplace. Also, you can get connected with XML developers at Cisco’s XML community at www.hotdispatch.com/cisco-ip-telephony .
All links were visited and supplementary material added in December, 2000 --- http://www.stratvantage.com/directories/xmlstandards.htm
Year 2000: Important updates on XML at http://www.w3.org/TR/xhtml1/
"XML's Grand Schema XML Schema Language is a powerful feature that can
be used to validate data in myriad ways, and save you time in the process"
by Yasser Shohoud at http://www.xmlmag.com/upload/free/features/xml/2000/03sum00/ys0300/ys0300.asp
This is a good review article dated in Summer 2000.
What's on Microsoft's wish list today? From Newsweek, April 17, 2000, pg. 43. You can read the entire article online at http://newsweek.com/nw-srv/printed/us/bz/a18441-2000apr9.htm
The current rubric for this effort is "Next Generation Windows Services," with an emphasis on that final word. The Microsoft vision is to replace the bulk of its software with a collection of dynamic "services" that makes it easy for customers to access and manipulate information spread out over the Web. In Microsoft's telling, the Web you know and love is severely limited: you can view pages but can't really fool around with the information it offers. By making use of a recent standard for creating Web pages called XML, however, it's possible to use that data as smoothly as you can massage the numbers in your own little spreadsheet at home. A whole new set of possibilities open where minutiae stored in the bowels of Web-connected databases get integrated into your life. Want to travel? Your personal calendar could take into account the weather in destinations you're scheduled to visit, as well as whether seats remain for discount fares on your favorite airline. And if your stockholdings increase, you may automatically upgrade your hotel reservation to a suite. Another benefit of XML is that by unhooking data from a fixed page view, it can effortlessly display the same figures, facts and trivia in devices ranging from mobile phones to e-books.
An April 7, 2000 draft of the the WC3's XML Schema Part 0: Primer is available at http://www.w3.org/TR/xmlschema-0/ For later versions, go to http://www.w3.org/TR/xmlschema-0/
One of the best articles on XML with a minimum of techie jargon, is entitled "Is XML the answer? Depends on the Question?" by Michael Goulde in Application Development Trends, October 1999, pp. 21-22. The online version is at http://www.adtmag.com/Pub/oct99/d9910xml.htm
One of the reasons XML has captured so much interest so quickly (Version 1.0 of the XML specification was released in February 1998) is that it represents a parsimonious solution to a wide variety of problems. There are three sets of users who have a very high level of interest in XML. The first group includes Webmasters and other designers of Web-based information systems who use HTML to mark up information for presentation, but have no way to structure the information they send to browsers. By providing structure to the unstructured Web data in a standard way, a Web query can deliver a much more useful set of results, increasing the value of the information.
The second group of users have toiled for years with the Standard Generalized Markup Language (SGML) to create structured documents such as training manuals and technical documentation. Although HTML is derived from SGML, SGML in general is not well suited to the Web environment because it is extremely complex -- something that has also affected its universal adoption. XML is an SGML derivative that is not only easier to use on the Web, it has garnered wider adoption. These users have also become very active in World Wide Web Consortium (W3C) working groups that are hammering out additional specifications and standards to ensure that XML meets many of the same application requirements as SGML.
The third set of users fascinated by XML are a set who were not originally targeted by the W3C's XML efforts. But very early on, application developers building distributed applications -- and faced with difficult challenges around application integration and interoperability -- saw XML as a way to free their applications from the tyranny of over-the-wire binary formats that made it impossible to link applications together in real time. These developers, many from the Java community, but equally as many using Microsoft tools, quickly realized that they could use XML syntax in their messages and, because of the self-describing nature of XML documents, applications could exchange data without having to be explicitly written or compiled to do so. Freedom! This group has now expanded to include developers who want to extend EDI, link networks of suppliers and customers, create dynamic marketplaces, and perform other heretofore impossible tasks over the Web.
The problem with networking of databases at the present time is that systems are proprietary and lack standards for efficient and effective computing around the world. Companies would like their particular products to dominate e-commerce, but that just is not going to happen because of emerging standards for networking of databases. The lead in standard setting is being taken by the World Wide Web Consortium (W3C).
Tim Berners Lee led a team of physicists who invented HTML scripting and the HTTP protocol. This creator of the WWW says theres a new revolution on the horizon for the Internet and the best way to deal with it is the Resource Description Framework (RDF). RDF will be of monumental importance to the 21st Century Intranet and millions of intranets. It will be implemented largely through XML extensive markups to HTML. XML will become a popular way of putting databases on networks. XML is already supported by leading browsers such as Microsoft's Internet Explorer and Netscape's Navigator. In reality, XML is a nested object-oriented component structure in which documents and databases can be broken into object components that can be edited, divided, and re-assembled. The analogy is that the component structure is like adding nested sections and chapters into the (hidden) table of contents of a document or database. For example see the POET Content Management Suite at http://www.poet.com .
A good place to start learning is "XML for the Absolute Beginner" at www.javaworld.com/javaworld/jw-04-1999/jw-04-XML.html. I might add the following online article entitled "XML Gains Ground: Vendors pledge support as XML stands poised to become a universal format for data exchange" at http://www.informationweek.com/725/XML.htm .
Interested in XML? Sign up for a free weekly email full of XML news, features, downloads and reviews. http://www.zdnet.com/enterprise/lists/xml/subscribe.html
Just about every recent technology magazine and journal carries at least one article about the looming XML and RDF. My top recommendation, apart from my own overview mentioned above, is entitled "XML: The Last Silver Bullet" by Jack Vaughan in Application Development Trends, April 1999, 24-30. He contends that "coming as it does on the heels of the Web's great success (HTML), XML is viewed by some as having a far broader impact." This is a nice summary article of the history of XML (it only started in 1996) and XML's tremendous future. Vaughn also discusses RDF. The online version of this article is at http://www.adtmag.com/pub/apr99/f04eaix0499.htm
The first step to understanding RDF is to distinguish between data and metadata. Metadata tags in documents and databases provide "data about data" like unseen genes provide data about body parts. One of the drawbacks of HTML is that HTML tags relate only symbols rather than attributes of what the symbols depict. For example, HTML tags tell us how to display the word "eyes" in a web document but there are no tags related to attributes such as eye color, eye size, vision quality, and susceptibility to various eye diseases.
For example, HTML tags relate only to formatting and linking tags on words red and purple appearing in a document. HTML tags do not disclose that both words depict colors, because HTML does not associate words with meanings. Metadata, on the other hand, attaches meanings to the data by attaching hidden attribute tags. For example, attached to the word "petal" might be an invisible tag that records information that the petal has color coded numbers for color hue and color saturation for rose petals. When any petal's invisible tags are read in a meta search engine, it would be possible to identify types of roses having a range of hue and saturation commonalties. Poppies would be excluded because they do not have rose tags. Red herrings (a term for false leads in a mystery) would be excluded because they do not have a tagged attribute for color.
In a sense, metadata is analogous to genetic coding of a living organism. Attributes in hidden tags become analogous to attributes coded into genes that determine the color of a flower's petals, degree of resistance to certain diseases, etc. If we knew the genetic "metadata" code of all flowering plants, we could quickly isolate the subsets of all known flowering plants having red petals or resistance to a particular plant disease. In botany and genetics, the problem lies is discovering the metadata codes that nature has already programmed into the genes. In computer documents and databases, the problem is one of programming in the metadata codes that will conform to a world wide standard. That standard will most likely be the RDF standard that is currently being developed by the World Wide Web Consortium (W3C) having Tim Berners-Lee as its current Director.
The examples given by me above are gross simplifications of text tagging that will actually take place under RDF. RDF works in a more complicated fashion that will be much more efficient for meta searches. The core of RDF will be its "RDF Schema" briefly described below:
This specification will be followed by other documents that will complete the framework. Most importantly, to facilitate the definition of metadata, RDF will have a class system much like many object-oriented programming and modeling systems. A collection of classes (typically authored for a specific purpose or domain) is called a schema. Classes are organized in a hierarchy, and offer extensibility through subclass refinement. This way, in order to create a schema slightly different from an existing one, it is not necessary to "reinvent the wheel" but one can just provide incremental modifications to the base schema. Through the sharability of schemas RDF will support the reusability of metadata definitions. Due to RDF's incremental extensibility, agents processing metadata will be able to trace the origins of schemata they are unfamiliar with back to known schemata and perform meaningful actions on metadata they weren't originally designed to process. The sharability and extensibility of RDF also allows metadata authors to use multiple inheritance to "mix" definitions, to provide multiple views to their data, leveraging work done by others. In addition, it is possible to create RDF instance data based on multiple schemata from multiple sources (i.e., "interleaving" different types of metadata). Schemas may themselves be written in RDF; a companion document to this specification, [RDF Schema], describes one set of properties and classes for describing RDF schemas. (Emphasis added).
World Wide Web Consortium (W3C)
http://web1.w3.org/TR/REC-rdf-syntax/
The term "metadata" is not synonymous with RDF. There were various metadata systems before RDF was on the drawing boards. Microsoft's Channel Definition Format (CDF) used in "Web Push Channels" and Netscape's Meta Content Framework (MCF) preceded RDF. These technologies describe information resources in a manner somewhat similar to RDF and can be used to filter web sites and web documents such as filtering pornography and violence from viewing. Metadata systems can be used to channel inflows of desired or undesired web information. CDF, for example, carries information not read on computer screens that perform metadata tasks.
RDF resources are built upon a foundation of Uniform Resource Identifiers (URIs) that are described at http://www.ietf.org/internet-drafts/draft-fielding-uri-syntax-04.txt . The metadata structure in RDF has the following components described on Page 4 of http://web1.w3.org/TR/REC-rdf-syntax/
Resources
All things being described by RDF expressions are called resources. A resource may be an entire
Web page; such as the HTML document "http://www.w3.org/Overview.html" for example. A
resource may be a part of a Web page; e.g., a specific HTML or XML element within the
document source. A resource may also be a whole collection of pages; e.g., an entire Web site. A
resource may also be an object that is not directly accessible via the Web; e.g., a printed book.
Resources are always named by URIs plus optional anchor ids. Anything can have a
URI; the extensibility of URIs allows the introduction of identifiers for any entity imaginable.
Properties
A property is a specific aspect, characteristic, attribute, or relation used to describe a resource.
Each property has a specific meaning, defines its permitted values, the types of resources it can
describe, and its relationship with other properties. This document does not address how the
characteristics of properties are expressed; for such information, refer to the RDF Schema
specification).Statements
A specific resource together with a named property plus the value of that property for that resource
is an RDF statement. These three individual parts of a statement are called, respectively, the
subject, the predicate, and the object. The object of a statement (i.e., the property value) can be
another resource or it can be a literal; i.e., a resource (specified by a URI) or a simple string or
other primitive datatype defined by XML. In RDF terms, a literal may have content that is XML
markup but is not further evaluated by the RDF processor.
A good place to begin reading about RDF is at http://web1.w3.org/TR/REC-rdf-syntax/ .
The most likely scripting codes will be XML, although RDF can be used in other scripting systems. The popular HTML and the emerging HTML are subsets of the GML text scripting conceived in 1969 by IBM researchers depicting Generalized Markup Languages (and not-so-coincidentally the lead researchers were named Goldfarb, Mosher, and Lorie). Between 1978 and 1987, Charles F. Goldfarb led the team that developed the SGML Standard GML that became International Standard ISO 8879. In 1990, Tim Berners-Lee led a team of particle physicists that invented the World Wide Web rooted in the rule-based text scripting markup innovations of SGML. The World Wide Web is comprised of all web documents marked up in scripts known as Hypertext Markup Language (HTML) scripts. SGML is tremendously powerful but inefficient and complex. HTML is marvelously simple but not very powerful. In 1996, Jon Bosak of Sun Microsystems spearheaded the development of the XML standard to lend power, efficiency, cross-platform standards, and simplicity to the networking of databases on the Internet. At the time of this writing, the world is converging upon an important standard known as RDF (Resource Description Framework) rooted in XML that will be the biggest 21st Century thing to hit the Internet since HTML hit the Internet in 1991.
HTML was extremely limited in its early versions. Several early versions' rigid and limited document formatting had simplistic appeal in their limited number of scripting "tags." Early versions of HTML, however, lacked styles (italics, underlines, indentations, tables, etc.) that authors prefer in documents. In subsequent versions, HTML developers invented cascading style sheets that expanded the formatting and font capabilities at the expense of more complex scripts for HTML tags. But HTML software "editors" such as HotMetal Pro, Page Mill, FrontPage, and many others took over the scripting chores. It became as easy to produce World Wide Web documents as it is to use a word processor such as Microsoft Word and Word Perfect. In fact, newer versions of word processors added options to automatically embed HTML scripts in documents.
But HTML did not, and still does not, allow authors to "extend" or "customize" tags for application-specific tasks. In later versions of HTML, tags were invented for creating tables. However, HTML tables are not dynamic in the sense of a table in a relational database or object oriented database. For example, it is not possible to perform simple arithmetic operations to fill table cells or do any other types of "computing" apart from formatting, viewing, and linking text and graphics. It was and is still not possible to search and retrieve subsets of tables without downloading entire HTML documents containing the tables. Common database software operations such as writing queries and revision of records within networked tables are not possible in HTML tables.
The curses of HTML was that HTML tags were not "extensible" and could not otherwise be customized for application-specific tasks such as simple database operations. Netscape invented JavaScript to allow developers to embed customized scripts to overcome the limitations of HTML. Dynamic HTML known as "DHTML" was invented for certain types of customizations. Web browsers such as Microsoft's Internet Explorer and Netscape's Navigator will read JavaScripts and DHTML. But these "extensions" of HTML have some very frustrating limitations. The major limitation is that many lines of script must be written to perform rather simple tasks. To write JavaScripts to perform database operations boggle the mind. The simplicity of HTML, thereby, gives way to coding complexities that virtually require that document authors first become computer programmers. Even computer programmers find JavaScript and DHTML to be inefficient and ineffective extensions of HTML. To make matters worse, standard setters could not agree on proposed standards for DHTML.
In 1996, the World Wide Web Consortium (W3C) gave serious attention to Jon Bosak's proposed "extensible markup language" called XML. It is extensible to HTML and allows meanings (e.g., attributes of a petal) to be tagged on the words "red petal." More importantly, subsets of documents and tables can be edited and transmitted over the Internet as bits of data that do not carry the accompanying excess baggage of HTML formatting information and entire documents containing entire tables. Users can feed those data inflows into style sheets of their own choosing. Appearances can be changed by user modifications to style sheets.
Perhaps the best example is a networked database on the web containing 10 million names, addresses, and phone numbers. It would be extremely inefficient to have to download the entire database merely to look up a particular phone number or to change a phone number in the database. It would be absurd to code the database information into a HTML document that has to be downloaded with all 10 million records before a user can search for one record. In contrast, application-specific database software is highly efficient in allowing users to use queries to retrieve only a desired subset of data in a database. XML will do the same thing for web documents and tables. Web meta searches become more like database queries.
Another important distinction between HTML and XML lies in the ability of using XML to process information without the aid of human beings. HTML documents are intended for human viewing. Computers using XML can "talk" to one another without human intervention. Charles Goldfarb and Paul Prescod describe the database aspects of XML as follows:
XML is also expected to become an important tool for interchange of database information. Databases have typically interchanged information using simple file formats like one-record per line with semi-colons between the fields. This is not sufficient for the new object-oriented information being produced by databases. Objects must have internal structure and links between them. XML can represent this using elements and attributes to provide a common format for transferring database records between databases. You can imagine that one database might produce an XML document representing all of the toys the manufacturer produces and that document could be directly loaded into another database either within the company or at a customer's site. This is a very interesting way of thinking about documents, because in many cases human beings will never see them. They are documents produced by and for computer software. (Emphasis added)
C.E. Goldfarb and P. Prescod
The XML Handbook
(Upper Saddle River, N.J. Prentice Hall PTR, 1998, Page 25)
http://www.phptr.com
"XML: Plugging into 'Standard' Hybrids," eWeek, January 7, 2002by Renee Boucher Ferguson --- http://www.eweek.com/article/0,3658,s%253D1884%2526a%253D20656,00.asp
It was supposed to be so simple. XML would enable companies to move beyond paper-, e-mail- and electronic data interchange-based commerce to the world of Internet transactions. Having such an open platform was supposed to provide a lower-cost way for developing applications that would be universally accessible to all of a company's business partners.
Now, more than three years after XML's introduction, IT shops implementing industry-specific variants find themselves looking at multiyear, multimillion-dollar projects that leave two fundamental obstacles unchallenged: how to shift partners from trading through traditional means to trading with XML and how to interoperate with other industries.
These vertical-industry XML flavors for many companies have created walls around their Internet trading software that require more code to be written and more expense incurred to make sure that some potential buyers or suppliers can take part in business-to- business e-commerce.
What's needed now, in the view of IT managers, software vendors and analysts, is a horizontal XML blueprint of sorts to describe a syntax and vocabulary that vertical industries can use to interoperate with B2B trading software from other verticals. ebXML (electronic business XML) is being touted as one solution—not just another XML variation but an architecture that provides a horizontal messaging framework.
Other cross-industry standards in the works include UBL (Universal Business Language) and XSL (Extensible Stylesheet Language).
However, until a universal standard or set of standards is agreed upon, vertical industries will continue to support individual XML standards that do not interoperate.
Continued at http://www.eweek.com/article/0,3658,s%253D1884%2526a%253D20656,00.asp
From Neal Hannon on September 1, 2000
Formal work in the area of XML glossaries of terms is well documented at the www.w3c.org web site.
I am working with the XBRL.org steering committee. We have developed a taxonomy, or data dictionary, for identifiying the elements of financial statements compiling with US GAAP. XML Schema is being used to provide more flexibility in the expressiveness of DTD's.
DTDs are part of the XML family of standards but do not use XML document syntax. DTDs also do not provide the mechanism for specifying the fundamental type of an element or attribute. XML Schema, although not yet a formal W3C recommendation, provides this ability. The entire taxonomy and examples are posted at the www.xbrl.org Web site.
Books that cover schemas simply include "Teach yourself XML in 24 hours" by Ashbacher, "XML, a Manager's Guide", by Dick. I hope this information helps.
Neal
A "Must See" site on XBRL --- http://www.xbrlsolutions.com/
Try out the XBRL Instance Document Validator!
Try out the XBRL Custom Taxonomy Builder!
XBRL is a framework that will allow the financial community a standards-based method to prepare, publish in a variety of formats, exchange and analyze financial reports and the information they contain. It will also permit the automatic exchange and reliable extraction of financial information among various software applications.
Neil Hannon pointed out this XBRL Demo --- http://www.reportingtools.com/xbrl/index.cfm
These financial statements have been created to display the use of XBRL taxonomies utilizing the specification dated 2000-07-31. These are NOT the official financial statements of Newtec
This demo provides a view of the XBRL mapping feature contained within MultiMart™ Web Financials and a brief description of how this is processed. A ‘View Source’ element, allows you to see the underlying data (instance document in accordance with XBRL Taxonomy specification dated 2000-07-31). Notably, the taxonomy mapping needs only to be completed once as Newtec’s MultiMart™ Enterprise Datawarehouse saves the information for use in report output and this solution can accommodate any taxonomy, including future additions.
MultiMart™ Web Financials encompasses applications such as Web G/L and Web A/P with functionality that includes online drill-down capabilities. This can be seen on the XBRL financial statements and accessed from the demo site. To obtain greater details of the many features included, you may request a one-on-one, online demonstration from Newtec.
In a very nice document (for beginners) on XML, Mark Johnson lists the following benefits from extending HTML to XML:
- XML is at least as readable as HTML and probably more so
Anyone who understands, more or less, what HTML is probably understands just about everything in Listing 4. They're also likely to have a good idea what the markup means, since the markup uses fairly intuitive terms (<Ingredient><OBJECT CLASSID="000DDA23432...">).- The tags don't have anything to do with how the document is displayed
Listing 4 is pure content: It's information. The markup indicates what the information means, not how to display it. The formatting information for an XML file (if there is any need for formatting) is usually written in a style language and stored separately from the XML. (See the sections on CSS and XSL below for more on formatting XML.) Separation of content and presentation is a key concept inherited from SGML.- A lot of the programming is already done for you
If you write a DTD and use a validating parser, much of the error checking for the validity of your input is done by the parser. There's no need to write the parser yourself, since there are so many high-quality parsers available for free. If you want to change the language, you simply change the DTD; the parser then obeys your new rules. Moreover, if your system needs to interoperate with other systems, you can choose a standard DTD (like XML/EDI, for example), so that other systems will automatically understand your system's vocabulary, and vice versa.- XML is more versatile than HTML
Let's think about all the ways a document like Listing 4 could be used:
- You could display this recipe in an online recipe database, with a page style easily modifiable across all recipes
- The recipes are automatically scalable, convenient if you're planning a dinner party for 200
- The recipe is already in a standard recipe format for transmission to the database
- Online recipe servers could exchange recipes using this format, or recipe applications could share data
- Such recipes would be much easier to search accurately (for example, "all recipes with lime Jello and Tabasco sauce") than HTML would be
- It would be easy, based on the contents of your "legacy" pantry inventory database, to produce a shopping list
Mark Johnson
"XML for the Absolute Beginner"
http://www.javaworld.com/javaworld/jw-04-1999/jw-04-XML.html
XHTML: A Bridge to the Future, Information Week, May 8, 2000, pp. 210-214. The article is not yet posted online, but eventually you will find it at http://www.informationweek.com/maindocs/archive.htm
XHTML: A Bridge To The Future
THE W3C'S RECOMMENDATION BLENDS XML AND HTML TO PRODUCE EXTENSIBLE WEB-PAGE FORMATTING
Hypertext Markup Language, an aging, inflexible formatting standard, has fueled the phenomenal growth of the Web. Now a new technology, a flexible data-markup standard called Extensible Markup Language, promises nearly complete flexibility. In a flash of brilliance, the World Wide Web Consortium (W3C) has combined HTML and XML into the new XHTML recommended standard, which reformulates HTML 4.02 -- the latest version -- with XML document type definitions (DTD).
HTML is the language behind one of the fastest, most widespread technology adoptions ever. Derived from Standard General Markup Language (SGML), HTML is simple to learn and reasonably flexible for formatting text and graphics, but it doesn't have the extensibility to adapt to dynamic Web applications. Most every site with valuable content is more of a Web application than a Web site, requiring code components, multimedia effects, and other features that strain the limits of HTML.
HTML is usually extended by innovations in a single browser, usually Microsoft's Internet Explorer or America Online's Netscape Communicator, and these changes gradually make their way into other browsers. Inevitably, the implementations are different enough that Web authors have a tough time making their sites viewable from different browsers, much less older versions of those browsers. The more popular extensions eventually make their way into the group's HTML standard -- frames and scripting languages, for example.
In the last couple of years, XML has been taking the Web by storm. Whereas HTML formats and presents information, XML marks up data so that the individual pieces of information on a Web page are identified as being of a particular type. In a bank's data, for example, $4,562.03 is marked as the outstanding balance of a customer's loan, and $123.90 as the monthly payment, identifying them as particular kinds of data points. Without XML, these would be just two character strings in a sea of text on a Web page. XML provides metadata -- data about data.
The most important feature of XML is the "X." HTML has a fixed set of tags, but with XML you can create multiple namespaces that define custom tags. Industries can band together and create namespaces that facilitate the exchange of information.
Continuing the bank example, <Balance> and <Payment> can identify the two character strings as being specific types of information. This facilitates exchanging data between applications and computer systems, limiting the need for expensive, complex data-conversion programs.
| Bob Jensen's Home Page | XML and RDF Overview | XML Software Review | Table of Contents |
| Offline References | Online References | Technology Glossaries | XML FAQs |
Some Messages on Traditional Relational Accounting Databases versus XML Databases
Some Questions Followed by Answers From Experts
Dear Professor Jensen,
I have read roughly your review of XML and some XML references that you recommend in your homepage. Thank you so much for this very useful introduction. Taking this opportunity, I would like to ask you for some additional help. I would be grateful if you could comment on the following issues which arise from my reading of your review and other materials.My first issue concerns the possible contradiction between standardisation and customisation in the context of corporate financial reporting (CFR). Apparently, any widespread use of XML requires standardisation of tags which gives a common meaning to pieces of financial information. On the other hand, there has been a sustained call for a greater customer focus or customisation in CFR. My question is, to what extent XML's standardisation imperative contradicts the call for customisation? If any, in what ways?
My second issue relates to XML's relative (dis)advantages over traditional databases (such as networked relational databases). It is claimed that XML is a powerful tool for data representation, storage, modeling, interoperation, and so on. Will XML simply replace traditional databases, or just operate on top of them? If comparable at all, how will you compare XML with traditional databases in terms of queries, storage, search, data import, data export, data exchange, data maintenance, data updating, data input etc?
I am not sure that my questions themselves are valid, logical, or significant. They may appear silly to you, but I look forward to hearing from you. If you think these questions may be intersting to others and/or you wish to invite others to discuss them, you may post your comments together with my questions on the AECM. Many thanks in advance. Best wishes. Dr Jason Xiao Cardiff Business School Cardiff University, UK
Dr JASON XIAO [Xiao@Cardiff.ac.uk]
Cardiff Business School University of Wales,
Cardiff Colum Drive Cardiff CF1 3EU Tel: 01222-875374 Fax: 01222-874419 URL: http://www.cf.ac.uk/uwcc/carbs/xiao/xiao.html
Interested in XML? Sign up for a free weekly email full of XML news, features, downloads and reviews. http://www.zdnet.com/enterprise/lists/xml/subscribe.html
A summary of XBRL is given at http://www.nwfusion.com/news/2000/0407xml.html
Leading global accounting, financial and software industry groups and companies have announced the formation of a consortium aimed at promoting a new specification for exchanging financial data over the Internet.
The extensible business reporting language Project Committee aims to develop and by July to launch XBRL for Financial Statements, the first in a planned series of free XBRL products for sending financial statements over the Internet as well as across other software and technologies, the American Institute of Certified Public Accountants (AICPA) said in a statement.
Based on XML, the XBRL specification uses accepted financial reporting standards and practices, and aims to standardize how financial information is sent and viewed on computer screens. XBRL, formerly codenamed XFRML, has been in development for one year, according to the AICPA statement.
AICPA is one of more than 30 backers of the XBRL Project Committee, which among its members also counts some of the biggest names in the software industry, including IBM, Microsoft, Oracle and SAP AG.
Other members include Arthur Andersen LLP, the Canadian Institute of Chartered Accountants, Deloitte & Touche LLP, Ernst & Young LLP, the International Accounting Standards Committee, the Institute of Chartered Accountants in Australia, the Institute of Chartered Accountants in England and Wales, KPMG LLP, PricewaterhouseCoopers LLP and Reuters Group LP.
More information about XBRL can be found at http://www.xbrl.org/ . AICPA is at http://www.aicpa.org/ .
Neil Hannon's XBRL Resource Center at http://www.tiac.net/users/nhannon/xbrl.htm
Section One: Introduction to XBRL Section Two: XML and The Financial Community Section Three: What is XML and XML Basics Section Four: What is XBRL Section Five Why Financial Professionals Will Use XBRL Section Six History of XBRL Section Seven XBRL Instance Documents Section Eight XBRL and XML Case Studies Section Nine Glossary of Terms Section Ten Questions and Problems Thanks for visiting the XBRL Education Resource Center. Be sure to visit www.xbrl.org for more information
Neil also clued me into the following:
Navision Software releases the first XBRL-enabled financial system, Navision Financials 2.50.
[August 02, 2000] "Navision Software Releases XBRL Solution; XML-Based Financial Reporting Language Now Available in Navision Financials 2.50." - "Navision Software, a leading worldwide provider of business management solutions to the middle market, announced today that it has released its XBRL solution, one day after the publication of the official XML-based taxonomy. XBRL (eXtensible Business Reporting Language) is a free specification that first appeared on the financial and accounting scene in October of 1999. It uses a financial reporting specification, agreed upon by key members of the financial information supply chain, that allows an open exchange of financial reporting data across all software and technologies, including the Internet. The XBRL coding contained in Navision Financials 2.50 will enable customers to more easily and efficiently connect and communicate with both competing products in the ERP space and complementary products such as Caseware. For example, a set of subsidiary offices using Navision Financials can now more quickly collaborate with a parent office using a larger ERP system, while realizing significant time and cost savings. XBRL offers several key benefits: technology independence, full interoperability, efficient preparation of financial statements and reliable extraction of financial information. Information is entered only once, allowing that same information to be rendered in any form, such as a printed financial statement, an HTML document for the company's Web site, an EDGAR filing document with the SEC, a raw XML file or other specialized reporting formats, such as credit reports or loan documents. More than 80 percent of major US public companies provide some type of financial disclosure on the Internet. Investors and users of the Internet need accurate and reliable financial information that can be delivered promptly to help them make informed financial decisions." See XBRL Taxonomy - "Taxonomy for the creation of XML-based instance documents for business and financial reporting of commercial and industrial companies according to US GAAP."
The main XBRL website is at http://www.xbrl.org/
Extensible Business Reporting Language (XBRL), formerly code-named XFRML, is an open specification which uses XML-based data tags to describe financial statements for both public and private companies. XBRL benefits all members of the financial information supply chain.
XBRL is:
- A standards-based method with which users can prepare, publish (in a variety of formats), exchange and analyze financial statements and the information they contain.
- Freely licensed, permits the automatic exchange and reliable extraction of financial information across all software formats and technologies, including the Internet.
- Ultimately benefits all users of the financial information supply chain: public and private companies, the accounting profession, regulators, analysts, the investment community, capital markets and lenders, as well as key third parties such as software developers and data aggregators.
- Does not require a company to disclose any additional information beyond that which they normally disclose under existing accounting standards. Does not require a change to existing accounting standards.
- Improves access to financial information/speeds up access
- Reduces need to enter financial information more than one time, reducing the risk of data entry error and eliminating the need to manually key information for various formats, (printed financial statement, an HTML document for a company’s Web site, an EDGAR filing document, a raw XML file or other specialized reporting formats such as credit reports and loan documents) thereby lowering a company's cost to prepare and distribute its financial statements while improving investor or analyst access to information.
- Leverages efficiencies of the Internet as today’s primary source of financial information by making Web browser searches more accurate and relevant. (More than 80% of major US public companies provide some type of financial disclosure on the Internet.)
- XBRL meets the needs of today's investors and other users of financial information by providing accurate and reliable information to help them make informed financial decisions.
For a brief history of how XBRL came to be, see the history page.
Hi Bob,
This message appears on Robin Cover's XML Cover Pages.
[August 22, 2000] FpML Architecture 1.0 Working Draft Advanced to Last Call. A communiqué from Cathy S. Yesenosky announces that the Financial Products Markup Language (FpML) Architecture document is now in last call review. Members and Working Groups of the FpML Consortium and other interested parties released the FpML specifications as working drafts July, 2000. The principal FpML Version 1.0 Specifiction (together with the FpML XML DTD) is currently in a last call review phase which ends on 25-August-2000. FpML (Financial Products Markup Language) "is a business information exchange standard for electronic dealing and processing of financial derivatives instruments. It establishes a new protocol for sharing information on, and dealing in, financial derivatives over the Internet. It is based on XML (Extensible Markup Language) and initially focuses on interest rate swaps and Forward Rate Agreements (FRAs). FpML has been designed to be modular, easy-to-use and in particular intelligible to practitioners in the financial industry. Ultimately, it will allow for the electronic integration of a range of services, from Internet-based electronic dealing and confirmations to the risk analysis of client portfolios. It is expected to become the standard for the derivatives industry in the rapidly growing field of electronic commerce. The standard, which will be freely licensed, is intended to automate the flow of information across the entire derivatives partner and client network, independent of the underlying software or hardware infrastructure supporting the activities related to these transactions." The announcement says: "the FpML Architecture Version 1.0 Working Draft has been advanced to the Last Call stage. The Last Call period is expected to end September 1, 2000. We encourage interested parties to provide comments on the specification as soon as possible. Please send comments via email to fpml-issues@egroups.com . Please report each issue in a separate email message. An archive of the comments is available at: http://www.egroups.com/messages/fpml-issues . An issues list is also maintained on the web site. The FpML specifications are available at http://www.fpml.org/spec/ . For description and references, see "Financial Products Markup Language (FpML)."
Neal Hannon Mailto:nhannon@tiac.net nhannon@bryant.edu
Bryant College http://web.bryant.edu/~nhannon 401-232-6227
XBRL... Financially Speaking in XML
On April 7, 2000 Glen Gray suggested going
to the story at
http://www.computerworld.com/home/print.nsf/CWFlash/000407D2CE
"Big names back new XML-based financial standard"
By Maria Trombly
ComputerWorld, April 7, 2000Some of the world's top financial institutions have formed a consortium to promote a new, XML-based standard for exchanging financial data over the Internet.
The group, the XBRL Project Committee, expects to launch the standard by July 1, the American Institute of Certified Public Accountants (AICPA) announced yesterday.
The standard, Extensible Business Reporting Language (XBRL), is also backed by big-name financial service companies such as Standard & Poor's, Arthur Andersen LLP, Deloitte & Touche LLP, Morgan Stanley Dean Witter, Ernst & Young LLP and PricewaterhouseCoopers.
In addition, some of the biggest names in the computer industry have lined up behind XBRL, including IBM, SAP AG, Microsoft Corp. and Oracle Corp. Financial reporting companies such as EDGAR Online Inc. and Reuters Group LP, as well as the International Accounting Standards Committee, are also backing the proposed standard.
The standard will be released in stages. The first release, scheduled for July, will cover specifications for publishing companies' financial statements in XBRL, said Mike Willis, chairman of the XBRL steering committee and a partner at PricewaterhouseCoopers. Other specifications, which will cover additional types of business reports — such as regulatory reports including Securities and Exchange Commission EDGAR files, tax filings and business event reports such as press releases — will be issued within the next 18 to 24 months, he said.
Willis said that because these specifications are simply electronic dictionaries for the XML standards that are already used in a great number of software applications, they will be simple to install and use.
"We have vendors such as SAP who are already working to integrate XBRL directly into their software, so when their customers want to run their financial statements, XBRL is an option," said Christy Reichhelm, an enterprise resource planning industry manager at Microsoft and co-chair of the public relations and communications working group for XBRL.
"This will be a new feature in these software packages, so some type of software upgrade will be gone through," she added. "But it would be minor."
XBRL will be a free specification that uses accepted financial reporting standards and practices to exchange financial statements across all software and technologies, including the Internet, the AICPA said.
"XBRL . . . greatly benefits all users of financial information," said Robert Elliot, chairman of the AICPA, in the statement released yesterday. "XBRL solves two significant problems for users and preparers of financial statements by providing efficient preparation and reliable extraction of financial data across all technology formats, including the Internet."
On April 7, 2000, a leading expert replied as follows:
XBRL, XFRML, XFDL, and the rest of the alphabet soup of XML applications are mere grammars for documents of specific sort from narrow domains, so some discipline (virtually non-existent in the HTML world) is imposed to facilitate communication.
From the viewpoint of us accountants, an equally interesting area is to tie-in these documents with backend databases. The document model for XML is hierarchical whereas most accounting databases today are relational. For seamless integration of relational databases and web-based interfaces, the application layer software should be able to map the relational stuff to the hierarchical model (in DOM) of XML. This is where the action today is. One example is the IBM prototype DB2XML, which maps the results of the SQL queries into a W3C DOM object, which can be displayed on a web browser using the Document Type Definition (DTD) and an XSL (eXtensible Stylesheet Language) stylesheet. DB2XML creates on the fly the DTD for the SQL query results.
The students in my web applications development course have been building XML fron-ends tied to backend relational (Oracle) databases, but the process is very tedious (and manual). It is imperative to develop technologies to map relational query results into DOM and then tie the whole thing together through something like JSP or ASP. The near future should be interesting. Those interested in the course may like to see
http://www.albany.edu/acc/courses/acc683.spring00.html
J. S. Gangolly [gangolly@CSC.ALBANY.EDU]
Associate Professor, State University of New York at Albany, Albany, NY 12222.
Phone: (518) 442-4949 Fax: (707) 897-0601 URL: http://www.albany.edu/acc/gangolly
"XBRL - The emerging electronic reporting language," by Mike Willis --- http://www.accountingeducation.com/library/library147.html
XBRL enables the users and preparers of existing financial statements to:
· decrease the cost of accessing the information contained within the existing financial statements,
· decrease the preparation cost,
· increase the distribution of, and access to, existing financial statement information, and
· increase and enhance the statement’s analysis
Scott Bonacker replied as follows on April 8, 2000
The EAI Journal recently (in the lst two or three issues) devoted a whole issue to XML and it's place in the world. There is also a website for the Enterprise Application Integration magazine - www.eaijournal.com - and the print publication is available for free if you qualify.
EAI is an important function for accountants - whether by digital or analog tools.
Scott Bonacker, CPA [scottbonacker@CLAND.NET]
CPA McCullough, Officer & Company, LLC Springfield, Missouri moccpa.com
Ron Tidd sent the following message on April 10, 2000:
Jason et al,
I am not even close to being the expert that many of the other participants in this list are. However, I think that I can address Jason's questions based on my own naive, rudimentary understanding of XML:
Standardization will/must occur within disciplines that have traditionally shared a common language. Thus, while the Web might allow everyone in the World to use XML to prepare web-based documents, each profession/discipline will/must adapt XML to its specific needs in a manner that facilitates communications between its members (e.g., math, chemistry, financial reporting, etc.). Personally, I am not too concerned that I, as an accountant, am "restricted" to the XFRML standard and that mathematicians can not derive much benefit from it.
Second, I would like to assure Jason that the questions are valid and significant. As I tell my students, you are not the only one with the question but usually, if you ask it, you are the only one with the guts. We all need to be prodded into thinking about the basics, especially with respect to emerging technologies.
An April 19 message on the dark side of XBRL --- J. S. Gangolly [gangolly@CSC.ALBANY.EDU]
XML is a meta-language, and XBRML is a language written in XML syntax (at least that is my understanding), ie., XBRML is an XML application. XBRML in some sense, therefore, is an XML derivative "customised" language for business reporting, and gives a bunch of "customised" tagset and a bunch of "grammatical" rules that fit the business reporting application.
Therefore, XBRML provides facilities for further "customisation" for a specific situation just as HTML did. Looking at XBRML as a language provides insights. The tagset and the grammar rules in XBRML are akin to lexicon and the grammar in english. It is possible to write all sorts of sentences in english eventhough the lexicon is limited (probably not more than 150,000 words or so) and the grammar, though fluid, can be taken an given.
There is nothing that prevents a company developing an XML application on its own (and might even be desirable, if one has belief in Darwinian "survival of the fittest"). My own suspicion for the bee-line for standardisation of business reporting is the aspirations of the accounting firms to keep the reporting costs down. Imagine costs of code review alone if each client had its own XML application for business reporting.
On the other hand, XBRML standardisation may lock us down to a mediocre standard in the long run. Moreover, imagine all the billings that would be created by need for, say, XARML (a fictional eXtensible Accounting Reporting Markup Language for each client) code review!
We as a profession are more worried about our liability (specially for code reviews with which traditional accountants are quite uncomfortable) than the "survival of the fittest" language.
Imagine what happened to languages with rigid standards (French being a good example vis-a-vis English) in international discourse. I do not mean to be disparaging (my favourite authors in ANY language are Romain Rolland and Moliere).
Thank you Denny Beresford for the tip on http://www.sec.gov/news/press/2000-53.txt
SEC APPROVES ISSUANCE OF INTERPRETIVE RELEASE ON THE USE OF ELECTRONIC MEDIA
Washington, DC, April 26, 2000 - At an open meeting yesterday, the Commission approved the issuance of an interpretive release discussing the application of the federal securities laws to electronic media. The interpretations build on Commission interpretations in 1995 and 1996 and are intended to help promote the efficient dissemination of information to investors, security holders and the securities markets. In addition, the interpretations are intended to ensure that the evolving use of communication technologies to offer and sell securities is consistent with the Commission's goals of protecting investors and promoting fair and orderly markets.
Many publicly-traded companies are incorporating Internet- based technology into their routine business operations, including setting up their own web sites to furnish company and industry information. Some provide information about their securities and the markets in which their securities trade. Investment companies use the Internet to provide investors with fund-related information, as well as security holder services and educational materials. Issuers of municipal securities also are beginning to use the Internet to provide information about themselves and their outstanding bonds, as well as new offerings of their securities.
The increased use of the Internet by issuers as a means of widespread information dissemination has resulted in uncertainty about the application of the federal securities laws to these communications. Through the release, the Commission seeks to reduce this uncertainty and remove interpretively some of the barriers to use of electronic media, while preserving important investor protections.
Highlights of the Interpretations
1. Electronic Delivery
The guidance in the release resolves several issues that have arisen out of the Commission's 1995 and 1996 releases on the use of electronic media to satisfy delivery obligations. In brief, this guidance
ú clarifies that, in addition to written consent, investors and security holders may consent to electronic delivery of documents telephonically, as long as the consent is obtained in a manner that assures its validity and a record of the consent is retained;
ú permits market intermediaries (such as broker-dealers and banks) to obtain consent to electronic delivery of documents on a "global," multiple-issuer basis, as long as the consent is informed;
ú clarifies that issuers and market intermediaries may deliver documents electronically in portable document format, or PDF, as long as investors and security holders are adequately informed of the requirements to download PDF and are provided with any necessary software and assistance;
ú clarifies that a hyperlink embedded within a prospectus or any other document required to be filed or delivered under the federal securities laws causes the hyperlinked information to be a part of that document; and
ú clarifies that the close proximity of information on a web site to a public offering prospectus does not, by itself, make that information an "offer to sell," "offer for sale" or "offer" within the meaning of the federal securities laws.
The first phase of an International Accounting Standards Committee research project on this topic can be downloaded from http://www.iasc.org.uk/frame/cen3_26.htm. The first phase of the project involved developing and publishing a discussion paper, "Business Reporting on the Internet." The discussion paper was published in November 1999 and was authored by: --Prof. Andrew Lymer (University of Birmingham, a.lymer@accountingeducation.com) --Prof. Roger Debreceny (Nanyang Technological University, Singapore, rogerd@netbox.com) --Prof. Glen Gray (California State University, Northridge, glen.gray@csun.edu). --Prof. Asheq Rahman (Nanyang Technological University, Singapore, aarrahman@ntu.edu.sg)
Outline of the Discussion Paper
- Chapter 1 reviews some of the impetuses behind the proliferation of Web based business reporting. It also provides background information on the increasing types and number of corporate Web sites, and the increasing number of online traders.
- Chapter 2 explores and summarises the multitude of different electronic reporting technologies that can be used by Web designers. These technologies are not mutually exclusive, which means that a designer can use any mix of these technologies to develop a Web site.
- Chapter 3 summarises the findings of the existing literature on Web-based financial reporting and adds further findings from a survey of 660 corporations in 22 countries conducted by the authors. The chapter also discusses electronic reporting environments within national disclosure and regulatory regimes such as EDGAR and SEDAR in the USA and Canada, respectively.
- Chapter 4 examines the information presented in the prior chapters and proposes that the IASC should seriously consider the development of a "code of conduct" that would cover both the form and content aspects of Web-based business reporting.
- Chapter 5 addresses issues raised by pending and future technologies, which are evolving at a rapid rate. The chapter suggests that to add value to information consumers, it is critical that international standards setters and other organisations respond to these new technologies, which can greatly improve business reporting and subsequent Internet searches. This chapter highlights the significant need for a universal Business Reporting Language (BRL) to facilitate the electronic dissemination and use of business information. The Chapter suggests a consortia approach that will help ensure the development of standards that provide both certainty in reporting and flexibility for future innovations.
- Chapter 6 synthesises the information provided in the prior chapters to discuss the opportunities, challenges, and implications for the accounting profession and the IASC, its international standard setter.
XForms --- forwarded by J. S. Gangolly [gangolly@CSC.ALBANY.EDU]
INTERNET WORLD NEWS Tuesday, April 18, 2000 Vol. 2, Issue 75 http://www.internetworldnews.com
Newfangled Forms from the W3C
By Nate Zelnick
It's been seven years since forms were added to the Hypertext Markup Language and, in the interim, a few things have changed.
For instance, in 1993 it was simply astounding to be able to collect user-supplied data from within a Web page itself through generic little widgets like text boxes, drop-down combo boxes, and Boolean radio buttons. The fact that doing anything with that data in the stateless Web meant submitting the form back up to the server and handing it off to some CGI script or other ancillary system -- which meant you could have one form per page that could be processed -- was a small price to pay. Later, client-side scripting helped relieve some of the tedium of this approach, but only by requiring a completely different development paradigm that would work only in the presence of the right version of JavaScript. In other words, a hack.
This week the World Wide Web Consortium ( http://www.w3.org ) published the first public view of where it wants to take the forms of the future. As with nearly everything coming out of the Consortium, the new XForms proposal ( http://http://www.w3.org/TR/2000/WD-xhtml-forms-req-20000329 ) begins and ends with the core value it's been promulgating since its founding: If the Internet is going to work everywhere, on every kind of device for every type of person, then information needs strict barriers between its structure, its content, and how it looks.
This meant that the HTML Activity Group that built the XForm outline had to think about what a form is and what it does in the most generic sense. Dave Ragget, one of the editors of the XForm Data Modeling Draft and the XForm Requirements document and a participant in the development of HTML from nearly the beginning, stressed that XForms is a much larger concept than merely the Web. It needs to encompass archaic media like paper, as well. A form that requires a human signature needs to exist as more than electrons, but the minute it's printed or faxed, it loses the ability for filled field values to be extracted.
But because XForms defines its data model as separate from its presentation, the position of a named field's answers can be extracted by Optical Character Recognition systems even after the electronic life has been squeezed out of it. More familiar Web-expansion problems -- like how to present a form on a cell phone, television screen, or Web-enabled blender -- are less hairy variations of the same problem.
Tuesday's XForm announcement includes only the broad definition of the problem that needs to be solved -- the Requirements doc -- and a first draft of an XForm Data Model. Possible collisions with XML Schemas -- an evolving spec that deals with defining data types for XML vocabularies -- may create some intraconsortium grumbling, but the XForm group was careful to make distinctions between its model and that ongoing work.
Early backing for the work thus far came from form-centered companies like Xerox, JetForm, and Cardiff Software. The long road to consensus -- required for something to become a W3C recommendation -- means predicting a done date is impossible.
Selected News Items
Software vendor Integral Corp. yesterday released programming details for creating documents using FinXML 1.0, a proposed standard for data interchange in the financial-services industry based on the Extensible Markup Language. XML is a promising new meta-language that is increasingly seen as a way to facilitate E-commerce by bringing better structure to data on the Web.
Integral has specified, and made publicly available via the Web, a set of data table definitions that prescribe rules for sharing data using FinXML. It plans next to make FinXML compatible with other flavors of XML being developed in the E-commerce arena, such as Microsofts BizTalk initiative and Ariba Inc.s commerce XML effort. Integral says FinXML is already interoperable with the Financial Industry Exchange protocol and the Open Financial Exchange protocol, which is used for online billing and other retail transactions.
Integral is attempting to form a consortium of technology vendors and users to back FinXML, but it has not yet identified any corporate members. Sun Microsystems and Chase Manhattan Corp. have endorsed the concept of FinXML, but a spokesman for Integral says it is premature to say whether they will actually join the planned group.
If you are interested in the future of networked databases, I highly recommend the article entitled ""The i Gets Bigger at Oracle," by Michael Bucken in Application Development Trends, August 1999, pp. 20-33. This article serves two purposes. The first purpose is to inform us about the major transitions of database networking into Internet networking of databases. The second purpose is to provide strategy professors and consultants with an excellent case study on how high-tech companies must "constantly re-invent themselves." The online version of this article is at http://www.adtmag.com/Pub/aug99/fe0803a.htm. Also see "Oracle's Long and Winding Repository Road," at http://www.adtmag.com/Pub/aug99/fe0802.htm.
For metadata and XML watchers, there is an excellent article in that same issue entitled ""Meta is the Word" by Rich Seeley and Jack Vaughn, Application and Development Trends, August 1999. pp. 43-48. The online version is at http://www.adtmag.com/Pub/aug99/fe0801.htm. Another great XML article is entitled "Biz Talk could Spur XML and E-Business," by Don Kiely in Information Week, August 23, 1999, pp. 94-76 The online version is at http://www.informationweek.com/749/talk.htm. Don Kiely states the following:
Microsoft is strongly behind XML, as demonstrated by the addition of native data support to develop technologies such as ActiveX Data Objects. As a result, XML documents are a native format in the Internet Explorer 5.0 Web browser. Microsoft also serves on the World Wide Web Consortium's XML standard committee.
To further capitalize on the growing interest in XML, Microsoft in May introduced BizTalk, a design framework for XML. By developing an XML schema under the BizTalk guidelines, data can be shared easily between applications using a loosely coupled, message-based system across a network or running on the same machine. Because BizTalk provides a repository for schemas, all applications can have access to the data definitions contained in the schema.
The current version, 0.81, consists of four documents: Biz Talk Framework XML Tags Specification, Biz Talk Framework Document Design Guide, XML Schema Developer's Guide, and a guide to a schema canonical format.
Along the Biz Talk line, Bill Gates had the following to say in a recent speech having a transcription at http://www.microsoft.com/MSFT/speech/analystmtg99/gatesfam99.htm:
In this paperless revolution, you will have a document that's quite rich. I mean, the document we've got here it's got links to other documents, it's got annotations, it's got things that are highlighted embedded in the document, depending on the mode of viewing you choose, you actually have discussion, people who say I don't agree with this, I think this is an interesting thing. And if there's anything in there that it recognizes, a reference to a book or music, for example, all of the verbs, like I want to buy that, or I want to get more information on that, are immediately available as you right click. So it's not just text. It's actually real world objects that are being referred to there, by having the XML schemas in the document. So we have a thing called Windows Schema that's part of our Biz Talk initiative that defines how that rich information is available in all the text that you work with in all your documents.
An excellent article that compares XML and CORBA was written by Mark Elenko and Mike Reinertsen, "XML & CORBA," Application Development Trends, September 1999, pp. 45-50. For some reason the article is not available online along with the other articles that are online at http://www.adtmag.com/ (Maybe it will be made available by the time you read this edition of New Bookmarks):
It is still important to sometimes distinguish CORBA from XML. CORBA is an enabling technology for creating sophisticated, distributed object systems on heterogeneous platforms. XML is a technology for conveying structured data in a portable way. CORBA allows users to connect disparate systems and form object architectures. XML will allow users to transmit structured information within, between and out of those systems, and to represent information in a universal way in and across architectures. Both technologies are platform-, vendor- and language-independent. The conceptual fit is perfect. To see where and how this fit is best realized, we will examine how to actually combine CORBA and XML from a series of widening perspectives.
"The dark side of XML and privacy, by Jack Vaughan, September 5, 2002 --- AppDevTrends@101communications-news.com
The data-describing power of XML could have a very dark side in the hand of mischievous individuals, says Ron Schmelzer, a senior analyst at industry analyst firm ZapThink, Waltham, Mass. "XML is essentially automating identity theft," said Schmelzer, a speaker at the XML Web Services One Conference in Boston.
By creating what Schmelzer described as a "human-readable, machine-processable, meta data-enhanced, text-based way of reading information that is tagged," XML has given developers a way to tag data fields that may be too efficient. With XML, developers don't really have the ability to tell DBAs to ignore the information. "It's like telling them not to think about polar bears. They're essentially drawing a big red flag" that points to those data fields holding sensitive information.
To resolve this problem, said Schmelzer, some programmers have turned to a strategy of obfuscation -- creating a field called XJ12 as the tag for credit cards, and splitting the credit card number into four fields or even hashing the number.
The Platform for Privacy Preferences is a popular XML-based effort that defines privacy policies in machine-readable formats and generates such policies. According to Schmelzer, attempts at offering customers P3P-based, user-centric services to store and access personal information, such as Microsoft Passport, the Liberty Alliance, CPExchange and Oasis CIQ, at best create as many questions as answers; at worst, they are doomed to failure.
All these plans have one thing in common: They use XML tags to standardize customer information. But, said Schmelzer, "if it's hard to [get agreement on] standardized simple address fields internationally, then think about how hard it will be to tag other, more complex forms of customer information."
Those of you following the tremendous impact that XML is having and will soon have upon all networking may be interested in a special insert in the November 15 Edition of The Wall Street Journal called "Technology: The Providers." I have not been able to find an online version of this insert.
SIDE BY SIDE
Publicly, few Microsoft officials claim that Windows will dominate the Internet, and instead say they envision a world in which Microsoft operating-system and application software coexists peaceably with that of competitors. "Windows 2000 is our intellectual property, and we will continue to drive forward with that," says Bill Anderson, head of Web application services for Microsoft, based in Redmond, Washington. But, he adds, "in a heterogeneous environment down the road, it will become increasingly difficult to interject proprietary standards in a Web-based world."
So, for instance, Microsoft has embraced an industry-wide standard for distributing data known as XML, for Extensible Markup Language. XML seems likely to become the common language of electronic commerce, making it possible for businesses to exchange in a universal format purchase orders, product descriptions and other minutiae important to e-commerce.
Microsoft has driven aggressive efforts to standardize the use of XML across the industry, even establishing a clearinghouse of XML data types called biztalk.org. Some critics have been surprised at the company's embrace of the standard; many expected Microsoft to attempt to subvert it by adding proprietary extensions that would work only on computers that run Windows.
But so far, the company's approach to XML differs substantially from its defensive reaction a few years ago to Sun's Java technology, an earlier attempt to break Microsoft's lock-in by making it possible to transfer software programs across incompatible computers without modifying them. E-mail disclosed as a result of Microsoft's numerous legal tussles has shown that officials from Bill Gates on down feared Java's threat to Windows; as one Microsoft foe put it, company officials set out a strategy to "embrace, extend and extinguish" Java by building in extensions that would tie it closely to Windows. Microsoft eventually had to abandon that strategy when Sun sued and obtained a ruling that forced Microsoft to hew to Sun's Java standards; the matter remains before the courts.
This time, Mr. Anderson insists that Microsoft will adhere to industry-wide XML standards. "It benefits us to be a good player in XML space," he says. Most Microsoft-inspired extensions to the XML standard, he says, will be accepted industry-wide; any exceptions will be "one-off" solutions tailored to solve particular problems. "We're saying we're going to take that framework, build on it and extend it, and make sure it's robust for the Windows 2000 platform," Mr. Anderson says.
Microsoft, however, envisions XML as much more than a simple data-description language; instead, it considers the standard a way of letting programs communicate with each other across networks of otherwise incompatible machines. That job currently required the use of Java or other similar technologies that create a layer of compatible "middleware" that allows programs to communicate. Microsoft competitors such as Sun and IBM consider the company's passion for XML little more than a thinly disguised attack on Java and other middleware technologies.
Indeed, some critics scoff at the notion that Microsoft intends to cooperate with the rest of the industry indefinitely. Fear of the Internet explains "why Microsoft is rushing so quickly to embrace the Web, to extend it in proprietary ways and get people to use those proprietary extensions," says Dan Kusnetzky, an analyst with International Data Corp. "Once you do, you are tied to Microsoft, which is trying to own the Web in a way no other provider is really trying to do."
XML rift may split HR applications standards. A schism is forming over the way developers use XML schema to integrate human resources software and services --- http://www.pcweek.com/a/pcwt9911152/2393058/
XHTML: A Bridge to the Future, Information Week, May 8, 2000, pp. 210-214. The article is not yet posted online, but eventually you will find it at http://www.informationweek.com/maindocs/archive.htm
Selected online references