[RFC-004]Comment-17
Organization: CEOSR GMU
Review of DAP2 specification
NASA's Earth Science Data Systems Standards Process Group (SPG) is considering the Data Access Protocol, Version 2, (DAP2) for adoption as a community standard. You are invited to review the DAP2 Request For Comment (RFC) (see review questions below).
Please send comments before November 12, 2004 to ese–rfc–004@spg.gsfc.nasa.gov
- (Your background) Describe in a sentence or two your overall implementation experience related to the proposed specification. (e.g., server design; database management; systems architecture; data translation; scientific analysis; science users, etc.)
I have set up one plain DAP server and several DAP processing servers (same server, different versions), and also worked on one DAP client for accessing (testing) data from both plain DAP servers and DAP processing servers. In addition, I worked on ingesting metadata from DAP servers for supporting metadata search and DAP URL generation.
(Complete) Does the specification provide all the detail you need to implement it in software? (e.g., to write a client, a server, or a format or protocol translator) If not, describe what is missing in the specification.
I have no experience in client or server development. However, a few additions seem needed:
- If the sequence can be accessed based on the hyperslabs in the projections (page 14), the starting index for a sequence should be stated in Section 3.3.4 (page 10), as those for the dimension indices of an array. If the starting index is zero, the example at the bottom of page 10 should be modified.
- In Section 6.2.1 (page 21), the document says The server is under no obligation to use the requested encoding. In that case, the server should send an error message. Otherwise, it will be an undetermined case for a client.
(Accurate) Do any parts of the specification contain inaccuracies, or internal inconsistencies? If so, please provide details.
- Page 10. Since the indices of an array start from zero, and grid uses arrays, the example should use indices starting from zero.
Length = Ending index – Starting index + 1.
- Page 14. There is a simple error in the definition of the relationship between the stride length and the starting and ending points. The correct formula should be
The same errors appear in both the Array and Sequence paragraphs.
- Page 15. In Table 5, the equal sign operator should be applicable to String and URL types. The example, a few lines below, shows the usage of the operator with a String.
- Page 19 on ext, three letter string is not a correct statement because the extension dods uses four letters.
- Page 20. The content in section 6.1.1.2 is not consistent with those on page 14 for array. On page 14, a hyperslab MUST include a starting index and an ending index. However, on page 20, the figure (in rectangular box) apparently allows a hyperslab with a start point only. Moreover, the optional stride being designed to be between two required numbers (start and end) may not be an optimal design. It is better to put start and end together.
- The matrix examples on page 13 and page 27 use a different order of indices for row and column. Apparently, one of them is incorrect.
(Clear) Is any part of the specification ambiguous, or poorly explained? If so, please provide details.
- Page 12, line 5 from the bottom. The meaning of the word shapes is not clear.
- Based on the description of page 15 and the box content on page 21, it seems that two variables can be used in a relational expression. It is better to have the idea clearly claimed on page 15.
- Page 25. The phrase _fillvalue is unclear. Should it be fill_value ?
- Page 37. The usage of longitude value of west 50 degrees (and 60) is not consistent with the commonly accepted standard (-50).
- Page 38. The example looks unrealistic. If we only have 2D spatial information for a point, there should be only one pair of values. For multiple value pairs, there must be another dimension such as time or vertical layer with the data.
(Balanced) Does the standard describe the right set of concepts, behavior, data types, and data operations for its intended users? An overly broad set (requiring excessive complexity)? A narrowly simplistic set?
This standard should be extended to include mapping information and data of NASA EOS swath type. For the former, standard mapping schemes such as those used in NWP models should be included to allow a client to use the data defined by an array, but not sufficiently by the map vectors in a grid. An extension including 2D maps can solve both the problem above (inefficiently) and the swath data problem.
(Useful) How well does this specification meet your information sharing needs? (e.g., Does it work well with the data types and data manipulations in your application? Does it improve on alternative methods, such as file exchange or proprietary software?)
For most climate study with NASA level 3 data, this specification will meet the information sharing needs. This data delivery mechanism is also significantly superior to other data transferring mechanisms such as ftp.
(Implementable) What implementation challenges does the proposed standard present? (e.g., Does it require advanced processing power, large amounts of memory, complex configuration, etc.? Does it scale to a production environment?)
The implementation of this standard should not require advanced processing power. However, users of DAP systems may need to allocate enough memory to access a large chunk of data. Since the system is designed as a client-server system, it is highly scalable. Of course, as with any other system, if a site (server) becomes very popular and has a large number of access requests, the performance of the server may be low.