Document Actions

Review-10

by Ananth Rao last modified 2007-05-25 06:40

University Research Organization

Review of HDF5 operational readiness:

NASA's Earth Science Data Systems Standards Process Group (SPG) is considering the HDF5 for adoption as a community standard. This is the second review of HDF5, this one focusing on its readiness for operational use. The questions below are provided to guide feedback from data systems, application providers, instrument teams and others. You only need to answer questions applicable to you. Please send comments to spg-rfc-007@lists.nasa.gov.

  1. Describe in a sentence or two your overall experience related to HDF5 (e.g., science data provider, science data systems, software tools developer, and science data user, etc).

    We are developers of software tools that make use of HDF5.

    So far, we are impressed with HDF5's released versions. It seems very well tested and debugged. Sometimes it also seems more complex than necessary, but it's part of our job to hide any unnecessary complexity.

    We're a little worried that the binary format changes from release to release, which will make it increasingly harder to maintain in future releases. Although backward compatibility in the format has mostly been achieved so far, continuing backward compatibility over a long period may prove difficult. Similarly, supporting backward compatibility in future versions of the API may be difficult.

  2. Do you currently use or plan to use HDF5 in a production setting? What types of applications do you use with HDF5? Is HDF5 applicable to your applications (e.g., Does it work well with the data types and data manipulations in your application?)

    Yes, we plan to make use of HDF5 in our released software, which will support data access, analysis, and visualization for research and education in climate, oceanography, atmospheric sciences, and other geosciences.

  3. Why do you choose to use HDF5 over other data formats for your applications?

    We would like to leverage an existing format to get needed features, rather than developing a new implementation from scratch. In particular, the features HDF5 provides that we particularly need include chunking, compression, parallel I/O, portable structs, user-defined variable-length types, efficient schema changes, and reader-makes-right conversion.

  4. Have you or your users encountered any difficulty when using some of the data access or visualization tools (e.g., IDL, GrADS, ..) on HDF-5 data files? If you have, please provide a brief description of your experience.

    We haven't made much use of other tools to access HDF5. Some users have reported difficulties accessing remote HDF5 data through OPeNDAP.

  5. Does the performance of HDF5 you have experienced meet your requirements? (e.g., Can it handle the data types in your applications? Does it take a long time to read and write HDF5 files?)

    The performance meets our requirements in the limited testing we have done.

  6. What operational challenges or limitations does HDF5 present? (e.g., Does it take a long time to learn how to use it? Does it require advanced processing power, large amounts of memory, complex configuration, etc)

    HDF5 is a large library, so it takes some time to learn. In some cases, it seems more general than needed, for example permitting loops in group inclusion graphs.

  7. What benefits does HDF5 present? Do the benefits of HDF5 outweigh the challenges? (e.g., Does it offer the flexibility you want to package the data types in your applications? Does it facilitate interdisciplinary studies?)

    HDF5 benefits include support for a rich set of data types including user-defined data types, variable compression, multidimensional chunking, parallel I/O, efficient schema changes, and reader-makes-right conversion. We think the benefits outweigh the challenges, but that remains to be seen since our software is not yet released to users.

  8. How much data do/will you provide or archive in HDF5? (number of distinct data products or data sets, total data volume, number of files.)

    We provide software rather than data.

  9. How many users do you have or expect to have for data in HDF5, and what is your expected user community?

    We hope a large fraction of our users will move to the version of the software that makes use of HDF5. If that happened, it could mean more than 100,000 users

 

+ Privacy Policy and Important Notices. NASA - National Aeronautics and Space Administration Curator: Jody Gibson
NASA Official: Richard Ullman