Simple data management for public data.
The Open Data movement has made enormous strides in making more government data available for citizens, reachers, policy analysts and journalists, but data is still to difficult, expensive and time consuming to use. Too many datasets are unusable because they are undocumented or have poor metadata. Datasets aren’t linkable, so we can’t create aggregated state-level datasets of, for instance, flu cases, restaurant inspections, or crime reports, and building indicator websites requires large budgets to pay for expensive data management.
The solution to many of these problem is improving metadata, the dataset that describes a dataset. With better metadata, and a simple, easy way to link metadata to data, data producers can create datasets that are both human and machine readable, allowing for low-cost, automatic processing, such as visualization, transformation and federation.
Our proposal is a simple method of adding metadata to metadata, as easy as copying two additional worksheets into an Excel file, but which can also be extended to large datasets comprising hundreds or thousands of files. Because the method is consistent across all datasets, it can be incorporated into data analysis tools and data repositories, and it is simple enough that anyone can create a compliant data package with nothing more than Excel and 30 minutes of training.