Today's theme in the ongoing Love Your Data Week series is "Finding the Right Data." On the one hand, this ties in closely with our discussion from Tuesday about documentation and fitness for purpose - documenting data with good, thorough metadata not only helps interested researchers discover your data, but can also help them understand the application and limitations of a given dataset across different contexts.
On the flip side, in addition to being data producers many of us are - possibly to an even larger extent - data consumers. As part of the research process, discovering and assessing existing data is analogous to a literature search and review. It's important to know what else is out there, what other researchers in our field have done and are doing, and how we can extend or apply existing data to our current research. Accordingly, it's important to develop search strategies relevant to data discovery - to conduct data reference as a distinct activity from a literature search. It's also important to develop data literacy - the ability to evaluate and assess the quality of a given dataset and its applicability to particular research objectives. Finally, once relevant datasets have been discovered and used or applied to a research project, it's especially important to cite the data just as we would cite other resources.
The Love Your Data Week site has pointers to some very useful resources: https://loveyourdata.wordpress.com/lydw-2017/thursday-2017/ We're also throwing in some RDS favorites for the love of it:
Data Repositories and Registries
Discovery and Citation
As a case in point, we recently worked with a researcher looking - as many are these days - for climate data. Some quick data reference produced the following results, which emphasize legacy and historical data: