The biosciences have become information sciences, in which knowledge is often produced in silica, by the manipulation and analysis of large datasets. Genomics has been at the forefront of the data explosion and is a model for bioscience as a large-scale endeavor. Large genome research datasets are frequently shared through research repositories. To protect the interests of people from whom the data were derived (data sources), human data are often shared through a controlled access mechanism, in which data repositories can, in theory, place limitations on who uses the data and for what purpose. Controlled access is an innovative governance mechanism, but it may not protect data sources the way policy makers intended. Here, I describe one controlled access process in some detail, and provide insight into how and why researchers fail to comply with data use restrictions.