– Executive Summary –
Research universities need to share information, whether through teaching or disseminating key innovations to society. However, universities should not share all research with everyone.[1] If universities fail to protect sensitive data, it could harm citizens personally, financially, or potentially fatally if acquired by malicious actors.
The greater the number of researchers collaborating on a complex problem, like cancer, the less time it may take to find cure.[2] Information sharing is part of a university’s mission and potentially a regulatory or legal obligation.[3] Because research is valuable, it needs to be propagated to reach its potential. As Steven Johnson writes, “The trick to having good ideas is not to sit around in glorious isolation and try to think big thoughts. The trick is to get more parts at the table.”[4] Researchers must share the data “parts” on the global collaboration “table” to realize big ideas.
Scholars must balance open access with restricted access to sensitive information, when sharing university research data. Therefore, universities must track and secure sensitive data to prevent nefarious actors from stealing or weaponizing the information. For instance, if a malicious actor stole data generated from biodefense projects, United States service members and citizens along with other United States allies could be at great risk. Furthermore, if a malevolent group exploited a lethal disease or toxic threat research, novel bioagents could be produced against which our country has no protection. These are just a couple of the many possibilities that could result from the theft and use of research data for pernicious purposes. Access management and tracking must take priority among research universities and homeland security experts.
Addressing these concerns will require creating a novel collaborative scientific environment, whereby researchers and other academically minded individuals openly share and debate ideas and findings, where research is verified as unique or properly attributed prior to publication, and where every participant is vetted as trustworthy. Perhaps most importantly, this ideal environment will prevent censoring and corruption of ideas, data, and progress by any nation, state, or malicious individual. This ideal collaborative space would immediately benefit scientists and universities; moreover, if successful for academic purposes, this environment could expand to include private industry research and government laboratories. By examining existing technologies and identifying gaps within these technologies, this thesis offers a hypothetical solution to the ideal research sharing environment. Using lessons from what exit today combined with ideas for the technology of tomorrow, this thesis outlines new technologies for an open and trusted sharing environment where unique data and ideas can be traceably shared without fear of deletion.
This thesis answers the question: how can research universities openly and with trust share verified unique data that is both immutable and ultimately traceable? This involves three processes:
- Securely storing sensitive information online so that it is accessible only to authorized individuals,
- Authorizing access to trusted parties with minimal risk of exposure,
- Verifying authorship and tracking access to guarantee that sensitive information is not tampered with or plagiarized,
- Preventing research information from being deleted prior to submission.
The author proposes a combination of key management, encryption, and validation to allow sharing of information and simultaneously preventing its distribution to un-authorized parties. The general outline of this solution is as follows.
A standard public-private key infrastructure (PKI) and Document Object Architecture is proposed for sharing documents among authorized parties while maintaining immutability and secure access. Proposed artificial intelligence techniques guarantee uniqueness and immutability of sensitive data and documents. A highly modified blockchain ledger similar to the X-Road is used for tracking and keeping records of who has had access in the past and present.
A PKI process similar to https currently employed by Internet browsers may be used to establish a trusted path between document owners and document users. Public keys are used to encrypt requests and private keys used to decrypt documents stored as Digital Object Identifier (DOI) objects. In place of a centralized certificate authority, a distributed blockchain ledger and associated algorithms are used to track and manage access. The blockchain mechanism and novel beacon technology guarantees traceability and symmetric key encryption of documents guarantees security.
A system of smart contracts provides a PUT operation for authorized parties to add sensitive information to the system, and a GET operation for document retrieval and authorization. The smart contract blockchain is similar to the X-Road system employed by Estonia, but with significant differences:
- Anti-plagiarism verification is integrated into the PUT operation
- The distributed ledger smart contract manages PKI, rather than a central authority
- Documents are encrypted and assigned a DOI that resides in the ledger(s)
- A beacon is inserted in each GET operation, allowing for files to be tracked, and muted as necessary, after being downloaded
Innovation does not occur in a vacuum. As Steven Johnson writes, “Good ideas may not want to be free, but they do want to connect, fuse, recombine. They want to reinvent themselves by crossing conceptual borders. They want to complete each other as much as they want to compete.”[5] Innovators must collaborate. The greatest minds in the world must be able to work together to solve the world’s most daunting problems. Facilitating on-demand global intellectual summits or collaboration colliders will make the world a better place, if done correctly. Achieving this on a daily basis will require a new digital collaboration and sharing environment. This environment will allow research universities openly and with trust share verified unique data that is both immutable and ultimately trackable. What are the next steps to make this environment a reality? First, by examining currently technology’s ability to meet the define needs. Second, evaluate the identified technologies against the ideal environment as defined by the thesis question. Third, proposing a solution that will meet the ideal environment, and finally, propose future projects to bring the environment from theory to reality.
Though this technology can help many different sectors, including the government and private industry, the ideal test-market for this new technology is the academic research setting. Universities have a need to share information. For financial, legal, and prestige reasons, research universities are an ideal market for this new technology to succeed. In addition to being generators of invention and innovation, universities also have highly intelligent workforces and understand the value of open information sharing. As discussed previously in the problem statement, university research, when used as intended, has the potential to improve life via gene therapies and replacement organs, and increasing nutrition and food security globally. Maintaining the safety and security of sensitive and potentially dangerous information while sharing it productively requires better technology than exists today. As examined in this thesis, existing technologies cannot meet the needs of researchers collaborating globally today.
Though existing technologies cannot create an open, trusted sharing environment of verified unique data that is immutable and trackable, they can, however, provide a foundation from which to build new technology. The solution applications proposed above can hypothetically meet all the prescribed needs of the thesis question. Unfortunately, drawbacks also exist to the proposed technologies. Universities, researchers, and homeland security experts must pursue a solution, perhaps the one described in this thesis, to protect our universities’ sensitive research data, our country’s health from bioengineered diseases, and our nation’s security from threats posed by maliciously misused research data.
[1] “Responsible Conduct of Research: Data Acquisition and Management,” Columbia University, accessed May 11, 2017, http://ccnmtl.columbia.edu/projects/rcr/rcr_data/foundation/.
[2] Robert W. Rycroft, “Does Cooperation Absorb Complexity? Innovation Networks and the Speed and Spread of Complex Technological Innovation,” Technological Forecasting and Social Change 74, no. 5 (June 1, 2007): 565–78, https://doi.org/10.1016/j.techfore.2006.10.005.
[3] To provide evidence for the assertion that information sharing is a part of the university’s mission, I queried university mission statements cited here. The National Institutes of Health reference provides evidence for the regulatory and legal obligation of research universities to share information.
See “Mission,” University of Kansas, accessed June 22, 2017, https://ku.edu; “History & Mission,” Johns Hopkins University, accessed June 22, 2017, https://www.jhu.edu/about/history/; “Stanford’s Mission,” Stanford University, accessed June 22, 2017, http://exploredegrees.stanford.edu/
stanfordsmission/; “NIH Data Sharing Policy,” National Institutes of Health, accessed March 2, 2018, https://grants.nih.gov/grants/policy/data_sharing/data_sharing_brochure.pdf; “Responsible Conduct of Research: Data Acquisition and Management,” Columbia University, accessed May 11, 2017, http://ccnmtl.columbia.edu/projects/rcr/rcr_data/foundation/.
[4] Steven Johnson, Where Good Ideas Come from: The Natural History of Innovation (New York: Riverhead Books, 2010).
[5] Johnson, Good Ideas, 22.