For most biotechnology and biopharmaceutical organizations, “business as usual” means a perpetual race to the finish line: Conceive a new invention, reduce it to practice, attain patent protection, repeat ad infinitum. But sometimes, the very technologies scientists use to expedite that chain of events (e.g., electronic laboratory notebooks and cloud-based laboratory data sharing) create security and authenticity holes. In essence, the more agile and sophisticated our work flow systems become, the more difficult it becomes to guarantee the integrity of our underlying data. In today’s aggressively litigious climate, even the slightest, most theoretical potential for data inauthenticity is enough to invalidate years of arduous research and development.
Numerous documented cases of scientific intellectual property (IP) losses — either to competition or to public domain — have led to hundreds of millions of dollars in lost biotechnology and biopharmaceutical revenues, legal fees, and damages. It’s a complex problem that cannot be remedied retroactively or through traditional means. We have no choice but to innovate. Just as our efficiency processes have evolved, so too must our preemptive security protocols.
Burden of Proof
Scientific intellectual property —the lifeblood of most biotechnology and biopharmaceutical entities — is notoriously difficult to protect. Insurance providers, for instance, cannot adequately cover the loss of intangible assets, which typically constitute 70% or more of a biotechnology company’s market value. In fact, the maximum policy coverage available today (~$25 million) would cover only 4% of Merck ‘s total intellectual property assessment.
Because the risk isn’t transferable, bioscience executives have a fiduciary responsibility to adopt reliable mitigation tactics in anticipation of competitive legal action. Most commonly, those IP lawsuits are instigated by one of two scenarios:
- knowledge diffusion, in which institutional knowledge is sent outside an organization — knowingly or unknowingly — to potential competitors
- concurrent development, in which two or more entities vie for the intellectual property rights of similar discoveries or formulas.
In first scenario, once an unprotected IP asset is electronically dispersed beyond an organization’s walls, it can be exceedingly difficult to prove ownership. In second scenario, the US patent system honors the principle of “first to invent.” But you must irrefutably prove priority of invention and reduction to practice.
In a third scenario, insider manipulation, trusted employees tamper with internal data to better meet objectives or maximize profits. In fact, because of the inherent risk of such manipulation (particularly in organizations with decentralized work f low management), bioscience companies must prove legal credibility and authenticity of intellectual properties at ever y point throughout their “chain of custody.”
In other words, opposing counsel will not only challenge the authenticity and credibility of a company’s laboratory data, but also the trustworthiness of its people, processes, and systems responsible for its safekeeping. Hence, to preserve legally defensibility, an informatics protection solution must satisfy several highly specific criteria: longevity of protection, independent verifiability, portability, and standards-based compliance.
Longevity of protection ensures that data authenticity can be proven at any point in time during a record’s lifetime. Independent verifiability ensures that data integrity can be corroborated independently of a company’s people, processes, and systems.
Portability ensures that integrity protection is not lost when data are migrated or exchanged. Standards based compliance ties data authenticity to a trusted international standards body such as International Organization for Standardization/International Electrotechnical Commission (ISO/IEC) or to a country-specific standards body such as the American National Standards Institute (ANSI).
Examination of individual mechanisms should take into account those governing criteria. Solutions lacking in any component are at high risk of legal invalidation and may fail to meet the burden of proof, potentially resulting in lost IP ownership, lost market share, and ultimately lost revenue.
Arrows In the Quiver
Many biotechnology organizations use one of four approaches to protect the integrity and authenticity of their scientific IP. Each has a graded level of integrity protection (Table 1).
Table 1: Comparison chart
PKI-Based Digital Signatures: A mechanism for reliably associating verified user identity with an electronic document, public key infrastructure (PK I)-based digital signatures effectively provide the “who” component of document authentication. In essence, a user is randomly assigned a pair of corresponding public and private keys. A digital signature is derived by running a user’s data and private key through a signing algorithm. Later, using that signature, data, and user’s public key, a signature verifying algorithm can determine — to a relative level of certainty — that the data are authentic.
Strictly speaking, when you verify a digital signature, you verify that the corresponding document has not changed since it was “signed” by an identifying party. W hat digital signatures cannot confirm, however, is the “when” component. In other words, a signed document could be altered, then re-signed, and still result in a perfectly valid signature.
Moreover, digital signatures become difficult to verify over the long term, because certification authorities typically will not provide revocation information for expired certificates. Furthermore, the cryptographic primitives used to create signatures and revocation information have been known to weaken over time. That can undermine their credibility in a court of law.
Repudiation may occur when the signer claims that someone else has used his or her private key. Assuming a key was indeed compromised at some point in time, it becomes imperative to determine the key’s validity at the time of signing. The digital signature’s lack of an independent time value, its inability to withstand the test of time, and its potential for misrepresentation are the sources of uncertainty. For this reason, most people consider digital sig
natures and time stamps separate but complementary technologies. This approach merits a D+ integrity protection grade.
Secure Hashing: Cryptographic hash algorithms (e.g., MD5 and SH A-2) can be used to detect digital data tampering. By running your data through a mathematical algorithm, you produce a unique hash value associated with that content — or a unique “digital fingerprint.” Should your data be altered and the hash value recomputed, those hash values will not match. Hence, the ability to replicate a previously derived hash value is strong assurance that your data are authentic. If you are confident in the strength of the underlying hashing technology, you should have a high degree of confidence in the authenticity of the content itself.
However, as with digital signatures, establishing time-of creation becomes the operative element. Simply providing a court with a data set and hash value proves little. The data could have been altered many times, and the hash value computed the day before a trial.
To make meaningful statements about data integrity, one must indelibly bind a valid time value to a hash, preferably from a trusted source such as the National Institute of Standards and Technology (NIST) or United States Naval Observatory (USNO). The auditability of a time source, the security of time-value binding, and impregnability of a hash algorithm can all be challenged by opposing counsel. In fact, any mechanisms that can be circumvented by application developers, administrators, or vendors may be rejected in court.
As with digital signatures, hash algorithms can grow weaker over time as attempts are constantly made to crack them, so they must be periodically “renewed” to maintain efficacy. Naturally, it’s crucial that such renewals not interfere with the original time-value bindings. This mechanism merits a C integrity protection grade.
PKI-Based Time-Stamping: As an attempt to solve the “when component” of document authentication, PKI-based time stamps (for example RFC 3161 timestamps) bind a time value to a data set using a pair of public and private keys. In this case, it’s the private key of a time-stamp authority (TSA) that’s used to create the time-based signature.
One issue with PKI-based time stamps is that you have to trust the TDA to create time stamps that accurately reflect the current time (not backdated). PKI-based time stamping systems have technical and procedural measures to make backdating difficult, but the trust factor here makes PKI-based timestamps subject to challenge.
As with any PKI-based digital signature, potential for key compromise creates a risk of timestamp forgery and invalidation. If a timestamping key is compromised, then anyone with access to that key could create a timestamp that appears legitimate. Furthermore, there’s nothing to prevent a TSA itself (or even an employee or malicious third party) from backdating a time stamp. In addition, a TSA key compromise would immediately render all previously-issued time stamps invalid, possibly affecting defensibility of years or decades of research. Key life, too, is a significant — and separate — impediment. TSA keys, because they are PKI-based, have fixed life spans that are generally reflected in the expiration date of each certificate. After expiration, it may no longer be possible to authenticate any associated time stamps. This mechanism merits a B– integrity protection grade.
Hash-and Link, Keyless Digital Time-Stamping (“Widely Witnessed”): One alternative to using owner-based, secret “keys” when binding time values to data sets is a US-patented process known as the keyless, hash-chain linking (or “ hash-and-link ”) methodology. What makes this approach unique is its non-PKI-based, “widely witnessed” digital time-stamping and publishing process. All time-stamp and content hash values are cryptographically linked into a chain, of which an algorithmically verifiable hash-and-link summary value is generated and is periodically published (e.g., in a newspaper).
Similar to why state lotteries televise the drawings of lottery numbers, publishing a hash-and link summary value in a widely viewed, global newspaper such The New York Times demonstrates that the process is independently auditable. It also demonstrates zero risk of repudiation due to collusion or — in the case of PKI-based approaches — key compromise. In addition, under this “widely witnessed” approach, PKI-key life becomes a nonfactor, because there is no key required. Core elements of underlying cryptographic technology is less of an issue as well because the only elements involved in this method are the underlying hash functions. If stronger hash algorithms become available in the future, then hash-and-link can be “renewed” with the new algorithms without disrupting original time values, thus preserving the previously “sealed” content.
In terms of legal defensibility, this most recent method represents what many consider a significant leap forward. So long as data authentication solutions depend on the reliability of outside parties — no matter how trustworthy — there will inevitably be credibility and security risks. But, with “ hash-and-link ” digital timestamping, coupled with the “widely witnessed ” published summary hash value, however, you need only trust the strength of the algorithms themselves. For these reasons, this mechanism merits an A+ integrity protection grade.
The Path Forward
After examining some of the R& D protections available, the question becomes: What is the best way to proceed? Stated simply, most biotechnology and biopharmaceutical organizations would do well to choose solutions that accommodate the core components of legal defensibility — longevity of protection, independent verifiability, portability, and standards-based compliance. It may even be worth asking whether your solutions provider is willing to guarantee the legal defensibility of its protection schemes.
Certainly, there’s little margin for error. As organizations hurry to reduce product development timelines, to automate processes, and to integrate transformative new technologies, security of critical research data isn’t always their first consideration. But remember this: High-tech informatics systems are susceptible to high-tech threat scenarios. At any moment, your organization may be called upon to defend the integrity of the electronic records in its possession, including those containing the lifeblood of your organization.
If and when that happens, anything less than certainty would be a catastrophe.
Author Details
Bob Flinton is vice president of marketing and product management at Surety, LLC, 12020 Sunrise Valley Drive, Suite 250, Reston, VA 20191; 1-571-748-5800; bflinton@surety.com; www.surety.com.