Big Data and IP Business Strategy

By Guest Author for TradeSecretsLaw.com on March 19, 2014

Posted in Computer Fraud, Cybersecurity, Data Theft, Privacy, Social Media

As a special feature of our blog –special guest postings by experts, clients, and other professionals –please enjoy this blog entry about the big data and IP business strategy by technology lawyer and IP strategist Joren De Wachter. Joren serves as a Co Chair with me on the ITechLaw Intellectual Property Law Committee and has an excellent blog of his own on current technology issues. Enjoy Joren’s article and for more on Big Data, please see our webinar on the Big Data Revolution.

-Robert Milligan, Editor of Trading Secrets

By Joren De Wachter

Big Data is an important technological change happening around us.

How should businesses react? What is the right business strategy? And, as part of such business strategy, what is the right Intellectual Property Strategy?

It can be the difference between success and failure.

1. What is Big Data?

“Big Data” is the revolution happening around us in the creation, collection, communication and use of digital data.

In the last couple of years, humanity’s capacity to create data, to communicate and to process them, has increased manifold.

According to IBM, we produced 2.5 Exabyte (that’s 2,500,000,000,000,000,000 or 2.5×1018 bytes) of data every day in 2012.

But the total amount of data, already beyond easy intuitive grasp, is not the key characteristic of Big Data. Its key characteristic is the continued exponential growth of those data.

While 90% of all data in existence today was created in the last two years (which means that, in less than 18 months, we create more data than has been created since the beginning of humanity, roughly 150,000 years ago), the most important point is this: those 90% created in the last two years, that what we consider to be enormous amounts of data today, will be dwarfed into complete insignificance in a couple of years’ time.

Humans are not very good at really understanding exponential mathematics, or at grasping its impact. This blogpost will give an insight into how technology businesses and their investors should plan and prepare for the Big Data Tsunami that is heading their way. And not just today, but for the foreseeable future. For there is no indication that this doubling of computational power (Moore’s law), the doubling of storage capacity or the doubling of communication capability – each occurring in 18 months or less – is about to slow down in the next couple of years.

This presents every business with a serious challenge: how will the emergence of Big Data affect the way the business uses its intangible assets? As we know, most assets these days are intangibles, also known as Intellectual Capital or Intellectual Property.

Arguably, Intellectual Capital is at the heart of any innovative business. Using it better will be the recipe for future business success. And failing to use it better will be a recipe for failure.

2. Framing the discussion around Big Data and Intellectual Capital Strategy.

a) What do we mean by “data”, and what is the difference with “information” and “intelligence”?

Data is a very wide concept. Everything created digitally is covered. From every document on your desktop to any picture posted by any user of social media. But it’s much more than that. It also means that, e.g. all the 150 million sensors of the Large Hadron Collider in Geneva, delivering data 40 million times per second, are included in the concept of data. If all of those data would be recorded, they would exceed 500 Exabyte per day – 200 times more than the world creation of data per day according to IBM as referred to above. However, those data are not actually produced, recorded or processed – before that happens, massive filtering takes place. In reality, the LHC produces a mere 25 Petabyte per year of data (a Petabyte is 1×1015, so 1/1000th of an Exabyte).

The implication is that there are enormous potential amounts of data that will be created and processed, once our computing and communication capability allow for it.

But “data” means more than that. It also includes everything created by any kind of sensor, but also by any camera, the input of any user, any person operating a computing device (PC, mobile, tablet, etc). Any project, any plan, any invention, any communication is also included.

And all of those data increase exponentially, roughly every year or so.

What is the relation between “data” and “information”? In essence, they are the same. Every bit of data has information. The nature of that information and its potential value are determined by analyzing it. This is where we start talking about the meaning of data, and the intelligence that can be extracted from them.

However, while many definitions and approaches are possible in respect of how information becomes useful, and about the value of analysis and intelligence, a simple observation will be sufficient for the purpose of this blogpost.

That observation is that any subset of data, such as useful data, or intelligent data, or structured data, will grow in a similar, exponential fashion.

The implication of this observation is that it is not only “data” that grows exponentially, but also, by necessary implication, “knowledge” or “useful data”. So, we need to assess Intellectual Property strategies in a world where the amount of knowledge grows exponentially.

b) The importance of algorithms

The analysis of data, or indeed pretty much any meaningful way of using data, is done through using algorithms.

An algorithm describes a process for calculation, processing or reasoning, and allows to extract meaning and understanding from data. This, in turn, increases the value of the data – algorithms make the data speak to us.

This is where data turn into intelligence; it is through applying algorithms that data start to make sense, and can give us additional information.

One of the great potentials of Big Data is the capability to recombine data from different sources, and compare and analyze them. This allows finding new correlations – something that will help us understand how society works, and how one phenomenon works on another.

For example, analysis of the raw data of drug prescription in the National Health Service in England and Wales, allows to find correlation with certain hospital visits for conditions that are indirectly caused by certain drugs, which have not been noticed by the clinical trials (or which the drug companies have kept hidden from publishing).

The potential value of Big Data, and the use thereof, is enormous. McKinsey, the consultancy, estimates that Open Data would add between $3tn and $5tn to the world economy – that’s an economy with a size somewhere between Germany and Japan.

And algorithms are the key to unlocking this value of Big Data; hence they will be key in any IP Strategy.

c) What is Intellectual Property Strategy?

Intellectual Property Strategy means that business understands what their Intellectual Capital consists of, and uses it in the most optimal way to support the business strategy. It looks at a lot more than just patents or copyrights, the technical aspects of IP rights, but considers the whole range of Intellectual Capital.

Key aspects of IP Strategy consist of recognizing and understanding Intellectual Capital, assessing how they are used to support the business model and keeping the right balance between an open approach and using protection techniques (such as patents or copyright), by taking into consideration the impact of Open Innovation and Open Source.

In essence, it is the tool to bring innovation to market, and to scale innovation to new markets.

3. Impact of Big Data on IP Strategy.

This blogpost will look at how Big Data will impact IP Strategy from five angles. These five are 1) patent strategy 2) ownership of data 3) copyright 4) secrecy/know-how and 5) IP value and strategy of algorithms.
They are all essential parts of an IP Strategy.

a) Patent strategy.

Patent strategies can be quite different from one industry to another.

However, there are some common elements to consider, and they can be summed up in two observations. The observations are that both obtaining and using patents will become much harder. As a consequence, the business value of patents is likely to drop significantly.

Obtaining patents will become much harder

The observation is straightforward, but very important: if the amount of available information doubles every 18 months, the amount of prior art also doubles every 18 months.big data and patents linear prior art.

Patents are exclusive rights, granted on novel and non-obvious technical inventions. The granting of patents is based on the assumption that the patent offices will know existing technology at the time of the patent application, and refuse the application if the technology described is not “novel”.

However, if the amount of existing information grows exponentially, this means that in principle, the rejection rate of patents must also grow exponentially, to the point where it will reach 100%.

The reason is simple: the number of patents does not double very fast – in the last 50 years, it has doubled only twice in the US. The exponential growth of prior art (remember – this is a phenomenon humans intuitively struggle to understand) means that the amount of information that would disallow the granting of a patent – and that is any patent – also grows exponentially.

So, unless the granting of patents also grows exponentially, the area of technology that is patentable will shrink accordingly. Since patents are granted by human operators (patent examiners), and the number of patent examiners cannot grow exponentially (within a couple of years, most of the workforce would have to consist of patent examiners, which would be absurd), the number of patents will fall behind. On an exponential basis.

More importantly, if the patent offices would do their job properly, and only grant patents on technology that is actually new, the rejection rate would soar, and would reach very high levels (up to 100%), within a relative short time span.

This means that the risk of having a patent rejected two or three years through the application process, will rise significantly.

However, this phenomenon is not very visible yet. One of the key reasons why the impact of the Big Data explosion of accessible information is not very visible at the moment in the way patents are being granted, is because the patent offices don’t actually look at prior art in a way that takes into account the exponential growth of non-patented technology information.

Most patent examination processes review existing patent databases, and will establish novelty against existing patents rather than the actual state of technology. This made sense in a world where the speed of information creation was not an issue, or ran generally parallel to the rate of technology patenting. However, as non-patented technology (and, more particularly, information thereon) doubles every 18 months, the relevance of patent databases to establish whether something is new, takes a significant nosedive.

It is not clear to me whether patent offices realize the exponentially growing insignificance of their traditional data-approach. Once they do, though, they only have two options. The first is to reject most, if not all, patent applications. The second option is to ignore reality, and grant patents on non-novel inventions. However, this will (and arguably, already does) create huge problems in enforcing or using patents, as explained further below.

Either way, from a purely theoretical level, a novelty-based patent system is unsustainable in an environment of exponentially growing prior art or publicly available information.

From an IP Strategy point of view, this means that businesses will have to become much more selective and knowledgeable in their decision process on what to patent, and how to patent it.

This will affect both the scope of patents (which, in order to remain effective, is likely to become much more narrow), and the rate of success/failure of patent applications, both of which will have a significant impact on the return on investment in patent exclusive rights being sought and used by a business and its investors.

Use of patents

A similar problem affects the potential use of patents as part of an IP Strategy. There are a number of ways in which patents can be used, but the core function of a patent is to act as an exclusive right – a monopoly – on the production or distribution of a product or service.

This means that “using” a patent effectively means suing a competitor to have them blocked access to market, or charge them a license for allowing them to sell.

However, depending on the specifics of the legal system involved, when a patent holder wishes to enforce a patent, the defendant often can invoke that the patent should not have been granted, because there was prior art at the time the patent was granted.

And, while patent offices do not seem to have a clear incentive to take into account actual reality, including the exponentially available information created by Big Data, when reviewing the application, the situation is very different for a defendant in a patent lawsuit.

They will have every incentive to establish that the patent should never have been granted, because there was pre-existing prior art, and the information in the patent was not new at the time of application.

And one important consequence of Big Data will be that the information available to defendants in this respect, will also grow exponentially.

This means that, again, from a theoretical level, the probability of being able to defend against a patent claim on the basis of prior art, will grow significantly. Because of the lag of time between patent applications and their use in court (it takes several years for an application to be granted, and it may take more time before a court decides on it), the effect of the recent explosion of information as a result of Big Data is not very visible in the patent courts yet. But this is a ticking time-bomb, and, if and to the extent procedural rules do not interfere with the possibility of invoking prior art to invalidate a patent, there is a high likelihood we will see the rates of invalidation in courts increase steeply.

From an IP Strategy point of view, this means that an offensive IP Strategy, consisting of suing competitors or others based on your patent portfolio, becomes more risky. While the costs will continue to rise, the potential of a negative outcome will also increase significantly.

There is a second important issue around use of patents that needs to be addressed here as well. It relates to the algorithmic aspect of patents.

A patent is, of itself, an algorithm. It describes the process of a technical invention – how it works (at least, that’s what a patent is theoretically supposed to be doing).

It is therefore quite possible that a lot of algorithms around analysis of Big Data will become patented themselves. It could be argued that this will act as a counterweight against the declining value and potential of patents described above. However, I do not believe that the effect will be anything more than marginal.

The reasons for my opinion are the three challenges affecting the potential value of a patent on algorithms analyzing Big Data.

The first is that many of these algorithms are, in fact, not technical inventions. They are theoretical structures or methods, and could therefore easily fall into the area of non-patentable matter.

The second is that algorithmic patents are particularly vulnerable to the ability by others to “innovate” around them. It is quite unlikely that a data analysis algorithm would be unique, or even necessary from a technical point of view. Most data analysis algorithms are a particular way of doing similar things, such as search, clever search, and pattern recognition. There is, in actual fact, a commoditization process going on in respect of search and analytical algorithms.

As a general rule, in order to become patentable, such algorithms must be quite specific and focused. The broader they are described, the higher the likelihood of rejection because of the existence of prior art. However, this reduces their impact from the perspective of using them to block others access to market. A slightly different algorithm, yielding sufficiently similar analytical intelligence, but outside the scope of the first patent, will often (in my experience almost always) be available. This is due to the generic nature of the different aspects of most data analytical algorithms – it’s basically always a combination of checking, calculating, filtering and compressing information (sometimes with visualization or tagging and creation of metadata added); but the potential ways in which these can be combined quickly becomes unlimited.

In practice, it means that a patent around data analysis can almost always be circumvented with relative ease.

But the third challenge is the most important one.

Patents are “frozen” algorithms. The elements of the algorithm described in a patent are fixed. In order to have a new version of the algorithm also protected, the patent will either have to be written very vague (which seriously increases the risk of rejection or invalidity) or will have to be followed up by a new patent, every time the algorithm is adapted.

And the key observation around Big Data algorithms is that, in order to have continued business value, they must be adapted continuously. This is because the data, their volume, sources and behavior, change continuously.
Compare it to the core search algorithms of Google. These algorithms are continuously modified and updated. Indeed, in order to stay relevant, Google must continuously change its search algorithms – if they didn’t do so, they would drop behind the competition very quickly, and become irrelevant in a very short time.

The consequence is that, even if a business manages to successfully patent Big Data analytical algorithms, and avoids the pitfalls described above, such patent will lose its value very quickly. The reason is simple: the actual algorithms used in the product or service will quickly evolve away from the ones described in the patent. Again, the only potential answer to this is writing very broad, vague claims – an approach that does not work very well at all.

In other words: the technology development cycles of algorithms applied to Big Data analytics and intelligence are much too short to be accommodated by patents as they exist today.

Therefore, the use of patents will decline significantly; their value for business needs to be continuously re-assessed to address this observation.

From an overall IP Strategy point of view, this means that businesses will have to become much more selective in applying for and using patents. Conversely, investors will have to re-assess their view on the value that patents add to a business.

b) Ownership of data

Data ownership is an interesting, and developing area of law. In most countries, it is theoretically possible to “own” data under the law. The legal principle applied will differ, but is typically based on some kind of protection of the effort to create or gather the data, and will allow to block or charge for access or use.

However, there are a number of challenges related to ownership of data.

These challenges are based on the fact that Big Data is typically described by three characteristics: Volume, Velocity and Variety.

Volume stands for the ever growing number of data, as explained above. Velocity stands for the speed required to gain access to and use data, and Variety stands for the fact that data sources and formats multiply and change constantly.

From an ownership perspective, these charactistics lead us to two ways in which the traditional concept of data ownership is challenged by Big Data.

The first is the simple observation that data are a non-rivalrous commodity. That means that one person’s use of data does not necessarily prohibit or reduce the value of use of those data by another person, or by another 10,000 persons.

From a technical perspective, re-use of data is the most common, and obvious, way of approaching data.

But the challenge is neither technical nor legal; it is based on business models and interests.

And those business models and interests point us to two very relevant facts: a) most data are generated by someone else, and b) the value of data increases by their use, not the restriction on their use.

The first fact is obvious, but its relevance is underestimated. Most of the business value in Big Data lies in combining data from different sources. Moreover, the actual source of data is often unknown, or derives from different levels of communication. Data from customers will be combined with data from suppliers. Data from government agencies will be combined with data from machines. Internal data need to be compared with external data. Etc. Etc.

Therefore, there is a clear push towards opening up and combining data flows – this is the most efficient and best way to create business value.

And while it is true that from a legal and risk management perspective, many people will indicate the risks related to opening up data flows (and those risks are real), it is my perspective that those risks, and the costs related to tackling them, will drown in the flood of business value creation generated by opening up and combining data flows.

Add in the observation that many governments are currently considering how much of their data will become open. The likely trend is for much, much more public data to be made available either for free, or for a nominal access fee. This, in turn, will increase the potential of re-usability and recombination of these data, pushing in turn businesses to open up, at least partly, their own data flows.

And this leads to the second fact. The value of data is in its flow, not its sources.

Big Data can be compared to new river systems springing up everywhere. And the value of a river is in having access to the flow, not control over the sources. Of course, the sources have some relevance, and control over specific forms or aspects of data can be valuable for certain applications.

But the general rule is, or will be, that gaining and providing access to data will be much more valuable than preventing access to data.

As a result, the question of “ownership” of data is probably not the right question to ask. It does not matter so much who “owns” the data, but who can use them, and for what purpose.

And, as the number of sources and the amount of data grows, it is the potential of recombining those aspects, that will lead to exponential growth in how we use and approach data.

The river analogy comes in handy again: as the number of sources and data grows, the number of river systems also grows – and they will be virtually adjacent to each other. If you can’t use one, you jump to another one; the variety on offer will make control or ownership in practice virtually impossible to operate.

The conclusion on ownership is again best illustrated by our river analogy: we should not focus on who owns the land that is alongside the river; we should focus on being able to use the flow, and extract value from that.
From a practical perspective, it means that “ownership” of data should be looked at from a different angle: businesses should not focus on acquiring ownership of data, but on expanding different ways of using data, regardless of their source.

c) Copyright

Copyright is a remarkably inept system for the Information Society. Its nominal goal is to reward authors and other creators. In real life, it mainly benefits content distributors.

Originally, copyright was typically granted for the expression of creative activity: writing a book or a blog, creating or playing music, making a film, etc.

However, copyright also applies to software code, based on the observation that code is like language, and therefore subject to copyright. As such, copyright covers the code, but not the software functionality expressed through the code.

But does copyright apply to Big Data? And if so, does it have business value?

Data is information. Copyright does not apply to the semantic content or meaning of text written by human authors. In other words, it is not the message that is covered, but the way the message is formulated. If only one formulation is possible, then there is no copyright protection, because there is no creative choice possible. That is, very abridged, what copyright theory states.

Logically, this means that most data will fall outside of copyright. Any data generated by machines or sensors will not be covered by copyright. Any statistical or mathematical data is, as such, not covered by copyright.

That means that a very large subset of Big Data will not be covered by copyright. This legal observation will not stop many businesses from claiming copyright. Claiming copyright is easy: there is no registration system, and there is no sanction attached to wrongfully claiming copyright or claiming copyright on something that cannot be covered by copyright (e.g. machine-generated data).

Another large subset of Big Data is, in theory, covered by copyright, but in practice, the copyright approach does not work. This subset relates to all user generated data. Any picture, video or other creation posted by any social media user online is, in theory, covered by copyright. But that copyright is never actually used.

Users will not be allowed to claim copyright protection against the social media platform (the terms of use will always include extremely liberal licenses, allowing the social media platform to do pretty much what they want with the content).

More importantly, the value of all that user generated content lies in using it in ways that copyright is structurally unable of handling. User generated content, in order to have value, must be freely available to copy and paste, tag, adapt, create derivatives of, and, fundamentally, share without limitation. It is the opposite of what copyright tries to achieve (a system of limited and controlled distribution and copying).

Again, the analysis points to the inappropriate nature of our Intellectual Property system.

Most business value in using Big Data will be in open breach of copyright, typically by ignoring it or, at best, pay some lip service to it (as e.g. Facebook or other large social media do), or will be dealing with data that are not under copyright, but have not necessarily been recognized yet as such by the court system.

As a result, the copyright aspect of any IP strategy in Big Data will first and foremost have to make the analysis of a) whether copyright applies, and b) whether it adds any business value.

Since the applicability of copyright on machine or user generated content is partly in legal limbo, an appropriate solution for some businesses may be to use the creative commons approach. It helps to ensure that data are shared and re-used, hence increasing their value, and allows, from a practical perspective, to ignore the question whether or not copyright applies. If it applies, the creative commons license solves the problem. If it does not apply, and the data can be freely used, the end result will be, from a business perspective, similar.

A final point on database rights, a subset of copyright for a specific purpose, developed in the European Union.

While database rights may look as a system specifically designed for Big Data, reality shows otherwise.

Database rights don’t protect the actual data, they protect the way in which data are organized or represented.

In a typical Big Data situation, they would apply to the structured result of an algorithmic analysis of a dataset. Or they could apply to a relational database model, the way in which an application will sort data that is delivered to it, before specific functionality is applied to it.

While these have potentially quite a lot of value, the concept of protecting them through a copyright-related system suffers from the same weaknesses of copyright itself.

The value of such databases directly derives from a) access to the underlying data and b) the algorithmic process of selecting and manipulating the data, both of which are not covered by database rights. The end result of the exercise, as a snapshot, is covered by database rights. But the logic of importing, selecting and other functions on them, are not.

In other words, it’s another Intellectual Property Right that does not focus on the business value of Big Data – which is why nobody really talks about it.

d) Secrecy/know-how

Secrecy and know-how protection can be a very valuable asset of businesses. The most classic example is of course the secret formula of Coca-Cola. It’s not actually protected by a formal Intellectual Property Right (anyone is free to copy e.g. cooking recipes), but it has significant business value, and it is protected by other legal instruments. Typically, contract law, with confidentiality agreements, will play a big role in protecting business secrets and know-how, and most legal systems allow businesses to bring legal claims against competitors, business partners or employees who disclose or use secret information in unauthorized ways.

By and large, this approach is used by many businesses. Often, the strategy around protecting Intellectual Capital will consist of understanding what the business secrets are, and building appropriate procedures of protection or disclosure.

Yet, a key consideration for this part of any IP and business strategy is the word “secret”.

Secrecy has a major downside: it means you can’t talk about, use or disclose whatever is secret in a way that allows others to find out about it. The challenge here is that a lot of the value of Big Data depends, as we have seen repeatedly, on the ability to have access, and preferably open or free access, to as much data as possible.

This means that there is a natural market-driven pressure to businesses, in a Big Data environment, to prefer the use (and therefore an open approach) to data, rather than to limit or restrict use. While it is true that access to certain data can be very valuable, this approach is typically based on the assumption that one knows the data available, and understands at least the most important value considerations in respect of these data.

This is where Big Data presents an important shift: not only does it become much harder to know who owns or generates which data, or what is in those data, it also becomes much riskier not to grant relatively free access to data. This is because there is a lot of relevant, but not necessarily obviously visible, value in the data. A lot of value in Big Data comes from recombining data from different sources, or approaching data in a different way (e. g. compressing data in a visual or topographic way in order to discover new patterns).

As a result, businesses that open up their data are more likely to retrieve value from those data, and those that do, will retrieve more value from the data that is most open and accessible.

These developments will change habits within businesses, who will be pushed by market forces and the need to be more efficient, to open up more and more data sets and data sources. Inevitably, this will clash with strategies to keep information secret.

While it is, in theory, uncertain which way this conflict will play out, we need to be reminded again of the exponential growth of Big Data. The logical consequence of this exponential growth is that the pressure to open up is likely to be much stronger, and yield more direct benefit, than the longer term strategy to keep things secret because one day that may yield an additional benefit.

In other words, it will become much harder for businesses to keep things secret, and there will be growing pressure to open up data streams.

From an IP Strategy point of view, this means that understanding and selecting those intangible assets that have more value as a secret than as an open, accessible intangible asset will become more difficult, but, arguably, also more important. On the other hand, businesses that reject the knee-jerk reaction to keep as much as possible hidden or secret, may find that they evolve faster and generate more new business opportunities. It is not a coincidence that Open Innovation has become such a tremendous success. Big Data is likely to reinforce that evolution.

e) Intellectual Property and value of algorithms.

A point that has been touched upon repeatedly is the value of algorithms in a world of Big Data.

Algorithms are the essential tools enabling businesses to make sense of, and create value out of, Big Data.

Yet algorithms are not, as such, protectable under formal Intellectual Property Rights.

Is this a problem for an IP Strategy? Not necessarily.

After all, an IP Strategy is not just about protecting or restricting access to Intellectual Capital, it is also about positive use of that Intellectual Capital to serve the strategic and operational needs of the business concerned.

The financial services industry has used complex algorithms for many years, particularly in the mathematical structures known as “quants” – the formulae used to operate in and track the highly complex mathematical environments of derivatives, online trading or future markets (not to mention the toxic products that are one of the causes of the crash of 2007-2008).

Yet, many of these systems do not enjoy any formal Intellectual Property Protection. No patents are used, no copyright applies. Secrecy does apply, of course, but the market data themselves are very open and visible; indeed, most of the algorithms depend on liquidity, if not of financial assets, then certainly of financial data.
The pattern is similar for algorithms around Big Data. While some secrecy can be tremendously important for specific parts of algorithmic use of Big Data, the liquidity and open nature of the data themselves will often be at least as, if not more, important.

To that needs to be added the need of continuous change and adaptation of such algorithms – in order to have business value, algorithms need to be “alive”. And in order to be alive, they need to be fed those huge amounts of data for which they have been created.

The analogy with bio- or ecosystems is not a coincidence. Just like biosystems thrive on resources that are freely available as a result of ecological circumstances (the energy of the sun, the oxygen in the air, etc), Big Data ecosystems are emerging and evolving, based on free(ish) availability of data and data streams.

As a result, an IP Strategy towards algorithms will have to take into account their almost biological-like behavior. Clever strategies will therefore allow for processes of evolution and selection to occur – and it is likely that those processes that allow free access to data will outperform, through the force of evolutionary pressure, those that do not allow such free access.

It will be therefore key for any IP Strategy to look at the core algorithms that are at the heart of any business dealing with or affected by Big Data. That will be almost everybody, by the way. And such an IP Strategy will have to consider the benefits to be gained from an open approach, to the risks suffered from closing down access to that new lifeblood of the Big Data Information Age: the flow of data itself.

4. Conclusion

A recurring theme throughout this article has been that the traditional view on data and their use is being challenged.

That traditional view is based on making data and information artificially scarce, and trying to charge for it. Intellectual Property Rights are the most obvious ways of making non-rivalrous commodities such as ideas, technology and data artificially scarce.

Yet, as an inescapable consequence of the exponential growth of Big Data, that approach is now at risk of causing more damage to businesses, rather than providing benefits.

Big Data is like a river system. The value of Big Data is not in its many sources, but in gaining access to the flow, and using it for the strategic purposes of your business.

A traditional IP Strategy, focusing on ownership, is in our analogy akin to focusing on claiming land a couple of miles from the river. It is looking in the wrong direction, and misses most of the value of Big Data. While some ownership of a bit of river banks (the algorithms) may have value, our Big Data River is more complex than a simple estuary – it is like the Delta of the Nile – overflowing regularly, where riverbanks and plots of land all of a sudden disappear or get flooded. And a new Delta comes into existence every 18 months.

Therefore, as a conclusion, IP Strategies around Big Data should focus on the instruments to access and use the flow of data, rather than using outdated models of artificial scarcity that will be overtaken by the exponential growth of Big Data.

Menu

Trading Secrets

About Seyfarth's Trade Secrets, Computer Fraud & Non-Competes Team