Where are the tout­ed rev­enues com­ing from Big Data?

Richard Heil­er Martínez

Has that label only become a catchy new adver­tis­ing slo­gan for com­pa­nies to show­case their sup­pos­ed­ly inno­v­a­tive pow­er? How much do com­pa­nies real­ly earn or save with data? How do the earn­ings rate in rela­tion to the expens­es for data pro­tec­tion (GDPR), infor­ma­tion secu­ri­ty as well as for the inevitable IT-infrastructure?

In her Duet inter­view with Richard Heil­er Martínez, cor­po­rate finance spe­cial­ist, Dr Cal­daro­la, author of the recent book Big Data and Law, dis­cuss­es the bot­tom line on the cor­po­rate earn­ings from Big Data.

The Sta­tista sta­tis­tics1 show that Big Data rev­enue in Ger­many amounts to €6.4 bil­lion in 2018. Does this rev­enue lev­el jus­ti­fy the famous state­ment “data is the new fuel of the 21st century”?

Richard Heil­er Martínez: I would not pay too much atten­tion to the 2018 rev­enue fig­ure of €6.4bn, with regard to Ger­many, not even to the US$40.8bn which sup­pos­ed­ly rep­re­sents the world­wide rev­enue derived from Big Data in 2018. Those rev­enue fig­ures pri­mar­i­ly com­prise pro­ceeds out of data sales or data usage rights col­lect­ed by search engines (for real estate, jobs, etc.) big dis­trib­u­tors and social media firms, which there­upon sell their col­lect­ed data to ser­vice providers and manufacturers.

Thus, what is not includ­ed in those num­bers are Big Data rev­enues direct­ly applied by the data col­lec­tors to their own sales or even to the stream­lin­ing of their inter­nal process­es, the lat­ter of which may pro­vide impor­tant cost sav­ings, numbers/figures which do not show up in any sta­tis­tic, not even in the company’s PyG under any spe­cif­ic head­ing. Any oth­er expla­na­tion could hard­ly account for the imbal­ance shown in your intro­duc­to­ry chart between rev­enue (€6.4bn) and loss (€102bn). For this rea­son, the pub­lished rev­enue fig­ure seems insignif­i­cant com­pared to what it may actu­al­ly be and what it may become in the very near future.

Anoth­er way of apprais­ing Big Data growth is by look­ing at the vol­umes of data col­lect­ed. Back in 2018, the total data vol­ume col­lect­ed world­wide stood at 33 Zettabytes (high­est in Asia, fol­lowed by North Amer­i­ca and then West­ern Europe). Many data ana­lysts cau­tious­ly pre­dict that this data col­lec­tion vol­ume could rise to a lev­el of 175 Zettabytes by the end of 2025.

It is not only cor­po­ra­tions col­lect­ing Big Data, but also gov­ern­ments gath­er­ing every kind of data on their sub­jects. The expres­sion “data, the new fuel of the 21st cen­tu­ry” should per­haps not only be under­stood by look­ing at rev­enue fig­ures or data vol­umes. Big Data is a gener­ic term for defin­ing a grad­ual but quick process, which may com­plete­ly change and rev­o­lu­tion­ize our way of life, our way of pro­duc­tion, com­mu­ni­ca­tion, polit­i­cal sys­tem and will even­tu­al­ly affect every aspect of our pri­vate life. Though we are still at the very begin­ning of this process it will quick­ly become omnipresent. Those fore­bod­ings are also illus­trat­ed by many charts where the future rev­enue and col­lect­ed data vol­umes exclu­sive­ly point to an upwards trend, with annu­al world­wide growth vol­umes of approx­i­mate­ly US$10bn and 40 to 50 zettabytes respectively.

In the com­ing years, those growth rates will not show a lin­ear but more of an expo­nen­tial pat­tern. The rea­son for this devel­op­ment is sim­ple: It is sus­tained by the grow­ing dig­i­tal­iza­tion of our dai­ly lives and habits. The two main cat­a­lysts for fuelling the growth of Big Data col­lec­tion and usage in the dig­i­tal era we are cur­rent­ly liv­ing in are the “update­abil­i­ty” and the “con­nec­tiv­i­ty” with which the data is col­lect­ed, stored, updat­ed and moved around. Those two fea­tures, which did not exist in our for­mer ana­logue era, will be respon­si­ble for the expo­nen­tial growth rates of Big Data in the years to come.

Big Data will grad­u­al­ly seep into all our dai­ly lives. It may per­haps take anoth­er 10 to 20 years for this omnipres­ence to become per­cep­ti­ble. Nev­er­the­less, this is a self-accel­er­at­ing and self-engen­der­ing process which will pro­duce ever greater quan­tum leaps in improv­ing the con­trol sys­tems over the con­sumer and/or cit­i­zens and, final­ly, over all of us. Think of what it may become in 100 years. The lat­ter may seem like a very long-time span for a human mind yet, in evo­lu­tion­ary terms, this is just a split sec­ond. As soon as Big Data includes the realm of biotech­nol­o­gy, then its pow­er over us can be seen as vir­tu­al­ly absolute.

The enti­ties – whether cor­po­ra­tions or gov­ern­ments – know­ing how to best gath­er, store, analyse and use the Big Data of peo­ple will then be in a posi­tion to know people’s pref­er­ences, their dreams, their polit­i­cal con­vic­tions, their predilec­tion etc. They will, there­fore, know which car, which music, which med­ical treat­ment or even which polit­i­cal par­ty will best suit them. The dan­gers inher­ent to such a process are obvi­ous: Peo­ple will even­tu­al­ly hand over the “steer­ing wheel” of their lives to a small group of the soci­ety that con­trols gath­er­ing, admin­is­ter­ing and using the Big Data col­lect­ed on them. The lat­ter is iden­ti­cal to hand­ing over pow­er, influ­ence and mon­ey to a small part of society.

A quote when it comes to Big Data is from one of my favourite writ­ers, Yuval Noah Hariri, 21 Lessons of the 21st Century


“Big Data algo­rithms might cre­ate dig­i­tal dic­ta­tor­ships in which all pow­er is con­cen­trat­ed in the hands of a tiny elite while most peo­ple suf­fer not from exploita­tion, but from some­thing far worse – irrelevance.”

Yuval Noah Hariri

For this rea­son, it is fair to state that with­in the next few years Big Data may cer­tain­ly be deserv­ing of its def­i­n­i­tion as the “fuel of the 21st cen­tu­ry”, not only because of the rev­enues, but because of its all-encom­pass­ing con­trol over us. How­ev­er, at the moment, it may still be some­what dif­fi­cult to iden­ti­fy which por­tion of an enterprise’s total rev­enue may be ascribed to its use of Big Data.

Aldous Huxley’s Brave New World of the 1930s may even­tu­al­ly become true. Now is the time for soci­ety to reflect on what they want to do with the new tech­no­log­i­cal pos­si­bil­i­ties as well as where the legal – and per­haps also finan­cial and com­pet­i­tive – bar­ri­ers should be estab­lished to pro­tect the free­dom of indi­vid­u­als to avoid dan­ger­ous devel­op­ments with­in our mar­kets and gov­ern­ments. This dis­cus­sion can­not be avoid­ed, nor sim­ply placed in the hands of a small group of busi­ness lob­by­ists, nor hand­ed over to some gov­ern­men­tal agen­cies, and most cer­tain­ly can­not be left to some ran­dom sequence of devel­op­ment. This must be an open dis­cus­sion involv­ing all ele­ments of society.

Some sort of com­pro­mise between what is pos­si­ble and what should be allowed must be found. One aspect of such a com­pro­mise rep­re­sents the afore­men­tioned indi­vid­ual free­dom requir­ing a cer­tain lev­el of pro­tec­tion where­as the oth­er side of this bal­anc­ing act is rep­re­sent­ed by the will­ing­ness of our com­pa­nies to devel­op the skills need­ed for col­lect­ing and using Big Data with­in a rea­son­able legal frame­work. If we put too much empha­sis on pro­tect­ing the indi­vid­ual free­dom of peo­ple, the skills relat­ed to Big Data might end up in coun­tries and for­eign cor­po­ra­tions with more lax reg­u­la­tions and even­tu­al­ly push­ing our own cor­po­ra­tions out of this market.

As always, the dif­fi­cul­ty lies in find­ing the gold­en mean between the two extremes.

Which projects gen­er­ate this rev­enue? Which com­pa­nies pro­duce these rev­enues? If Big Data is the future to lucra­tive tak­ings, how are gov­ern­ments able to allo­cate their nation­al bud­get if data only par­tial­ly appears as rev­enue in finan­cial reporting?

Big Data will affect every­thing and every­one and will lead to a con­cen­tra­tion of indus­tri­al and eco­nom­ic pow­er. The big dis­trib­u­tors, media com­pa­nies and man­u­fac­tur­ers are already gath­er­ing all types of mea­sur­able data. Every “click” on the inter­net and all forms of con­sumer behav­iour are already being amassed and run through algo­rithms with one over­ar­ch­ing aim: Name­ly, to win and bind the client to their com­pa­ny, prod­uct or even polit­i­cal party.

Ora­cle, Microsoft, SAP, IBM, Ama­zon, Aliba­ba, Zalan­do, any search engine, Google, Face­book, LinkedIn, Tik­Tok and the like are the lead­ing com­pa­nies deal­ing in the col­lec­tion of data. Most of those com­pa­nies are head­quar­tered in the US and in Chi­na, where­as only SAP is head­quar­tered in Germany.

These com­pa­nies are rel­a­tive­ly young busi­ness­es. There­fore, it seems that the poten­tial for future rev­enue gen­er­a­tion, or, in oth­er words, the con­trol over the “well” to pump the fuel of the 21st cen­tu­ry, is more like­ly to lie with those younger com­pa­nies. Old­er and more tra­di­tion­al com­pa­nies will have a hard time catch­ing up with these new trends and skills. In fact, it is not unlike­ly that many of these busi­ness­es might indeed dis­ap­pear dur­ing this era of Mod­ern Times and will per­haps regret that they are not watch­ing the famous and ever-pop­u­lar movie by Char­lie Chap­lin, but rather are being ful­ly exposed to the inex­orable and unfor­giv­ing laws of the market.

It is thus fair to assume that US and Chi­nese com­pa­nies will play a lead­ing role in the Big Data Mar­ket. Europe, unfor­tu­nate­ly, is like­ly to be rel­e­gat­ed to the sec­ond tier. This admis­sion is a painful one to make, espe­cial­ly if one feels Euro­pean with every last fibre of one’s being.

Europe’s only answer to this over­whelm­ing­ly Amer­i­can and Chi­nese Big Data dom­i­na­tion may well be to mere­ly levy a dig­i­tal tax which will be reg­u­lat­ed by the EU Com­mis­sion to help pay off some of the unsus­tain­able debt accu­mu­lat­ed by cer­tain governments.

Accord­ing to a study from Bitkom2, the total loss due to IT sab­o­tage, espi­onage, theft and sim­i­lar amount in to €102.9bn in Ger­many alone. Accord­ing to that same study, the total amount spent on IT-secu­ri­ty in Ger­many amount­ed to €4.57bn in 2019. GDPR states in Arti­cle 83 that the fine due to a breach of the data pro­tec­tion law may cost a com­pa­ny up to 4% of its world­wide turnover. The­o­ret­i­cal­ly, when apply­ing this max­i­mum fine lev­el to the Ger­man GDP3, this would result in a the­o­ret­i­cal max­i­mum expo­sure lev­el of €137.42bn (com­pare Germany’s 2019 GDP which amount­ed to €3435.74bn) which core­lates quite nice­ly to the actu­al loss of €102bn4, as men­tioned above. On the basis of those fig­ures, do you think that the IT secu­ri­ty effort under­tak­en in Ger­many has been suf­fi­cient, or do you instead think that addi­tion­al resources should be made avail­able to pro­tect our IT sys­tems in Germany?

Invest­ments in data pro­tec­tion and IT secu­ri­ty have always been impor­tant and will become even more so – espe­cial­ly after the Gen­er­al Data Pro­tec­tion Reg­u­la­tion (GDPR) came into force, and the con­comi­tant high­er fines for data trans­gres­sions asso­ci­at­ed with it. Cor­po­ra­tions and gov­ern­ment agen­cies alike are increas­ing­ly exposed to hack­ers try­ing to ille­gal­ly acquire their sen­si­tive data.

My impres­sion is that our invest­ment in data secu­ri­ty is far too lit­tle, com­pared to what we should be doing. Just look­ing at the num­bers for Ger­many where the total IT secu­ri­ty invest­ment amount­ed to rough­ly. €4.6bn in 2019 while the loss suf­fered by IT secu­ri­ty leaks went beyond €100bn, shows that there is still much room for improvement.

With the aid of a sta­tis­ti­cal method, name­ly by using the cumu­la­tive prob­a­bil­i­ty func­tion of a Gauss­ian nor­mal dis­tri­b­u­tion, I have tried to illus­trate that the cur­rent Euro­pean invest­ment in IT secu­ri­ty is far too low.

On the basis of the Ger­man aggre­gate num­bers for 2019, the result – in no way sur­pris­ing – shows that the total invest­ment in IT secu­ri­ty should have been at least three times high­er than the amount actu­al­ly spent in order to obtain a rea­son­able, yet still imper­fect, secu­ri­ty environment.

While this is a very rough and pure­ly sta­tis­ti­cal esti­mate which does not con­sid­er any qual­i­ta­tive ele­ments con­cern­ing the actu­al expen­di­ture of the funds, such spend­ing analy­sis should ide­al­ly be a bot­tom-up analy­sis of qual­i­ta­tive ele­ments, such as train­ing IT spe­cial­ists as well as list­ing the best ways to cre­ate smart IT devices and soft­ware pro­grams to pro­tect the IT infra­struc­ture of the cor­po­ra­tions and coun­tries in ques­tion. The sum of all those pro­tec­tive mea­sures would give us a valid number.

Cur­rent­ly, with lit­tle data avail­able to us, it is not easy to secure an appro­pri­ate lev­el of pro­tec­tion. Most com­pa­nies do not even reg­is­ter their IT secu­ri­ty expen­di­tures under a spe­cif­ic rubric. This means they do not even know how much they have actu­al­ly spent in pro­tect­ing one of their most valu­able assets: their data. Cor­po­rate senior man­age­ment should be made aware of the sig­nif­i­cance of their IT secu­ri­ty and should thus make cer­tain that the nec­es­sary inter­nal account­ing and con­trol­ling pro­ce­dures have been duly imple­ment­ed in order to be able to iden­ti­fy what amounts have been spent on what type of IT pro­tec­tion and what gaps and short­com­ings still need to be covered.

Besides, in the very near future, our infra­struc­ture, includ­ing the sup­ply of basic needs, such as water and elec­tric­i­ty and even our struc­tures for traf­fic con­trol and sim­i­lar infra­struc­ture sys­tems, will grad­u­al­ly become depen­dent on “intel­li­gent” devices aimed at opti­miz­ing the offer to the con­stant­ly fluc­tu­at­ing con­sumer demand. Imag­ine what would hap­pen if those sys­tems fell prey to hack­er attacks. They would be able to par­a­lyze a whole nation. Thus, as such inter­con­nect­ed “intel­li­gent” devices start sup­port­ing our most basic require­ments, peo­ple will become increas­ing­ly aware of the need to estab­lish appro­pri­ate IT shields to fend off the inher­ent dangers.

Thus, upgrad­ing our IT secu­ri­ty will become a con­stant require­ment. Unfor­tu­nate­ly, the pace with which cor­po­ra­tions and gov­ern­ment agen­cies upgrade their IT secu­ri­ty sys­tems gen­er­al­ly lags behind the speed with which smart hack­ers enhance their illic­it methods.

Have the sources of data already been ful­ly exploit­ed in order to dig up the data trea­sure? Can data already be giv­en the attribute of cash cows? How many invest­ments in data pro­tec­tion, cyber­se­cu­ri­ty, IT infra­struc­ture etc. need to be made and how many data need to be man­aged in order to become prof­itable? How long will it last to reach the laud­ed Big Data potential?

All poten­tial data sources are by no means being ful­ly exploit­ed at the moment. There are innu­mer­able dif­fer­ent data sources and data com­bi­na­tions. Much of the data need­ed to gain a com­plete and clear under­stand­ing of what a cor­po­ra­tion is look­ing for may not even have been col­lect­ed so far, due to legal and tech­ni­cal restric­tions. Fur­ther­more, the skills required to read, com­ple­ment and under­stand the data may dif­fer great­ly from one com­pa­ny to anoth­er. The same data may offer a com­pet­i­tive advan­tage to one com­pa­ny while being of absolute­ly no use to anoth­er one. Thus, we can affirm that Big Data is cer­tain­ly not a cash cow yet.

How­ev­er, the abil­i­ty to gath­er and use client data will soon be a skill which will be trans­formed from a “nice to have” to an absolute “must-have” abil­i­ty as far as com­pa­nies are con­cerned. Nobody will ask how much of the cor­po­rate total rev­enue was derived from the use of Big Data since the entire capac­i­ty of a busi­ness’s abil­i­ty to gen­er­ate rev­enue may soon depend upon its abil­i­ty to gath­er, read and employ cer­tain aspects of client data.

How impor­tant is Big Data for new inno­va­tions of the West­ern world – espe­cial­ly Ger­many? Are dis­rup­tive inno­va­tions nowa­days the cen­tral – if not the only – approach to rais­ing the GDP?

Yes, they are. Our devel­oped soci­eties are look­ing for a new type of dis­rup­tive inno­va­tion to keep on gen­er­at­ing growth for their respec­tive populations.

Dig­i­tal­iza­tion may pro­duce this dis­rup­tive effect. Para­dox­i­cal­ly, it may also be due to dig­i­tal­iza­tion that more peo­ple are put out of than into work. Fac­to­ries, stores and even trans­porta­tion may be able to oper­ate with few­er staff, turn­ing peo­ple into an irrel­e­vant mass pos­sess­ing only one sin­gle use: that of being a mere consumer.

Or will the reverse sce­nario come to pass: Will dig­i­tal­iza­tion require more peo­ple (to be sure, they will be high­ly spe­cial­ized and trained) to cre­ate, run and improve all this exu­ber­ant pletho­ra of IT infra­struc­ture? We will cer­tain­ly see the result with­in the next few years.

Anoth­er way of cre­at­ing a dis­rup­tive growth effect on the econ­o­my may come from the elec­tri­fi­ca­tion of var­i­ous means of trans­porta­tion or inte­grat­ing green hydro­gen in them which in turn leads us away from a com­bus­tion engine. This is cer­tain­ly anoth­er inter­est­ing top­ic to be dis­cussed in a fur­ther interview.

Until now, we have instead wit­nessed the Cen­tral Banks gen­er­at­ing mon­ey out of “thin air” in their attempt to keep the wheels of the econ­o­my turn­ing and to finance the myopic and irre­spon­si­ble deal­ing of many gov­ern­ments trapped in an unsus­tain­able debt. The lat­ter may serve as a pro­hib­i­tive imped­i­ment to the future of the next gen­er­a­tions as well as con­sum­ing the pen­sions of this gen­er­a­tion in the process. If those inven­tions do not make their appear­ance some­time soon, we will be fac­ing very dif­fi­cult and chal­leng­ing times indeed.

Richard, thank you for shar­ing your insights on Big Data and so amply con­tex­tu­al­iz­ing the now famous expres­sion “data is the fuel of the 21st cen­tu­ry” by expand­ing and clar­i­fy­ing what actu­al rev­enues entail.

Thank you, Cristi­na, and I look for­ward to read­ing your upcom­ing inter­views with rec­og­nized experts delv­ing much deep­er into this fas­ci­nat­ing topic.

1 Dossier Big Data Sta­tista, Release Sta­tista 2020: https://​www​.sta​tista​.com/​s​t​u​d​y​/​1​4​6​3​4​/​b​i​g​-​d​a​t​a​-​s​t​a​t​i​s​t​a​-​d​o​s​s​i​er/
2 Spi­onage, Sab­o­tage und Datendieb­stahl – Wirtschaftss­chutz in der ver­net­zten Welt Stu­di­en­bericht 2020”: https://​www​.bitkom​.org/​s​i​t​e​s​/​d​e​f​a​u​l​t​/​f​i​l​e​s​/​2​020 – 02/200211_bitkom_studie_wirtschaftsschutz_2020_final.pdf
3 Gross Domes­tic Prod­uct (GDP) in Ger­many from 1991 to 2019, Sta­tista 2020: https://​de​.sta​tista​.com/​s​t​a​t​i​s​t​i​k​/​d​a​t​e​n​/​s​t​u​d​i​e​/​1​2​5​1​/​u​m​f​r​a​g​e​/​e​n​t​w​i​c​k​l​u​n​g​-​d​e​s​-​b​r​u​t​t​o​i​n​l​a​n​d​s​p​r​o​d​u​k​t​s​-​s​e​i​t​-​d​e​m​-​j​a​h​r​-​1​9​91/
4 Mehr als 100 Mil­liar­den Euro Schaden durch Cyberkrim­i­nal­ität, 11. Novem­ber 2019 https://​indus​trie​.de/​i​t​-​s​i​c​h​e​r​h​e​i​t​/​1​0​0​-​m​i​l​l​i​a​r​d​e​n​-​e​u​r​o​-​s​c​h​a​d​e​n​-​c​y​b​e​r​k​r​i​m​i​n​a​l​i​t​a​et/

About me and my guest

Dr Maria Cristina Caldarola

Dr Maria Cristina Caldarola, LL.M., MBA is the host of “Duet Interviews”, co-founder and CEO of CU³IC UG, a consultancy specialising in systematic approaches to innovation, such as algorithmic IP data analysis and cross-industry search for innovation solutions.

Cristina is a well-regarded legal expert in licensing, patents, trademarks, domains, software, data protection, cloud, big data, digital eco-systems and industry 4.0.

A TRIUM MBA, Cristina is also a frequent keynote speaker, a lecturer at St. Gallen, and the co-author of the recently published Big Data and Law now available in English, German and Mandarin editions.

Richard Heiler Martínez

Richard Heiler Martínez has been an international banker for some 30 years, specializing primarily in project, export and trade finance as well as in raising funding for corporate loans. After having been with one English and two Spanish banks in various capacities, he has recently decided to move to Switzerland to support a Swiss institution in its corporate finance handlings.

Dr Maria Cristina Caldarola

Dr Maria Cristina Caldarola, LL.M., MBA is the host of “Duet Interviews”, co-founder and CEO of CU³IC UG, a consultancy specialising in systematic approaches to innovation, such as algorithmic IP data analysis and cross-industry search for innovation solutions.

Cristina is a well-regarded legal expert in licensing, patents, trademarks, domains, software, data protection, cloud, big data, digital eco-systems and industry 4.0.

A TRIUM MBA, Cristina is also a frequent keynote speaker, a lecturer at St. Gallen, and the co-author of the recently published Big Data and Law now available in English, German and Mandarin editions.