Will blockchain rev­o­lu­tionise the way in which data is man­aged and stored in a Data Warehouse?

W
Dr Ioan­nis Revolidis 

In the lat­est of her Duet inter­views, Dr Cal­daro­la, edi­tor of Data Ware­house as well as author of Big Data and Law, and Dr Revo­lidis dis­cuss the poten­tial of blockchain in a data warehouse.

You have writ­ten an arti­cle about blockchain in our book Data Ware­house. Blockchain is known in con­nec­tion with Bit­coin.
Could you explain what a blockchain is in the con­text of a data warehouse?

Dr Revo­lidis: Cer­tain­ly, when most peo­ple hear the term “blockchain,” they imme­di­ate­ly asso­ciate it with Bit­coin and oth­er cryp­tocur­ren­cies. How­ev­er, the util­i­ty and poten­tial appli­ca­tions of blockchain tech­nol­o­gy extend beyond the realm of cryptocurrencies.

At its core, a blockchain is a dis­trib­uted and immutable ledger, where trans­ac­tions are record­ed in a series of blocks that are linked togeth­er and secured using cryp­to­graph­ic prin­ci­ples. Each block con­tains a cryp­to­graph­ic hash of the pre­vi­ous block, form­ing a con­tin­u­ous chain, which makes retroac­tive alter­ations prac­ti­cal­ly impos­si­ble with­out alter­ing sub­se­quent blocks.

One can observe many poten­tial appli­ca­tions in the con­text of a data warehouse:

  1. Immutable Data Stor­age: One of the pri­ma­ry virtues of blockchain is its immutabil­i­ty. Once data is record­ed on a blockchain, it can­not be altered with­out the con­sen­sus of the net­work. This can intro­duce a lev­el of data integri­ty and authen­tic­i­ty to data ware­hous­ing that tra­di­tion­al data­bas­es may strug­gle to match. For instance, if an organ­i­sa­tion’s data ware­house were to employ blockchain prin­ci­ples, his­tor­i­cal data would be pre­served with a high lev­el of con­fi­dence in its accu­ra­cy over time.
  2. Decen­tral­i­sa­tion and Ver­i­fi­ca­tion: Unlike tra­di­tion­al data ware­hous­es that cen­tralise data stor­age, blockchains can oper­ate on a decen­tralised mod­el. Each par­tic­i­pant in a blockchain net­work has access to the entire data­base and its com­plete his­to­ry. This decen­tral­i­sa­tion can pro­vide a mech­a­nism for data val­i­da­tion and ver­i­fi­ca­tion in ware­hous­ing con­texts, thus ensur­ing that data has not been tam­pered with.
  3. Inte­gra­tion and Inter­op­er­abil­i­ty: Blockchain can act as a bridge between dif­fer­ent data silos and sys­tems, offer­ing a uni­fied and trans­par­ent view of data. This can be par­tic­u­lar­ly ben­e­fi­cial in sce­nar­ios where data from dif­fer­ent sources needs to be inte­grat­ed into a sin­gle ware­house, ensur­ing that the data is con­sis­tent and reli­able across the board.
  4. Enhanced Secu­ri­ty: The cryp­to­graph­ic nature of blockchain offers a high­er lev­el of secu­ri­ty for data. This can be of immense val­ue for sen­si­tive data stored in a data ware­house, ensur­ing that the data remains con­fi­den­tial and resis­tant to unau­tho­rised changes.
  5. Trace­abil­i­ty and Audit Trails: Every trans­ac­tion on a blockchain is record­ed with a time­stamp and linked to the pre­vi­ous trans­ac­tion. This pro­vides a trans­par­ent audit trail of data changes, which can be par­tic­u­lar­ly valu­able for indus­tries that require rig­or­ous data track­ing and compliance.

To sum­marise, while blockchain is wide­ly recog­nised for its role in the world of cryp­tocur­ren­cies, its foun­da­tion­al prin­ci­ples of immutabil­i­ty, trans­paren­cy, and decen­tral­i­sa­tion can offer trans­for­ma­tive poten­tial for data ware­hous­ing, while enhanc­ing data integri­ty, secu­ri­ty, and traceability.

My opin­ion is:

“The trust blockchain bestows is not mere­ly cod­ed; it’s a del­i­cate dance between

soft­ware and tra­di­tion­al legal frame­works, each guid­ing the other”.

Dr Ioan­nis Revolidis

A data ware­house can con­tain per­son­al and non-per­son­al data. When a blockchain is deployed in a data ware­house, who does the blockchain serve? The oper­a­tor for the “bet­ter” pro­cess­ing of “his/her” data? Or for the data sub­ject and device own­ers who make their data avail­able for “tam­per-proof” pur­pose-spe­cif­ic pro­cess­ing in the data warehouse?

Blockchains may pro­vide mul­ti­fac­eted ben­e­fits that can serve mul­ti­ple stake­hold­ers simul­ta­ne­ous­ly. In fact, espe­cial­ly when per­son­al data is being processed, the adop­tion of a blockchain solu­tion should nec­es­sar­i­ly serve the inter­ests of both the oper­a­tor of the data ware­house and those of the data sub­jects. After all, that’s exact­ly what the GDPR wants to happen

 (see art. 1(1) and (3) of the GDPR). 

In that sense, I would say that when a Blockchain solu­tion is incor­po­rat­ed with­in a data ware­house, both the oper­a­tor and the user/data sub­ject can derive advan­tages from its deploy­ment, albeit in dif­fer­ent ways.

As regards the oper­a­tor I would gen­er­al­ly assume that the fol­low­ing ben­e­fits could be extracted:

  1. Data Integri­ty and Trust­wor­thi­ness: By lever­ag­ing blockchain’s immutabil­i­ty, Oper­a­tors can ensure that the stored data has not been tam­pered with, thus enhanc­ing the reli­a­bil­i­ty of the warehouse.
  2. Auditabil­i­ty: Every data trans­ac­tion is trans­par­ent­ly record­ed, which aids in com­pre­hen­sive audit­ing process­es. This trans­paren­cy can be ben­e­fi­cial for sec­tors that are reg­u­lat­ed and require strin­gent data documentation.
  3. Oper­a­tional Effi­cien­cy: Blockchain can facil­i­tate real-time data shar­ing and inte­gra­tion across dif­fer­ent sys­tems, poten­tial­ly improv­ing data ware­hous­ing operations.

As regards the user/data sub­ject I would envis­age the fol­low­ing advan­tages, which align well with cer­tain reg­u­la­to­ry goals, especially:

  1. Data Sov­er­eign­ty: With some blockchain imple­men­ta­tions, indi­vid­u­als can main­tain a sub­stan­tial lev­el of con­trol over their per­son­al data, deter­min­ing who can access it and for what pur­pose. This aligns well with prin­ci­ples of mod­ern data pro­tec­tion, where user con­sent and con­trol are paramount.
  2. Assur­ance of Data Integri­ty: Indi­vid­u­als can be more con­fi­dent that their data, once entered into a blockchain-backed data ware­house, remains unal­tered and genuine.
  3. Trans­paren­cy and Trace­abil­i­ty: If indi­vid­u­als are grant­ed access, they can view the his­to­ry of their data trans­ac­tions, offer­ing trans­paren­cy into how their data is being used and processed.

I would like to stress, nonethe­less, that the afore­men­tioned ben­e­fits only hold true if cer­tain design choic­es are being made when the sys­tem is being pro­grammed. What I mean is mere­ly adopt­ing a blockchain-based solu­tion will not auto­mat­i­cal­ly guar­an­tee the ben­e­fits I men­tioned above. Incor­po­rat­ing a blockchain solu­tion to a data ware­house is a deci­sion that should not be tak­en light­ly, since there are sev­er­al para­me­ters that need care­ful consideration.

For exam­ple, I would nev­er advise an organ­i­sa­tion that oper­ates a data ware­house to imple­ment a blockchain solu­tion based on a pub­lic blockchain. This would put a huge com­pli­ance bur­den on the shoul­ders of this organ­i­sa­tion and might even jeop­ar­dise the appli­ca­tion of basic data pro­tec­tion prin­ci­ples as regards the per­son­al infor­ma­tion of users/data sub­jects, espe­cial­ly if the stor­age of their per­son­al infor­ma­tion would be “on-chain”. The risks asso­ci­at­ed with such a choice might even negate many of the expect­ed benefits.

Anoth­er exam­ple would be the desired lev­el of data sov­er­eign­ty on the user/data sub­ject end. It would not be enough to set up a blockchain in order to achieve this kind of sov­er­eign­ty. Addi­tion­al soft­ware, most prob­a­bly in the form of smart con­tracts, must com­ple­ment a robust iden­ti­ty and data man­age­ment design, by virtue of which the oper­a­tor of the data ware­house will make the nec­es­sary arrange­ments that will fos­ter a sub­stan­tial degree of user empow­er­ment. In addi­tion, we must also take into account that the deploy­ment of a blockchain solu­tion would be a major cul­tur­al shift for users. While they can expect the ben­e­fits I men­tioned above, they must also take on the admin­is­tra­tion of sev­er­al key process­es relat­ed to infor­ma­tion and data. Not every­one might wish to do so.

With Bit­coin, the blockchain enables decen­tralised struc­tures, self-deter­mi­na­tion, bar­ri­er-free access, no iden­ti­fi­ca­tion, secu­ri­ty against manip­u­la­tion, per­son­al respon­si­bil­i­ty and free­dom. Who ben­e­fits from these advan­tages of a data ware­house with blockchain technology?

When inte­grat­ing blockchain tech­nol­o­gy with­in a data ware­house, many of the advan­tages inher­ent to blockchain ecosys­tems can indeed per­me­ate the entire organ­i­sa­tion, although the con­text and stake­hold­ers involved may dif­fer. Let’s break down some pos­si­ble outcomes. 

Let’s start with the fact that blockchains gen­er­al­ly pro­mote decen­tralised struc­tures. In this con­text, I think that both oper­a­tors and users of the data ware­house can ben­e­fit. For oper­a­tors, decen­tral­i­sa­tion can pro­vide an added lay­er of trust with­in their organ­i­sa­tion. In addi­tion, it might also enhance the robust­ness of their ecosys­tem against data fail­ures and cen­tral points of attack. For users, it can offer a sense of shared own­er­ship and assur­ance that no sin­gle enti­ty has over­ar­ch­ing con­trol, pro­vid­ed nonethe­less that code and gov­er­nance assur­ances are put in place so that gov­er­nance of the data sets is suf­fi­cient­ly decentralised.

When I con­sid­er the sec­ond advan­tage you men­tioned, name­ly self-deter­mi­na­tion, I would say that those ben­e­fit­ing would pri­mar­i­ly be indi­vid­ual data sub­jects or enti­ties con­tribut­ing data to the ware­house. They might have more say in how their data is used, espe­cial­ly if smart con­tracts or oth­er blockchain-based gov­er­nance mech­a­nisms are employed.

When we turn to a bar­ri­er-free access, I think that this is a rather sen­si­tive point. I believe that the pub­lic and open nature of Bit­coin aligns well with what this par­tic­u­lar imple­men­ta­tion strives to achieve, but I’m afraid that with­in the con­text of a data ware­house this degree of open­ness might be unde­sir­able or even risky.

Manip­u­la­tion secu­ri­ty, on the oth­er hand, is a net gain for both oper­a­tors and data sub­jects. The immutabil­i­ty of blockchain records ensures that once data is record­ed, it is resis­tant to unau­tho­rised alter­ations. This builds trust in the data’s integri­ty for every­one involved.

Final­ly, you made a valid point by refer­ring to indi­vid­ual respon­si­bil­i­ty when think­ing about blockchain appli­ca­tions. I also allud­ed to the same in my answer to your pre­vi­ous ques­tion. While all stake­hold­ers can ben­e­fit from the imple­men­ta­tion of blockchain-based solu­tions, there is indeed a height­ened need for edu­ca­tion and aware­ness. Just as los­ing a pri­vate key in Bit­coin means los­ing access to funds, mis­man­age­ment in a blockchain-based data ware­house can lead to irre­versible con­se­quences. In blockchains, free­dom (in terms of data access, shar­ing and even mon­eti­sa­tion) comes with responsibility.

A data ware­house must be secured with­in the frame­work of infor­ma­tion secu­ri­ty and at the same time legal oblig­a­tions (data pro­tec­tion, for exam­ple) must be adhered to. This is usu­al­ly achieved by turn­ing to tech­ni­cal mea­sures for effi­cient and quick han­dling. Can blockchain tech­nol­o­gy help with this?

Indeed, blockchain tech­nol­o­gy, espe­cial­ly the use of smart con­tracts, can be piv­otal in ensur­ing both infor­ma­tion secu­ri­ty and com­pli­ance with legal obligations.

As regards infor­ma­tion secu­ri­ty, a num­ber of advan­tages come to mind. For exam­ple, smart con­tracts could help achieve encryp­tion and con­fi­den­tial­i­ty since they can be designed to incor­po­rate cryp­to­graph­ic encryp­tion meth­ods. This ensures that sen­si­tive data remains con­fi­den­tial and only acces­si­ble to par­ties hav­ing the cor­rect decryp­tion keys. By stor­ing encrypt­ed data and using smart con­tracts to man­age access, unau­tho­rised enti­ties can be effec­tive­ly barred from read­ing sen­si­tive infor­ma­tion. In addi­tion, smart con­tracts can auto­mate and enforce strin­gent access con­trols based on pre­de­fined con­di­tions, ensur­ing that data is only acces­si­ble to autho­rised par­ties and under the right con­di­tions. For addi­tion­al secu­ri­ty, sen­si­tive data can be tokenised using smart con­tracts. Instead of stor­ing the actu­al data, a token rep­re­sent­ing that data is stored. The real data can be stored off-chain in a secure envi­ron­ment, while the token on the blockchain pro­vides a secure and trans­par­ent way to trans­act with the data ware­house system.

In terms of com­pli­ance, smart con­tracts may guar­an­tee auto­mat­ed and con­sis­tent enforce­ment. Smart con­tracts are self-exe­cut­ing soft­ware where the terms of the agree­ment or con­di­tions are writ­ten into lines of code. Once con­di­tions are met, actions are auto­mat­i­cal­ly exe­cut­ed. This ensures that pre­de­fined com­pli­ance mea­sures are con­sis­tent­ly enforced with­out human inter­ven­tion, reduc­ing the poten­tial for errors or biases.

Nonethe­less, the deploy­ment of such data secu­ri­ty and com­pli­ance solu­tions does not come with­out challenges.

Even though smart con­tracts are immutable and can­not be tam­pered with once deployed, they are only as robust as their code. If there are bugs or vul­ner­a­bil­i­ties in the con­trac­t’s code, they could be exploit­ed, lead­ing to unin­tend­ed con­se­quences. More­over, once iden­ti­fied, these vul­ner­a­bil­i­ties can­not be rec­ti­fied with­out deploy­ing a new ver­sion of the con­tract, which might not always be straightforward.

The deter­min­is­tic nature of smart con­tracts is anoth­er chal­lenge, espe­cial­ly in terms of com­pli­ance. While the deter­min­is­tic nature of smart con­tracts ensures they per­form exact­ly as writ­ten, this rigid­i­ty can be a draw­back. Laws and reg­u­la­tions are nuanced and sub­ject to inter­pre­ta­tion, and smart con­tracts might strug­gle to cap­ture these nuances. This could lead to over­sim­pli­fi­ca­tions or dan­ger­ous mis­in­ter­pre­ta­tions of legal obligations.

More­over, for many com­pli­ance mea­sures, smart con­tracts may need to inter­act with exter­nal data sources or sys­tems (often referred to as “ora­cles”). Depen­dence on exter­nal data intro­duces poten­tial points of fail­ure or manipulation.

Incor­po­rat­ing smart con­tracts for data secu­ri­ty and com­pli­ance in a data ware­house or any data man­age­ment sys­tem requires a robust under­stand­ing of the under­ly­ing blockchain plat­form, the nature of the data, and the poten­tial threats. While they offer many advan­tages for automat­ing and bol­ster­ing data secu­ri­ty and com­pli­ance mea­sures, they must be imple­ment­ed thought­ful­ly and in con­junc­tion with oth­er secu­ri­ty and com­pli­ance best prac­tices to ensure a tru­ly secure and com­pli­ant environment.

A data ware­house – espe­cial­ly if it con­tains per­son­al data – must com­ply with the GDPR. Since per­son­al data is obtained from dif­fer­ent sources, and because dif­fer­ent data of the dif­fer­ent peo­ple hold­ing the rights to said data may be used for dif­fer­ent pur­pos­es and since the (for­eign) data of (for­eign) right hold­ers may be sub­ject to dif­fer­ent nation­al data pro­tec­tion laws, the data ware­house oper­a­tor has to deal with dif­fer­ent vari­ables. Is blockchain a tech­nol­o­gy which can help data ware­house oper­a­tors man­age and imple­ment pre­cise­ly these data pro­tec­tion variables?

While some com­men­ta­tors view blockchains as a com­pli­ance prob­lem and not a com­pli­ance solu­tion, I think that a more bal­anced approach would be fairer.

Blockchain tech­nol­o­gy can be imple­ment­ed to lever­age its dis­tinct fea­tures, includ­ing user pseu­do­nymi­ty, data prove­nance, access per­mis­sions, and adher­ence to var­i­ous nation­al data pro­tec­tion regulations.

We dis­cussed ear­li­er, for exam­ple, that blockchains oper­ate in a default state of user pseu­do­nymi­ty. This is a well-recog­nised data pro­tec­tion prin­ci­ple that has even made its way with­in the text of the GDPR itself. Art. 32(1)(a) of the GDPR explic­it­ly refers to pseu­do­nymi­sa­tion as a com­pli­ance mech­a­nism and since blockchains offer this by default, it would be unfair not to recog­nise their poten­tial for data pro­tec­tion compliance.

But there are oth­er blockchain qual­i­ties that can sup­port data pro­tec­tion principles.

Blockchain’s inher­ent trans­paren­cy and immutabil­i­ty can con­tribute to the trace­abil­i­ty and val­i­da­tion of data ori­gin. Every trans­ac­tion or data entry on a blockchain is time­stamped and becomes part of an unal­ter­able record, which could aid in ensur­ing that data is sourced and processed cor­rect­ly. This is a qual­i­ty that not only sup­ports the com­pli­ance efforts of oper­a­tors, but might even prove a valu­able resource for super­vis­ing author­i­ties. More­over, blockchain’s abil­i­ty to man­age and con­trol access to data via cryp­to­graph­ic means ensures data integri­ty and secu­ri­ty. Final­ly, smart con­tracts can be deployed to man­age and auto­mate con­sent pro­to­cols. This automa­tion might play a cru­cial role in ensur­ing data is used in accor­dance with stip­u­lat­ed guide­lines and might mit­i­gate man­u­al over­sight errors.

Does­n’t tag­ging pro­vide bet­ter options for man­ag­ing vari­ables? Can’t this tech­nol­o­gy han­dle fur­ther oblig­a­tions such as con­sent and revo­ca­tion man­age­ment, incom­ing and out­go­ing con­trols, doc­u­men­ta­tion in the pro­cess­ing reg­is­ter, dele­tion and reten­tion oblig­a­tions, etc. bet­ter than blockchains?

That is a very inter­est­ing point.

Tag­ging presents notable advan­tages espe­cial­ly with­in the con­text of data ware­hous­es and data pro­tec­tion com­pli­ance. When con­sid­er­ing the adapt­abil­i­ty of data man­age­ment sys­tems, tag­ging often stands out due to its inher­ent flexibility.

I would, nonethe­less, view the two solu­tions not as com­pet­ing with one anoth­er. I would say that they can com­ple­ment one anoth­er. Tag­ging can be com­bined with blockchain-based com­part­ments in order to enhance the lev­el of com­pli­ance with data pro­tec­tion requirements.

For exam­ple, while tag­ging can help man­age the dif­fer­ent vari­ables of com­pli­ance, blockchain com­po­nents can ensure that com­pli­ance is auto­mat­ed and hard­wired into the system.

Let’s look, for exam­ple, at con­sent and revo­ca­tion man­age­ment. The gran­u­lar­i­ty pro­vid­ed by tag­ging offers a stream­lined way to cat­e­gorise data accord­ing to con­sent cri­te­ria. Should a user revoke their con­sent, the rel­e­vant tag can swift­ly iden­ti­fy and facil­i­tate appro­pri­ate actions con­cern­ing the asso­ci­at­ed data. Exe­cu­tion of the appro­pri­ate actions can be auto­mat­ed and guar­an­teed by smart con­tracts in an immutable manner.

Data Ware­house oper­a­tors could, there­fore, com­bine all avail­able tools and achieve high lev­els of effi­cien­cy and com­pli­ance. Blockchains and/or blockchain-based solu­tions can always be com­bined with oth­er soft­ware in order to max­imise the expect­ed ben­e­fits. As with all tech­no­log­i­cal archi­tec­tures the cru­cial ques­tion will always be which tool can bet­ter serve the goals we want to attain. In that sense, blockchain-based appli­ca­tions can be part of a wider ecosys­tem of tech­no­log­i­cal solu­tions, pro­vid­ed they are deployed effi­cient­ly and for pur­pos­es that align well with their inher­ent qualities.

Blockchain tech­nol­o­gy is char­ac­terised by the fact that those affect­ed can­not be iden­ti­fied. Nev­er­the­less, after their con­tri­bu­tion to blockchain tech­nol­o­gy, pseu­do­nyms are used, i.e., the data ware­house oper­a­tor remains with­in the scope of the GDPR although “no iden­ti­fi­ca­tion” con­notes anonymi­sa­tion. What advan­tages does the use of blockchain bring to the data ware­house oper­a­tor? Or does it only bring advan­tages if the data ware­house has a cer­tain legal form – such as a cooperative?

Thank you very much for this ques­tion, it allows me to make an impor­tant clarification.

While blockchains inher­ent­ly mask the authen­tic iden­ti­ties of par­tic­i­pants by sub­sti­tut­ing names with pub­lic keys, this should not be sim­plis­ti­cal­ly con­strued as pro­vid­ing absolute anonymi­ty. Though indus­try dis­course might frame it as such, the legal per­spec­tive offers a nuanced under­stand­ing. Specif­i­cal­ly, when the data inscribed on a blockchain ledger per­tains to an iden­ti­fied or iden­ti­fi­able indi­vid­ual, it assumes the sta­tus of per­son­al data with­in the con­text of the GDPR. It remains a mat­ter of con­tention as to whom this applies; not all par­ties access­ing a blockchain net­work pos­sess the tech­ni­cal, legal, or prac­ti­cal means to re-iden­ti­fy the ecosys­tem’s par­tic­i­pants. This con­tention is exem­pli­fied in the recent rul­ing of the Gen­er­al Court in T‑557/20 SRB vs EDPS (refer to paras 86 – 101). Intrin­si­cal­ly, what blockchains facil­i­tate aligns with the GDPR’s con­cep­tu­al­i­sa­tion of pseu­do­nymi­sa­tion. This inher­ent attribute of blockchains should not be over­looked. As under­scored in Art. 32(1)(a) GDPR, pseu­do­nymi­sa­tion is recog­nised as a piv­otal mech­a­nism for data pro­tec­tion com­pli­ance and presents numer­ous advan­tages for data ware­house systems.

For exam­ple, by ren­der­ing data less iden­ti­fi­able, it reduces the risk asso­ci­at­ed with poten­tial data breach­es or unau­tho­rised access. If a breach does occur, the pseu­do­nymised data, in the absence of the addi­tion­al decryp­tion keys or ref­er­ence data, is less like­ly to be exploit­ed for mali­cious pur­pos­es, pro­vid­ing an added lay­er of security.

Simul­ta­ne­ous­ly, pseu­do­nymi­sa­tion allows organ­i­sa­tions to con­tin­ue lever­ag­ing their data for ana­lyt­i­cal, research, and oper­a­tional pur­pos­es. While the data is ren­dered less iden­ti­fi­able, its under­ly­ing struc­ture and rel­e­vance are pre­served, ensur­ing that the organ­i­sa­tion can still derive valu­able insights and per­form nec­es­sary data operations.

Fur­ther­more, by incor­po­rat­ing pseu­do­nymi­sa­tion, data ware­house sys­tems can fos­ter trust among stake­hold­ers, clients, and end-users. Assur­ing indi­vid­u­als that their per­son­al data is treat­ed with the utmost care and dili­gence strength­ens the organ­i­sa­tion’s rep­u­ta­tion and can lead to enhanced cus­tomer loy­al­ty and stake­hold­er confidence.

In essence, the inte­gra­tion of pseu­do­nymi­sa­tion into data ware­house sys­tems not only for­ti­fies data secu­ri­ty and ensures reg­u­la­to­ry align­ment but also pre­serves the func­tion­al util­i­ty of the data, cre­at­ing a bal­ance between data pro­tec­tion and oper­a­tional effi­cien­cy. This is an impor­tant qual­i­ty that blockchain-based appli­ca­tions can pro­mote by default.

Nonethe­less, I would like to stress that the pseu­do­nymi­sa­tion of user data can be a dou­ble-edged sword in a data ware­hous­ing con­text. While pseu­do­nymi­ty can ben­e­fit data sub­jects con­cerned about pri­va­cy and data pro­tec­tion, cer­tain appli­ca­tions of data ware­hous­es, espe­cial­ly those in reg­u­lat­ed indus­tries, may require clear iden­ti­fi­ca­tion pro­to­cols. A data ware­house must find a way to bal­ance pri­va­cy, data pro­tec­tion and trace­abil­i­ty. This will not always be an easy task. 

Blockchain are char­ac­terised by being non-manip­u­la­ble‑a fea­ture that is intend­ed to offer secu­ri­ty to cit­i­zens. Does it also bring ben­e­fits to the data ware­house oper­a­tor? If so, what are they?

The Blockchain’s defin­ing fea­ture is its immutabil­i­ty, which aims to pro­vide a degree of secu­ri­ty to users. From the per­spec­tive of a data ware­house oper­a­tor, this char­ac­ter­is­tic does present cer­tain oper­a­tional impli­ca­tions. The immutable nature of blockchain ensures that once data is entered onto the sys­tem, its integri­ty is main­tained, mak­ing sub­se­quent alter­ations notably chal­leng­ing. This immutabil­i­ty can aid data ware­house oper­a­tors in estab­lish­ing a con­sis­tent and trace­able record, which can be of sig­nif­i­cance when ver­i­fy­ing com­pli­ance with cer­tain legal and reg­u­la­to­ry standards.

The trans­paren­cy inher­ent in blockchain means that every data trans­ac­tion is record­ed and can be traced. For data ware­house oper­a­tors, this offers a struc­tured overview of data trans­ac­tions, which might be valu­able in con­texts where data val­i­da­tion or ver­i­fi­ca­tion is essential.

While blockchain’s pri­ma­ry design might focus on secur­ing user data, its fea­tures, par­tic­u­lar­ly immutabil­i­ty and trans­paren­cy, have poten­tial impli­ca­tions for data ware­house oper­a­tions, par­tic­u­lar­ly in areas relat­ed to data integri­ty and com­pli­ance verification.

Why is it worth­while for a data ware­house oper­a­tor to think about using blockchain technology?

In the realm of data man­age­ment and ware­hous­ing, trust is para­mount. The fun­da­men­tal propo­si­tion of blockchain tech­nol­o­gy is its abil­i­ty to fos­ter and estab­lish trust. At its core, blockchains serve as trust mech­a­nisms. They pro­vide a trans­par­ent and immutable ledger, ensur­ing that data entries, once made, can­not be eas­i­ly tam­pered with or altered with­out leav­ing a trace­able record. This char­ac­ter­is­tic is espe­cial­ly valu­able in sce­nar­ios where the ver­i­fi­a­bil­i­ty and authen­tic­i­ty of data are crucial.

Now, the major ques­tion a data ware­house must grap­ple with when pon­der­ing the adop­tion of blockchain tech­nol­o­gy is that of trust. Is there a trust deficit that the organ­i­sa­tion is try­ing to address? And if so, is it an inter­nal or exter­nal trust issue?

Inter­nal­ly, there could be sit­u­a­tions where depart­ments with­in an organ­i­sa­tion do not ful­ly trust each oth­er due to var­i­ous rea­sons, per­haps stem­ming from oper­a­tional dis­crep­an­cies and silos. By imple­ment­ing a blockchain-based solu­tion, the need for inter-depart­men­tal trust can be min­imised as the tech­nol­o­gy itself ensures data accu­ra­cy and integri­ty, mak­ing it a sin­gle source of truth that every­one can rely on.

Exter­nal­ly, trust chal­lenges can arise when an organ­i­sa­tion wish­es to con­vey to its users or stake­hold­ers that it oper­ates with trans­paren­cy and cred­i­bil­i­ty. For busi­ness­es that rely heav­i­ly on user data, it’s increas­ing­ly impor­tant to assure users that their data is man­aged secure­ly, trans­par­ent­ly, and in a man­ner that grants them bet­ter con­trol. Imple­ment­ing a blockchain solu­tion can act as a ges­ture of com­mit­ment to these prin­ci­ples. The immutable nature of blockchain ensures that data-relat­ed oper­a­tions are trans­par­ent, while its decen­tralised struc­ture can poten­tial­ly offer users more direct con­trol and secu­ri­ty over their data.

Dr Revo­lidis, thank you for shar­ing your insights on the blockchain tech­nol­o­gy with regard to the use in a data warehouse.

Thank you, Dr Cal­daro­la, and I look for­ward to read­ing your upcom­ing inter­views with recog­nised experts, delv­ing even deep­er into this fas­ci­nat­ing topic.

About me and my guest

Dr Maria Cristina Caldarola

Dr Maria Cristina Caldarola, LL.M., MBA is the host of “Duet Interviews”, co-founder and CEO of CU³IC UG, a consultancy specialising in systematic approaches to innovation, such as algorithmic IP data analysis and cross-industry search for innovation solutions.

Cristina is a well-regarded legal expert in licensing, patents, trademarks, domains, software, data protection, cloud, big data, digital eco-systems and industry 4.0.

A TRIUM MBA, Cristina is also a frequent keynote speaker, a lecturer at St. Gallen, and the co-author of the recently published Big Data and Law now available in English, German and Mandarin editions.

Dr. Ioannis Revolidis

Dr Ioannis Revolidis studied law at Aristotle University, Thessaloniki and passed the state law examination in Greece in 2010; Furthermore, Dr Revolidis completed a Master’s degree in Law at Aristotle University in 2011. Between 2013-2019 he was a research assistant at the Institute for Legal Informatics at Leibniz University in Hannover. In 2019 Dr Revolidis received his PhD in law from the Aristotle University, Thessaloniki. Finally, since September 2021 Dr Revolidis has been a Lecturer in internet and blockchain law (tenure track) at the Faculty of Law and the Centre for Distributed Ledger Technologies at the University of Malta.

Dr Maria Cristina Caldarola

Dr Maria Cristina Caldarola, LL.M., MBA is the host of “Duet Interviews”, co-founder and CEO of CU³IC UG, a consultancy specialising in systematic approaches to innovation, such as algorithmic IP data analysis and cross-industry search for innovation solutions.

Cristina is a well-regarded legal expert in licensing, patents, trademarks, domains, software, data protection, cloud, big data, digital eco-systems and industry 4.0.

A TRIUM MBA, Cristina is also a frequent keynote speaker, a lecturer at St. Gallen, and the co-author of the recently published Big Data and Law now available in English, German and Mandarin editions.

FOL­LOW ME