More data – more knowledge?

M
Prof. Dr Peter Brödner 

In the lat­est of her Duet inter­views, Dr Cal­daro­la, edi­tor of Data Ware­house as well as author of Big Data and Law, and Prof. Dr Bröd­ner dis­cuss the inten­tions, the goals and the qual­i­ty of data analysis.

Peo­ple always seem to be say­ing that we need more data. Will Big Data / AI lead to a new under­stand­ing of the world? Will they gen­er­ate progress?

Prof. Dr Bröd­ner: You are open­ing up a huge dis­cus­sion, and I am very grate­ful to you for this ques­tion. What mat­ters is how data relates to the real world. First of all, data is noth­ing more than numer­als and oth­er char­ac­ters. These char­ac­ters have no mean­ing per se. Peo­ple assign mean­ing to these char­ac­ters. Con­se­quent­ly, the ques­tion aris­es as to which the­o­ret­i­cal or oth­er con­sid­er­a­tion peo­ple can use to assign mean­ing. There­fore, we must start by clar­i­fy­ing what rela­tion­ship we humans have to the world or to reality.

There have been a wide vari­ety of approach­es through­out his­to­ry. With­out going into detail, I am an advo­cate of the human­is­tic tra­di­tion, accord­ing to which we humans, our­selves the prod­uct of nat­ur­al evo­lu­tion and enabled by it through social­i­sa­tion to engage in a col­lec­tive­ly active con­fronta­tion with nature, have both prac­ti­cal abil­i­ties to act and reflec­tive con­scious­ness. Aris­to­tle already made a clear dis­tinc­tion between prac­ti­cal rea­son – by which he also includ­ed tech­nol­o­gy as the abil­i­ty to pro­duce some­thing that can be used for a pur­pose – and the­o­ret­i­cal rea­son as the source of knowl­edge. Accord­ing to Aris­to­tle, prac­ti­cal rea­son is essen­tial and deci­sive for action. Humankind must always labo­ri­ous­ly acquire the­o­ret­i­cal knowl­edge by means of his/her intellect.

Lat­er, sci­en­tists such as Galileo or Kep­pler pos­tu­lat­ed that one must first start from some indis­putable facts from which a ques­tion is derived. Unde­ni­able facts are nature and, with­in a social envi­ron­ment, prac­tices, tra­di­tions and habits. Of course, we can change nature as well as our social envi­ron­ment through inter­ven­tion, but we can­not change the big pic­ture accord­ing to our own wish­es. It is true that we make our own con­di­tions, but not under cir­cum­stances of our own choos­ing, but under cir­cum­stances that have been found and hand­ed down in each case. These ear­ly sci­en­tists came up with the idea of invent­ing hypothe­ses and fea­si­ble exper­i­ments based on their knowl­edge at the time, for which nature would pro­vide them with answers. These are, for exam­ple, Galileo’s famous free fall exper­i­ments or the abun­dance of obser­va­tion­al data of cos­mic light points with the help of tech­ni­cal devices such as tele­scopes, tim­ing devices, etc. This made it pos­si­ble to explain the seem­ing­ly chaot­ic move­ments in the cos­mos. By means of hypo­thet­i­cal expla­na­tions, Kep­pler and Galileo were able to test the behav­iour of nature through mea­sure­ments and val­i­date the counter-evi­dence to ear­li­er ideas. In this way, researchers were able to deter­mine a tiny aspect of “truth” again and again.

If we now come back to the data, we humans have ques­tions that we are try­ing to explain and val­i­date today with mea­sured data. The prob­lem nowa­days is that we have a lot of data. For exam­ple, with the “Inter­net of Things” many things now have a sen­sor that pro­vides data. Often, how­ev­er, these do not fit our ques­tions or we mea­sure with­out hav­ing a ques­tion (e.g. because we can sim­ply sell the thing with the sen­sor bet­ter) or we can­not recog­nise the sup­posed connection.

Let’s assume that the door has a sen­sor that mea­sures the time of each move­ment using a time stamp. What do we know using these mea­sured time and move­ment data of the door? We only know that the door moved at a cer­tain time, but not for what pur­pose, for what rea­son, due to which per­son, pos­si­bly not even with which tool or even how it opened. We sim­ply do not know the rea­sons or cir­cum­stances. And then humans start to think and use oth­er sen­sors such as cam­eras, which in turn raise new questions.

This brings us back to the ques­tion of what data is, what mean­ing it has and what it actu­al­ly tells us. And it is pre­cise­ly this ques­tion that humans can only answer if they fol­low up on why and how the data was gen­er­at­ed. At its begin­ning, there is usu­al­ly sur­prise at a cer­tain event, the expla­na­tion of which sci­en­tists come up with an exper­i­ment, cre­ate a hypoth­e­sis and col­lect data in a suit­ably method­olog­i­cal­ly sound man­ner in order to be able to val­i­date the hypothesis.

More­over, the full prac­ti­cal sig­nif­i­cance of the data only becomes appar­ent in the respec­tive con­text of action, for exam­ple when answer­ing the ques­tion: How much is a lot? Thus, the mea­sured data obtained must first be inter­pret­ed, based on the respec­tive the­o­ret­i­cal back­ground as well as on the con­text of action of a par­tic­u­lar social prac­tice, in order to be able to deter­mine what they “tell” us. Apart from care­ful assess­ment of the method­olog­i­cal valid­i­ty and reli­a­bil­i­ty of the data, the ques­tion of how much data is impor­tant in a prag­mat­ic con­text of action can only be answered in rela­tion to oth­er process­es that are sim­i­lar­ly rel­e­vant. Thus, mean­ing­ful and effec­tive action depends entire­ly on this dual inter­pre­ta­tion. How­ev­er, this is pre­cise­ly where mis­takes in the han­dling of data are fre­quent­ly rooted.

In short: As pure sig­nals, as num­bers and let­ters tak­en in them­selves, the data do not say any­thing at all, as such they are ini­tial­ly com­plete­ly mean­ing­less. They gain their sig­nif­i­cance exclu­sive­ly on the basis of the the­o­ry behind the mea­sure­ment method, on the basis of which the method and the mea­sur­ing appa­ra­tus were devel­oped, as well as on the basis of their proven valid­i­ty and reliability.

What do com­pa­nies or gov­ern­ments expect from still more data? More knowl­edge? Bet­ter fore­casts and deci­sions? Tech­no­log­i­cal lead­er­ship in AI…?

Today, we are often under the illu­sion that we can explain and val­i­date every­thing with the avail­able data with­out know­ing for sure how it came about or being able to assess its qual­i­ty. In doing so, we suc­cumb to the temp­ta­tion of ascrib­ing mean­ing to the data with­out sys­tem­at­ic review, because this mean­ing or inter­pre­ta­tion fits cur­rent projects, opin­ions, ideas or even wish­es. In addi­tion, with the advent of dig­i­tal­i­sa­tion, com­pa­nies have a great desire for data because they hope that it will ulti­mate­ly lead to progress. That’s why com­pa­nies every­where are siphon­ing off data. Fur­ther­more, many peo­ple have faith in tech­nol­o­gy; they give more cre­dence to data eval­u­a­tion than to their own human instinct and evaluation.

But has it ever been care­ful­ly exam­ined whether these data pools of a wide vari­ety of data could actu­al­ly be used to explain some­thing, sell some­thing bet­ter, win an elec­tion or achieve a tech­no­log­i­cal com­pet­i­tive advan­tage? Have our politi­cians defined care­ful­ly enough what they want to achieve with the Inter­net of Things, data and its eval­u­a­tion, with “big data”, etc.? These would be – but only with a method­i­cal­ly sound approach – at best only tools for achiev­ing a goal, and that does not seem to be clear. Thus, more data by no means pro­vides more reli­able knowl­edge, but it does cre­ate the illu­sion of knowl­edge. And so, there are no more accu­rate fore­casts or decisions.

Accord­ing­ly, it is also ques­tion­able to strive for tech­no­log­i­cal lead­er­ship in so-called “arti­fi­cial intel­li­gence (AI)” with­out clar­i­fy­ing the back­ground in more detail. Quite apart from the fact that the mean­ing of “AI” is often not very clear, the per­for­mance of the “con­nec­tion­ist” approach­es, which are based on the appli­ca­tion of deci­sion trees or arti­fi­cial neur­al net­works, also depends deci­sive­ly on the qual­i­ty of the data used for “train­ing”, for cal­cu­lat­ing the ini­tial­ly unde­ter­mined parameters.

Data qual­i­ty is there­fore of great impor­tance. This is often unknown. In most cas­es, it is not known where the data comes from or under what con­di­tions it was col­lect­ed. There­fore, the data qual­i­ty can­not be assessed or can only be assessed with difficulty.

In some of the pre­vi­ous inter­views, we have heard that the results of data eval­u­a­tion always depend on the qual­i­ty as well as the quan­ti­ty of the data. In oth­er words, if the input of the data is bad, the out­put of the data is also bad – in short, “garbage in, garbage out”. How do we get good or bet­ter data?

In tech­nol­o­gy, med­i­cine and the social sci­ences, there is a method­olog­i­cal­ly sound and proven approach. You have to fol­low the premis­es of sto­chas­tics, i.e. a ran­dom exper­i­ment must be car­ried out and exam­ined using a ran­dom sam­ple. Hypothe­ses devel­oped on the basis of the­o­ret­i­cal con­sid­er­a­tions and data col­lect­ed specif­i­cal­ly to test them then guar­an­tee that the data eval­u­a­tion was secure, valid, low-error, com­plete, appro­pri­ate – i.e. the data qual­i­ty is appro­pri­ate and cor­rect. This is based on a the­o­ret­i­cal, derived hypoth­e­sis, which of course can still be wrong­ly or insuf­fi­cient­ly deter­mined. There­fore, a ran­dom exper­i­ment as well as asso­ci­at­ed meth­ods of data col­lec­tion must also be val­i­dat­ed. The ques­tion to be answered is whether the method real­ly mea­sures exact­ly what is to be mea­sured, the ques­tion of valid­i­ty. In addi­tion, the ques­tion must be answered as to whether the method also per­forms a reli­able mea­sure­ment with­out inter­fer­ence fac­tors – basi­cal­ly the ques­tion of reliability.

There are exam­ples. Google, for exam­ple, thought that they could make bet­ter, more accu­rate and faster pre­dic­tions of flu epi­demics from their search queries than the health author­i­ties with their sto­chas­tic meth­ods because of amount of their data. How­ev­er, this could not be con­firmed because Google did not have suf­fi­cient knowl­edge of the back­ground of their data and its attri­bu­tion of mean­ing. Social mech­a­nisms came into play here, which soci­ol­o­gists call self-ful­fill­ing prophe­cy: When peo­ple per­ceive many flu cas­es, more queries are being made on the Inter­net and this can then be exag­ger­at­ed, so that there are more respec­tive queries than cor­re­sponds to the actu­al flu out­break and its course.

My favourite quote:

“Not every­thing that counts can be count­ed, and not every­thing that can be count­ed counts.”

Albert Ein­stein

Anoth­er illu­mi­nat­ing exam­ple: On the occa­sion of the ter­ror­ist attack at the Christ­mas mar­ket at the Berlin Memo­r­i­al Church on Decem­ber 19th 2016, a one-year mass test with auto­mat­ic facial recog­ni­tion was lat­er car­ried out at Berlin’s Süd­kreuz train sta­tion. After the eval­u­a­tion, the Min­is­ter of the Inte­ri­or proud­ly announced a hit rate of 80% and a false alarm rate of 0.1% – i.e. out of 10 sus­pects, 8 were cor­rect­ly iden­ti­fied and out of 1000 inno­cent passers-by, only one was false­ly sus­pect­ed. This means that with a large num­ber of about 100,000 passers-by dai­ly and assum­ing 100 actu­al sus­pects among them, 80 are cor­rect­ly iden­ti­fied and about 100 are incor­rect­ly sus­pect­ed, i.e. 56% (= 100180) of the sus­pect­ed passers-by are false­ly sus­pect­ed; There­fore, all 180 must be checked indi­vid­u­al­ly. How­ev­er, like MEP Hohlmeier, most peo­ple believe that a false alarm rate of 0.1% means 99.9% cor­rect hits – which tes­ti­fies to the wide­spread inabil­i­ty to deal with con­di­tion­al prob­a­bil­i­ties (Bayes’ the­o­rem), accord­ing to which there are still many false pos­i­tives with large obser­va­tion num­bers and a small num­ber of actu­al sus­pects. So mass sur­veil­lance becomes part of the prob­lem and not part of the solution.

Yet anoth­er exam­ple: A vac­ci­na­tion with zero effi­ca­cy can show an effect of 70 per­cent in stud­ies such as those con­duct­ed dur­ing the pan­dem­ic. In order to esti­mate the pro­tec­tive effect of mRNA vac­cines, author­i­ties, vac­ci­na­tion com­mis­sions and physi­cians have been large­ly depen­dent on obser­va­tion­al stud­ies and mod­el cal­cu­la­tions since the begin­ning of 2021. The prob­lem with this is that they can lead to results that do not reflect real­i­ty. Accord­ing to John Ioan­ni­dis, infec­ti­ol­o­gist and epi­demi­ol­o­gist at Stan­ford Uni­ver­si­ty, most reports on the effec­tive­ness of the Covid vac­cine are based on “dis­tort­ed find­ings”.[i] Three sci­en­tists have now con­duct­ed a thought exper­i­ment: They assumed that the Covid mRNA vac­ci­na­tions were com­plete­ly inef­fec­tive – and cal­cu­lat­ed what effect obser­va­tion­al stud­ies could nev­er­the­less attest to the vac­ci­na­tions. A vac­cine that has zero effect can still “achieve” a sup­posed pro­tec­tive effect of 67 per­cent. High­ly regard­ed obser­va­tion­al stud­ies, some of which were wide­ly cit­ed and served as a guide­line for the author­i­ties, had great­ly over­es­ti­mat­ed the pro­tec­tive effect of the mRNA vac­cines, accord­ing to the con­clu­sion of the three sci­en­tists Peter Doshi, Kaiser Fung and Mark Jones.

So what have we learned from these exam­ples?  That we should have pre­cise knowl­edge about the data, how it comes about and its pos­si­ble dis­rup­tive fac­tors. A con­di­tion which will nev­er be 100 per­cent pos­si­ble and which we can­not know exact­ly. But we can get clos­er. That’s why I actu­al­ly feel that Google’s approach to pre­dict­ing a peak flu out­break is like read­ing tea leaves – the hypoth­e­sis and the mea­sure­ment method are not good enough. That’s why I also speak of dataism as a kind of blind belief in the objec­tiv­i­ty of data and would like to take up a defence for sto­chas­tics and prob­a­bil­i­ty theory.

We also know from the the­o­ry of mea­sure­ment that every mea­sure­ment is wrong, for fun­da­men­tal rea­sons. The true val­ue can nev­er be derived from a phys­i­cal quan­ti­ty, but only the mea­sured or read val­ue. Data record­ing cre­ates a dia­gram, a kind of point cloud, which in turn shows a ten­den­cy. With the help of a math­e­mat­i­cal func­tion (a straight line or a poly­no­mi­al of high­er order), the devi­a­tions of these mea­sur­ing points from this func­tion can be deter­mined. This allows the sum of the devi­a­tion squares to be set min­i­mal­ly and so we come to the so-called regres­sion analy­sis. This gen­er­ates a curve that cor­re­sponds to the mea­sured val­ues as best as pos­si­ble. This func­tion can then also be used for future fore­casts. This method can be clear­ly jus­ti­fied and the mea­sured val­ues can be rea­son­ably inter­pret­ed – name­ly against the back­ground of my the­o­ret­i­cal con­sid­er­a­tions. Sci­ence has been work­ing with this for a long time – at least since Gauss.

In order to be able to assess the qual­i­ty of the results of a data eval­u­a­tion, trans­paren­cy about the data and the under­ly­ing log­ic, explain­abil­i­ty and prov­abil­i­ty are required. For humans, trace­abil­i­ty and ver­i­fi­a­bil­i­ty seem dif­fi­cult. How good or suf­fi­cient is the ver­i­fi­a­bil­i­ty by sci­en­tif­i­cal­ly recog­nised math­e­mat­i­cal meth­ods such as LRP or LIME in the case of AI?

In the case of AI, the first thing to do is to define what is meant by it. Today, it is usu­al­ly under­stood to mean arti­fi­cial neur­al net­works. These can be math­e­mat­i­cal­ly mod­elled with pre­ci­sion using func­tions of lin­ear alge­bra (matrix cal­cu­lus). These neur­al net­works can be “trained” for a spe­cif­ic use case by opti­mal­ly adapt­ing the large num­ber of their para­me­ters, which are still unde­ter­mined for the time being, to large amounts of data from the field of appli­ca­tion. This can then be used to make pre­dic­tions, com­pa­ra­ble to the pro­ce­dure for regression.

There is some talk that a neur­al net­work is a black box that is not com­pre­hen­si­ble. Here I have to vehe­ment­ly dis­agree inso­far as a neur­al net­work and its behav­iour can also be explained and recal­cu­lat­ed by lin­ear alge­bra. The cal­cu­lat­ing pro­ce­dure does what the for­mal descrip­tion by pro­grams spec­i­fies and deliv­ers a deter­mined result. And that’s exact­ly why we can be sure that there are no mis­takes in this area, irre­spec­tive how­ev­er of inevitable sto­chas­tic uncer­tain­ties. Of course, there fur­ther are some chal­lenges in lin­ear alge­bra, such as the fact that indi­vid­ual matri­ces are poor­ly con­di­tioned and the com­pu­ta­tion­al pro­ce­dure does not real­ly run well. These are numer­i­cal prob­lems. But the descrip­tion by lin­ear alge­bra is equiv­a­lent to the neur­al net­work. Neur­al net­works are math­e­mat­i­cal spaces with tens of thou­sands of dimen­sions and although this over­whelms human per­cep­tion, the mind can rely on the cal­cu­la­tions. Beyond that, how­ev­er, an indi­vid­ual result can­not be explained.

Let’s assume that the qual­i­ty of the data is excel­lent, then the result is still a result with a cer­tain uncer­tain­ty or a cer­tain prob­a­bil­i­ty. It’s like regres­sion, where the curve only describes pos­si­ble out­comes but not the real ones. Even if the prob­a­bil­i­ty of false results from a neur­al net­work is rel­a­tive­ly small, then we only know that a cer­tain per­cent­age of all results are false, but not which of these results are false. So, there is always inher­ent uncertainty.

How we should deal with this fun­da­men­tal uncer­tain­ty can­not be clear­ly answered. It depends on the sit­u­a­tion. One sit­u­a­tion can tol­er­ate a high­er, the oth­er a low­er prob­a­bil­i­ty of error rate. We already know this from many tech­ni­cal systems.

But do we want to tol­er­ate mis­takes and sim­ply accept the result from a neur­al net­work, or should or must we check the result indi­vid­u­al­ly? The indi­vid­ual inspec­tions are time-con­sum­ing and accept­ed errors can have con­sid­er­able mon­e­tary con­se­quences – pos­si­bly also in terms of our lives. And how do super­vi­sors react when mis­takes hap­pen – even if they know that there is an error rate and only prob­a­ble results are delivered?

The fact of a prob­a­bil­i­ty-based error rate can­not be elim­i­nat­ed and it is pre­cise­ly this ques­tion that soci­ety must deal with. Per­haps there can be a com­par­i­son between the sus­cep­ti­bil­i­ty to error of a neur­al net­work and the sus­cep­ti­bil­i­ty to error in humans, and per­haps tol­er­ance lim­its for high-risk AI must then be defined in laws and con­tracts so that respon­si­bil­i­ty is clar­i­fied and ful­ly auto­mat­ed process­es become pos­si­ble. Cer­ti­fi­ca­tion process­es are also con­ceiv­able. Then there is the assess­ment by the judi­cia­ry, which must con­tin­ue to deal with the knowl­edge of the sus­cep­ti­bil­i­ty to error and the tol­er­ance lim­it in the event of damage.

Do we need a phi­los­o­phy of sci­ence adapt­ed to new tech­no­log­i­cal changes such as AI / Big Data? Or per­haps a com­plete­ly new one?

No, we need nei­ther a new nor an adapt­ed the­o­ry of sci­ence. Rather, we must use the proven and test­ed methods.

Around 2008, the online mag­a­zine “Wired” had an edi­tor-in-chief named Chris Ander­son, who put for­ward the bold the­sis that there was no longer a need for hypoth­e­sis-based analy­ses in view of the large amount of data on the Inter­net. One can sim­ply read the truth about real­i­ty from data correlations.

This is a bla­tant, ancient form of the error “hoc ergo proc­ter hoc” (Latin: “by this, there­fore because of this”), i.e. the fal­la­cy of pseu­do-causal­i­ty, in which the joint occur­rence of events or the cor­re­la­tion between char­ac­ter­is­tics is under­stood as a causal rela­tion­ship with­out clos­er exam­i­na­tion. But a cor­re­la­tion does not imply causal­i­ty, even if the con­nec­tion may seem to sug­gest it. There­fore, it is impor­tant that we know the dif­fer­ence between causal­i­ty and cor­re­la­tion, because just because A and B occur togeth­er does not mean that we know whether A is caused by B or vice ver­sa or even by a third C in each case.

And it is pre­cise­ly this state­ment that is an indi­ca­tor for me of how far our soci­ety and sci­ence have regressed when an edi­tor-in-chief can make such a state­ment in a glob­al­ly read mag­a­zine. An out­ra­geous scandal!

The French post­mod­ern philoso­phers after Sartre also incurred a great deal of blame – even if they were not real­ly tak­en seri­ous­ly in France and Europe – because they were cel­e­brat­ed as lucky charms in the USA. They were court­ed in the elite uni­ver­si­ties from Har­vard to Stan­ford. From there comes the ques­tion­able claim that real­i­ty is noth­ing more than a chaos of nar­ra­tives – the the­o­ret­i­cal basis for the fight against “fake news” lat­er on.

There are two logi­cians, Charles Sanders Peirce and Got­t­lob Frege, who have devel­oped a first-lev­el pred­i­cate cal­cu­lus with exis­tence- and all-quan­ti­fiers. They thus make it pos­si­ble to for­malise argu­ments in prac­tice and the­o­ry in many sci­ences in impor­tant areas and to check their valid­i­ty. In addi­tion, they have devel­oped a log­i­cal­ly advanced con­cept of signs, which is par­tic­u­lar­ly impor­tant for com­put­er technology.

This is of the utmost impor­tance, espe­cial­ly for our dia­logue, because both say inde­pen­dent­ly of each oth­er that a sign is not a thing, but a tri­adic rela­tion. We need 3 enti­ties that are relat­ed to each oth­er: First, a phys­i­cal sign car­ri­er or body, called »rep­re­sen­ta­men« (e.g., a fig­ure made of print­er’s ink on paper), sec­ond, the object named or des­ig­nat­ed by it, about which the sign says some­thing (e.g., cup), and third, the inter­pre­tant, the con­cept or mean­ing that an inter­preter assigns to the sign depend­ing on the situation.

In this struc­tur­al rela­tion­ship, the pecu­liar­i­ty of the object plays a deci­sive role. If the object is only a thought object, as in math­e­mat­ics, for exam­ple, because no one has ever seen a point or a line, strict log­i­cal rules apply to these thought objects. It can have a cer­tain pred­i­cate, such as the point as a thing with­out exten­sion (unlike the atom). These are “inven­tions” of math­e­mat­ics. David Hilbert said so beau­ti­ful­ly: “Math­e­mat­ics is a game with few rules and mean­ing­less char­ac­ters on paper”. So, if we want to talk about imag­i­nary math­e­mat­i­cal objects, then what is said must agree with the rules of math­e­mat­ics, log­ic (espe­cial­ly pred­i­cate log­ic). If there is no con­for­mi­ty, then we know that the state­ment or mean­ing assign­ment is wrong. If we have a nat­ur­al object such as a star in front of us, then we can only talk about it mean­ing­ful­ly if we recog­nise the laws of grav­i­ty and bring our state­ment into line with these laws. That’s why the object is so impor­tant, because it pro­vides infor­ma­tion about right and wrong state­ments. At the same time, it fol­lows that cal­cu­la­tion meth­ods used in com­put­er tech­nol­o­gy oper­ate only with data in the form of bina­ry sig­nals, with­out “know­ing” what they stand for or what they mean.

How can sci­ence have a bet­ter impact on our social and polit­i­cal sphere in order to advance our civilisation?

Data or AI per se will not bring about any progress, at best, data and AI will help us from the point of view of effi­cien­cy, but not in the field of creativity.

Com­put­er tech­nol­o­gy is com­plete­ly over­rat­ed in eco­nom­ic terms. It does not rep­re­sent an increase in pro­duc­tiv­i­ty. This has been shown empir­i­cal­ly time and time again, because increas­es in pro­duc­tiv­i­ty can only be achieved through a reor­gan­i­sa­tion of sign-based coop­er­a­tion process­es and not through mere data processing.

The delu­sion from the 1980s: “Experts go, expert sys­tems stay” will not come true this time either. Although the approach­es of “sym­bol­ic AI” at that time have been large­ly replaced by those of “con­nec­tion­ist AI” this time around, they are not “intel­li­gent” either, but, like all arti­facts, only objec­ti­fy insights from the intel­li­gence of their cre­ators. Jean Piaget said at the end of his life: “Intel­li­gence is what you use when you don’t know what to do”. And the algo­rith­mic con­trol of com­pu­ta­tion­al pro­ce­dures is exact­ly the oppo­site, because you have to know exact­ly what to do and how to con­trol the process. There­fore, AI sys­tems, their com­put­er pro­grams, algo­rithms, etc. are not intelligent.

I have stud­ied the his­to­ry of com­put­ers and have col­lect­ed many news­pa­per head­lines from the ear­ly days of com­put­er tech­nol­o­gy in the 1940s. One of the head­lines was: “30-ton elec­tron brain at Philadel­phia Uni­ver­si­ty thinks faster than Ein­stein” or a book by E.C. Berke­ley in 1949 is enti­tled “Giant Brains or Machines that think”. This shows how long the “think­ing machine” and its sup­posed pos­si­bil­i­ties have been an issue in our soci­ety. They are also today’s leit­mo­tif in the cog­ni­tive sci­ences – in the form of the so-called “com­pu­ta­tion­al the­o­ry of mind”.

We are deal­ing here with views from a long ratio­nal­ist tra­di­tion, in which, like Pierre-Simon Laplace, peo­ple believed that the world could be explained by a sys­tem of dif­fer­en­tial equa­tions, and accord­ing to which intel­lec­tu­al insights were noth­ing more than cal­cu­la­tion (man as machine). This is also con­tained in the word “ratio­nal”, because ratio means both cal­cu­la­tion and reason.

In the 1950s, there were philoso­phers such as Gilbert Ryle who wrote against it. He made a strict dis­tinc­tion between “know­ing how” and “know­ing that”, i.e. the first is implic­it abil­i­ty and the sec­ond is explic­it knowl­edge. Or the chemist Michael Polanyi, who also wrote some­thing sim­i­lar in his book “The tac­it dimen­sion”. We also find this in Aris­to­tle, who pos­tu­lat­ed that there is prac­ti­cal action com­pe­tence that also has access to our cre­ativ­i­ty and intuition.

Today we know that we can­not expli­cate all implic­it abil­i­ty as knowl­edge. This is only pos­si­ble to a lim­it­ed extent with the sci­en­tif­ic meth­ods men­tioned above. Only indi­vid­ual aspects of real­i­ty or abil­i­ty can be expli­cat­ed. We have recent­ly found this in the core and nuances in Daniel Kah­ne­man’s “Fast think­ing and slow think­ing” or in Albert Ein­stein’s “The sign of true intel­li­gence is not knowl­edge, but imagination.”

In view of all this, I ask myself with a view to today’s social sit­u­a­tion: How are upcom­ing exis­ten­tial crises to be over­come in the future if it is no longer even pos­si­ble to inter­pret data appro­pri­ate­ly against the back­ground of their emer­gence and to answer the fun­da­men­tal ques­tion “How much is a lot?” in a rel­e­vant com­par­i­son? When it is no longer even pos­si­ble to clas­si­fy an observed event in its fac­tu­al and tem­po­ral con­text. When it is no longer even pos­si­ble to con­duct ratio­nal dis­cours­es based on evi­dence-based argu­ments as the core of rea­son-dri­ven knowl­edge gain. When it is no longer even pos­si­ble to con­duct ratio­nal dis­cours­es based on evi­dence-based argu­ments as the core of rea­son-dri­ven knowl­edge gain. When the aware­ness of the fun­da­men­tal dif­fer­ence between the­o­ret­i­cal and prac­ti­cal rea­son (Aris­to­tle), between intu­itive judg­ment and cal­cu­lat­ing rea­son, between expe­ri­ence-based abil­i­ty and explic­it knowl­edge has dis­ap­peared. When it is no longer pos­si­ble to com­bine the dis­ci­pli­nary frag­ments of knowl­edge of a sci­ence to form a coher­ent over­all pic­ture of a com­plex sit­u­a­tion that has occurred?

We are cur­rent­ly expe­ri­enc­ing all of these phe­nom­e­na. This is by no means a mat­ter of sub­tle sophistry or sub­ject-spe­cif­ic method­olog­i­cal sub­tleties that would only be of inter­est to rel­e­vant experts, but of the most ele­men­tary log­i­cal and epis­temic knowl­edge and skills that deter­mine our access to and under­stand­ing of the world. In a soci­ety that is too capa­ble of arti­fice, there should be a core com­po­nent of ele­men­tary edu­ca­tion, but this too has appar­ent­ly been large­ly lost. This loss of a real­is­tic and method­olog­i­cal­ly secured access and under­stand­ing of the world is tan­ta­mount to uproot­ing and means noth­ing oth­er than a pro­found decline in civilisation.

Prof. Dr Bröd­ner, thank you for shar­ing your insights on dataism and our ques­tion whether data quan­ti­ty auto­mat­i­cal­ly leads to qual­i­ty if the data set is big enough.

Thank you, Dr Cal­daro­la, and I look for­ward to read­ing your upcom­ing inter­views with recog­nised experts, delv­ing even deep­er into this fas­ci­nat­ing topic.


[i] https://​www​.infos​per​ber​.ch/​g​e​s​u​n​d​h​e​i​t​/​w​i​e​-​e​i​n​-​u​n​w​i​r​k​s​a​m​e​r​-​i​m​p​f​s​t​o​f​f​-​w​i​r​k​s​a​m​-​e​r​s​c​h​e​i​n​e​n​-​k​a​nn/ Seite 1 von 11 Wie ein unwirk­samer Impf­stoff wirk­sam erscheinen kann – infos­per­ber 16.01.25, 10:29

About me and my guest

Dr Maria Cristina Caldarola

Dr Maria Cristina Caldarola, LL.M., MBA is the host of “Duet Interviews”, co-founder and CEO of CU³IC UG, a consultancy specialising in systematic approaches to innovation, such as algorithmic IP data analysis and cross-industry search for innovation solutions.

Cristina is a well-regarded legal expert in licensing, patents, trademarks, domains, software, data protection, cloud, big data, digital eco-systems and industry 4.0.

A TRIUM MBA, Cristina is also a frequent keynote speaker, a lecturer at St. Gallen, and the co-author of the recently published Big Data and Law now available in English, German and Mandarin editions.

Prof. Dr Peter Brödner

From 1976 to 1989 Prof. Dr-Ing. Peter Brödner was responsible for the management of industrial development projects in the fields of NC programming, flexible manufacturing systems, production planning and control, and anthropocentric production systems at the project management agencies Humanisation of Working Life (DLR Bonn) and Manufacturing Technology (Forschungszentrum Karlsruhe). From 1989 to 2005, he was research director for production systems at the Institute for Work and Technology at the North Rhine-Westphalia Science Center, with a focus on the design of computer-aided work and organisational change. Since then, he has retired and has become a research consultant, lecturer for "IT in Organisations" and honorary professor at the University of Siegen. He is a member of the Leibniz Society of Sciences and Humanities in Berlin.

Dr Maria Cristina Caldarola

Dr Maria Cristina Caldarola, LL.M., MBA is the host of “Duet Interviews”, co-founder and CEO of CU³IC UG, a consultancy specialising in systematic approaches to innovation, such as algorithmic IP data analysis and cross-industry search for innovation solutions.

Cristina is a well-regarded legal expert in licensing, patents, trademarks, domains, software, data protection, cloud, big data, digital eco-systems and industry 4.0.

A TRIUM MBA, Cristina is also a frequent keynote speaker, a lecturer at St. Gallen, and the co-author of the recently published Big Data and Law now available in English, German and Mandarin editions.

FOL­LOW ME