Big Data, data jour­nal­ism and the future of media

Prof. Christi­na Elmer

The nature of jour­nal­ism is under­go­ing a trans­for­ma­tion. New dig­i­tal tools, such as AI, Big Data, as well as the dis­tri­b­u­tion chan­nels of social media are chang­ing what is required, what is pos­si­ble in terms of career and offers a new set of chal­lenges. What are these exact­ly? How can jour­nal­ism retain its cred­i­bil­i­ty in the dig­i­tal age? How will it ensure its inde­pen­dence in this era of vast­ly dis­parate opin­ions in dig­i­tal space? How can the pop­u­la­tion digest the cur­rent infor­ma­tion flood con­sist­ing of real and fake reports? How does jour­nal­ism cre­ate room for thinkers to have dis­cus­sions on what we want to achieve using all that is pos­si­ble in the dig­i­tal world?

In her lat­est Duet inter­view, Dr Cal­daro­la, author of Big Data and Law, and Prof. Christi­na Elmer, the first Pro­fes­sor for data jour­nal­ism and dig­i­tal jour­nal­ism in Ger­many, dis­cuss the many impor­tant changes in journalism.

You have received the first pro­fes­sor­ship for data jour­nal­ism and dig­i­tal jour­nal­ism in Ger­many. Can you please pro­vide our read­ers with a short intro­duc­tion on what data jour­nal­ism is, how the future in media might look like and what devel­op­ments, ini­tia­tives and break­throughs exist in oth­er countries?

Prof. Christi­na Elmer: Data jour­nal­ism is essen­tial­ly about unlock­ing struc­tured data as sources of infor­ma­tion in jour­nal­ism. These can be exten­sive sta­tis­tics in clear for­mats or even larg­er data leaks that we have to eval­u­ate in an inves­tiga­tive way. For this type of analy­sis, we need new skills, meth­ods and process­es in jour­nal­ism – and teams that are organ­ised in a much more inter­dis­ci­pli­nary way than before and are strong­ly con­nect­ed with oth­er dis­ci­plines and depart­ments. My goal is to ground these nec­es­sary skills more com­pre­hen­sive­ly and, at the same time, train spe­cial­ists who can focus on this excit­ing and increas­ing­ly rel­e­vant area. The field of data jour­nal­ism has been devel­op­ing in Ger­many for about 15 years; in the US, for instance, it has a much longer tra­di­tion. We can, how­ev­er, dis­cern a clear trend in this direc­tion glob­al­ly, which is not sur­pris­ing: With dig­i­tal­i­sa­tion and the grow­ing avail­abil­i­ty of open data, jour­nal­ists can increas­ing­ly access rel­e­vant data sources in their inves­ti­ga­tions – and should do so in order to gain new insights and check oth­er sources of infor­ma­tion. What emerges can, at best, be called evi­dence-based jour­nal­ism. And in my view, this is real­ly an impor­tant devel­op­ment. Because, after all, we have to do some­thing to coun­ter­act the flood of uncer­tain, biased or even manip­u­lat­ed infor­ma­tion in the dig­i­tal pub­lic sphere.

Dig­i­tal media cur­rent­ly offers a huge spec­trum for all sort of pub­li­ca­tions. Once there were news­pa­pers and mag­a­zines, then came the advent of tele­vi­sion while today we use social media and tools, such as AI, Big Data and alike. Infor­ma­tion is not only cre­at­ed by jour­nal­ists but also by peo­ple using their mobile devices that record occur­rences and pro­vide oppor­tu­ni­ties to post them on dif­fer­ent dis­tri­b­u­tion chan­nels. How do jour­nal­ists han­dle the amount of dis­tri­b­u­tion chan­nels and the quan­ti­ty of infor­ma­tion? How are they digest­ed, edit­ed, and val­i­dat­ed con­sid­er­ing the fact that media com­pa­nies have few­er employ­ees and small­er budgets?

Of course, media hous­es try to allo­cate their resources wise­ly – even if their bud­get is tight. In rela­tion to the chal­lenges you men­tioned, this means, for exam­ple, that the area of devel­op­ing and refin­ing new prod­ucts and offers has grown in many out­lets, espe­cial­ly regard­ing dig­i­tal chan­nels and plat­forms. These teams take a close look at what media users need and what for­mats can be used to address them clev­er­ly. In this con­text, new prod­ucts with a strong tech­no­log­i­cal com­po­nent are often cre­at­ed, which is why new con­stel­la­tions and com­pe­tences are also need­ed in this area – from data analy­sis and pro­gram­ming to dig­i­tal dis­tri­b­u­tion and search engine opti­mi­sa­tion. As far as research in the dig­i­tal space is con­cerned, jour­nal­ists are indeed chal­lenged. Check­ing the authen­tic­i­ty of dig­i­tal sources has become an impor­tant skill – some news­rooms have now set up their own units for this type of work. In research as well as in dis­tri­b­u­tion, we also need new ways of struc­tur­ing infor­ma­tion via meta­da­ta and link­ing it in dif­fer­ent con­texts. This could also be a way to pre­pare infor­ma­tion in a man­age­able way so that we can use it appro­pri­ate­ly in our process­es and for publication.

A pic­ture used to be viewed as evi­dence for an occur­rence. Today every pic­ture can eas­i­ly and rapid­ly be manip­u­lat­ed by every­body by means of dig­i­tal tools. How much effort is required and what tech­niques do jour­nal­ists use to ver­i­fy sources? How do they main­tain the cred­i­bil­i­ty of their report­ing? How effec­tive are they and how do they function?

One cen­tral chal­lenge has always been to trace the source back to its ori­gin, if pos­si­ble, and to cross-check it with a sec­ond source: Who took a pho­to and how can we assess its cred­i­bil­i­ty as con­fi­dent­ly as pos­si­ble? Also, who was on the scene and could they pro­vide us with a sec­ond per­spec­tive on it? In addi­tion, in many cas­es one can indeed find traces of the pho­to-fak­ing process in the pic­ture itself. These strate­gies are not new and have always been used in edi­to­r­i­al offices ‑and not only for visu­al sources. Com­par­a­tive­ly new, how­ev­er, are tech­nolo­gies that enable entire video or audio sequences to be fab­ri­cat­ed with remark­ably lit­tle effort. How­ev­er, there also exist tech­no­log­i­cal tools to iden­ti­fy such fakes, which are already being used in news­rooms. Nev­er­the­less, from my point of view, it would be fatal to rely on these tools, since the fakes are nat­u­ral­ly also improv­ing all the time. Con­verse­ly, to me it seems rea­son­able to empha­sise the val­ue of cred­i­ble and ver­i­fied sources and to make jour­nal­is­tic research more trans­par­ent – and, in case of doubt, to always decide against pub­lish­ing rather than putting faith in jour­nal­ism at risk.

Peo­ple are there when some­thing hap­pens and with­in a few split sec­onds, post videos, their opin­ions, reports and sim­i­lar. How do jour­nal­ists han­dle this type of speed, quan­ti­ty, sen­sa­tion­al­ism as well as the lack of research, bal­ance, objec­tiv­i­ty and prop­er use of lan­guage? Or expressed dif­fer­ent­ly: How do jour­nal­ists ver­i­fy sources and con­tent and how does well researched con­tent reach the respec­tive media in time?

There isn’t real­ly a sin­gle answer that applies to jour­nal­ism as a whole – every news­room has its own process­es, dead­lines and time­lines, depend­ing on the pub­li­ca­tion chan­nel, the strat­e­gy and the busi­ness mod­el. How­ev­er, the fol­low­ing applies to all dig­i­tal media offer­ings: In order to stand out from the flood of infor­ma­tion, they have to con­vey more strong­ly than before what dis­tin­guish­es them from oth­er pub­lish­ers. This can mean dis­clos­ing the sources and sta­tions of the research, address­ing pos­si­ble con­flicts of inter­est and con­sis­tent­ly dif­fer­en­ti­at­ing between neu­tral infor­ma­tion and opin­ion, between ver­i­fied knowl­edge and the grey areas of report­ing. The cen­tral issues of our time, such as pan­demics, cli­mate cri­sis and soci­etal trans­for­ma­tion, are very com­plex and can­not be solved with sim­ple recipes. Media can gain trust if they cre­ate spaces in which such com­plex issues can be nego­ti­at­ed. How­ev­er, this also means resist­ing the temp­ta­tion to react quick­ly and thus work against the reward sys­tem of our real-time inter­net is no easy task in a news world that has steadi­ly accelerated.

Human beings are social beings who most like­ly share a major­i­ty opin­ion because it helps them feel secure. Are opin­ions being cre­at­ed with the aid of clicks, tweets and likes? Is an illu­sion of a mass of human beings being cre­at­ed via a sup­posed grass roots aris­ing as a move­ment of cit­i­zens which is then also sub­ject to manipulation? -

If soci­ety were to inform itself sole­ly through media and if media were to align their report­ing exclu­sive­ly to the met­rics you men­tioned, there could of course be such an effect. But from my point of view, nei­ther sit­u­a­tion is true at the moment. In seri­ous edi­to­r­i­al offices, even today, the most impor­tant news items are select­ed on the basis of their rel­e­vance – based not only on the response, but also on find­ings that go beyond that, for exam­ple, from sci­en­tif­ic stud­ies or statistics.

Math­e­mati­cians and data experts from the French Insti­tute for Com­plex Sys­tems have been inves­ti­gat­ing the bat­tle for con­trol of vir­tu­al space and have visu­al­ized it, using the exam­ple of the com­bat between cli­mate sup­port­ers and cli­mate oppo­nents. They showed how news were being shared, how accounts were inter­act­ing and how com­mu­ni­ties were grow­ing. This visu­al­i­sa­tion illus­trat­ed that cli­mate oppo­nents are, numer­i­cal­ly speak­ing, con­sid­er­ably few­er in num­ber than cli­mate sup­port­ers. By con­trast, cli­mate oppo­nents are clear­ly more active and can com­pen­sate for the enor­mous amount of evi­dence against them. They could also show that the Heart­land Insti­tute is behind the group of cli­mate oppo­nents – a think tank of around 40 employ­ees. This Insti­tute which is locat­ed in Chica­go pro­duces tweets, con­fer­ences and books. Are they the opin­ion mak­ers and jour­nal­ists of today or tomorrow?

You have brought up an excit­ing study that real­ly makes you think! But I would not call such actors jour­nal­ists –they are sim­ply not inde­pen­dent and impar­tial enough to deserve that title.

My favourite quote – almost three decades old, but still true:

“Com­put­ers don’t make a bad reporter into a good reporter. What they do is make a good reporter better”.

Elliot Jaspin

What they do is fol­low their own agen­da, which does not nec­es­sar­i­ly oblige them to com­mu­ni­cate in the inter­est of a well-informed pub­lic. But, of course, espe­cial­ly when it comes to top­ics with a sci­en­tif­ic basis, the media must pay even more atten­tion to ade­quate­ly cov­er­ing such dis­cours­es. Unlike in socio-polit­i­cal dis­cus­sions, it is sim­ply not enough to set the two sides against each oth­er when sci­en­tif­ic debates are involved – that would pro­duce an effect known as false bal­ance. If there is a posi­tion that the over­whelm­ing major­i­ty of sci­en­tif­ic experts in this field can agree on, then it should log­i­cal­ly also be giv­en more space in report­ing. Espe­cial­ly with regard to top­ics which are both sci­en­tif­i­cal­ly researched and social­ly nego­ti­at­ed, jour­nal­ists have to be care­ful. In my view, this is an impor­tant les­son from the pan­dem­ic report­ing. But it also means that we need new com­pe­ten­cies in some parts of the news­rooms, for exam­ple, to eval­u­ate sci­en­tif­ic experts. 

Will Bild TV become the “Heart­land” of tomorrow?

That depends on whether the edi­to­r­i­al team places jour­nal­is­tic prin­ci­ples and rel­e­vance cri­te­ria at the cen­tre of its work – or whether it fol­lows oth­er inter­ests and cam­paigns. That, of course, is some­thing to watch. I find it excit­ing that cur­rent or even live report­ing is to be giv­en such a promi­nent place in the new pro­gramme. Pro­vid­ing the nec­es­sary con­text which makes it pos­si­ble to cor­rect­ly clas­si­fy the top­ics of the day is cer­tain­ly a major challenge.

Where are the var­i­ous dis­sent­ing opin­ions? How are these opin­ions being pro­tect­ed? Or are they get­ting lost in the flood of information?

In a plu­ral­is­tic soci­ety that pro­tects free­dom of the press and free­dom of opin­ion, it is fun­da­men­tal that minori­ties be heard. Accord­ing­ly, there are var­i­ous pre­cau­tions to ensure diver­si­ty of opin­ion in jour­nal­ism – for exam­ple, through con­cen­tra­tion con­trol or the appro­pri­ate require­ments for both pub­lic and pri­vate broad­cast­ing. In this respect, such pro­tec­tion is pro­vid­ed for – in the­o­ry – but we must indeed be care­ful to keep spaces avail­able for minor­i­ty per­spec­tives and posi­tions. This is espe­cial­ly true for media offer­ings that are sup­posed to be prof­itable and are there­fore par­tic­u­lar­ly prone to focussing on large or afflu­ent tar­get groups. 

Ama­zon has devel­oped in part due to the online sale of books. They sell dig­i­tal books and audio books for their Kin­dle. With the help of Big Data analy­sis, Ama­zon analy­ses what their clients pur­chase, read, which sen­tences and chap­ters they under­line, quote and which para­graphs they read again. Ama­zon has now pur­chased a pub­lish­ing house and wants to pro­duce tai­lor-made con­tent based on their clien­tele analy­ses to raise their sales. Is that a mod­el for journalism?

Cer­tain­ly not. Jour­nal­ism that com­plete­ly fol­lows what its read­ers want to hear and opti­mis­es its con­tent sole­ly for sales would make itself irrel­e­vant. On the con­trary, we should empha­sise what jour­nal­ists are still need­ed for today – for ques­tions that no one has asked before, for an empa­thet­ic view of the world and a cre­ative approach to forms of expres­sion that can evolve as a result. Of course, edi­to­r­i­al offices should use new tech­nolo­gies to stream­line their process­es in mean­ing­ful places, to enable research involv­ing large amounts of data and to gen­er­ate for­mats for new plat­forms. But we should not leave the con­tent cre­ation and the eval­u­a­tion of top­ics to algo­rithms. If jour­nal­ism is to be the place where social dis­course is mod­er­at­ed, then it should be shaped by soci­ety – and not exclu­sive­ly accord­ing to mea­sur­able cri­te­ria that can be processed by algorithms.

Wikipedia has replaced ency­clopae­dias. The pub­lish­ing hous­es of ency­clopae­dias used to have trained employ­ees who researched, eval­u­at­ed and processed facts, knowl­edge etc. This group of employ­ees dis­ap­peared and every lay­man can now add knowl­edge and pseu­do- sci­ence to the largest and best-known dig­i­tal ref­er­ence book. Wikipedia makes efforts to check entries, erase untruths or duly endorse such con­tri­bu­tions. Is this trans­for­ma­tion an improvement?

The idea behind Wikipedia is bril­liant in my opin­ion – a par­tic­i­pa­to­ry ency­clopae­dia that is close­ly inter­twined with today’s soci­ety, gives space to niche knowl­edge and thus reflects much more than a stan­dard ency­clopae­dia ever could. Of course, this is asso­ci­at­ed with a dif­fer­ent qual­i­ty stan­dard: A Wikipedia entry can at best serve as a sec­ondary source, that is, as a guide to cor­re­spond­ing pri­ma­ry sources. There­fore, in my view, the two for­mats are not tru­ly com­pa­ra­ble. Nev­er­the­less, it is of course excit­ing to observe how Wikipedia has devel­oped and what chal­lenges it is cur­rent­ly fac­ing – such as the calls for a com­pre­hen­sive reform, includ­ing more trans­paren­cy in fund­ing and in the author­ship of con­tri­bu­tions. That would cer­tain­ly be desir­able, as would efforts for more diver­si­ty and gen­der equal­i­ty with­in Wikipedia.

What is jour­nal­ism doing to com­bat fake news, agno­tol­ogy, manip­u­la­tion, and shrink­ing bud­gets to pre­serve its credibility?

Cur­rent­ly, new mod­els and for­mats are being devel­oped and test­ed all over the media sys­tem to keep jour­nal­ism rel­e­vant and ensure its cred­i­bil­i­ty. There­fore, it is not at all easy to answer the ques­tion in an over­ar­ch­ing way. What I would like to address in this con­text is the user-based devel­op­ment of new prod­ucts and for­mats. This focus­es on the con­crete needs of the audi­ence so that edi­tors have to deal more inten­sive­ly with the real­i­ty of their audi­ence’s lives. I am con­vinced that this approach can help media to secure trust, stay rel­e­vant and build good rela­tion­ships with their read­ers. Espe­cial­ly in a dig­i­tal pub­lic sphere, where jour­nal­ism must always stand out from some­times aggres­sive com­peti­tors, these rela­tion­ships mat­ter. And of course, it needs to be com­pre­hen­sive­ly researched, crit­i­cal­ly exam­ined and qual­i­ta­tive­ly pre­pared con­tent – with­out it, even the smartest for­mat can­not be successful.

Are peo­ple, the read­ers of news, los­ing their abil­i­ty to dis­cern between actu­al and fake facts, rumours, opin­ions etc, giv­en the back­ground of con­stant infor­ma­tion flood?

We cer­tain­ly need new skills to fil­ter and clas­si­fy the infor­ma­tion we are con­front­ed with every day. Jour­nal­ists can help by not only pub­lish­ing their sto­ries, but also by mak­ing the process of their inves­ti­ga­tion trans­par­ent and explain­ing how they have come to their con­clu­sions. In my view, how­ev­er, we should also deep­en media edu­ca­tion, start­ing at schools, but even­tu­al­ly address­ing all age groups. This edu­ca­tion is impor­tant because many meth­ods and tools from the jour­nal­is­tic research process also help in every­day life, for exam­ple, when it comes to assess­ing alleged facts that are shared on social media plat­forms. If more peo­ple know how the work is being done in edi­to­r­i­al offices, this not only strength­ens trust in qual­i­ty jour­nal­ism, but also fos­ters their own judgement.

Is there a fight for opin­ions and truth tak­ing place on the of jour­nal­ism home front? How can jour­nal­ism safe­guard its inde­pen­dence in the dig­i­tal age?

I see jour­nal­ism, as I have said before, as an insti­tu­tion that gives space to rel­e­vant social dis­cours­es and mod­er­ates them. In this respect, a com­pe­ti­tion of opin­ions and a crit­i­cal search for truths are both essen­tial parts of this sys­tem. I would­n’t call it a bat­tle – but when I think of the dig­i­tal net­work pub­lic sphere as a whole, there are clear ten­den­cies towards manip­u­la­tion and polar­i­sa­tion, and per­son­al attacks have become stan­dard in many dig­i­tal dis­course spaces, espe­cial­ly against minori­ties. This ter­rain there­fore seems more prob­lem­at­ic to me, espe­cial­ly if it is more dif­fi­cult to influ­ence it in a con­struc­tive direc­tion. But since jour­nal­ism is a rel­e­vant actor in this sphere, we nat­u­ral­ly have a shared respon­si­bil­i­ty and must do our part. How­ev­er, this only works if we main­tain our inde­pen­dence, and this chal­lenge is indeed even more com­plex in the dig­i­tal sphere. It is not enough to sim­ply pub­lish arti­cles and sup­ply them to sub­scribers. Instead, we have to deal with a com­plex world of diverse plat­forms, actors and con­stant­ly chang­ing con­texts. In par­tic­u­lar, if we want to reach new, younger tar­get groups, we have to be present where they are active. It is thus cru­cial to be clear­ly recog­nis­able – and to empha­sise the val­ues of an inde­pen­dent, crit­i­cal, val­ue-ori­ent­ed journalism.

Prof. Elmer, thank you for shar­ing your reflec­tion on the changes in journalism.

Thank you, Dr Cal­daro­la, and I look for­ward to read­ing your upcom­ing inter­views with rec­og­nized experts, delv­ing even deep­er into this fas­ci­nat­ing topic.

About me and my guest

Dr Maria Cristina Caldarola

Dr Maria Cristina Caldarola, LL.M., MBA is the host of “Duet Interviews”, co-founder and CEO of CU³IC UG, a consultancy specialising in systematic approaches to innovation, such as algorithmic IP data analysis and cross-industry search for innovation solutions.

Cristina is a well-regarded legal expert in licensing, patents, trademarks, domains, software, data protection, cloud, big data, digital eco-systems and industry 4.0.

A TRIUM MBA, Cristina is also a frequent keynote speaker, a lecturer at St. Gallen, and the co-author of the recently published Big Data and Law now available in English, German and Mandarin editions.

Prof. Christina Elmer

Christina Elmer is a Professor for Digital and Data Journalism at the Technical University of Dortmund. Previously, she worked at DER SPIEGEL magazine, where she expanded the data journalism department, headed the online section as a member of the editorial board and was responsible for core components of editorial product development. Elmer studied Journalism and Biology and is a board member of netzwerk recherche, Germany’s largest association supporting investigative reporters.

Dr Maria Cristina Caldarola

Dr Maria Cristina Caldarola, LL.M., MBA is the host of “Duet Interviews”, co-founder and CEO of CU³IC UG, a consultancy specialising in systematic approaches to innovation, such as algorithmic IP data analysis and cross-industry search for innovation solutions.

Cristina is a well-regarded legal expert in licensing, patents, trademarks, domains, software, data protection, cloud, big data, digital eco-systems and industry 4.0.

A TRIUM MBA, Cristina is also a frequent keynote speaker, a lecturer at St. Gallen, and the co-author of the recently published Big Data and Law now available in English, German and Mandarin editions.