3.4. Data and algorithms in journalism

3.4.1. Data and algorithms in journalism

Another field heavily impacted by the uses of data and algorithms is journalism. Journalists and editors see themselves facing new opportunities for the measurement of the success and reach of their pieces with audiences by counting clicks on their sites and interaction metrics provided to them by platform companies used by them to steer audiences to news pieces. Algorithms also play a role in the way people are steered toward journalistic content. By the increasing dependency on digital platforms - like Facebook, Instagram, or Google - news media are subject to the algorithms these platforms use to prioritize or de-prioritize journalistic pieces. Accordingly, figuring out the mechanisms behind these algorithms and adapting one's pieces accordingly may come to matter. Finally, and potentially most disruptively, algorithms are increasingly used to produce journalistic pieces automatically, raising fears about the future of journalism as an occupation. Journalism offers a promising window into the field-specific impact and transformations driven by the increasing availability and uses of data and algorithms.

Let's now have a look at two of these processes: the introduction of digital metrics in journalism and the potential for the automated production of content.

Journalists always looked to metrics documenting their reach and impact among audiences. Yet, measures of old, such as number of copies sold or viewership numbers, were imprecise and often offered limited insights. Sill, they served as powerful markers deciding about ad prizes news organizations could charge or the fate of programs or journalists. The advent of digital metrics has amplified the availability and reliance on audience metrics.

News organizations now have detailed measurements of online traffic on their digital products, such as websites, apps, or multi-media content. Beyond this, they also have access to third-party metrics documenting the usage frequency and reach of their content. These can be metrics provided by companies running digital platforms that news organizations use to publish, advertise, or announce their content on, such as Facebook, Google, Instagram, Spotify, or YouTube. These can also be cross-platform metrics documented by a monitoring software, often called dashboards. These dashboards provide journalists and editors with a view of what people on digital media are talking about, sharing, or clicking on at any moment in time. They also provide another view on the relative success of their own content compared to that of the competition. The contemporary information environment - or at least its digital component - is therefore much more visible to participants and contributors than past information environments. This has consequences for the production of news. Not all of them beneficial.

As known for other metrics, once click rates become a factor in rating the success of items or journalists, it can be expected that journalists start writing toward what they perceive to be click drivers or editors start commissioning items with associated features. Clicks then might become a metric that journalists and editors try to game. This might then impact the type of content news outlets produce and we come to see, for example click oriented news-making might favor controversial opinion pieces over well-reasoned or balanced reports.

Besides writing up topics in an reader- or algorithm-friendly way, this can also lead to the emergence - or at least reinforcement - of herd journalism. When the majority of editorial desks shares the same social media monitoring dashboard, then all editors see the same topics trending at the same time. Accordingly, to get in on the eyeballs on trending topics all editors assign journalists to write quick online pieces and on a majority of outlets pieces on the social media topic of the minute appear. This then gives the topic the vesture of importance. After all, why would so many outlets write about this? These dynamics, and the dependency of online news media to quickly cover trending topics, has been exploited by controversial actors. Reportedly, this dynamic was a crucial feature in the successful media exposure Donald Trump could gain during the Republican primaries for the US Presidential election 2016. By posting controversial tweets, Trump managed to routinely trend on Twitter, which then in turn guaranteed him media attention and thus allowed him to hijack the political agenda during the primaries.

Another problem emerges from the fact that digital audiences and their engagement are comparatively easy to measure. The opposite is true for traditional non-digital audiences. But, as many studies have shown, digital news users differ from those who choose to engage with news in traditional ways. And topics playing strongly on social media, do not necessarily have to correspond with the interests and concerns of the general public. Accordingly, by relying on what is easy to measure and easy to see, editors and journalists might over-rely on their digital audiences and lose sight of their traditional audience. At the same time, by looking at digital dashboards they might start following the interests and concerns of politically vocal online users strongly, while losing sight of those of the general public.

The easy availability of digital information might entice news organizations to self-imprison in a metric-cage. Since information about digital news audiences abound, it is easy to turn them into metrics. Whether these metrics are useful to the business or normative goals of a news organization is a different question of course. But, as we have encountered throughout this section, availability of data and metrics often trumps the effortful work of interpretation and validation. Similar dangers lurk with the use of digital metrics by news organizations.

While the use of metrics based on information on digital news audiences is straight forward, there are further points were data and algorithms might transform journalism even more fundamentally. One is the automated production of news items. The romantic notion of plucky journalists being out on their beat covering the goings on of corrupt politicians still lurks in the back of many minds. Just watch the movie All the President's Men (1976) next time it is on, to get an overdose of this notion. But in fact, like many white collar jobs, in many aspects, journalism consists of routine tasks. Tasks an algorithm can be taught to replicate.

In its core, many tasks in journalism mean taking information about events in the world and translating them into the standard template of a report. Yes, I know, this excludes your favorite in-depth coverage of a topic of interest or the meaningful and deeply reasoned opinion piece, but those types of stories are the minority if you look at the output of news organizations. Much of news coverage is the routine coverage of events, think sports or financial news. Theses types of stories pick up on factual events and translate them into a fixed template. This makes them promising objects for automation.

It remains to be seen how different experiments in the integration of algorithmically produced content or content snippets in the workflow of journalism evolve. But algorithmically produced news content challenges the ethos and public image of journalists at its core. Any advances in this area are therefore likely to be highly contentious but at the same time also provide promising objects for research.

So much for the big picture. Let's now move to empirical attempts at documenting these changes and making their effects measurable.

3.4.2. The use and perception of metrics in newsrooms

Let's start by having a closer look at how newsrooms deal with metrics and how journalists and editors adjust their behaviors according to metrics available to them. To get a better sense of this, we need to leave our higher plane of theoretical and abstract considerations, and examine actual news room practices closely. Lucky for us, this exactly what the sociologist Angèle Christin has done in her article "Counting Clicks: Quantification and Variation in Web Journalism in the United States and France".

Between 2012 and 2013, Christin conducted an ethnographic study in the newsrooms of two online news organization, one based in the USA, the other based in France. Christin is interested in the use of metrics in digital news organizations for two reasons. First, web-based journalism is on the frontline of the work with new digital metrics. Everything web-based journalism produces in digital communication environments is quantifiable. How many people clicked on a headline? How many people read the piece? How many linked to it or commented on social media? In principle, digital newsrooms can quantify every detail of the interaction around their items and the work of individuals. Their approach to working with digital metrics might therefor proof highly instructive.

Second, Christin sees web-based journalism as a natural outgrowth of traditional print journalism. In the past, print journalism was seen as somewhat more independent of metrics than other forms of journalis, such as television news. This was due to readership numbers being relatively coarse and difficult to break down to individual items or the work of individual journalists. The adjustment to the staggering amount and detail of available metrics in digital communication environments might make for a more interesting conflict in the comparatively metric-divorced print newsroom culture, than in one more adjusted to metrics, like television.

Christin also has good reasons for comparing newsroom practices in France and USA. Journalism in the USA is strongly dependent on market forces, news organizations being primarily business ventures. At the same time, US journalism is also characterized by a strong tendency of professionalization. This means that journalists, editors, and news organizations tried to develop, uphold, and police a shared standard of ethical and quality assurance practices - the principle of objectivity being prominent among them. They did so in order to differentiate themselves from less reputable competitors. In contrast, journalism in France has been found to be less dependent on the market and journalism to be less concerned with objectivity but more open to politicizing news. These differences are likely to drive divergent uses of metrics. In the words of Christin:

"The direction that such differences might take, however, is hard to predict. Since French journalism is more protected from market pressures than American journalism, French journalists might be more critical of - or at least indifferent to - traffic numbers compared to their American counterparts. Yet if we were to focus on relative level of professionalization in the two countries, we should instead expect French journalists to be more vulnerable than American journalists to the growing influence of traffic numbers, since they do not have the same professional buffers protecting them against this new form of commercial interference. As the rest of this article shows, both sets of expectations in fact end up shaping newsroom dynamics, although in paradoxical ways."

[Christin, 2018], p. 1392.

So how does Christin go about to find out how metrics impact the work in both countries? She spent time in both newsrooms as an observer, she shadowed journalists during their workdays observing what they did and asking questions, she also sat in on editorial meetings and took notes during the process. Additionally, she interviewed selected staffers, editors, and journalists in both places. This allowed her to get a sense of working practices herself, while also having access to the reflections of newsroom members on their practices in their work with metrics. If you are interested in conducting ethnographic research yourself, make sure to have a look at the method section of her paper [Christin, 2018],p. 1392-1394, to get a better understanding of her approach. But now, what did she find out?

Both companies were using the same analytics software to provide them with metrics about how their content was doing. But employees in both companies were using metrics differently. First, Christin focuses on news editors. In the US she finds editors to trust highly in metrics. They treat them as "fact" or "evidence". They have no qualms about using web-based metrics in both editorial and managerial decisions, cutting content or sections if traffic is lacking, and openly incentivizing staff to improve these metrics. In contrast, Christin found the French editors to be more skeptical about the value of metrics. Some editors were even consciously ignoring the metrics provided by their analytics tool in order not to succumb to what they felt was a corrupting influence. Still, editors relied on the analytics tool in order to decide which articles to display prominently on their site. But, and this was a crucial difference, in contrast to their US counterparts French editors did not send regular updates to journalists about the latest top-10 article lists or click counts.

Journalists in both countries also reacted to metrics differently, but in another way than their editors. This time, US journalists were skeptical about metrics - even sometimes treating them as a game - while their French colleagues were treating them very seriously.

To explain these seemingly contradicting findings, Christin contextualizes what she found about the use of metrics in both newsrooms with the historical and institutional contexts of journalism in France and the USA. In the US news outlet:

"(...) editors thus felt responsible for the commercial success of the publication. They paid close attention to traffic numbers, repeatedly asking the journalists for help in achieving their economic objectives. The incorporation of web analytics into this bureaucratic form of management provoked a counterreaction among the staffers, who quietly engaged in passive resistance. Staffers drew on their professional ethos of editorial excellence and independence to shield themselves from the editors’ demands. Hence, despite editors’ pressure to attract more traffic, staffers in the newsroom remain relatively buffered from market forces: they could be fired if profits are too low, but they do not have to care about metrics in their daily work. Web journalists (...) are able to adopt a distanced attitude toward metrics, but only because editors bear the responsibility of maximizing traffic numbers."

[Christin, 2018], p. 1409.

In contrast she found editors in the French newsoutlet to:

"(...) define their primary role as publishing "important" pieces, even when they know that such pieces are not popular with the readers, which is consistent with the ambition prevalent among French intellectuals to provide political and cultural guidance to a broad public. Yet (...) editors could not help but realize over time that traffic was essential for the survival of the website, given the renewed pressure from its parent company to get more traffic. Contrary to (...) editors [in the US outlet], however, they refused to change their editorial line or fire employees; they did not shield their staffers from the pressures they were experiencing. The absence of specialization between editors and staffers in turn left both groups ill prepared to handle the strain of having to maximize traffic numbers. Over time, metrics became integrated into a broader - disciplinary - form of anxiety about their individual and collective performance in the competitive market for online news."

[Christin, 2018], p. 1409.

With this contextualization of the reaction of editors and journalists in both countries, Christin shows that metrics clearly shape practices in newsrooms in both countries. But the way they shape them depends on the historical and institutional contexts, which in turn influence the way editors and journalists fulfill their roles and accordingly the way economic pressures are channeled or distributed within the organization. Back to Christin:

"Against today's dominant rhetoric of unbound technological change, future research should study how institutional forces as canonical as professional norms, work practices, and organizational dynamics shape the impact of digital technologies on the social world."

[Christin, 2018], p. 1411.

Christin's study is highly instructive on the differential usage practices of metrics in both countries depending on their different newsroom cultures. This shows that the same type of metric does not necessarily create the same sort of reaction but instead that usage practices vary across contexts. At the same time, the study offers an instructive template for ethnographic work interested in identifying individual usage practices and organizational processes in the adaption to data, metrics, or algorithms.

So, now we know how to study the reactions within organizations to metrics and algorithms. But how about their effects on audiences. For this, we turn to our second study.

3.4.3. Effects of metrics on audiences

In the previous section, we have had a look at a study examining the effects of metrics on newsrooms. Now, we will examine a study that looked at their effects on audiences.

Digital communication environments are littered with public metrics. We see the number of likes a Facebook post received, the number of times a tweet was retweeted, or the number of times a song was streamed on Spotify. This makes metrics accessible not only to professional users of analytics dashboards or platform staffers, this makes metrics into public signals about the popularity of content in digital communication environments. As any public signal, these metrics serve as orientation for people and can be expected to influence behavior. For an example of how to go about to find out if and how metrics shape behavior, we can turn to a study by the sociologists Matthew J. Salganik, Peter Sheridan Dodds, and Duncan J. Watts.

In their study "Experimental Study of Inequality and Unpredictability in an Artificial Cultural Market", Salganik and colleagues test in an experiment the influence of metrics on the behavior and preferences of people in an artificial music market. The authors recruited 14,341 participants predominantly from a teen-interest website and assigned them randomly to two conditions.

Some participants were shown a list of unknown songs from unknown bands. They then were asked to rate songs from one star ("I hate it") to five stars ("I love it") and given the opportunity to download the song. This was the independent condition. People were free to interact with content based on its own merits without being influenced by public metrics.

Other participants were assigned an influence condition. Here, participants had the same options like those in the first group. But other than the first group, they were shown the download counts for each song. Additionally, participants in the influence condition were split in eight independent groups. The authors did this in order to have different and independent estimates for the influence of metrics independent of contingent developments in either of the eight groups. Thus, the author can experimentally measure the impact that publicly available metrics have on the behavior of users.

With this set-up the authors ran two experiments that varied based on the presentation of songs. Here, the second experiment varied from the first in that it arranged songs in a ranking based on their download counts. This reinforced the influence signal contained in the download metric through spatial arrangement. Accordingly, the influence of metrics should be most pronounced in the second experiment. So, what did the authors find?

"In both experiments, we found that all eight social influence worlds (...) exhibit greater inequality - meaning popular songs are more popular and unpopular songs are less popular - than the world in which individuals make decisions independently. (...) we also note that inequality increased when the salience of the social information signal was increased from experiment 1 to experiment 2. Thus our results suggest not only that social influence contributes to inequality of outcomes in cultural markets, but that as individuals are subject to stronger forms of social influence, the collective outcomes will become increasingly unequal."

[Salganik, Dodds, and Watts, 2006], p. 855.

For details on the procedure and further results make sure to check out the study and the subsequent pieces by the authors [Salganik, Dodds, and Watts, 2006, Salganik and Watts, 2008, Salganik and Watts, 2009]. But for our purposes, these original findings already suffice. The study by Salganik and colleagues shows that metrics matter for audience behavior. Publicly available signals about the behavior of others clearly influenced user in their own behavior. Those signals clearly matter for users in digital communication environments.

In their study, Salganik and colleagues creatively translate a potential mechanism for the effects of publicly visible metrics on audience behavior into testable experiments. This makes the study an informative template for future work on the effects of the design of digital communication environments on user behavior, in general.