Journalism involves the search for and critical analysis of information.18 How journalists discover and select sources of this information is important to avoid bias, to be credible and trusted, and to create angles with which to generate new stories of value to readers.
- Journalists identified more with digital tools to support them to discover and generate new angles on stories more quickly than now—tools that recognized and augmented their existing creativity skills.
- Different creative search algorithms applied to news information operationalized the strategies for discovering new angles on stories reported by experienced journalists.
- Evaluations of the INJECT digital tool in three newsrooms revealed it increased the novelty of stories written by journalists, but younger journalists more open to new technologies and working more autonomously were more likely to use the tool.
Journalist creative thinking, to discover and generate new associations during this search and analysis of information, contributes to the generation of new stories. Journalists are known to seek opportunities to develop new creative skills with which to discover information.17 Applying these skills enables journalists to maintain control over their work.25 And emerging forms of investigative journalism demand new creative search and association skills.10
However, discovering and examining information sources about complex stories takes time—time that journalists increasingly lack as news organizations reduce staff numbers.22 The digitalization of news production and consumption has led many news businesses to become uncompetitive. Some work practices are slow to change due to conflicts with journalist professional values for autonomy.8 Therefore, as coping strategies, journalists often use subsets of available and familiar information sources to create stories, which in turn can reduce the diversity of angles used to report stories.
Although journalism is one of the creative industries, explicit support for the creative skills of journalists is rare. For example, it is not one of the five journalist capabilities reported in Cohen et al.,4 and few digital tools support journalist creativity.
INJECT was a new digital tool designed to support journalists to discover new associations with which to generate stories with angles more novel and valuable than stories published previously. It integrated creative search algorithms with which to discover information in published news stories and interactive support to form new associations with this information during creative thinking. It was designed to contribute to journalist engagement in professional-level creative work, that is, work that generated income and provided a living.12 This work included writing stories that were creative, that is, judged to be novel and valuable14 through the application of the creativity skills1 of the journalists.
Existing Creativity Support Technologies for Journalists
Digital tools that enhance the creativity skills of journalists are rare. One exception was the Story Discovery Engine, which used artificial intelligence algorithms for investigative reporting.3 The Tell Me More system mined the Web for similar stories reported by different sources and extracted text that offered new information in the form of quotes, actors and figures.11 Both of these tools had similar objectives to INJECT, but were not framed as creativity support tools for journalists. The SocialSensor news app surfaced fast moving trends from social media content, but revealed biases arising from such content.23 Many different data visualization tools support journalists to make sense of, for example, social media content.6 However, none supported human creative thinking to discover new angles on news stories.
To work around the lack of bespoke tools for information search and analysis tasks, many journalists use generic search tools such as Google,13 but these lack explicit support for human creative thinking about news stories.
Unlike in journalism, digital creativity support has been implemented for professionals in other creative industries, such as the performing arts, music, and film and television. Examples of the digital support include Story-Crate, a collaborative editing tool developed to drive users’ creative workflows within a location-based television production environment2 and Trigger Shift, which appropriated information technologies into performance art in theater.21 Other domains for which digital creativity support has been developed include theatre, scientific discovery, and caring for older people.
Therefore, to fill the gap in journalism, new digital creativity support was developed that aligned with the work practices, tools and values of journalists. The resulting tool was called INJECT.
Designing INJECT with Journalists
INJECT’s design was informed by established cognitive models of creative thinking. Most of these models describe dual processes of developing and evaluating ideas to generate outcomes that are both novel and valuable.12,14 Developing ideas is a divergent and associative process that can be spontaneous and deliberate, and involves retrieving relevant items from memory and generating associations with new information.9 By contrast, evaluating ideas is more analytic, but can be interleaved tightly with developing ideas.7
Therefore, INJECT was designed to support journalists to discover new information, generate associations between this information and items from memory to discover new angles on news stories, and evaluate these angles quickly during story development.
To align INJECT to these work practices, tools, and values, journalists were included in the tool’s design. Interviews were held with experienced and inexperienced journalists to discover problems, requirements, and constraints. Paper-based then digital wire-frames of the INJECT tool were developed and presented to professional journalists. New releases of the working INJECT software were prototyped for their usability and impact with professional and student journalists.
The user-centered design process uncovered three important values that most journalists held about their work—values the INJECT tool was designed to uphold.
The first value was the importance of discovering information already reported in verified newspapers, as opposed to in unverified sources, as the starting point for discovering new angles on stories. Even though it was argued that published news might constrain their creative thinking, most journalists expressed a preference for it to direct the discovery of new associations and angles. Feedback on prototypes revealed three specific types of verified news information were effective for discovering angles: 1) published news stories similar but not the same as the new story being written; 2) entities such as people, places, and organizations that might relate to these new stories, and; 3) guidance for directed creative thinking to develop the stories. INJECT was designed to direct journalists to generate new associations between information discovered in similar stories and in entities referenced in these stories.
The second value was to recognize the existing creativity skills of journalists. Many who engaged in INJECT’s design initially rejected the need for digital support for their creative thinking. After all, journalism is one of the creative industries, and many chose it as a profession to be creative. Instead, journalists identified with the need to generate new angles more quickly.
The third value was that creative thinking was not separate from but part of everyday journalistic work. Indeed, journalists sought support for more original journalism that was embedded in daily work tasks and tools, such as text editors.
The INJECT Tool
The INJECT tool was implemented with natural language processing, multi-language creative search, and interactive creativity support capabilities. It indexed content from millions of verified stories published by hundreds of news titles in multiple languages in order to provide journalists with a sufficiently large external information source from which to discover associations.
The INJECT tool’s three-tier architecture is shown in Figure 1. The interaction layer was a sidebar designed to be simple, fit with existing work practices, and encourage journalists to discover new angles on stories quickly without learning new skills.
The application layer was composed of services designed to generate large numbers of possible associations between information that journalists were writing about using indexed news content from millions of already published news stories.
These services retrieved this content from INJECT’s data layer, called the creative news index, which was designed so the discoverer service could undertake divergent creative searches that were more sophisticated than were possible with existing Web search and news site APIs. The index was populated by the presser service, which indexed millions of verified news stories as possible starting points for discovering new angles on stories. The text processor service was invoked by the presser to make sense of and to generate indexed content from published news, and by the discoverer to expand creative search queries.
INJECT’s presser. The presser generated indexes of millions of verified news stories that could be retrieved, on request, as starting points for journalists to discover associations with which to generate new angles on stories. It had a crawler component that fetched news stories to index from open RSS feeds, and an importer component that fetched stories from accessible newspapers’ archives.
The crawler was directed to fetch verified news stories from 1,105 pre-defined RSS feeds published by 380 diverse news titles in six languages. These feeds, titles and languages were selected by INJECT’s editorial team to generate indexes of diverse views and angles on news, and ranged from major daily newspapers in the U.S., regional newspapers in the Netherlands, and tabloid titles in the U.K. On a normal news day, it fetched about 15,000 verified stories. Stories from high-frequency feeds were fetched every 30 minutes, others every 12 hours. During each fetch cycle, the crawler automatically read all news stories accessible via the URLs in each RSS feed, removed navigation links, adverts and embedded media such as links, images, and videos, and sent the remaining text string, along with story’s author, URL, image URL, and published date, to the text processor service. This text string, author, date, and URLs provided a rich external information source with which journalists could discover and generate new associations and angles on news stories.
The importer component was similar to the crawler but was directed to fetch stories from local JSON files. It was developed to address the need of news organizations to use their own stories as starting points for more original journalism. Like the crawler, it also sent a text string, along with author, URL, image URL, and published date, to the text processor service.
INJECT’s text processor. The text processor service generated new entries to add to the creative news index by analyzing the natural language text string of each fetched news story with the following:
- Named entity extraction mechanisms to index stories using real names such as people and places. The mechanisms that treated candidate-named entities as groups of consecutive words describing a concept such as a person (for example, Tawakkol Karman), location (for example, Sana’a), organization (for example, United Nations) or object (for example, war crime). This enabled the processor to extract entities with which journalists might discover associations not described in the text, for example the entity Sana’a from the text the capital of Yemen. After experimentation with alternatives, the processor invoked the DBpedia Spotlight5 and Polyglot20 services. Spotlight annotated mentions of DBpedia resources using entity detection and disambiguation algorithms with adjustable precision and recall, which were used to refine INJECT’s sensitivity to news content using measures such as entity prominence, topical pertinence, and disambiguation confidence. Polyglot implemented named entity extraction, speech tagging, sentiment analysis, morphological analysis, and transliteration in all of INJECT’s six target languages—English, German, Dutch, French, Italian, and Norwegian. It could detect, for example, that Forente Nasjoner is Norwegian for the entity United Nations, the international organization founded in 1945;
- Automatic parser mechanisms that detected nouns and verbs to index stories using common objects and actions. The parsers split news text into sentences then applied part-of-speech tagging to mark up words as belonging to lexical, part-of-speech categories. Shallow parsing was applied to generate a machine understanding of the structure of a sentence without parsing it fully into a parsed tree form. The output was a division of the text’s sentences into a series of words that, together, constituted a grammatical unit. To select candidate objects and actions from these units with which journalists might also discover associations, the mechanism applied lexical extraction heuristics on a syntax structure rule-tagged sentence. For example, the processor parsed the news headline The Yemen war in the world’s worst humanitarian crisis to extract the nouns such as war, world and crisis.
INJECT’s creative news index. For each fetched story, the creative news index generated a new entry composed of all extracted named entities, objects and actions and frequencies of occurrence, the author, URL, image URL, and publication date. A typical entry for a news story of 400 words was composed of between 30 and 50 entities, objects, and actions. Early prototyping of the INJECT tool revealed that indexes with this volume and type of content were sufficient to generate new associations that journalists reported could be effective for discovering new angles.
All index entries were uploaded to an external Elasticsearch cluster to be manipulated by the discoverer’s creative search algorithms. Elasticsearch is a scalable open source search engine with a REST API that provides scalable, near real-time search. This performance was essential to support journalists to discover new angles on stories more quickly. In April 2020, the Elasticsearch cluster held over 17 million entries, with another 350,000 new entries being added each month.
INJECT’s sidebar and discoverer. Journalists interacted with the INJECT sidebar to discover new associations and angles on stories. The sidebar was designed to provide journalists with index information and features and generate new associations with this information without opening another application. To work within the space constraints of the widget, the sidebar was implemented with mouse hover-boxes and information that journalists could use to discover associations quickly. Its design also supported journalists to flip quickly between ideation and evaluation processes during story development.
Figure 2 depicts use of the sidebar in the Google Docs editor to discover associations leading to angles for a new story about the Yemen humanitarian crisis. The sidebar was also implemented for WordPress, Adobe InCopy text editors, Google Chrome Web browser, and content management systems that use the TinyMCE text editor, as well as a separate Web application that a journalist could reshape as the sidebar.
If the journalist highlighted text in the editor, the sidebar invoked the text processor service to extract named entities (for example, Yemen), nouns and verbs (for example, crisis, ecological) as candidate topics to present at the top of the sidebar, see Figure 2. This feature was implemented to increase the sidebar’s usability and enabled journalists to work more quickly.
The journalist could then use the icons beneath these topics to select between six predefined creative strategies that mimicked the strategies of experienced journalists.15 These strategies were implemented in the discoverer service to retrieve creative news index entries with the following:
- Quantified information associated with the topics;
- Information about people associated with the topics;
- Information about events associated with the background of the topics;
- Information about future consequences associated with the topics;
- Datasets and visualizations associated with the topics; and,
- Comical information associated with the topics.
For strategies A-E, the discoverer:
- Disambiguated each noun topic term by discovering its correct sense in the online lexicon at WordNet using context knowledge from other terms in the query (for example, that crisis is an unstable situation of extreme danger or difficulty rather than a crucial stage or turning point in the course of something). It then expanded each term with other terms with similar meanings (for example, the term crisis is synonymous with exigency and flashpoint) and included these terms in the search query. Term sense disambiguation and query expansion was implemented to retrieve index entries that were different lexically but related semantically to the topic terms, so that journalists could generate new associations based on different types of semantic similarity;
- Invoked an Elasticsearch search via the news API with the expanded query terms and logic operators set by the journalist to control search breadth. Elasticsearch returned a set of indexed entries that achieved a threshold match score in response times acceptable to journalists;
- Scored the returned index entries for relevance based on the frequencies of original and expanded query terms in the title of each story, to prioritize entries with headlines related to topic terms. This scoring mechanism was implemented to reflect the structure of most news stories with the most important information at the start of stories;
- Filtered the scored index entries using constraints specified for the selected strategy, so that journalists were presented with information to form associations consistent with that strategy. For example, for quantified information (A), it filtered to retain entries with a minimum threshold of 100s of quantity, measure and value keywords, for exawwmple Sterling, population and actual numbers. For information about events associated with the background of the topic terms (C), it filtered to retain entries with more than 500 words of content and a minimum threshold of 100s of keywords indicative of background articles such as cause, impact and studies from sources such as the Economist and the New York Times. And for information about people (B), it generated orders of entries that reference a person entity named in a minimum number of entries.
The discoverer sent JSON representations of each remaining index entry to the sidebar to display as a news card. By contrast, for strategy F, discoverer generated simpler keyword queries that searched the caption text of over 60,000 political cartoons accessed by INJECT via an API from an external database.
This automation of information discovery was designed to enable journalists to commit more cognitive resources to generating associating and evaluating ideas. The sidebar presented the retrieved information as a scrollable sequence of news cards. Journalists could select the information to view using sidebar features to sequence the news cards by relevance, date of publication or random, and to present news published within selected periods.
Each news card in the sidebar presented the title, publication, date, first sentence, and 10 randomly selected entities. Clicking on the title opened the original new story or cartoon, at source, in a new browser tab. Positioning the cursor over each rectangle presented a pop-up creativity spark generated for that places, things, people and organizations. This feature was implemented as a mouse hover-over to enable journalists to explore multiple sparks and discover different associations quickly. The sparks themselves were designed to direct the deliberate generation of associations and ideas by journalists. Each was generated by the sidebar from a predefined set of spark types to direct journalists to think about, for example, the history and relevance of places, the motives of people and their opponents, the future and emotional impact of objects, and available data about organizations.
Figure 3 shows these features in three different INJECT sidebars presented for different angles using the same information about the Yemen humanitarian crisis.
Figure 3. Three INJECT sidebars presented for the new story about the Yemen crisis, showing (from left-to-right) information to support a background angle, a people angle, and a comical angle based on political cartoons.
The sidebar also presented other styles of news card showing only entities, word clouds, and sparks in list form. These other styles, shown in Figure 4, were added to reduce comparisons with Google search that reduced journalists’ expectations for creativity support.
Figure 4. Three INJECT sidebars presented for the same new story about the Yemen crisis in Norwegian, showing (from left-to-right) showing use of a background information angle, a people angle, and use of creativity sparks in list form.
Furthermore, to support journalists to evaluate as well as discover ideas, the sidebar launched Google Web searches in new browser tabs from within INJECT, to retrieve information with which to analyze and critique ideas, see Figure 5.
Figure 5. INJECT features to distinguish it from Google search, including a different form of presentation of creative ideas, additional digital capabilities, and a feature to launch Google search from INJECT with the topic terms, selected angle, and title of the retrieved news item.
The INJECT tool was tested by journalists working in multiple languages. When sufficiently robust, it was evaluated in different newsrooms.
Evaluating the INJECT Tool in Three Newsrooms
The INJECT tool was installed in the newsrooms of three regional newspapers in Norway to investigate the effectiveness of its creativity support. One research question explored was whether journalists produced news stories that were more novel and valuable with INJECT’s support.
INJECT was introduced into the daily work of four journalists in each of the three newspapers for two months in 2018, for use in Norwegian and English. The 12 journalists received INJECT training and helpdesk support and were encouraged by their editors to use INJECT. During the evaluation, the numbers of English-language entries in the creative news index increased from 2.7 million to 3.2 million and Norwegian-language entries from 260,000 to 300,000. The index also included 62,160 Norwegian-language articles from archives of the three newspapers generated by the importer component and INJECT also searched over 50,000 digital cartoons. The journalists used INJECT’s Web application version.
To investigate the research question, news stories produced by the journalists with and without the support of INJECT were rated by seven individuals with journalism expertise and/or knowledge of the regions of the three newspapers, see the sidebar “The Expert Judgment Process Used to Rate News Stories.”
INJECT was used in all three newsrooms. No major technical problems were reported. A total of 72 published stories were written with the support of INJECT by 10 of the journalists. Journalists used already-published news stories as effective starting points for new angles on stories. Based on the expert analysis, a Mann-Whitney test revealed that the novelty ratings were greater for the news stories written with the support of INJECT (Mdn=3) than without the support of INJECT (Mdn=2), U=6997.5, p<0.0001. INJECT use was associated with an increase on the novelty of news stories, albeit from ratings that indicated low novelty of most non-INJECT news stories.
In contrast, a second Mann-Whitney test revealed the value ratings were not greater for the news stories written with the support of INJECT (Mdn=5) than without the support of INJECT (Mdn=5), U=9156, p>0.05. The average value rating of all of the news stories was 4.7 out of 7, and the lowest and highest average valuated articles were 3.71 and 5.86. This was unsurprising, given that all of the news stories had passed through editorial processes.
Most of the journalists needed time to learn to use INJECT, and many reported comparisons to Google: “You need to adjust slightly, because we are used to search engines that give us the most popular hits.” INJECT use was related to journalist attitudes. Four younger journalists in one newspaper who were open to new technologies and worked more autonomously used INJECT more frequently. By contrast, more experienced journalists were less willing to adopt INJECT after the evaluation: “We seem to have certain stubbornness against using INJECT and other tools like it.”
The evaluation in the three newsrooms revealed the journalists did produce news stories that were more novel if not more valuable with support from INJECT. In fact, all were published, indicating sufficient value for purpose. One interpretation of this result was the stories written without the tool’s support had value but lower novelty, that is, the stories were not creative. Articles written with the tool’s support had increased novelty but not increased value. In a strict sense, these articles were more novel rather than creative, but still had sufficient value to publish.
More results are reported in Maiden et al.15
Demonstrating INJECT to other news organizations reinforced our judgment that digital support for journalist creative thinking is rare.2,11 However, its positive reception revealed the potential of INJECT to support journalist creative thinking.
To uphold the three journalist values uncovered during design, INJECT’s interactive support was separated from indexing published news. The sidebar design enabled journalists to access INJECT’s guidance in as few as two clicks, without leaving the text editor. It demonstrated how to establish digital support for creative thinking as part of journalists’ daily work tools, although more evaluations are needed.
INJECT’s creative news index is now an important asset of more than 500 million pieces of news information for computational manipulation. New computational analyses under development will detect patterns, biases and angles on news shown to be novel, and hence creative. One will analyze differences in topic reporting in different languages to generate angles underreported in a target language. Rolling out new INJECT versions with these features will support news businesses to remain competitive and fulfil their role in liberal democracies.
The research reported in this paper was supported by the EU-funded H2020 723328 INJECT innovation action.
Figure. Watch the authors discuss this work in the exclusive Communications video. https://cacm.acm.org/videos/digital-creativity