How does the artificial automatic generation of stories work?

Robotic journalism - benefit or danger?

Journalists have always had to deal with new technologies. The typewriter replaced pen and paper. With the introduction of computers, these aids also began to gather dust. Now robot journalism is just around the corner. But the subject is not new.

What is robotic journalism?

The term robot journalism describes the creation of automatically generated texts on the basis of structured data.

The exciting thing is Definition of robotic journalismthat the term is not entirely correct. Because robots are actually not used when writing texts. Rather, a machine with artificial intelligence and a programmed algorithm creates a text from an abundance of data records. In order for the machine to implement exactly this principle, a person must first identify the one behind it Programming the algorithm and learn if necessary. Without this human component, a robot would be of no help.

A Advantage of automation in journalism consists in doing routine and time-consuming work processes with a machine. This does not replace the journalist, but supports them in their tasks. Get editors more time for research or creativity.

Perform at the moment Machines in journalism especially editorial assistance. They crawl the web and social media channels. From this Amount of data generate suitable ones building blocks, suggest thematically appropriate pictures or create graphics with information. Withstructured data and technical processes in the editorial office, news agencies already operate live tickers. However, this method is basically just a machine journalism. By definition, robotic journalism is only the correct term if the Learning algorithms and semantically related reports create without the involvement of authors.

How does automatic content production work?

requirement for readable digital content are data. The more data there is, the better the programs with artificial intelligence are at writing texts. This data is combined with defined phrases via the central algorithm. The building blocks for the later text are determined by the person in front of the computer by defining the familiar language images in advance. In addition to the necessary linguistic formulation is also the correct use of statistical rules define.

In addition to scientific institutes, other tools or social networks can also be used as sources for the multitude of statistical data. A large amount of data helps that automated content varied are. There is also a close connection to normal journalism here, as an editor also uses data for his reporting. However, machines can still use the data do not classify or even comment.

Natural language processing as the basis

Any algorithm is only as good as that Understanding of language through underlying software. The basis for automatically generated content is therefore always Artificial Intelligence Artificial Intelligence: Definition Artificial Intelligence (AI) or Artificial Intelligence (KI) refers to the simulation of human intelligence in machines. Computers are programmed to have properties that are compatible with cognitive intelligence such as ... Read more. (AI). A part of AI is Natural Language Processing (NLP). All digital technologies that deal with the processing of natural language are united here. Two further areas of expertise can be derived from this: Natural Language Understanding (NLU) and Natural Language Generation (NLG).

Natural Language Understanding is the basis for machines or software Understanding of natural language to teach. Examples of the use of such methods are chatbots and virtual voice assistants.

The next stage is the automatic and correct text generation using the principle of natural language generation. NLG is a requirement that systems use a mathematical algorithm Create natural language automatically. This turns data into journalistic content. They are now so good that readers can no longer distinguish them from articles by an editor. Examples of this Form of text generation are product descriptions in e-commerce and chatbots.

Examples of machine content

In various subject areas, readers are meanwhile with news supplied that not written human were. However, take it differently than expected journalistic media do not play a pioneering role for robot journalism, but rather companies from different industries. These are, for example, businesses that automated business reports or companies from the health sector. Patient records are often also created automatically.

In e-commerce are Product descriptions in online shops partially automated. In this case, robotic journalism shows all its strengths. Shop owners can access a extensive database (Prices, colors, dimensions, variants). In this way you can within a short time tens of thousands of product descriptions written by machine become.

Putting data into words also works best where there is a corresponding data situation. That is the case in the Sports, finance, weather, or traffic.

Software provider for automated content

An important topic for robot journalism are intelligent programswho write a journalistic report using data-based information using an algorithm. In the past few years, different providers have one NLG software developed.

Automated Insights was founded in 2007. The portal offered automatically generated articles about sports. The company is now very important for the development of NLG software. Robot journalism is an issue in the American market in particular. The media used a data-based technology early on to generate reports.

Since 2010Narrative Science another provider in the software development segment. First attempts with automatic content were also made here in the sports sector.

In this country too, companies develop intelligent ones Robot text software. Customers are, for example, journalistic media in the sports and financial sector, but above all companies without a journalistic background. Among the most famous developers are Retresco, AX semantics and Textomatic.

Robotic journalism in online journalism

First attempts, simple ones Words with programming language to sentences to formulate it already existed in the 1960s. The commercial step took place in 1992 with the Forecast generatorwhich was able to generate longer weather forecasts in two languages.

With the "Quakebot" the 2011 Los Angeles Times became the Robot journalism in a broader professional community known. The project included the development of an algorithm that used data structures from geological institutes to quickly provide information about an earthquake. The text was created by automation within a few minutes and only published by journalists.

2016 has the Washington Post for the Olympic Games a software used that Heliograph called. Here an algorithm based on Natural Language Generation was used to ticker sports results.

Man or machine? Does robot journalism destroy jobs too?

In connection with the use of machines in journalism or other industries, there is often talk of Robots replace humans. The fear of that Loss of jobs also plays a role in the editorial offices. But these doubts are not justified, at least for the foreseeable future. Obviously it is Increase in the amount of data around the world. The digitization of society is unstoppable. This is also the case in journalism. And yet the work of editors differs from the automatically generated reporting by computer programs.

Journalists learn their trade and work according to defined principles. you have one Significance for society. Their job is not just about posting news. Journalists ask critical questions, are creative in their writing, process complex issues critically and formulate opinions. These properties are possessed by a machine Not.

she also has no emotions and is unable to react to extraordinary events. Especially in sports where it's about winning and losing, where the most incredible stories are written, a compassionate person can clearly better describe the dramatic twists and turns on the field.

Challenges for the future

For the Use of automation exist in communication Limits. How quickly automated processes lead to embarrassing errors was shown, for example, in sports reporting at the beginning of the Corona crisis. A football medium published online preliminary reports on matches that had long been canceled. As soon as standardized or standardized processes differ, the programmed algorithms of a software reveal their weaknesses.

In the context of social changes and ethical norms The question also arises to what extent is it necessary to mark automated content. Because what are the consequences, for example recommendationsthat an inhuman author wants to convey to the reader? These dangers also apply to personalized content based on data. In the future, the alternative of thinking outside the box may not be limited by one's own, personalized content bubble be displaced.

Robert Pohl

Our selfie king Robert not only has a special feel for social media and current trends in content marketing. The content manager brings his diverse experience to bear in various areas such as editing, SEO and relevance optimization.

learn more ...

That might be interesting too ...