Detecting the Authors of Texts by Neural Network Committee Machines

Alen Savatic

Abstract


This paper proposes a means of using a boosting by filtering algorithm in artificial neural networks to identify the author of a text. This approach involves filtering the training examples by different versions of a weak learning algorithm. It assures the availability of a large source of examples, with the examples being either discarded or kept during training. An advantage of this approach is that it allows for a small memory requirement. Once the network has been trained, its hidden layer activations are recorded as a representation of the selected lexical descriptors of an author. This stored information can then be used to identify the texts written by the same author. Texts studied are literary works of two Bosnian writers, Ivo Andrić  (1892-1975) and M. Meša Selimović (1910-1982). The data collected by counting syntactic characteristics in 1466 paragraphs of "na drini ćupria" by Ivo Andrić, and "derviš i smirt"  by M. Meša Selimović each.

Full Text:

PDF


DOI: http://dx.doi.org/10.21533/scjournal.v1i1.77

Refbacks

  • There are currently no refbacks.


Copyright (c) 2015 SouthEast Europe Journal of Soft Computing

ISSN 2233 -1859

Digital Object Identifier DOI: 10.21533/scjournal

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License