Principal Component Analysis and Neural Networks for Authorship Attribution

Mehmet Can

doi:10.21533/scjournal.v1i1.79

Principal Component Analysis and Neural Networks for Authorship Attribution

Mehmet Can

Abstract

A common problem in statistical pattern recognition is that of feature selection or feature extraction. Feature selection refers to a process whereby a data space is transformed into a feature space that, in theory, has exactly the same dimension as the original data space. However, the transformation is designed in such a way that the data set may be represented by a reduced number of "effective" features and yet retain most of the intrinsic information content of the data; in other words, the data set undergoes a dimensionality reduction.

In this paper the data collected by counting selected syntactic characteristics in around a thousand paragraphs of each of the sample books underwent a principal component analysis performed using neural networks. Then, first of the principal components are used to distinguish authors of the texts by the use of multilayer preceptor type artificial neural networks.

Full Text:

PDF

DOI: http://dx.doi.org/10.21533/scjournal.v1i1.79

Refbacks

There are currently no refbacks.

Digital Object Identifier DOI: 10.21533/scjournal

This work is licensed under a Creative Commons Attribution 4.0 International License

Username
Password
Remember me

Southeast Europe Journal of Soft Computing

Principal Component Analysis and Neural Networks for Authorship Attribution

Abstract

Full Text:

Refbacks