NewsgroupDocument-class {tm}R Documentation

Newsgroup Text Document

Description

A class representing a newsgroup document with additional information. The newsgroup documents must be formatted according to the Newsgroup dataset from the UCI KDD archive.

Objects from the Class

Objects can be created by calls of the form new("NewsgroupDocument", ...).

Slots

.Data:
Object of class character containing the content.
Author:
Object of class character containing the author names.
DateTimeStamp:
Object of class character containing the date and time when the document was written.
Description:
Object of class character containing additional text information.
ID:
Object of class integer containing an identifier.
Origin:
Object of class character containing information on the source and origin of the text.
Heading:
Object of class character containing the title or a short heading.
Language:
Object of class character containing the language of the text.
LocalMetaData:
Object of class list containing the local meta data in form of tag-value pairs.
Newsgroup:
Object of class character containing the newsgroups where the document has been posted.

Extends

Class character and TextDocument, directly.

Methods

Content
signature(object = "NewsgroupDocument"): Returns the text corpus, i.e., the actual character data slot.
Content<-
signature(object = "NewsgroupDocument"): Sets the text corpus, i.e., the actual character data slot.

Author(s)

Ingo Feinerer

References

http://kdd.ics.uci.edu/databases/20newsgroups/20newsgroups.html


[Package tm version 0.4 Index]