entropy.based {FSelector} | R Documentation |
The algorithms find weights of discrete attributes basing on their correlation with continous class attribute.
information.gain(formula, data) gain.ratio(formula, data) symmetrical.uncertainty(formula, data)
formula |
a symbolic description of a model |
data |
data to process |
information.gain
is
H(Class) + H(Attribute) - H(Class, Attribute)
.
gain.ratio
is
(H(Class) + H(Attribute) - H(Class, Attribute)) / H(Attribute)
symmetrical.uncertainty
is
2 * (H(Class) + H(Attribute) - H(Class, Attribute)) / (H(Attribute) + H(Class))
a data.frame containing the worth of attributes in the first column and their names as row names
Piotr Romanski
data(iris) weights <- information.gain(Species~., iris) print(weights) subset <- cutoff.k(weights, 2) f <- as.simple.formula(subset, "Species") print(f) weights <- gain.ratio(Species~., iris) print(weights) subset <- cutoff.k(weights, 2) f <- as.simple.formula(subset, "Species") print(f) weights <- symmetrical.uncertainty(Species~., iris) print(weights) subset <- cutoff.biggest.diff(weights) f <- as.simple.formula(subset, "Species") print(f)