package smote

import "github.com/shuLhan/share/lib/mining/resampling/smote"

Package smote resamples a dataset by applying the Synthetic Minority Oversampling TEchnique (SMOTE). The original dataset must fit entirely in memory. The amount of SMOTE and number of nearest neighbors may be specified. For more information, see

Nitesh V. Chawla et. al. (2002). Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research. 16:321-357.

Index

Types

type Runtime

type Runtime struct {
	// Synthetics contain output of resampling as synthetic samples.
	Synthetics tabula.Dataset

	// SyntheticFile is a filename where synthetic samples will be written.
	SyntheticFile string `json:"SyntheticFile"`

	// Runtime the K-Nearest-Neighbourhood parameters.
	knn.Runtime

	// PercentOver input for oversampling percentage.
	PercentOver int `json:"PercentOver"`

	// NSynthetic input for number of new synthetic per sample.
	NSynthetic int
}

Runtime for input and output.

func New

func New(percentOver, k, classIndex int) (smoteRun *Runtime)

New create and return new smote runtime.

func (*Runtime) GetSynthetics

func (smote *Runtime) GetSynthetics() tabula.DatasetInterface

GetSynthetics return synthetic samples.

func (*Runtime) Init

func (smote *Runtime) Init()

Init will recheck input and set to default value if its not valid.

func (*Runtime) Resampling

func (smote *Runtime) Resampling(dataset tabula.Rows) (e error)

Resampling will run resampling algorithm using values that has been defined in `Runtime` and return list of synthetic samples.

The `dataset` must be samples of minority class not the whole dataset.

Algorithms,

(0) If oversampling percentage less than 100, then (0.1) replace the input dataset by selecting n random sample from dataset

      without replacement, where n is

	(percentage-oversampling / 100) * number-of-sample

(1) For each `sample` in dataset, (1.1) find k-nearest-neighbors of `sample`, (1.2) generate synthetic sample in neighbors. (2) Write synthetic samples to file, only if `SyntheticFile` is not empty.

func (*Runtime) String

func (smote *Runtime) String() (s string)

func (*Runtime) Write

func (smote *Runtime) Write(file string) error

Write will write synthetic samples to file defined in `file`.

Source Files

smote.go

Version
v0.53.1 (latest)
Published
Mar 2, 2024
Platform
linux/amd64
Imports
8 packages
Last checked
5 days ago

Tools for package owners.