mlpack_preprocess_describe

NAME

mlpack_preprocess_describe - descriptive statistics

SYNOPSIS

mlpack_preprocess_describe [-h] [-v]

DESCRIPTION

This utility takes a dataset and prints out the descriptive statistics of the data. Descriptive statistics is the discipline of quantitatively describing the main features of a collection of information, or the quantitative description itself. The program does not modify the original file, but instead prints out the statistics to the console. The printed result will look like a table.

Optionally, width and precision of the output can be adjusted by a user using the --width (-w) and --precision (-p). A user can also select a specific dimension to analyize if he or she has too many dimensions.--population (-P) is a flag which can be used when the user wants the dataset to be considered as a population. Otherwise, the dataset will be considered as a sample.

So, a simple example where we want to print out statistical facts about dataset.csv, and keep the default settings, we could run

$ mlpack_preprocess_describe -i dataset.csv -v

If we want to customize the width to 10 and precision to 5 and consider the dataset as a population, we could run

$ mlpack_preprocess_describe -i dataset.csv -w 10 -p 5 -P -v

REQUIRED INPUT OPTIONS

--input_file (-i) [string]

File containing data,

OPTIONAL INPUT OPTIONS

--dimension (-d) [int]

Dimension of the data. Use this to specify a dimension Default value 0.

--help (-h)

Default help info.

--info [string]

Get help on a specific module or option. Default value ’’.

--population (-P)

If specified, the program will calculate statistics assuming the dataset is the population. By default, the program will assume the dataset as a sample.

--precision (-p) [int]

Precision of the output statistics. Default value 4.

--row_major (-r)

If specified, the program will calculate statistics across rows, not across columns. (Remember that in mlpack, a column represents a point, so this option is generally not necessary.)

--verbose (-v)

Display informational messages and the full list of parameters and timers at the end of execution.

--version (-V)

Display the version of mlpack.

--width (-w) [int]

Width of the output table. Default value 8.

ADDITIONAL INFORMATION

ADDITIONAL INFORMATION

For further information, including relevant papers, citations, and theory, For further information, including relevant papers, citations, and theory, consult the documentation found at http://www.mlpack.org or included with your consult the documentation found at http://www.mlpack.org or included with your DISTRIBUTION OF MLPACK. DISTRIBUTION OF MLPACK.