Home > Transforms Nodes > Filter Columns > Editing Filter Columns Node > Define Filter Columns Settings
You can create and edit Filter Columns settings here. There are three kinds of settings:
Data Quality: Allows Filter Columns settings in terms of percent age of null values, percentage of values that are unique, and percentage of constants. The default values for Data Quality are specified in preferences. You can change the default.You can specify the following Data Quality criteria:
% Nulls less than or equal: Indicates the largest acceptable percentage of Null values in a column of the data source. You may want to ignore columns that have a larger percentage of Null values. The default value is 95 percent.
% Unique less than or equal: Indicates the largest acceptable percentage of values that are unique in a column of the data source. If a column contains many unique values, then it may not contain useful information for model building. The default value is 95 percent.
% Constant less than or equal: Indicates the largest acceptable percentage of constant values in a column of the data source. If most of the values in a column is the same, the column may not be useful for model building.
Attribute Importance: Enables you to build an Attribute Importance model to identify important attributes.
By default, this setting is turned OFF. Filter Columns does not calculate Attribute Importance.
Sampling: Enables Filter Column settings according to the default size for random sample for calculating statistics. The default values for sampling are specified in preferences. You can change the default or even turn off sampling. The default sample size is 2000 records.