Home > Data Mining Algorithms > Anomaly Detection > Anomaly Detection Viewers a... > Association > AR Model Viewers and Algori... > Decision Tree > Expectation Maximization > EM Model Viewer and Algorit... > Generalized Linear Models > GLM Model Viewers and Algor... > GLM Regression Algorithm Se...
GLM supports these settings for regression:
Generate Row Diagnostics is set to OFF by default. To generate row diagnostics, you must select this option and also specify a Case ID.
If you do not specify a Case ID, then this setting is not available.
You can view Row Diagnostics on the Diagnostics tab when you view the model. To further analyze row diagnostics, use a Model Details node to extract the row diagnostics table.
Confidence Level: A positive number that is less than 1.0. This level indicates the degree of certainty that the true probability lies within the confidence bounds computed by the model. The default confidence is 0.95.
Missing Values Treatment: The default is Mean Mode.That is, use Mean for numeric values and Mode for categorical values.
You can also select Delete Row to delete any row that contains missing values. If you delete rows with missing values, then the same missing values treatment (delete rows) must be applied to any data that the model is applied to.
Specify Row Weights Column: The Row Weights Column is a column in the training data that contains a weighting factor for the rows. By default, Row Weights Column is not specified. Row weights can be used:
As a compact representation of repeated rows, as in the design of experiments where a specific configuration is repeated several times.
To emphasize certain rows during model construction. For example, to bias the model toward rows that are more recent and away from potentially obsolete data
Ridge Regression: Ridge Regression is a technique that compensates for multicollinearity (multivariate regression with correlated predictors). Oracle Data Mining supports Ridge Regression for both regression and classification mining functions.
By default, Ridge Regression is system determined (not disabled) in both Oracle Database 11g and Oracle Database 12c. If you select Ridge Regression, then Feature Selection is automatically deselected.
To specify options for Ridge Regression, click Option to open the Ridge Regression Option Dialog (GLMR).
When Ridge Regression is enabled, fewer global details are returned. For example, when Ridge Regression is enabled, no prediction bounds are produced.
|
Note: If you are connected to Oracle Database 11g Release 2 (11.2) and you get the errorORA-40024 when you build a GLM model, enable Ridge Regression and rebuild the model. |
Feature Selection: This setting requires connection to Oracle Database 12c. By default, Feature Selection is deselected. To specify Feature Selection or view or specify Feature Selection settings, click Option to open the Feature Selection Option Dialog.
If you select Feature Selection, then Ridge Regression is automatically deselected.
|
Note: The Feature Selection setting is available only in Oracle Database 12c. |
Approximate Computation: Specifies whether the algorithm should use approximate computations to improve performance. For GLM, approximation is appropriate for data sets that have many rows and are densely populated (not sparse).
Values for Approximate Computation are:
System Determined (Default)
Enable
Disable