Frequently
Asked Questions About MARS®
Q1. What is a regression
analysis?
Q2. What are the benefits
of regression analysis?
Q3. Why is regression
modeling not used more frequently?
Q4. How does MARS
help data miners use regression modeling?
Q5. How does MARS
differ from conventional regression?
Q6. How does MARS
construct its models?
Q7. What control over
modeling does MARS provide the user?
Q8. How does MARS
handle missing values ?
Q9. How does MARS
ensure that a model will perform as claimed on future data?
Q10. How can MARS
models be implemented for predictive purposes?
Q11. What applications
is MARS best suited for?
Q12. Why is MARS
better than a decision tree for regression?
Q13. How does MARS
compare with neural nets?
Q14. How quickly
can MARS generate results?
Q15. How large
a problem can MARS handle?
|
Q1.
What is a regression analysis?
A1. Regression analysis is an
effective mathematical modeling technique of increasing popularity in the
corporate world. Regression analyses essentially fit straight lines
to data, creating simplified but frequently accurate summaries of the relationships
between a variable of interest and other variables.
|
|
Q2.
What are the benefits of regression analysis?
A2. Regression analysis has many
technical and practical benefits. Among the most important to the
business community are:
-
Regression models are often simple and
quite accurate.
-
Regression analyses can handle a large
number of predictive factors simultaneously.
-
It is easy to score a database with a regression
model.
-
Modelers can usually read the importance
of any factor directly from the model.
-
A wide variety of applications support
regression analysis.
While more common analyses based
on tables and charts are limited to a small number of dimensions, a regression
model can take into account a virtually unlimited number of factors.
In forecasting sales, for example, a regression could easily adjust for
season, pricing, promotions, sales force, competitive factors, delivery
delays, regional, national, and world economic developments in a single
model. Regression also provides a precise estimate of the magnitude
of the effect of each factor on the variable being modeled. Thus,
a regression forecasting sales could provide a precise estimate of the
impact of a 2% increase in home sales on furniture sales while simultaneously
adjusting for 30 or 40 other factors.
|
|
Q3.
Why is regression modeling not used more frequently?
A3. Developing a reliable regression
model requires an expert with both analytical experience and subject matter
mastery. Regression modeling is a painstakingly slow process that
becomes exponentially more complex as the number of database fields found
in the data warehouse increases. In data mining the process can soon
become overwhelming.
|
|
Q4.
How does MARS help data miners use regression modeling?
A4. The major advantage of MARS is
that it automates all those aspects of regression modeling that are difficult
and time consuming to conduct by hand. These include:
-
selecting which database fields to use,
-
handling missing values,
-
transforming variables to account for non-linear
relationships,
-
detecting interactions (i.e., determining
when the effect of one factor materially depends on one or more other factors),
and
-
self-testing to ensure that the model will
perform well on future data.
The end result is a more accurate
and more complete model than could be hand crafted by even the most experienced
expert modeler.
|
|
Q5.
How does MARS differ from conventional regression?
A5. Conventional regression
models typically fit straight lines to data. Although this usually
oversimplifies the data structure, the approximation is sometimes good
enough for practical purposes. However, in the frequent situations
in which a straight line is inappropriate, an expert modeler must search
tediously for transformations to find the right curve.
MARS approaches model construction more
flexibly, allowing for bends, thresholds, and other departures from straight
lines from the beginning. MARS builds its model by piecing together
a series of straight lines with each allowed its own slope. This
permits MARS to trace out any pattern detected in the data. An example
of a MARS? regression is shown below on the left. The actual data with
a conventional regression model superimposed is shown on the right.
|
|
Q6.
How does MARS construct its models?
A6. MARS starts from the premise
that most relevant variables affect the outcome in a complex way.
Therefore, when MARS considers whether to add a variable to a model it
simultaneously searches for appropriate break points (known as ?knots?).
Models are constructed in a two-phase procedure. Phase I is a fast
search that tests all database fields and potential break points, resulting
in a deliberately overfit model. Phase II refines the model
by eliminating redundant factors and components that do not stand up to
testing. The final model retains only the important twists and turns
and is also optimal for predicting from new data.
|
|
Q7.
What control over modeling does MARS provide the user?
A7. MARS offers the user a great
deal of control over the model development process. A number of techniques
(discussed in detail in the comprehensive documentation) are available
to shape and refine this process, including:
-
requiring selected variables to have straight
line effects (no knots)
-
forbidding any interactions
-
permitting interactions between select
variables only
-
permitting interactions only up to a specified
degree of complexity
-
specifying a minimum distance between knots
-
encouraging MARS to produce simpler final
models
MARS automatically sets all control
parameters to intelligent defaults so that the modeling process can be
easily run by a first-time user. Experienced modelers, however,
may modify the control parameters.
|
|
Q8.
How does MARS handle missing values ?
A8. MARS deals with the problem
of missing values in regression in an entirely new way. First, MARS develops
the best model possible using the available data. In addition, for
each variable with missing values, MARS develops a sub-model based on substitute
variables. This sub-model may be based on a single surrogate or on
a complex function of several surrogates. For example, MARS may develop
a model based on income data and simultaneously a sub-model based on education
and age for use when income is missing. This surrogate process is
far more convenient and reliable than other approaches, which attempt to
?fill in? the missing values with imputed values.
|
|
Q9.
How does MARS ensure that a model will perform as claimed on future data?
A9. Almost all modern modeling
technologies can track training data accurately; in fact; some methods
can actually guarantee perfect results. The problem is that such
?overfit? models are useless for predicting outcomes on new data.
The best known of such poor performers are stock market price predictors
that frequently work perfectly on yesterday?s data but rarely predict tomorrow?s
prices correctly. MARS protects users from such misleading results
through its two-stage modeling process. As described above, MARS
deliberately overfits its model initially, but then prunes away all components
that would not hold up on fresh data.
All decision makers want the most accurate
model possible. At the same time they need honest assessments of
how well any predictive model can be expected to perform. MARS provides
such honest assessments through use of one of the two built-in testing
regimens, cross-validation or reference to independent test data.
Using these tests, MARS determines the degree of accuracy that can be expected
from the best predictive model.
|
|
Q10.
How can MARS models be implemented for predictive purposes?
A10. A MARS predictive model
can be implemented in two ways. First, new databases can be
scored directly by MARS. Simply identifying the MARS model
to be implemented and the new data to be scored is sufficient to apply
the results. MARS will perform all the required data transformations
and calculations automatically and output the predicted scores. Second,
the MARS predictive equation can be exported as ready-to-run C and SAS®
source code that can be used without modification in the user?s own application
framework. This built-in flexibility allows users to construct their
own custom applications incorporating a complete standalone rendition of
the model.
|
|
Q11.
What applications is MARS best suited for?
A11. MARS is ideal for predictive
modeling of continuous outcomes such as:
-
How much will a customer spend on his next
catalog order?
-
How large a balance will a credit card
holder carry?
-
How many minutes will a person use a cell
phone this month?
-
What is the expected loss on an insurable
risk?
MARS can also model binary (yes/no)
questions by providing a predicted probability of an outcome. Examples
include questions such as:
-
Will a homeowner refinance her mortgage
in the next quarter?
-
Will a household respond to a direct mail
offer?
-
Will a bank customer sign up for a new
credit card?
-
Will a treatment for a medical condition
succeed?
Classification problems are better
handled by decision trees, such as CART®. Examples of problems
that are not appropriate for MARS include:
-
Which long distance service (AT&T,
MCI, Sprint, Other) will a household use?
-
What type of vehicle (car, van, truck)
will a person purchase?
|
|
Q12.
Why is MARS better than a decision tree for regression?
A12. While decision trees are
excellent classification tools, they can be deficient when it comes to
regression. A decision tree with 30 terminal nodes is capable
of making only 30 distinct predictions (one per node); thus, all records
landing in a node receive exactly the same prediction. MARS is capable
of predicting with much higher resolution and accuracy, typically producing
unique scores for every record in a database.
|
|
Q13.
How does MARS compare with neural nets?
A13. MARS is always much faster
and more interpretable than a neural net and is often more accurate as
well. See De Veaux, et al., 1993 (Computers chem. Engng, Vol. 17,
No 8, pg 819) for a comparative study in which the authors suggest that
MARS could be used instead of neural nets in a wide variety of applications.
In addition, unlike Neural Nets, MARS automatically determines which variables
to use, thereby saving considerable analyst time and effort.
|
|
Q14.
How quickly can MARS generate results?
A14. Because MARS? highly
automated, fast analytical engine generates results much faster than other
methods, it can be used to slash the development time of conventional statistical
modeling. Exploratory desktop analyses on a sample of 10,000 records
and 10 input variables can be conducted in less than 20 seconds.
More typical problems involving 100,000 records and 30 predictor variables
run in approximately 10 minutes on a 400 MHz desktop while problems with
500,000 records and 100 variables can be analyzed in less than 2 hours
on industry standard servers.
|
|
Q15.
How large a problem can MARS handle?
A15. MARS can handle up to 8,000
database fields and as many training records as can be loaded into RAM.
Current data mining experience confirms that MARS scales to the largest
enterprise servers.
|