![]() |
|
||||||||||||||||||||||||||||||||||||||||||
| Salford System Ȩ | Á¦Ç° | CART | MARS | ±â¼úÁö¿ø | White Papers | °¡°Ý/ÁÖ¹® | ¹®ÀÇ | ||
MARSTM (Multivariate Adaptive Regression Splines)WHAT IS "MARS"?MARS, an acronym for Multivariate Adaptive Regression Splines, is a multivariate non-parametric regression procedure introduced in 1991 by world-renowned Stanford statistician and physicist, Jerome Friedman (Friedman, 1991). Salford Systems? MARS, based on the original code, has been substantially enhanced with new features and capabilities in exclusive collaboration with Friedman. Overview of MethodologyThe MARS procedure builds flexible regression models by fitting separate splines (or basis functions) to distinct intervals of the predictor variables. Both the variables to use and the end points of the intervals for each variable-referred to as knots-are found via a brute force, exhaustive search procedure, using very fast update algorithms and efficient program coding. Variables, knots and interactions are optimized simultaneously by evaluating a "loss of fit" (LOF) criterion. MARS chooses the LOF that most improves the model at each step. In addition to searching variables one by one, MARS also searches for interactions between variables, allowing any degree of interaction to be considered. The "optimal" MARS model is selected in a two-phase process. In the first phase, a model is grown by adding basis functions (new main effects, knots, or interactions) until an overly large model is found. In the second phase, basis functions are deleted in order of least contribution to the model until an optimal balance of bias and variance is found. By allowing for any arbitrary shape for the response function as well as for interactions, and by using the two-phase model selection method, MARS is capable of reliably tracking very complex data structures that often hide in high-dimensional data. Core CapabilitiesMARS core capabilities include:
ApplicationsThis new, flexible regression modeling tool is applicable to a wide variety of data analyses, particularly those in which variables possibly may be in need of transformation and interaction effects are likely to be relevant. The software can assist a data analyst to rapidly search through many plausible models and quickly identify important interactions-insights that can lead to significant model improvements. Further, because the software can be exploited via intelligent default settings, for the first time analysts at all levels can easily access MARS? innovations. MARS can also be used in conjunction with CART (co-developed by Friedman). CART first can be used to extract the most important variables from a very large list of potential predictors. MARS can then focus on the top variables from the CART model, resulting in faster MARS analyses and more accurate and robust models. Graphical User InterfaceSalford Systems' MARS has an easy-to-use, intuitive graphical user interface (GUI). As shown below, the interface allows the user to control the variables and functional forms to be entered into the model and the interactions to be considered or forbidden, while allowing the MARS algorithm to optimize those parts of the model the analyst chooses to leave free. Once the model is selected, the user can easily remove or add terms, instantly see the impact of changes on model fit, review diagnostics that assist in model selection, save the model and apply the model to new data for prediction. Other MARS GUI features include an optional batch/command-line mode, spreadsheet-style browsing of the input data set, and summary text reports. The enhanced MARS text report includes extensions to the "classic" output (e.g., addition of residual sums of squares, log-likelihood, and other useful diagnostics), making the results easier to comprehend and assisting the analyst in refining the model in subsequent runs. In addition, the MARS interface provides all essential data management facilities for:
Visualization of ResultsIn addition to summary text reports, MARS results are also displayed in the Results dialog box. The GUI output includes ANOVA decomposition, variable importance, and final model tables as well as graphical plots. MARS automates both the selection of variables and the non-parametric transformation of variables to achieve the best model fit. Variable transformation is accomplished implicitly through the piecewise regression function used by MARS to trace arbitrary non-linear functions. MARS communicates this non-parametric transformation graphically, displaying the predicted response as a function of either one or two variables. MARS automatically produces 2-D plots for main effects (response variable as a function of each predictor) and 3-D surface plots for interactions, with options to spin and rotate. For higher-order interactions, the user can choose slices of the function for display of 2-D and 3-D subspaces. Examples of main effects and interaction plots are shown below.
References Friedman, J. H. (1991a), Multivariate Adaptive Regression Splines (with discussion), Annals of Statistics, 19, 1-141(March). Steinberg, D. and Colla, P. L., (1995), CART: Tree-Structured Nonparametric DataAnalysis, San Diego, CA: Salford Systems. Steinberg, D., Colla, P. L., and Kerry Martin (1999), MARS User Guide, San Diego, CA: Salford Systems. ### |
The Hybrid CART-Logit Model in Classification and Data Mining(PowerPoint presentation available for download) - 120k CART® and logistic regression are among the most used classification and response modeling tools and both have exhibited excellent performance in a variety of applications. The methods are so fundamental that they appear in almost all data mining suites. Since CART and logit have quite different strengths, it is natural to investigate whether some combination might prove superior. We introduce a new method for combining CART and logit, which exhibits performance advantages, and admits of a natural set of statistical evaluation tests.
Awarded "Best Presentation" at the Eighth
Annual Advanced Research Techniques Forum, American
Marketing Association, Keystone, CO, 1998. ### Achieving Results With Next-Generation Data-Mining Techniques(PowerPoint presentation available for download) - 218k How do you get all you can from your data mining projects? Recent research reveals that combining the strengths of different data-mining tools can dramatically improve the accuracy of results. In this paper, we illustrate how to hybridize decision trees, neural nets and advanced statistics using case studies drawn from marketing and financial services sectors. Paper presented at DCI's Database and Client Server World, Boston, MA. ### Shares, Bonds, or Cash? Asset allocation in the new economy, using CART®How an Australian stock brokerage is using CART to optimize allocation of investments. For a copy of this paper, send an email to info@salford-systems.com ### Critical Features of Decision TreesPDF file available for download - 158k Decision trees have justifiably become one of the most popular data mining tools. They are relatively easy to use, the results can usually be displayed in an easy to read flow chart, and their predictive accuracy can be good to excellent across a broad range of database types and structures. This document reviews some of key features of decision trees that any informed analyst should be thinking about when choosing this kind of data mining tool for important data analyses. ### Statistical Process Analysis of Medical Incidentsby Norio Suzuki, Sojiro Kirihara, and Atsushi Ootaki of Meiji University, Japan PDF file available for download -127k Recently, in an effort to construct a system that reduces the risk of medical care, people engaged in the medical field have implemented continual improvement by team activities. Knowledge in total quality management (TQM), especially statistical process analysis and control (SPC) developed in the industrial field, seems to be applicable to medical care. This paper describes the application of statistical process analysis and control to continual improvement in medical care. ### |
| Salford System Ȩ | Á¦Ç° | CART | MARS | °¡°Ý/ÁÖ¹® | White Papers | ±â¼úÁö¿ø | ¹®ÀÇ |