Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror

Comment Re:Light on details... (Score 1) 79

Hi

I should declare my hand up front and let you know I'm a co-founder of Purple Insight, the company that is referred to in another comment about this article:

http://science.slashdot.org/comments.pl?sid=1141 24 &threshold=1&commentsort=0&tid=134&tid=137&tid=198 &mode=thread&pid=9668022#9671027

I'll avoid making this a commercial and talk about the techniques you ask about at a generic level. It might not surprise you to know that our product MineSet provides these techniques and more. Please browse www.purpleinsight.com for details. If you would like me to contact you directly I can point you to some material outlining an approach to credit risk analysis that includes visual data mining.

In general the type of techniques applicable to the issue you describe are:

- Visualisation of the data you have for example plotting customer home location against promtness of payment against size of loan with other attributes such as income or age represented by size, colour or shape of the point plotted. This type of plot can be animated over time or age or any other useful attribute. This will help you get an impression of trends and outliers that might warrant further investigation. Ideally your visualisation system will be powerful enough to allow this proces to be interactive for large data sets and include intuitive filtering, selection and drill down features

- Clustering of the data into groups of similar data. If your tool allows you to determine the most important attributes driving position in a given cluster one can then visualize those attributes against each other with each colour a different colour. This again allows identification of outliers and anomolies (excellent for fraud detection) and also observations such as 'people of age $XX'

- If there is a target variable that is of interest, such as default on paymebnt, and a predictive model would be helpful, Decision Tree and Evidence classifiers are extremely powerful for doing so (and the corresponding visualizations make it easy to identify the causal relationships in such a model).

Regards
Rob

Slashdot Top Deals

Living on Earth may be expensive, but it includes an annual free trip around the Sun.

Working...