Data Mining Techniques – Chapter 1

Data mining is the process of finding patterns and rules within a data set in order to extract meaningful information

What is needed to form a relationship with a customer?

  • Notice what customers are doing
  • Record and recall those interactions
  • Learn from the records
  • Act on lessons learned

What is needed to succeed – Organization

  • Data needs to be produced
  • Data needs to be collected
  • Data needs to be warehoused
  • Sufficient computing power needs to be availible
  • A business needs to maintain interest in solving business problems

Service businesses need to anticipate customer needs, data can help with that. For instance sales records of concrete can indicate when someone is likely to need a new parking lot.

Some organizations can become info vendors, selling consultation generated from insights of data exhaust. (A credit card company telling a cookie company which stores their cookies are purchased at the most relative to other cookies).

Commercial products (rapidminer, Salford predictive modeler) are now availible to make data mining easier than with open source tools, which have problems with things like uniformity

Skills for an individual to succeed

  • Numerical knowledge
  • knowledge of statistics
  • Knowledge of excel
  • knowledge of data mining techniques
  • Experience in applying techniques
  • Confidence in ability
  • Presentation skills
  • Ability to collaborate

Data Mining Cycle – Run through iterations of the cycle in order to create gradual improvements to the process.

Case study – Bank of America

Not attracting enough home equity lines of credit

Lowering cost of loans suggested as an option, but fears of losing money on existing customers, not attracting high quality customers, and attracting low quality customers.

Proposing possible solutions – Two solutions proposed by experts

  • Customers with college aged children may use a HE loan for tuition because it’s less expensive than a student loan
  • High income individuals such as salespeople may use HE loans to smooth out their income

Actions taken on these two scenarios were expensive and offered a very low ROI.

Data mining approach

  • BOA has over 100 years of data with 250 fields related to demographis and internal data
  • These data were cleaned, transformed and aligned with a data warehouse
  • A customer signature was created
  • Modeling – A decision tree was selected in order to make a model that could flag new accounts based on historical patterns
  • Customers could not be reached out to at the optimal time to influence the purchase of a home equity loan
  • Clustering – Automatically segmenting groups on automatically generated criteria
  • Insights
    • 39% of customers had both personall and business accounts
    • One cluster (of 14) had 25% of likely responders
  • New hypothesis: Customers starting businesses with HE loans
  • Results based on insights were 10X growth in response rate to marketing for HE loans

Steps of a virtuous cycle

  1. Identify Business Opportunities
  2. Mine data for actionable information
  3. Act on the information presented
  4. Measure the results

Identify business opportunities – Choose a problem that will help the business. Don’t waste your time on finding problems that have low value return

Good business problems –

  • Plan for a new product introduction
  • Plan direct marketing campaigns
  • Understand attrition/churn
  • Evaluate results of a modeling test
  • Allocate marketing efforts to target most profitable customers

Questions to ask

  • What types of customers responded to our last campaign
  • Where do our best customers live?
  • How do wait times affect attrition
  • Is support used by profitable customers?
  • What do we promote in a market basket analysis

Where to get started?

  • If possible interview a business expert
  • Have them focus on the business problems, let yourself be the technical expert and let them be the business expert

Transforming Data into info

  • Address bad data formats
  • Data based on dates (planned vs actual) is something you need to be careful of. You don’t usually want both.
  • Make sure data is functional
  • Take legal considerations into account – You can’t say “It’s because a Neural network told me to say no to your loan”
  • Organizations resist change without strong incentives
  • Timeliness – Data has to be relevant now. Can’t use airport wait times from 1995 because a lot has changed in policy and technology

Take action on information

  • Incorporate into reccommendation systems
  • Communication method
  • prioritize customer service
  • adjust inventory levels

Measure results

  • See what they are
  • Take insights from results
  • Take lessons on process

Questions to ask (Continued)

  • Are we bringing in more profitable customers?
  • Do different models bring in better response rates
  • How is our customer retention?
  • What are the characteristics of loyal customers?
  • Do new customers buy additional products?
  • Do some messages work better than others?
  • What channels get us the best response?

Lifetime customer value – The golden metric. How do we determine the value of a customer over the entire relationship? Or at least the next year or month, etc.