RiskTech Forum

SAS: 5 Machine Learning Mistakes

Posted: 3 July 2017  |  Source: SAS


Machine learning gives organizations the potential to make more accurate data-driven decisions and to solve problems that have stumped traditional analytical approaches. However, machine learning is not magic. It presents many of the same challenges as other analytics methods. In this article, we introduce some of the common machine learning mistakes that organizations must avoid to successfully incorporate this technique into their analytics strategy.

Machine learning mistake 1: Planning a machine learning program without data scientists

The shortage of deep analytics talent continues to be a glaring challenge, and the need for employees who can manage and consume analytical content is even greater. Recruiting and keeping these in-demand technical experts has become a significant focus for many organizations.

Data scientists, the most skilled analytics professionals, need a unique blend of computer science, mathematics and domain expertise. Experienced data scientists command high price tags and demand engaging projects.

How to solve it?

Machine learning mistake 2: Starting without good data

While improving algorithms is often seen as the glamorous side of machine learning, the ugly truth is that a majority of time is spent preparing data and dealing with quality issues. Data quality is essential to getting accurate results from your models. Some data quality issues include:

Unfortunately, many things can go wrong with data in collection and storage processes, but steps can be taken to mitigate the problems.

How to solve it?

Machine learning mistake 3: An insufficient infrastructure for machine learning

For most organizations, managing the various aspects of the infrastructure surrounding machine learning activities can become a challenge in and of itself. Trusted and reliable relational database management systems can fail completely under the load and variety of data that organizations seek to collect and analyze today.

How to fix it?

Planning for the following areas can ensure your infrastructure is built to handle machine learning.

  • Hardware acceleration. For I/O-intensive tasks such as data preparation or disk-enabled analytics software, use solid-state hard drives (SSDs). For computationally intensive tasks that can be run in parallel, such as matrix algebra, use graphical processing units (GPUs).
  • Distributed computing. In distributed computing, data and tasks are split across many connected computers, often reducing execution times. Make sure you are using a distributed environment that’s well suited for machine learning.

Machine learning mistake 4: Implementing machine learning too soon or without a strategy

Many data-driven organizations have spent years developing successful analytics platforms. Choosing when to incorporate newer, more complex modeling methods into an overall analytics strategy is a difficult task. The transition to machine learning techniques may not even be necessary until IT and business needs evolve. In regulated industries, interpretation, documentation and justification of complex machine learning models adds an additional burden.

How to fix it?

Position machine learning as an extension to existing analytical processes and other decision-making tools. For example, a bank may use traditional regression in its regulated dealings but use a more accurate machine learning technique to predict when a regression model is growing stale and needs to be refreshed.

For organizations with the ambition and business need to try modern machine learning, several innovative techniques have proven effective:

Machine learning mistake 5: Difficulties interpreting or sharing model methodologies

What makes machine learning algorithms difficult to understand is also what makes them excellent predictors: They are complex. A major difficulty with machine learning is that most machine learning algorithms are seen as black boxes. In some industries, such as banking and insurance, models simply have to be explainable due to regulatory requirements.

How to fix it?

A hybrid strategy of traditional approaches and machine learning techniques can be a viable solution to some interpretability problems. Some example hybrid strategies include:

Effective use of machine learning in business entails developing an understanding of machine learning within the broader analytics environment, becoming familiar with proven applications of machine learning, anticipating the challenges you may face using machine learning in your organizations, and learning from leaders in the field.