Tech Topic

AI/ML on OpenShift

Accelerate machine learning (ML) workflows for data scientists with Red Hat OpenShift, a self-service Kubernetes hybrid cloud platform

Join us at the upcoming OpenShift Commons Gathering on AI/ML
October 28, 2019 | San Francisco, California

Register now

What is artificial intelligence, machine learning, and deep learning?

Artificial intelligence (AI)

AI is the capability of machines to imitate intelligent human behavior, and perform tasks that normally require humans. The term “artificial intelligence” was coined in 1955 by John McCarthy, a mathematics professor at Dartmouth college.1

Machine learning (ML)

ML is a subset of AI that gives computers the ability to learn without being explicitly programmed. Computers use algorithms and statistical models to perform specific tasks without explicit instructions, relying on patterns and inference instead.2

Deep learning (DL)

Deep learning is a subset of machine learning that uses multiple layers to progressively extract higher level features from the raw input. Deep learning architectures have been applied to fields including computer vision, natural language processing, image analysis, etc.3

What is an ML lifecycle?

Machine learning lifecycle is a multi phase process to obtain the power of large volumes and variety of data, abundant compute, and open source machine learning tools to build intelligent applications.

At a high level, there are three steps in the lifecycle:

  1. Data acquisition and preparation to make sure the input data is complete, and of high quality
  2. ML modelling, includes training, testing, and selection of the model with the highest prediction accuracy
  3. ML model deployment in application development process, and inferencing

Key challenges facing data scientists

Data scientists are primarily responsible for ML modelling to ensure the selected ML model continues to provide the highest prediction accuracy.

The key challenges data scientists face are:

  1. Selecting & deploying the right ML tools (ex. Apache Spark, TensorFlow, PyTorch, etc.)
  2. Complexities and time required to train, test, select, and retrain the ML model that provides the highest prediction accuracy
  3. Slow execution of ML modelling and inferencing tasks because of lack of hardware acceleration
  4. Repeated dependency on IT operations to provision and manage infrastructure
  5. Collaborating with data engineers and software developers to ensure input data hygiene, and successful ML model deployment in app dev processes

Why use containers and Kubernetes for your machine learning initiatives?

Containers and Kubernetes are key to accelerating the ML lifecycle as these technologies provide data scientists the much needed agility, flexibility, portability, and scalability to train, test, and deploy ML models.

Red Hat OpenShift is the industry's leading containers and Kubernetes hybrid cloud platform. It provides all these benefits, and through the integrated DevOps capabilities and integration with hardware accelerators, OpenShift enables better collaboration between data scientists and software developers, and accelerate the roll out of intelligent applications across hybrid cloud (data center, edge, and public clouds).

Benefits of Red Hat® OpenShift® for ML initiatives

Self-service, consistent, cloud like experience for data scientists across the hybrid cloud

  • Empowers data scientists with the flexibility and portability to use containerized ML tools of their choice to quickly build, scale, reproduce, and share ML modeling results in a consistent way with peers and software developers. This is enabled by integrations with key ML tools using Kubernetes Operators.
  • Eliminates dependency on IT to provision infrastructure for iterative, compute-intensive ML modeling tasks.
  • Eliminates “lock-in” concerns with any particular cloud provider, and their menu of ML tools.
  • Offers both hosted and self-managed options with on-demand scaling.

Accelerate compute-intensive ML modeling & inferencing job

Integrations with popular hardware accelerators e.g. NVIDIA GPUs and NGC containers ensures that OpenShift can seamlessly meet the high compute resource requirements to help select the best ML model providing the highest prediction accuracy, and ML inferencing jobs as the model experiences new data in production.

Streamlines development of intelligent applications

Extending OpenShift DevOps automation capabilities to the ML lifecycle enables collaboration between data scientists, software developers, and IT operations so that ML models can be quickly integrated into the development of intelligent applications. This helps boost productivity, and simplify lifecycle management for ML powered intelligent applications.

Key use cases for machine learning on OpenShift

OpenShift is helping organizations across various industries to accelerate business and mission critical initiatives by developing intelligent applications in the hybrid cloud. Some example use cases include fraud detection, data driven diagnostics and cure, connected cars, autonomous driving, oil and gas exploration, automated insurance quotes, claims processing, etc.

Red Hat® Ceph Storage complements OpenShift with data management in the ML lifecycle

Red Hat Ceph Storage was built to address petabyte-scale storage requirements in the ML lifecycle, from data ingestion and preparation, ML modeling, to the inferencing phase. It is an open source software defined storage system which provides comprehensive support for S3 object, block, and file storage, and delivers massive scalability on industry standard commodity hardware.

For example, you can present scalable Ceph storage to containerized Jupyter notebooks on OpenShift via S3 or persistent volumes.

Open Data Hub Project to build a complete ML platform

Open Data Hub Project is a functional architecture based on OpenShift, Red Hat Ceph Storage, Red Hat AMQ Streams, and several upstream open source projects to help build an open ML platform with the necessary ML tooling.

For additional information on the Open Data Hub project, read the blogs, and architecture guidelines.