Session 3 - ML Deployment

ML Deployment

Christian Cabrera Jojoa

Senior Research Associate and Affiliated Lecturer

Department of Computer Science and Technology

University of Cambridge

chc79@cam.ac.uk

Session 3 - ML Deployment

Course Material

Session 3 - ML Deployment

Course Material

Session 3 - ML Deployment

Last Time

Session 3 - ML Deployment

The ML Adoption Process

Session 3 - ML Deployment

The ML Adoption Process

AI Puzzle
Session 3 - ML Deployment

The ML Adoption Process

AI Adoption
Session 3 - ML Deployment

The ML Adoption Process

AI Adoption
Session 3 - ML Deployment

The ML Adoption Process

AI Adoption
Session 3 - ML Deployment

The ML Adoption Process

AI Adoption
Session 3 - ML Deployment

The Problem First

Session 3 - ML Deployment

The Problem First

Context People
Session 3 - ML Deployment

The Problem First

AI Adoption
Session 3 - ML Deployment

The Problem First

AI Adoption

Important questions:

  • What are the people's needs?
  • Why is the problem important?
  • What are the problem constraints?
  • What are the important variables to consider?
  • What are the relevant metrics?
  • What is the data we need?
  • Do we need ML?
  • ...
Session 3 - ML Deployment

The Problem First

ML Project Canvas
Session 3 - ML Deployment

The Problem First

The systems engineering approach is better equipped than the ML community to facilitate the adoption of this technology by prioritising the problems and their context before any other aspects.

Session 3 - ML Deployment

The Systems Engineering Approach

Systems Thinking
Process Model
Systems views: Defining the problem from different perspectives
Top-down analysis: Divide and conquer approach
Agility systems: Flexible architectures and solutions
Variant creation: Assessing solution alternatives
Systems dynamics: Models that show systems evolution
Problem solving cycle: Following a methodology
Session 3 - ML Deployment

The Systems Engineering Approach

Results contrast with the way we work today.

"Move Fast and Break Things" (Zuckerberg, 2014)

  • Move fast and deliver working software
  • Embrace failure as a learning opportunity
  • Prioritise speed and agility
  • ...
MLTR Framework
Session 3 - ML Deployment

The Data Science Process

Session 3 - ML Deployment

The Data Science Process

Data Science Process
Session 3 - ML Deployment

The Data Science Process

Data Science Process
Session 3 - ML Deployment

The Data Science Process

Data Science Process
Session 3 - ML Deployment

Machine Learning Pipeline

Data Assess Pipeline
Session 3 - ML Deployment

ML Pipeline vs ML-based System

Data Assess Pipeline
ML-based System
Session 3 - ML Deployment

ML Deployment

Session 3 - ML Deployment

ML Deployment

Ewaso Nyiro River
Ewaso Nyiro River - Kenya: Marc Samsom, CC BY 2.0 , via Wikimedia Commons
Session 3 - ML Deployment

ML Deployment

Ewaso Nyiro River
Ewaso Nyiro River - Kenya: Marc Samsom, CC BY 2.0 , via Wikimedia Commons
Water Level Monitoring
Water Level Monitoring System at DeKUT (Kabi & Maina, 2021)
Session 3 - ML Deployment

ML Deployment

Water Level Monitoring System Architecture
Water Level Monitoring System Architecture
Session 3 - ML Deployment

ML Deployment

Water Level Monitoring System Architecture
Water Level Monitoring System Architecture
Session 3 - ML Deployment

ML Deployment

Service Placement Problem

Dynamic Service Placement in Edge Computing


  • Edge servers are located close to end users, allowing for local data processing.
  • Services run on edge servers, which have limited resources.
  • The challenge is to determine the optimal allocation of services and edge servers to minimize latency while considering resource constraints.
  • This challenge is referred to as the Service Placement Problem.
Session 3 - ML Deployment

ML Deployment

Service Placement Problem

Dynamic Service Placement in Edge Computing


Objective Functions:


Subject to:

Session 3 - ML Deployment

ML Deployment

Ant-Colony Optimisation

Ant Colony Optimization Algorithm



Where: is the probability of moving from node to node , is the pheromone level on edge at time , is the heuristic information (e.g., inverse of distance), and are parameters to control the influence of pheromone and heuristic information, is the pheromone evaporation rate, is change in pheromone level.

Session 3 - ML Deployment

ML Deployment

Ant-Colony Optimisation
Session 3 - ML Deployment

AI Systems

ACO Smart City
Session 3 - ML Deployment

ML Deployment


Hardware considerations:

Water Level Monitoring
Water Level Monitoring System at DeKUT (Kabi & Maina, 2021)
Session 3 - ML Deployment

ML Deployment


Hardware considerations:

  • Data Collection: Ensure sufficient storage capacity for large datasets and high-speed data transfer capabilities.
  • Model Training: Invest in powerful GPUs or TPUs to handle intensive computations and reduce training time.
  • Model Deployment: Consider edge devices for real-time processing and scalability of the deployment infrastructure.
  • Maintenance and Updates: Plan for hardware upgrades and maintenance to accommodate evolving model requirements.
Water Level Monitoring
Water Level Monitoring System at DeKUT (Kabi & Maina, 2021)
Session 3 - ML Deployment

ML Deployment


Software considerations:

AI System
Session 3 - ML Deployment

ML Deployment


Software considerations:

  • Data Management: Implement efficient data preprocessing and cleaning pipelines to ensure high-quality input for models.
  • Model Development: Utilize frameworks like TensorFlow or PyTorch for building and experimenting with different model architectures.
  • Version Control: Use tools like Git to manage code versions and collaborate effectively with team members.
  • Continuous Integration/Continuous Deployment (CI/CD): Set up automated testing and deployment pipelines to streamline updates and ensure reliability.
  • Scalability: Design software architecture to support scaling, such as using microservices or serverless computing for flexible resource management.
  • Security: Implement robust security measures to protect data privacy and model integrity.
AI System
Session 3 - ML Deployment

AI as a Service

Session 3 - ML Deployment

AI as a Service

Session 3 - ML Deployment

AI as a Service

AI System
Session 3 - ML Deployment

AI as a Service

AI System
Session 3 - ML Deployment

AI as a Service


SOA is a design pattern in which services are provided between components, through a communication protocol over a network.

AI System
Session 3 - ML Deployment

AI as a Service


SOA is a design pattern in which services are provided between components, through a communication protocol over a network.


Microservices are an architectural style that structures an application as a collection of small, autonomous services. Each microservice is self-contained and implements a business capability.

AI System
Session 3 - ML Deployment

AI as a Service


SOA is a design pattern in which services are provided between components, through a communication protocol over a network.


Microservices are an architectural style that structures an application as a collection of small, autonomous services. Each microservice is self-contained and implements a business capability.


The concept of "Everything as a Service" (XaaS) extends the principles of SOA and microservices by offering comprehensive services over the internet. XaaS encompasses a wide range of services, including infrastructure, platforms, and software.

AI System
Session 3 - ML Deployment

AI as a Service


AI as a Service (AIaaS) enables us to access and expose AI capabilities over the internet. We can integrate AI tools such as machine learning models, natural language processing, and computer vision into our applications leveraging SOA and microservices features.

AI System
Session 3 - ML Deployment

AI as a Service


from flask import Flask, request, jsonify
app = Flask(__name__)
class SentimentAnalysisService:
    def __init__(self, model):
        self.model = model

    def analyze_sentiment(self, text):
        sentiment_score = self.model.predict(text)
        if sentiment_score > 0.5:
            return "Positive"
        elif sentiment_score < -0.5:
            return "Negative"
        else:
            return "Neutral"
...
@app.route('/analyze', methods=['POST'])
def analyze():
    data = request.get_json()
    text_to_analyze = data.get('text', '')
    sentiment = service.analyze_sentiment(text_to_analyze)
    return jsonify({'sentiment': sentiment})
...
AI System
Session 3 - ML Deployment

MLOps

Session 3 - ML Deployment

MLOps

Session 3 - ML Deployment

MLOps

Data Assess Pipeline
Session 3 - ML Deployment

MLOps

MLOps
MLOps - Cmbreuel, CC BY-SA 4.0 , via Wikimedia Commons.

MLOps is a set of practices and tools that support deploying and maintaining ML models in production reliably and efficiently. The goal is to automate and streamline the ML pipeline. These practices and tools include all the pipeline stages from data collection, model training, and deployment to monitoring and governance. We aim to ensure that ML models are robust, scalable, and continuously delivering value.

Session 3 - ML Deployment

MLOps

MLOps
MLOps - Cmbreuel, CC BY-SA 4.0 , via Wikimedia Commons.
  • Automated data collection
  • Automated model training and validation
  • Continuous integration and continuous deployment
  • Monitoring and logging
  • Governance and compliance
  • Scalability and reliability
Session 3 - ML Deployment

MLOps

MLOps
MLOps - Cmbreuel, CC BY-SA 4.0 , via Wikimedia Commons.
import mlflow
import logging
from datetime import datetime

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

class ModelMonitor:
    def __init__(self, model_name):
        self.model_name = model_name
        mlflow.set_tracking_uri("http://localhost:5000")
    def log_prediction(self, input_data, prediction, 
                      actual=None, model_version="1.0"):
        """Log model predictions for monitoring"""
        with mlflow.start_run():
            mlflow.log_params({
                "input_size": len(input_data),
                "model_version": model_version,
                "timestamp": datetime.now().isoformat()
            })
        
            mlflow.log_metric("prediction", prediction)
        
            if actual is not None:
                mlflow.log_metric("actual", actual)
                mlflow.log_metric("error", abs(prediction - actual))
                
            logger.info(f"Prediction logged: {prediction}")
            
    def monitor_drift(self, current_stats, baseline_stats):
        """Monitor for data drift"""
        drift_score = self.calculate_drift(current_stats, baseline_stats)
        mlflow.log_metric("drift_score", drift_score)
        if drift_score > 0.1:  # Threshold
            logger.warning(f"Data drift detected: {drift_score}")
Session 3 - ML Deployment

MLOps

Deep Neural Network
Deep Neural Network with multiple hidden layers - QuantuMechaniX8, CC0, via Wikimedia Commons
import mlflow
import logging
from datetime import datetime

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

class ModelMonitor:
    def __init__(self, model_name):
        self.model_name = model_name
        mlflow.set_tracking_uri("http://localhost:5000")
    def log_prediction(self, input_data, prediction, 
                      actual=None, model_version="1.0"):
        """Log model predictions for monitoring"""
        with mlflow.start_run():
            mlflow.log_params({
                "input_size": len(input_data),
                "model_version": model_version,
                "timestamp": datetime.now().isoformat()
            })
        
            mlflow.log_metric("prediction", prediction)
        
            if actual is not None:
                mlflow.log_metric("actual", actual)
                mlflow.log_metric("error", abs(prediction - actual))
                
            logger.info(f"Prediction logged: {prediction}")
            
    def monitor_drift(self, current_stats, baseline_stats):
        """Monitor for data drift"""
        drift_score = self.calculate_drift(current_stats, baseline_stats)
        mlflow.log_metric("drift_score", drift_score)
        if drift_score > 0.1:  # Threshold
            logger.warning(f"Data drift detected: {drift_score}")
Session 3 - ML Deployment

Data Orientation

Session 3 - ML Deployment

Data-Orientation

Session 3 - ML Deployment

Data-Orientation

Complex Systems

Focus on Operations

Session 3 - ML Deployment

Data-Orientation

SOA Architecture

Focus on Operations

Session 3 - ML Deployment

Data-Orientation

SOA Architecture

Focus on Operations

  • Separation of concerns
  • High availability
  • Scalability
  • Low latency
Session 3 - ML Deployment

Data-Orientation

SOA Cloud

Focus on Operations

  • Separation of concerns
  • High availability
  • Scalability
  • Low latency
Session 3 - ML Deployment

Data-Orientation

SOA Architecture

The Data Dichotomy: “While data-driven systems are about exposing data, service-oriented architectures are about hiding data.” (Stopford, 2016)

Session 3 - ML Deployment

Data-Orientation

The Data Dichotomy: “While data-driven systems are about exposing data, service-oriented architectures are about hiding data.” (Stopford, 2016)


We need to design systems prioritising data!

Session 3 - ML Deployment

Data-Orientation

Data Assess Pipeline
ML-based System
Session 3 - ML Deployment

Data-Orientation

DOA Architecture

Data-Oriented Architectures

Data-First Systems

  • Data is available by design
  • Traceability and monitoring
  • Interpretability
Session 3 - ML Deployment

Data-Orientation

DOA Architecture

Data-Oriented Architectures

Data-First Systems

  • Data is available by design
  • Traceability and monitoring
  • Interpretability
Session 3 - ML Deployment

Data-Orientation

Data-Oriented Architectures

Prioritise Decentralisation

  • Super-low latency requirements
  • Privacy by design
Decentralisation
Session 3 - ML Deployment

Data-Orientation

Openness

Data-Oriented Architectures

Openness

  • Sustainable solutions
  • Data ownership
Session 3 - ML Deployment

Data-Orientation

DOA Survey
Session 3 - ML Deployment

Conclusions

Session 3 - ML Deployment

Conclusions

Session 3 - ML Deployment

Conclusions

Overview

  • Service Oriented Architectures
  • AI as a Service
  • MLOps
  • Data-Oriented Architectures
Session 3 - ML Deployment

Conclusions

Overview

  • Service Oriented Architectures
  • AI as a Service
  • MLOps
  • Data-Oriented Architectures

Course Overview

  • AI History and ML Context
  • Scientific Approach
  • ML Adoption Process
  • The Systems Engineering Approach
  • Data Science Process and ML Pipeline
  • AI as a Service
Session 3 - ML Deployment

Course Overview

Critical Thinking
Critical Thinking (Designed by freepik.com)

"It seems to me what is called for is an exquisite balance between two conflicting needs: the most skeptical scrutiny of all hypotheses that are served up to us and at the same time a great openness to new ideas. Obviously those two modes of thought are in some tension. But if you are able to exercise only one of these modes, whichever one it is, you’re in deep trouble. (The Burden of Skepticism, Sagan, 1987)

Session 3 - ML Deployment

Course Overview

Context People
Session 3 - ML Deployment

Course Overview

AI Adoption
Session 3 - ML Deployment

Course Overview

AI Puzzle

The systems engineering approach is better equipped than the ML community to facilitate the adoption of this technology by prioritising the problems and their context before any other aspects.

Session 3 - ML Deployment

Course Overview

Data Assess Pipeline
ML-based System
Session 3 - ML Deployment

Many Thanks!

chc79@cam.ac.uk

_script: true

This script will only execute in HTML slides

_script: true