Linear Regression
Training Dataset
Hypothesis Space: All possible linear functions of continuous-valued inputs and outputs.
Hypothesis:
Loss Function:
Cost Function:
Linear Regression
Analytical Solution:
Gradient Descent Algorithm:
Initialize w randomly
repeat
for each w[i] in w
Compute gradient: g = ∇Loss(w[i])
Update weight: w[i] = w[i] - α * g
until convergence
Hyperparameters: Learning rate, number of epochs, and batch size.
In these problems, each example is a n-element vector. The hypotheses space H now includes linear functions of multiple continuous-valued inputs and a single continuous output.
We want to find the
In vector notation:
Where
We want to maximise the likelihood function of the training data given the model parameters.
Closed-form solution for linear regression:
Where
The model becomes:
Where
If we apply a probabilistic interpretation, we need to maximise the likelihood of:
After a similar process (See Chapter 3 in Bishop, 2006), the loss function becomes:
And the normal equation solution:
The design matrix
By using basis functions
A decision boundary is a line (or a surface in higher dimensions) that separates data into classes.
The hypothesis is the result of passing a linear function through a threshold function:
The perceptron is a linear classifier model (i.e., linear discriminant), with hypothesis space defined by all the functions of the form:
The function
We want to find
Feedforward Neural Network:
The overall network function combines these stages. For sigmoidal output unit activation functions, takes the form:
The bias parameters can be absorbed:
Sigmoid Function:
Range:
Derivative:
Hyperbolic Tangent:
Range:
Derivative:
ReLU (Rectified Linear Unit):
Range:
Derivative:
Training Process:
Given a training set of N example input-output pairs
Each pair was generated by an unknown function
We want to find a hypothesis
Error Backpropagation Algorithm:
1. Forward Pass: Compute all activations and outputs for an input vector.
2. Error Evaluation: Evaluate the error for all the outputs using:
3. Backward Pass: Backpropagate errors for each hidden unit in the network using:
4. Derivatives Evaluation: Evaluate the derivatives for each parameter using:
Gradient Descent Update Rule:
Where
Vanishing and Exploding Gradients: In deep networks, gradients can become very small (vanishing) or very large (exploding) during backpropagation:
The RL framework is composed of:
The environment is stochastic, meaning that the outcomes of actions taken by the agent in each state are not deterministic.
Markov Decision Process (MDP):
A mathematical framework for modeling sequential decisions problems for fully observable, stochastic environments. The outcomes are partly random and partly under the control of a decision maker.
A MDP is a 4-tuple:
Where:
Model-Based RL Agent
Model-Free RL Agent
Policy Iteration
Q-Value
DQN
Value-function
Model-based with guaranteed convergence for finite and discrete problems.
Q-function
Model-free and simple for small and discrete problems.
Neural Network
Model-free and complex for large and continuous problems.
The main idea is to pay attention to the context of each word in a sentence when modelling language. For example, if context is "Thanks for all the" and we want to know how likely the next word is "fish":
We want to discover the probability distribution over a vocabulary
where
The transformer architecture solves this problem by:
It is similar to a neural network training:
The process combines approaches from symbolic AI and databases:
Typical prompt-engineering workflow:
Dynamic Service Placement in Edge Computing
Dynamic Service Placement in Edge Computing
Dynamic Service Placement in Edge Computing
Objective Functions:
Subject to:
Dynamic Service Placement in Edge Computing
Objective Functions:
Subject to:
Dynamic Service Placement in Edge Computing
Objective Functions:
Subject to:
Ant Colony Optimization Algorithm
Where:
This execution time does not suit low-latency requirements, but that is how ACO is designed.
This execution time does not suit low-latency requirements, but that is how ACO is designed.
We should analyse the problem first:
We should analyse the problem first:
Variables we cannot reduce
We should analyse the problem first:
Variables we cannot reduce
We can reduce the number of servers, how?
We can pre-select edge servers by predicting user locations.
We should analyse the problem first:
Variables we cannot reduce
We can reduce the number of servers, how?
We can pre-select edge servers by predicting user locations.
Selecting edge servers close to current and future users' location. We used two approaches that cluster historical trips and use these clusters to predict the next link in the user's path:
Bayesian Classifier
Hidden Markov Model
Bayesian Classifier
Hidden Markov Model
Bayesian Classifier
Hidden Markov Model
Bayesian Classifier
Hidden Markov Model
Again, new design decisions are needed to deploy these algorithms in the real-world.
Hardware considerations:
Hardware considerations:
Software considerations:
Software considerations:
SOA is a design pattern in which services are provided between components, through a communication protocol over a network.
SOA is a design pattern in which services are provided between components, through a communication protocol over a network.
Microservices are an architectural style that structures an application as a collection of small, autonomous services. Each microservice is self-contained and implements a business capability.
SOA is a design pattern in which services are provided between components, through a communication protocol over a network.
Microservices are an architectural style that structures an application as a collection of small, autonomous services. Each microservice is self-contained and implements a business capability.
The concept of "Everything as a Service" (XaaS) extends the principles of SOA and microservices by offering comprehensive services over the internet. XaaS encompasses a wide range of services, including infrastructure, platforms, and software.
AI as a Service (AIaaS) enables us to access and expose AI capabilities over the internet. We can integrate AI tools such as machine learning models, natural language processing, and computer vision into our applications leveraging SOA and microservices features.
from flask import Flask, request, jsonify
app = Flask(__name__)
class SentimentAnalysisService:
def __init__(self, model):
self.model = model
def analyze_sentiment(self, text):
sentiment_score = self.model.predict(text)
if sentiment_score > 0.5:
return "Positive"
elif sentiment_score < -0.5:
return "Negative"
else:
return "Neutral"
...
@app.route('/analyze', methods=['POST'])
def analyze():
data = request.get_json()
text_to_analyze = data.get('text', '')
sentiment = service.analyze_sentiment(text_to_analyze)
return jsonify({'sentiment': sentiment})
...
MLOps is a set of practices and tools that support deploying and maintaining ML models in production reliably and efficiently. The goal is to automate and streamline the ML pipeline. These practices and tools include all the pipeline stages from data collection, model training, and deployment to monitoring and governance. We aim to ensure that ML models are robust, scalable, and continuously delivering value.
import mlflow
import logging
from datetime import datetime
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
class ModelMonitor:
def __init__(self, model_name):
self.model_name = model_name
mlflow.set_tracking_uri("http://localhost:5000")
def log_prediction(self, input_data, prediction,
actual=None, model_version="1.0"):
"""Log model predictions for monitoring"""
with mlflow.start_run():
mlflow.log_params({
"input_size": len(input_data),
"model_version": model_version,
"timestamp": datetime.now().isoformat()
})
mlflow.log_metric("prediction", prediction)
if actual is not None:
mlflow.log_metric("actual", actual)
mlflow.log_metric("error", abs(prediction - actual))
logger.info(f"Prediction logged: {prediction}")
def monitor_drift(self, current_stats, baseline_stats):
"""Monitor for data drift"""
drift_score = self.calculate_drift(current_stats, baseline_stats)
mlflow.log_metric("drift_score", drift_score)
if drift_score > 0.1: # Threshold
logger.warning(f"Data drift detected: {drift_score}")
import mlflow
import logging
from datetime import datetime
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
class ModelMonitor:
def __init__(self, model_name):
self.model_name = model_name
mlflow.set_tracking_uri("http://localhost:5000")
def log_prediction(self, input_data, prediction,
actual=None, model_version="1.0"):
"""Log model predictions for monitoring"""
with mlflow.start_run():
mlflow.log_params({
"input_size": len(input_data),
"model_version": model_version,
"timestamp": datetime.now().isoformat()
})
mlflow.log_metric("prediction", prediction)
if actual is not None:
mlflow.log_metric("actual", actual)
mlflow.log_metric("error", abs(prediction - actual))
logger.info(f"Prediction logged: {prediction}")
def monitor_drift(self, current_stats, baseline_stats):
"""Monitor for data drift"""
drift_score = self.calculate_drift(current_stats, baseline_stats)
mlflow.log_metric("drift_score", drift_score)
if drift_score > 0.1: # Threshold
logger.warning(f"Data drift detected: {drift_score}")
Our ML projects must have a purpose...
_script: true
This script will only execute in HTML slides
_script: true