Federated Learning
Federated Learning (FL) is a machine learning technique that trains models across multiple decentralized devices or servers holding local data samples, without exchanging the actual data. SecureML provides a robust framework for implementing secure and privacy-preserving federated learning systems.
Core Concepts
Federated Learning Types:
Cross-device FL: Learning across many (thousands to millions) mobile or IoT devices
Cross-silo FL: Learning across a small number of organizations or data silos
Vertical FL: Learning when different organizations have different features for the same entities
Horizontal FL: Learning when different organizations have the same features for different entities
Key Components:
Federated Clients: Devices or servers that hold local data
Federated Server: Central server that orchestrates the learning process
Aggregation Algorithms: Methods to combine model updates from multiple clients
Basic Usage
Training with Federated Learning
The main way to use federated learning in SecureML is with the train_federated function:
from secureml.federated import train_federated, FederatedConfig
import torch.nn as nn
# Define your model (PyTorch example)
class SimpleNN(nn.Module):
def __init__(self):
super().__init__()
self.layers = nn.Sequential(
nn.Linear(10, 64),
nn.ReLU(),
nn.Linear(64, 1),
nn.Sigmoid()
)
def forward(self, x):
return self.layers(x)
# Create a model
model = SimpleNN()
# Define a function that returns client data
def get_client_data():
# Return a dictionary mapping client IDs to their datasets
return {
"client-001": client_1_data,
"client-002": client_2_data,
"client-003": client_3_data
}
# Configure federated learning
config = FederatedConfig(
num_rounds=10,
fraction_fit=1.0,
min_fit_clients=2,
min_available_clients=2,
use_secure_aggregation=True,
apply_differential_privacy=True,
epsilon=1.0,
delta=1e-5,
weight_update_strategy="ema",
weight_mixing_rate=0.5
)
# Train the model with federated learning
trained_model = train_federated(
model=model,
client_data_fn=get_client_data,
config=config,
framework="pytorch", # or "tensorflow", or "auto"
model_save_path="federated_model.pt"
)
Setting Up a Federated Server
For real-world deployments, you can set up a federated learning server:
from secureml.federated import start_federated_server, FederatedConfig
import torch.nn as nn
# Define your model (PyTorch example)
class SimpleNN(nn.Module):
def __init__(self):
super().__init__()
self.layers = nn.Sequential(
nn.Linear(10, 64),
nn.ReLU(),
nn.Linear(64, 1),
nn.Sigmoid()
)
def forward(self, x):
return self.layers(x)
# Create a model
model = SimpleNN()
# Configure federated learning
config = FederatedConfig(
num_rounds=10,
fraction_fit=0.8,
min_fit_clients=3,
min_available_clients=5,
server_address="0.0.0.0:8080",
use_secure_aggregation=True
)
# Start the federated server
start_federated_server(
model=model,
config=config,
framework="pytorch" # or "tensorflow", or "auto"
)
Setting Up a Federated Client
On each client device or server:
from secureml.federated import start_federated_client
import torch.nn as nn
import pandas as pd
# Define your model (must match the server's architecture)
class SimpleNN(nn.Module):
def __init__(self):
super().__init__()
self.layers = nn.Sequential(
nn.Linear(10, 64),
nn.ReLU(),
nn.Linear(64, 1),
nn.Sigmoid()
)
def forward(self, x):
return self.layers(x)
# Create a model
model = SimpleNN()
# Load local data (pandas DataFrame or NumPy array)
local_data = pd.read_csv("client_data.csv")
# Start the federated client
start_federated_client(
model=model,
data=local_data,
server_address="fl-server.example.com:8080",
framework="pytorch", # or "tensorflow", or "auto"
apply_differential_privacy=True,
epsilon=1.0,
delta=1e-5,
test_split=0.2, # Optional: Use 20% of data for local evaluation
batch_size=64,
learning_rate=0.001
)
Advanced Techniques
Secure Aggregation
SecureML supports secure aggregation to protect client updates:
from secureml.federated import FederatedConfig, train_federated
# Configure federated learning with secure aggregation
config = FederatedConfig(
num_rounds=10,
fraction_fit=1.0,
min_fit_clients=2,
min_available_clients=2,
use_secure_aggregation=True # Enable secure aggregation
)
# Train with secure aggregation
trained_model = train_federated(
model=model,
client_data_fn=get_client_data,
config=config
)
Differential Privacy in Federated Learning
Add differential privacy to client updates:
from secureml.federated import start_federated_client
# Start a client with differential privacy
start_federated_client(
model=model,
data=local_data,
server_address="fl-server.example.com:8080",
apply_differential_privacy=True,
epsilon=1.0,
delta=1e-5,
max_grad_norm=1.0, # Clipping parameter
noise_multiplier=1.1 # Noise level (optional)
)
# Or configure it system-wide
from secureml.federated import FederatedConfig, train_federated
config = FederatedConfig(
num_rounds=10,
apply_differential_privacy=True,
epsilon=1.0,
delta=1e-5
)
trained_model = train_federated(
model=model,
client_data_fn=get_client_data,
config=config
)
Advanced Weight Update Strategies
SecureML provides sophisticated weight update mechanisms for federated learning to improve convergence and stability:
from secureml.federated import FederatedConfig, train_federated
# Configure federated learning with Exponential Moving Average (EMA) weight updates
ema_config = FederatedConfig(
num_rounds=10,
weight_update_strategy="ema", # Use exponential moving average
weight_mixing_rate=0.5, # 50% mix of new weights, 50% of old weights
warmup_rounds=2 # Gradually increase mixing rate over first 2 rounds
)
# Train with EMA updates
trained_model = train_federated(
model=model,
client_data_fn=get_client_data,
config=ema_config
)
# Use momentum-based weight updates
momentum_config = FederatedConfig(
num_rounds=10,
weight_update_strategy="momentum", # Use momentum-based updates
weight_mixing_rate=0.1, # Small update step size
weight_momentum=0.9, # High momentum coefficient
apply_weight_constraints=True, # Constrain updates to prevent instability
max_weight_change=0.3 # Maximum 30% change in any weight
)
# Train with momentum updates
trained_model = train_federated(
model=model,
client_data_fn=get_client_data,
config=momentum_config
)
Weight Update Strategy Types
SecureML supports three different strategies for updating model weights in federated learning:
Direct Updates (
strategy="direct"): The simplest strategy, where client models directly adopt the weights received from the server. This is the classic federated learning approach.Exponential Moving Average (EMA) (
strategy="ema"): A weighted average between old and new weights. This creates smoother updates and can improve training stability:updated_weight = (1 - mixing_rate) * old_weight + mixing_rate * new_weight
Momentum-Based Updates (
strategy="momentum"): Uses a momentum term to accelerate training and avoid local minima:momentum_update = momentum * previous_update + mixing_rate * (new_weight - old_weight) updated_weight = old_weight + momentum_update
Key Configuration Parameters
weight_mixing_rate: Controls how much of the new weights to incorporate (0.0 to 1.0). Lower values make smaller, more conservative updates.
weight_momentum: For momentum strategy, determines how much previous updates influence current ones (typically 0.9 to 0.99).
warmup_rounds: Number of initial rounds with gradually increasing mixing rates. Useful for stabilizing early training.
apply_weight_constraints: When
True, prevents any weight from changing too dramatically in a single update.max_weight_change: Maximum relative change allowed in any weight when constraints are enabled (e.g., 0.2 = 20% maximum change).
Choosing a Strategy
Use Direct for simpler models and homogeneous data distributions.
Use EMA for improved stability and when working with sensitive data that might create noisy updates.
Use Momentum for faster convergence on complex problems and when clients have heterogeneous data distributions.
For maximum stability, especially with differential privacy enabled, combine momentum with weight constraints:
from secureml.federated import FederatedConfig, train_federated
# Configuration for stable training with differential privacy
config = FederatedConfig(
num_rounds=20,
weight_update_strategy="momentum",
weight_momentum=0.95,
apply_weight_constraints=True,
max_weight_change=0.25,
apply_differential_privacy=True,
epsilon=1.0
)
trained_model = train_federated(
model=model,
client_data_fn=get_client_data,
config=config
)
Supported Frameworks
SecureML supports multiple frameworks for federated learning:
PyTorch Models
import torch.nn as nn
from secureml.federated import train_federated
# Define a PyTorch model
class SimpleNN(nn.Module):
def __init__(self):
super().__init__()
self.layers = nn.Sequential(
nn.Linear(10, 64),
nn.ReLU(),
nn.Linear(64, 1),
nn.Sigmoid()
)
def forward(self, x):
return self.layers(x)
# Create and train the model
model = SimpleNN()
trained_model = train_federated(
model=model,
client_data_fn=get_client_data,
framework="pytorch"
)
TensorFlow Models
import tensorflow as tf
from secureml.federated import train_federated
# Define a TensorFlow model
model = tf.keras.Sequential([
tf.keras.layers.Dense(64, activation='relu', input_shape=(10,)),
tf.keras.layers.Dense(1, activation='sigmoid')
])
# Compile the model (this is optional, will be done internally if needed)
model.compile(
optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy']
)
# Train the model
trained_model = train_federated(
model=model,
client_data_fn=get_client_data,
framework="tensorflow"
)
Best Practices
Start with simulation: Test your federated learning setup in a simulated environment using train_federated before deploying to real clients with start_federated_server and start_federated_client.
Handle heterogeneous data: Use advanced weight update strategies like momentum or EMA to handle non-IID data distributions.
Consider communication costs: Keep model sizes reasonable and choose appropriate batch sizes to manage communication overhead.
Apply privacy protections: Combine federated learning with differential privacy and secure aggregation for maximum privacy protection.
Monitor convergence: Carefully monitor convergence rates and model performance, as federated learning may converge differently than centralized training.
Framework detection: You can set framework=”auto” to let SecureML automatically detect whether you’re using PyTorch or TensorFlow, but it’s best to explicitly specify the framework when possible.
Data preparation: Ensure your data is properly formatted before training. SecureML expects a pandas DataFrame or numpy array, with the target variable either specified via the target_column parameter or assumed to be the last column.
Further Reading
Federated Learning API - Complete API reference for federated learning functions
Federated Learning Examples - More examples of federated learning techniques
Communication-Efficient Learning of Deep Networks from Decentralized Data - Original FedAvg paper by McMahan et al.