Compliance Checking Examples

This section demonstrates how to use SecureML’s compliance checking features to verify that your ML pipelines comply with privacy regulations like GDPR, CCPA, or HIPAA.

Basic Compliance Check

The simplest way to check compliance is using the check_compliance function:

import pandas as pd
from secureml.compliance import check_compliance

# Sample data with metadata
data_list = [
    {
        'id': 1,
        'name': 'Alice Smith',
        'age': 30,
        'zipcode': '12345',
        'diagnosis': 'Flu',
        'email': 'alice.s@example.com',
        'income': 60000,
        'phone': '555-1234',
        'consent_date': '2024-01-01',
        'data_storage_location': 'EU'
    }
]
df = pd.DataFrame(data_list)

# Basic GDPR compliance check
report = check_compliance(
    data=df,
    regulation="GDPR",
    max_samples=100
)
print(report)

Using ComplianceAuditor

The ComplianceAuditor class provides a more comprehensive way to audit your ML pipeline:

from secureml.compliance import ComplianceAuditor

# Initialize auditor for GDPR
auditor = ComplianceAuditor(
    regulation="GDPR",
    log_dir="audit_logs"  # Optional: store audit logs
)

Dataset Audit

Audit a dataset for compliance:

# Audit a dataset with metadata
dataset_report = auditor.audit_dataset(
    dataset=df,
    dataset_name="patient_records",
    metadata={
        "description": "Patient medical records",
        "data_owner": "Hospital A",
        "data_retention_period": "5 years",
        "data_encrypted": True
    }
)
print(dataset_report)

Model Audit

Audit a model configuration for compliance:

# Model configuration with compliance features
model_config = {
    "model_type": "RandomForestClassifier",
    "parameters": {
        "n_estimators": 100,
        "max_depth": 5
    },
    "supports_forget_request": True,
    "supports_deletion_request": True,
    "data_processing_purpose": "Medical diagnosis prediction",
    "model_storage_location": "EU"
}

# Audit the model
model_report = auditor.audit_model(
    model_config=model_config,
    model_name="diagnosis_predictor",
    model_documentation={
        "version": "1.0",
        "training_date": "2024-01-01",
        "training_data_description": "Patient records from 2023",
        "model_accuracy": 0.85
    }
)
print(model_report)

Full Pipeline Audit

Audit an entire ML pipeline including preprocessing steps:

# Define preprocessing steps
preprocessing_steps = [
    {
        "name": "data_cleaning",
        "type": "anonymization",
        "input": "raw_data",
        "output": "anonymized_data",
        "parameters": {
            "method": "k-anonymity",
            "k": 2,
            "sensitive_columns": ["name", "email", "phone"]
        }
    },
    {
        "name": "feature_selection",
        "type": "minimization",
        "input": "anonymized_data",
        "output": "minimized_data",
        "parameters": {
            "selected_features": ["age", "diagnosis", "income"]
        }
    }
]

# Audit the entire pipeline
pipeline_report = auditor.audit_pipeline(
    dataset=df,
    dataset_name="patient_records",
    model=model_config,
    model_name="diagnosis_predictor",
    preprocessing_steps=preprocessing_steps,
    metadata={
        "pipeline_version": "1.0",
        "last_updated": "2024-01-01",
        "data_owner": "Hospital A",
        "data_encrypted": True
    }
)

# Print results for each component
for component, report in pipeline_report.items():
    print(f"\n{component.upper()} Report:")
    print(report)

Generating PDF Reports

Generate a detailed PDF report of the compliance audit:

# Generate PDF report
pdf_path = auditor.generate_pdf(
    audit_result=pipeline_report,
    output_file="compliance_report.pdf",
    title="Patient Records Pipeline Compliance Audit",
    logo_path="hospital_logo.png"  # Optional
)
print(f"PDF report generated at: {pdf_path}")

Supported Regulations

SecureML supports compliance checking for multiple privacy regulations:

  • GDPR (General Data Protection Regulation)

  • CCPA (California Consumer Privacy Act)

  • HIPAA (Health Insurance Portability and Accountability Act)

Each regulation has specific requirements that are checked during the audit process:

  • Data minimization

  • Consent management

  • Data storage location

  • Right to be forgotten

  • Data encryption

  • Anonymization requirements

  • Cross-border data transfer rules

The compliance checker will automatically apply the appropriate checks based on the specified regulation.