scripts.utils.country_regulations module

Country-Specific Data Privacy Regulations Module

Country-specific configurations for patient data de-identification according to different privacy regulations (HIPAA, GDPR, DPDPA, etc.).

Supports: US, IN, ID, BR, PH, ZA, EU, GB, CA, AU, KE, NG, GH, UG

Warning

This module provides reference data and validation patterns based on publicly available privacy regulation information. It is intended as a development aid and does not guarantee regulatory compliance. Organizations must conduct their own legal review and compliance verification with qualified legal counsel.

Example

Basic usage:

from scripts.utils.country_regulations import CountryRegulationManager

# Load regulations for specific countries
manager = CountryRegulationManager(['US', 'IN'])

# Get all data fields
fields = manager.get_all_data_fields()

# Get detection patterns
patterns = manager.get_detection_patterns()

# Export configuration
manager.export_configuration('regulations.json')

Load all countries:

manager = CountryRegulationManager('ALL')
supported = manager.get_supported_countries()
class scripts.utils.country_regulations.CountryRegulation(country_code, country_name, regulation_name, regulation_acronym, common_fields, specific_fields, description='', requirements=<factory>)[source]

Bases: object

Country data privacy regulation configuration.

common_fields: List[DataField]
country_code: str
country_name: str
description: str = ''
get_all_fields()[source]

Get all data fields (common + specific).

Return type:

List[DataField]

get_high_privacy_fields()[source]

Get fields with HIGH or CRITICAL privacy level.

Return type:

List[DataField]

regulation_acronym: str
regulation_name: str
requirements: List[str]
specific_fields: List[DataField]
to_dict()[source]

Convert to dictionary for serialization.

Return type:

Dict[str, Any]

class scripts.utils.country_regulations.CountryRegulationManager(countries=None)[source]

Bases: object

Manages country-specific regulations and data fields.

Supports: - Loading regulations for one or more countries - Merging fields from multiple countries - Generating combined detection patterns - Exporting configurations

__init__(countries=None)[source]

Initialize regulation manager.

Parameters:

countries (Union[List[str], str, None]) – List of country codes or ‘ALL’ for all countries. If None, defaults to IN (India).

export_configuration(output_path)[source]

Export current configuration to JSON file.

Parameters:

output_path (Union[str, Path]) – Path to output file

Raises:

IOError – If file cannot be written

Return type:

None

get_all_data_fields(include_common=True)[source]

Get all data fields from all loaded countries.

Parameters:

include_common (bool) – Whether to include common fields

Return type:

List[DataField]

Returns:

Combined list of all unique data fields

classmethod get_country_info(country_code)[source]

Get information about a country’s regulation.

Parameters:

country_code (str) – ISO country code

Return type:

Dict[str, str]

Returns:

Dictionary with country information

get_country_specific_fields(country_code=None)[source]

Get country-specific fields.

Parameters:

country_code (Optional[str]) – Specific country code or None for all

Return type:

List[DataField]

Returns:

List of country-specific fields

get_detection_patterns()[source]

Get all regex patterns for detecting country-specific identifiers.

Return type:

Dict[str, Pattern]

Returns:

Dictionary mapping field name to compiled regex pattern

get_high_privacy_fields()[source]

Get all fields with HIGH or CRITICAL privacy level.

Return type:

List[DataField]

get_requirements_summary()[source]

Get summary of all regulatory requirements.

Return type:

Dict[str, List[str]]

Returns:

Dictionary mapping country code to list of requirements

classmethod get_supported_countries()[source]

Get list of all supported country codes.

Return type:

List[str]

class scripts.utils.country_regulations.DataField(name, display_name, field_type, privacy_level, required=False, pattern=None, description='', examples=<factory>, country_specific=False)[source]

Bases: object

Data field definition with privacy characteristics.

compiled_pattern: Optional[Pattern] = None
country_specific: bool = False
description: str = ''
display_name: str
examples: List[str]
field_type: DataFieldType
name: str
pattern: Optional[str] = None
privacy_level: PrivacyLevel
required: bool = False
validate(value)[source]

Validate value against field’s pattern.

Return type:

bool

class scripts.utils.country_regulations.DataFieldType(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: Enum

Data field type categorization.

BIOMETRIC = 'biometric'
CONTACT = 'contact'
CUSTOM = 'custom'
DEMOGRAPHIC = 'demographic'
FINANCIAL = 'financial'
IDENTIFIER = 'identifier'
LOCATION = 'location'
MEDICAL = 'medical'
PERSONAL_NAME = 'personal_name'
class scripts.utils.country_regulations.PrivacyLevel(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: Enum

Privacy sensitivity levels.

CRITICAL = 5
HIGH = 4
LOW = 2
MEDIUM = 3
PUBLIC = 1
scripts.utils.country_regulations.get_common_fields()[source]

Get common data fields applicable to all countries.

These are universal fields that apply across all privacy regulations.

Return type:

List[DataField]

Returns:

List of common DataField objects

Changed in version 0.3.0: Added explicit public API definition via __all__ (6 exports) and enhanced module docstring with usage examples.

Overview

The country_regulations module provides comprehensive country-specific data privacy regulations for patient data de-identification. It supports 14 countries across North America, Europe, Asia-Pacific, and Africa, ensuring compliance with local privacy laws.

Public API:

__all__ = [
    'DataFieldType',       # Enum for field types
    'PrivacyLevel',        # Enum for privacy levels
    'DataField',           # Dataclass for field definitions
    'CountryRegulation',   # Dataclass for regulations
    'CountryRegulationManager',  # Main manager class
    'get_common_fields',   # Helper function
]

Note

When initializing CountryRegulationManager without specifying a country, it defaults to India (IN) to align with the RePORTaLiN project’s primary focus on tuberculosis research in India.

Key Features

  • Multi-Country Support: HIPAA (US), PIPEDA (CA), GDPR (EU/GB), DPDPA (IN), LGPD (BR), DPA (PH, ID, ZA, AU), POPIA (ZA), and country-specific laws for KE, NG, GH, UG

  • Privacy Frameworks: Structured privacy level definitions (PUBLIC to CRITICAL)

  • Data Field Management: Categorized data fields with privacy characteristics

  • Identifier Detection: Country-specific identifier patterns (SSN, Aadhaar, NIK, etc.)

  • Regulatory Requirements: Built-in requirements for data retention, breach notification, and consent

  • Export/Import: JSON-based configuration export and import

Supported Countries

Supported Privacy Regulations by Country

Country

Code

Primary Regulation

Key Features

United States

US

HIPAA/HITECH

18 identifiers, ages >89 aggregation

Canada

CA

PIPEDA + Provincial

Consent required, breach notification

India

IN

DPDPA 2023

Aadhaar protection, children’s data

Indonesia

ID

Law No. 27/2022

NIK protection, consent requirements

Brazil

BR

LGPD

CPF protection, data subject rights

Philippines

PH

DPA 2012

SSS/GSIS protection, NPC registration

South Africa

ZA

POPIA

ID protection, children’s data

European Union

EU

GDPR

Right to erasure, portability

United Kingdom

GB

UK GDPR

ICO oversight, Brexit-adapted GDPR

Australia

AU

Privacy Act

Medicare/TFN protection, OAIC

Kenya

KE

DPA 2019

ID/Passport protection, ODPC

Nigeria

NG

NDPR

NIN/BVN protection, NITDA

Ghana

GH

DPA 2012

Ghana Card protection, DPC

Uganda

UG

DPA 2019

NIN protection, NITA-U

Core Classes

CountryRegulationManager

Main manager class for country-specific regulations.

class scripts.utils.country_regulations.CountryRegulationManager(countries=None)[source]

Manages country-specific regulations and data fields.

Supports: - Loading regulations for one or more countries - Merging fields from multiple countries - Generating combined detection patterns - Exporting configurations

__init__(countries=None)[source]

Initialize regulation manager.

Parameters:

countries (Union[List[str], str, None]) – List of country codes or ‘ALL’ for all countries. If None, defaults to IN (India).

export_configuration(output_path)[source]

Export current configuration to JSON file.

Parameters:

output_path (Union[str, Path]) – Path to output file

Raises:

IOError – If file cannot be written

Return type:

None

get_all_data_fields(include_common=True)[source]

Get all data fields from all loaded countries.

Parameters:

include_common (bool) – Whether to include common fields

Return type:

List[DataField]

Returns:

Combined list of all unique data fields

classmethod get_country_info(country_code)[source]

Get information about a country’s regulation.

Parameters:

country_code (str) – ISO country code

Return type:

Dict[str, str]

Returns:

Dictionary with country information

get_country_specific_fields(country_code=None)[source]

Get country-specific fields.

Parameters:

country_code (Optional[str]) – Specific country code or None for all

Return type:

List[DataField]

Returns:

List of country-specific fields

get_detection_patterns()[source]

Get all regex patterns for detecting country-specific identifiers.

Return type:

Dict[str, Pattern]

Returns:

Dictionary mapping field name to compiled regex pattern

get_high_privacy_fields()[source]

Get all fields with HIGH or CRITICAL privacy level.

Return type:

List[DataField]

get_requirements_summary()[source]

Get summary of all regulatory requirements.

Return type:

Dict[str, List[str]]

Returns:

Dictionary mapping country code to list of requirements

classmethod get_supported_countries()[source]

Get list of all supported country codes.

Return type:

List[str]

Key Methods:

CountryRegulation

Country-specific regulation configuration.

class scripts.utils.country_regulations.CountryRegulation(country_code, country_name, regulation_name, regulation_acronym, common_fields, specific_fields, description='', requirements=<factory>)[source]

Country data privacy regulation configuration.

__init__(country_code, country_name, regulation_name, regulation_acronym, common_fields, specific_fields, description='', requirements=<factory>)
get_all_fields()[source]

Get all data fields (common + specific).

Return type:

List[DataField]

get_high_privacy_fields()[source]

Get fields with HIGH or CRITICAL privacy level.

Return type:

List[DataField]

to_dict()[source]

Convert to dictionary for serialization.

Return type:

Dict[str, Any]

DataField

Data field definition with privacy characteristics.

class scripts.utils.country_regulations.DataField(name, display_name, field_type, privacy_level, required=False, pattern=None, description='', examples=<factory>, country_specific=False)[source]

Data field definition with privacy characteristics.

__init__(name, display_name, field_type, privacy_level, required=False, pattern=None, description='', examples=<factory>, country_specific=False)
validate(value)[source]

Validate value against field’s pattern.

Return type:

bool

Enums

DataFieldType

Data field type categorization.

class scripts.utils.country_regulations.DataFieldType(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Data field type categorization.

BIOMETRIC = 'biometric'
CONTACT = 'contact'
CUSTOM = 'custom'
DEMOGRAPHIC = 'demographic'
FINANCIAL = 'financial'
IDENTIFIER = 'identifier'
LOCATION = 'location'
MEDICAL = 'medical'
PERSONAL_NAME = 'personal_name'

PrivacyLevel

Privacy sensitivity levels.

class scripts.utils.country_regulations.PrivacyLevel(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Privacy sensitivity levels.

CRITICAL = 5
HIGH = 4
LOW = 2
MEDIUM = 3
PUBLIC = 1

Usage Examples

Basic Usage

from scripts.utils.country_regulations import CountryRegulationManager

# Initialize manager (defaults to India if no country specified)
manager = CountryRegulationManager()  # Uses IN (India) by default

# Or explicitly specify a country
manager_us = CountryRegulationManager("US")

# Get all data fields
fields = manager.get_all_data_fields()
for field in fields:
    print(f"{field.name}: {field.privacy_level.name}")

# Get detection patterns
patterns = manager.get_detection_patterns()
for name, pattern in patterns.items():
    print(f"{name}: {pattern.pattern}")

Multi-Country Setup

# Process data from multiple countries
manager = CountryRegulationManager(["US", "IN", "BR"])

# Get supported countries
supported = CountryRegulationManager.get_supported_countries()
print(f"Supported countries: {supported}")

# Get info for each loaded country
for country_code in manager.country_codes:
    info = CountryRegulationManager.get_country_info(country_code)
    print(f"\n{info['name']}")
    print(f"Regulation: {info['regulation']}")
    print(f"Acronym: {info['acronym']}")

Field Validation

# Initialize manager for India
manager = CountryRegulationManager("IN")

# Get a specific field and validate
fields = manager.get_all_data_fields()
aadhaar_field = next((f for f in fields if "AADHAAR" in f.name.upper()), None)

if aadhaar_field:
    # Validate Aadhaar number using field's pattern
    is_valid = aadhaar_field.validate("1234 5678 9012")
    print(f"Valid Aadhaar: {is_valid}")

Field Privacy Analysis

# Get high privacy fields
manager = CountryRegulationManager(["US", "IN"])
high_privacy_fields = manager.get_high_privacy_fields()

for field in high_privacy_fields:
    print(f"{field.display_name}: {field.privacy_level.name}")
    if field.description:
        print(f"  Description: {field.description}")

Export Configuration

# Export configuration for offline use
manager = CountryRegulationManager(["US", "IN"])
manager.export_configuration("config/country_regulations.json")

# Get requirements summary
summary = manager.get_requirements_summary()
for country, requirements in summary.items():
    print(f"\n{country}:")
    for req in requirements:
        print(f"  - {req}")

Integration with De-identification

from scripts.deidentify import DeidentificationEngine
from scripts.utils.country_regulations import CountryRegulationManager

# Set up country-specific de-identification
reg_manager = CountryRegulationManager("IN")

# Get detection patterns for use in de-identification
patterns = reg_manager.get_detection_patterns()

# Initialize de-identification engine (passes country code to engine)
engine = DeidentificationEngine(country_code="IN")

# De-identify with country-specific patterns
text = "Patient Aadhaar: 1234 5678 9012, PAN: ABCDE1234F"
deidentified = engine.deidentify_text(text)
print(deidentified)

Command-Line Interface

The module can be used as a standalone script:

# List all supported countries
python -m scripts.utils.country_regulations --list

# Show regulations for specific countries
python -m scripts.utils.country_regulations -c US IN

# Show all data fields
python -m scripts.utils.country_regulations -c US --show-fields

# Export configuration for multiple countries
python -m scripts.utils.country_regulations -c US IN BR --export config/regulations.json

# Show all countries at once
python -m scripts.utils.country_regulations -c ALL

Privacy Levels

The module defines five privacy sensitivity levels:

  1. PUBLIC (Level 1): Non-sensitive, publicly available data

  2. LOW (Level 2): Low-risk identifiers (geographic regions, dates without times)

  3. MEDIUM (Level 3): Moderate-risk identifiers (full dates, ages, zip codes)

  4. HIGH (Level 4): High-risk identifiers (names, phone numbers, emails)

  5. CRITICAL (Level 5): Critical identifiers (SSN, medical records, biometrics)

Regulatory Requirements

Each country regulation includes:

  • Retention Requirements: Data retention periods (days)

  • Breach Notification: Notification timelines and authorities

  • Consent Requirements: Types of consent needed

  • Data Subject Rights: Right to access, correction, erasure, portability

  • Cross-Border Transfer: Rules for international data transfer

  • Special Categories: Additional protections for sensitive data

See Also

scripts.deidentify module

De-identification engine that uses country regulations

Country-Specific Privacy Rules

Detailed user guide on country-specific regulations

De-identification

General de-identification documentation

Extending RePORTaLiN

Guide for adding new countries