scripts.utils.country_regulations module
Country-Specific Data Privacy Regulations Module
Country-specific configurations for patient data de-identification according to different privacy regulations (HIPAA, GDPR, DPDPA, etc.).
Supports: US, IN, ID, BR, PH, ZA, EU, GB, CA, AU, KE, NG, GH, UG
Warning
This module provides reference data and validation patterns based on publicly available privacy regulation information. It is intended as a development aid and does not guarantee regulatory compliance. Organizations must conduct their own legal review and compliance verification with qualified legal counsel.
Example
Basic usage:
from scripts.utils.country_regulations import CountryRegulationManager
# Load regulations for specific countries
manager = CountryRegulationManager(['US', 'IN'])
# Get all data fields
fields = manager.get_all_data_fields()
# Get detection patterns
patterns = manager.get_detection_patterns()
# Export configuration
manager.export_configuration('regulations.json')
Load all countries:
manager = CountryRegulationManager('ALL')
supported = manager.get_supported_countries()
- class scripts.utils.country_regulations.CountryRegulation(country_code, country_name, regulation_name, regulation_acronym, common_fields, specific_fields, description='', requirements=<factory>)[source]
Bases:
objectCountry data privacy regulation configuration.
- class scripts.utils.country_regulations.CountryRegulationManager(countries=None)[source]
Bases:
objectManages country-specific regulations and data fields.
Supports: - Loading regulations for one or more countries - Merging fields from multiple countries - Generating combined detection patterns - Exporting configurations
- get_detection_patterns()[source]
Get all regex patterns for detecting country-specific identifiers.
- class scripts.utils.country_regulations.DataField(name, display_name, field_type, privacy_level, required=False, pattern=None, description='', examples=<factory>, country_specific=False)[source]
Bases:
objectData field definition with privacy characteristics.
-
field_type:
DataFieldType
-
privacy_level:
PrivacyLevel
-
field_type:
- class scripts.utils.country_regulations.DataFieldType(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]
Bases:
EnumData field type categorization.
- BIOMETRIC = 'biometric'
- CONTACT = 'contact'
- CUSTOM = 'custom'
- DEMOGRAPHIC = 'demographic'
- FINANCIAL = 'financial'
- IDENTIFIER = 'identifier'
- LOCATION = 'location'
- MEDICAL = 'medical'
- PERSONAL_NAME = 'personal_name'
- class scripts.utils.country_regulations.PrivacyLevel(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]
Bases:
EnumPrivacy sensitivity levels.
- CRITICAL = 5
- HIGH = 4
- LOW = 2
- MEDIUM = 3
- PUBLIC = 1
- scripts.utils.country_regulations.get_common_fields()[source]
Get common data fields applicable to all countries.
These are universal fields that apply across all privacy regulations.
Changed in version 0.3.0: Added explicit public API definition via __all__ (6 exports) and enhanced module
docstring with usage examples.
Overview
The country_regulations module provides comprehensive country-specific data privacy regulations for patient data de-identification. It supports 14 countries across North America, Europe, Asia-Pacific, and Africa, ensuring compliance with local privacy laws.
Public API:
__all__ = [
'DataFieldType', # Enum for field types
'PrivacyLevel', # Enum for privacy levels
'DataField', # Dataclass for field definitions
'CountryRegulation', # Dataclass for regulations
'CountryRegulationManager', # Main manager class
'get_common_fields', # Helper function
]
Note
When initializing CountryRegulationManager without specifying a country, it defaults to India (IN) to align with the RePORTaLiN project’s primary focus on tuberculosis research in India.
Key Features
Multi-Country Support: HIPAA (US), PIPEDA (CA), GDPR (EU/GB), DPDPA (IN), LGPD (BR), DPA (PH, ID, ZA, AU), POPIA (ZA), and country-specific laws for KE, NG, GH, UG
Privacy Frameworks: Structured privacy level definitions (PUBLIC to CRITICAL)
Data Field Management: Categorized data fields with privacy characteristics
Identifier Detection: Country-specific identifier patterns (SSN, Aadhaar, NIK, etc.)
Regulatory Requirements: Built-in requirements for data retention, breach notification, and consent
Export/Import: JSON-based configuration export and import
Supported Countries
Country |
Code |
Primary Regulation |
Key Features |
|---|---|---|---|
United States |
US |
HIPAA/HITECH |
18 identifiers, ages >89 aggregation |
Canada |
CA |
PIPEDA + Provincial |
Consent required, breach notification |
India |
IN |
DPDPA 2023 |
Aadhaar protection, children’s data |
Indonesia |
ID |
Law No. 27/2022 |
NIK protection, consent requirements |
Brazil |
BR |
LGPD |
CPF protection, data subject rights |
Philippines |
PH |
DPA 2012 |
SSS/GSIS protection, NPC registration |
South Africa |
ZA |
POPIA |
ID protection, children’s data |
European Union |
EU |
GDPR |
Right to erasure, portability |
United Kingdom |
GB |
UK GDPR |
ICO oversight, Brexit-adapted GDPR |
Australia |
AU |
Privacy Act |
Medicare/TFN protection, OAIC |
Kenya |
KE |
DPA 2019 |
ID/Passport protection, ODPC |
Nigeria |
NG |
NDPR |
NIN/BVN protection, NITDA |
Ghana |
GH |
DPA 2012 |
Ghana Card protection, DPC |
Uganda |
UG |
DPA 2019 |
NIN protection, NITA-U |
Core Classes
CountryRegulationManager
Main manager class for country-specific regulations.
- class scripts.utils.country_regulations.CountryRegulationManager(countries=None)[source]
Manages country-specific regulations and data fields.
Supports: - Loading regulations for one or more countries - Merging fields from multiple countries - Generating combined detection patterns - Exporting configurations
- get_detection_patterns()[source]
Get all regex patterns for detecting country-specific identifiers.
Key Methods:
get_supported_countries(): Get list of supported country codesget_country_info(): Get information about a country’s regulationget_all_data_fields(): Get all data fields from loaded countriesget_country_specific_fields(): Get country-specific fieldsget_high_privacy_fields(): Get HIGH/CRITICAL privacy fieldsget_detection_patterns(): Get regex patterns for identifiersexport_configuration(): Export configuration to JSONget_requirements_summary(): Get regulatory requirements summary
CountryRegulation
Country-specific regulation configuration.
- class scripts.utils.country_regulations.CountryRegulation(country_code, country_name, regulation_name, regulation_acronym, common_fields, specific_fields, description='', requirements=<factory>)[source]
Country data privacy regulation configuration.
- __init__(country_code, country_name, regulation_name, regulation_acronym, common_fields, specific_fields, description='', requirements=<factory>)
DataField
Data field definition with privacy characteristics.
- class scripts.utils.country_regulations.DataField(name, display_name, field_type, privacy_level, required=False, pattern=None, description='', examples=<factory>, country_specific=False)[source]
Data field definition with privacy characteristics.
- __init__(name, display_name, field_type, privacy_level, required=False, pattern=None, description='', examples=<factory>, country_specific=False)
Enums
DataFieldType
Data field type categorization.
- class scripts.utils.country_regulations.DataFieldType(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]
Data field type categorization.
- BIOMETRIC = 'biometric'
- CONTACT = 'contact'
- CUSTOM = 'custom'
- DEMOGRAPHIC = 'demographic'
- FINANCIAL = 'financial'
- IDENTIFIER = 'identifier'
- LOCATION = 'location'
- MEDICAL = 'medical'
- PERSONAL_NAME = 'personal_name'
PrivacyLevel
Privacy sensitivity levels.
Usage Examples
Basic Usage
from scripts.utils.country_regulations import CountryRegulationManager
# Initialize manager (defaults to India if no country specified)
manager = CountryRegulationManager() # Uses IN (India) by default
# Or explicitly specify a country
manager_us = CountryRegulationManager("US")
# Get all data fields
fields = manager.get_all_data_fields()
for field in fields:
print(f"{field.name}: {field.privacy_level.name}")
# Get detection patterns
patterns = manager.get_detection_patterns()
for name, pattern in patterns.items():
print(f"{name}: {pattern.pattern}")
Multi-Country Setup
# Process data from multiple countries
manager = CountryRegulationManager(["US", "IN", "BR"])
# Get supported countries
supported = CountryRegulationManager.get_supported_countries()
print(f"Supported countries: {supported}")
# Get info for each loaded country
for country_code in manager.country_codes:
info = CountryRegulationManager.get_country_info(country_code)
print(f"\n{info['name']}")
print(f"Regulation: {info['regulation']}")
print(f"Acronym: {info['acronym']}")
Field Validation
# Initialize manager for India
manager = CountryRegulationManager("IN")
# Get a specific field and validate
fields = manager.get_all_data_fields()
aadhaar_field = next((f for f in fields if "AADHAAR" in f.name.upper()), None)
if aadhaar_field:
# Validate Aadhaar number using field's pattern
is_valid = aadhaar_field.validate("1234 5678 9012")
print(f"Valid Aadhaar: {is_valid}")
Field Privacy Analysis
# Get high privacy fields
manager = CountryRegulationManager(["US", "IN"])
high_privacy_fields = manager.get_high_privacy_fields()
for field in high_privacy_fields:
print(f"{field.display_name}: {field.privacy_level.name}")
if field.description:
print(f" Description: {field.description}")
Export Configuration
# Export configuration for offline use
manager = CountryRegulationManager(["US", "IN"])
manager.export_configuration("config/country_regulations.json")
# Get requirements summary
summary = manager.get_requirements_summary()
for country, requirements in summary.items():
print(f"\n{country}:")
for req in requirements:
print(f" - {req}")
Integration with De-identification
from scripts.deidentify import DeidentificationEngine
from scripts.utils.country_regulations import CountryRegulationManager
# Set up country-specific de-identification
reg_manager = CountryRegulationManager("IN")
# Get detection patterns for use in de-identification
patterns = reg_manager.get_detection_patterns()
# Initialize de-identification engine (passes country code to engine)
engine = DeidentificationEngine(country_code="IN")
# De-identify with country-specific patterns
text = "Patient Aadhaar: 1234 5678 9012, PAN: ABCDE1234F"
deidentified = engine.deidentify_text(text)
print(deidentified)
Command-Line Interface
The module can be used as a standalone script:
# List all supported countries
python -m scripts.utils.country_regulations --list
# Show regulations for specific countries
python -m scripts.utils.country_regulations -c US IN
# Show all data fields
python -m scripts.utils.country_regulations -c US --show-fields
# Export configuration for multiple countries
python -m scripts.utils.country_regulations -c US IN BR --export config/regulations.json
# Show all countries at once
python -m scripts.utils.country_regulations -c ALL
Privacy Levels
The module defines five privacy sensitivity levels:
PUBLIC (Level 1): Non-sensitive, publicly available data
LOW (Level 2): Low-risk identifiers (geographic regions, dates without times)
MEDIUM (Level 3): Moderate-risk identifiers (full dates, ages, zip codes)
HIGH (Level 4): High-risk identifiers (names, phone numbers, emails)
CRITICAL (Level 5): Critical identifiers (SSN, medical records, biometrics)
Regulatory Requirements
Each country regulation includes:
Retention Requirements: Data retention periods (days)
Breach Notification: Notification timelines and authorities
Consent Requirements: Types of consent needed
Data Subject Rights: Right to access, correction, erasure, portability
Cross-Border Transfer: Rules for international data transfer
Special Categories: Additional protections for sensitive data
See Also
- scripts.deidentify module
De-identification engine that uses country regulations
- Country-Specific Privacy Rules
Detailed user guide on country-specific regulations
- De-identification
General de-identification documentation
- Extending RePORTaLiN
Guide for adding new countries