Welcome to RePORTaLiN Documentation

RePORTaLiN is a robust data extraction pipeline for processing medical research data from Excel files to JSONL format with advanced PHI/PII de-identification capabilities.

Current Version: |version| (October 28, 2025)

Python 3.13+ Code Optimized 68% Privacy-Aware

Key Features

🌍 Multi-Country Privacy Compliance
  • 14 countries supported (US, IN, ID, BR, PH, ZA, EU, GB, CA, AU, KE, NG, GH, UG)

  • HIPAA, GDPR, LGPD, DPDPA, POPIA compliance

  • 21 PHI/PII identifier types detected and pseudonymized

🔒 Security & Performance
  • Encryption by default (AES-128)

  • Fast processing with optimized algorithms

  • Date shifting with temporal relationship preservation

  • Audit trails for compliance validation

📊 Data Processing
  • Multi-table detection from complex Excel layouts

  • JSONL output for efficient streaming

  • Progress tracking with real-time feedback

  • Duplicate detection and intelligent column handling

🔧 Robust Configuration
  • Enhanced error handling

  • Auto-detection of dataset folders

  • Type-safe with full type hints

  • Cross-platform support (macOS, Linux, Windows)

What’s New in 0.8.5

See Changelog for complete version history and detailed release notes.

Documentation Sections

👥 For Users - Learn how to install and use RePORTaLiN

👥 User Guide

🔧 For Developers - Contribute to RePORTaLiN development

🔧 Developer Guide

📚 API Reference - Technical documentation for all modules

📚 API Reference

📋 Additional Information

📋 Additional Information

Note

📖 Documentation Modes

This documentation can be built in two modes:

  • User Mode (make user-mode): Shows only user-facing documentation

  • Developer Mode (make dev-mode): Includes developer guides and API documentation

Alternatively, set the DEVELOPER_MODE environment variable (True/False) or edit conf.py and set developer_mode = True or False.

Indices and Tables