Sphinx Auto-Documentation Guideο
For Developers: Automated Documentation System
This guide explains how Sphinx automatically generates documentation from your code and how to enhance automation for βwrite code β instant docsβ workflow.
Last Updated: October 23, 2025
Current Automation Statusο
β Whatβs Already Automatedο
API Documentation from Docstrings (FULLY AUTOMATED)
Sphinx autodoc automatically extracts:
Function signatures with type hints
Docstrings (Google/NumPy style)
Class hierarchies and inheritance
Module-level documentation
Return types and parameters
Example:
# In your code: config.py def normalize_dataset_name(folder_name: Optional[str]) -> str: """ Normalize a dataset folder name by removing common suffixes. Args: folder_name: Dataset folder name to normalize Returns: Normalized dataset name without common suffixes """ # ... implementation
Result: Automatically appears in
docs/sphinx/api/config.rstwhen you runmake html!Type Hints Rendering (FULLY AUTOMATED)
The
sphinx-autodoc-typehintsextension automatically renders:Function parameters with types
Return type annotations
Variable type hints
Complex types (List, Dict, Optional, etc.)
Version Tracking (SEMI-AUTOMATED)
Version is automatically pulled from
__version__.py:# docs/sphinx/conf.py from __version__ import __version__ version: str = __version__ release: str = __version__
Cross-References (AUTOMATED)
Sphinx automatically creates links between:
Function references
Class references
Module references
External library docs (via intersphinx)
β Whatβs Still Manualο
User Guides - Manual writing required
Tutorials and how-tos
Conceptual explanations
Examples and workflows
Developer Guides - Manual writing required
Architecture decisions
Design patterns
Best practices
Changelog - Manual updates required
Version history
Breaking changes
Migration guides
How It Worksο
The Autodoc Pipelineο
1. You write code with docstrings
β
2. Sphinx autodoc reads Python source
β
3. Extracts docstrings, signatures, types
β
4. Generates .rst documentation
β
5. Builds HTML automatically
Example Flow:
# Step 1: Write code (config.py)
def ensure_directories() -> None:
"""Create all required output directories.
This function creates:
- RESULTS_DIR
- CLEAN_DATASET_DIR
- DICTIONARY_JSON_OUTPUT_DIR
Raises:
OSError: If directory creation fails
"""
os.makedirs(RESULTS_DIR, exist_ok=True)
# ...
# Step 2: Run Sphinx build
cd docs/sphinx && make html
# Step 3: Documentation is automatically generated! β
Current Setupο
Sphinx Extensions Enabledο
# docs/sphinx/conf.py
extensions = [
'sphinx.ext.autodoc', # Auto-generate from docstrings β
'sphinx.ext.viewcode', # Link to source code β
'sphinx.ext.intersphinx', # Link to external docs β
'sphinx.ext.napoleon', # Google/NumPy docstrings β
'sphinx_autodoc_typehints', # Render type hints β
]
Auto-Documentation Filesο
These files use automodule directive to auto-generate content:
docs/sphinx/api/
βββ modules.rst # Auto-generated module index
βββ config.rst # Auto-docs for config.py
βββ main.rst # Auto-docs for main.py
βββ scripts.rst # Auto-docs for scripts package
βββ scripts.deidentify.rst # Auto-docs for deidentify.py
βββ scripts.extract_data.rst # Auto-docs for extract_data.py
βββ scripts.load_dictionary.rst # Auto-docs for load_dictionary.py
βββ scripts.utils.*.rst # Auto-docs for utils modules
Each uses:
.. automodule:: config
:members:
:undoc-members:
:show-inheritance:
Enhancing Automationο
π Level 1: Watch Mode (AVAILABLE NOW)ο
Auto-rebuild documentation when files change:
# Install sphinx-autobuild
pip install sphinx-autobuild
# Run in watch mode
cd docs/sphinx
sphinx-autobuild . _build/html
# Opens browser, auto-refreshes on code changes! β¨
Makefile target (add this):
.PHONY: docs-watch
docs-watch:
@cd docs/sphinx && sphinx-autobuild . _build/html --open-browser
Then just:
make docs-watch
Now whenever you save a Python file with docstrings, the docs rebuild automatically!
π Level 2: Git Hook Integration (RECOMMENDED)ο
Automatically rebuild docs when you commit code changes:
Create `.git/hooks/post-commit`:
#!/bin/bash
# Auto-rebuild documentation after code commits
echo "π§ Rebuilding documentation..."
cd docs/sphinx
make html
echo "β
Documentation updated!"
chmod +x .git/hooks/post-commit
Now docs rebuild every time you commit! β¨
π Level 3: CI/CD Auto-Deploy (PRODUCTION)ο
Automatically build and deploy docs on every push:
GitHub Actions Example (.github/workflows/docs.yml):
name: Build and Deploy Docs
on:
push:
branches: [main]
jobs:
build-docs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.13'
- name: Install dependencies
run: |
pip install -r requirements.txt
pip install sphinx sphinx_rtd_theme
- name: Build documentation
run: |
cd docs/sphinx
make html
- name: Deploy to GitHub Pages
uses: peaceiris/actions-gh-pages@v3
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
publish_dir: docs/sphinx/_build/html
Result: Push code β Docs auto-build β Deploy to web! π
π Level 4: Docstring Quality Checks (AUTOMATION)ο
Ensure docstrings exist and are properly formatted:
pydocstyle check:
# Install pydocstyle
pip install pydocstyle
# Check docstring quality
pydocstyle scripts/
Add to pre-commit hook:
#!/bin/bash
# .git/hooks/pre-commit
echo "Checking docstrings..."
pydocstyle scripts/ || {
echo "β Docstring issues found!"
exit 1
}
echo "β
Docstrings OK"
π Level 5: Auto-Generate Changelog (ADVANCED)ο
Auto-generate changelog from commit messages:
Install conventional-changelog:
npm install -g conventional-changelog-cli
# Generate changelog
conventional-changelog -p angular -i CHANGELOG.md -s
Or use Python:
pip install gitchangelog
gitchangelog > docs/sphinx/changelog.rst
Best Practices for Auto-Documentationο
Write Good Docstringsο
Use Google or NumPy style consistently:
def process_data(input_file: str, options: Dict[str, Any]) -> pd.DataFrame:
"""Process input data file with specified options.
This function reads an Excel file and applies various transformations
based on the provided options dictionary.
Args:
input_file: Path to input Excel file
options: Dictionary of processing options with keys:
- 'validate': bool - Enable validation
- 'clean': bool - Remove empty rows
Returns:
Processed DataFrame with cleaned data
Raises:
FileNotFoundError: If input file doesn't exist
ValueError: If options are invalid
Example:
>>> df = process_data('data.xlsx', {'validate': True})
>>> len(df)
100
Note:
This function modifies data in-place. Make a copy if needed.
See Also:
validate_data: Validation function used internally
"""
# ... implementation
Use Type Hints Everywhereο
from typing import Optional, List, Dict, Any
def get_dataset_folder() -> Optional[str]:
"""Get the first dataset folder."""
# Type hint automatically appears in docs!
Add Module-Level Documentationο
"""
Data Extraction Module
======================
This module provides functions for extracting data from Excel files
and converting to JSONL format.
Key Functions:
- extract_excel_to_jsonl: Main extraction function
- process_excel_file: Single file processor
- clean_record_for_json: Data cleaning
Example:
>>> from scripts.extract_data import extract_excel_to_jsonl
>>> extract_excel_to_jsonl(input_dir, output_dir)
"""
Use Explicit __all__ Exportsο
__all__ = [
'extract_excel_to_jsonl',
'process_excel_file',
'clean_record_for_json',
]
Only these appear in from module import * and are prioritized in docs.
Current Workflowο
Immediate Auto-Documentationο
Right now, you can already do this:
# 1. Write code with docstrings
vim config.py
# 2. Build docs (reads your code automatically)
cd docs/sphinx && make html
# 3. View updated docs
open _build/html/api/config.html
Your docstrings β Instant API docs! β
Recommended Workflowο
# Terminal 1: Watch mode
make docs-watch
# Terminal 2: Write code
vim scripts/deidentify.py
# Add/update docstrings
# Save file
# β Browser automatically refreshes with new docs! β¨
Implementation Checklistο
Quick Wins (Do Now)ο
β Install sphinx-autobuild
β Add docs-watch target to Makefile
β Create post-commit git hook
β Document the workflow for team
Medium Termο
β Set up GitHub Actions for auto-deploy
β Add pydocstyle to pre-commit hooks
β Create docstring templates/snippets
β Add coverage reports for documentation
Long Termο
β Auto-generate changelog from commits
β Set up Read the Docs hosting
β Add API diff detection for breaking changes
β Implement version-specific documentation
Summaryο
You Already Have:
β
Auto-documentation from docstrings (autodoc)
β
Type hints rendering (sphinx-autodoc-typehints)
β
Cross-references and linking
β
Multiple output formats (HTML, PDF)
You Can Add:
π Watch mode for instant rebuilds π Git hooks for automatic updates π CI/CD for automatic deployment π Quality checks for docstrings π Automated changelog generation
The Goal:
Write code β Save file β Docs update automatically β¨
With sphinx-autobuild in watch mode, youβre already 90% there!
External Resourcesο
β
TL;DR: Yes! Sphinx already auto-generates API docs from your code. Install sphinx-autobuild for instant updates while you code! π