Configuration

Most users only need a few settings: study name, model provider, optional API key, and a small number of PHI-related choices.

Minimum Settings

Set the study name if the folder cannot be auto-detected:

export STUDY_NAME=Indo-VAP

Choose one model provider.

Local Ollama, no API key:

export LLM_PROVIDER=ollama
export LLM_MODEL=qwen3:8b

Hosted providers:

export LLM_PROVIDER=anthropic
export ANTHROPIC_API_KEY=sk-ant-...
export LLM_MODEL=claude-opus-4-7

# or
export LLM_PROVIDER=openai
export OPENAI_API_KEY=sk-...
export LLM_MODEL=gpt-5.5

# or
export LLM_PROVIDER=google-genai
export GOOGLE_API_KEY=...
export LLM_MODEL=gemini-3.1-pro-preview

The web UI also lets you choose the provider and paste the key during setup.

Study Folder

The expected input layout is:

data/raw/{STUDY_NAME}/
├── datasets/
├── data_dictionary/
└── annotated_pdfs/        # optional

The main output appears under:

output/{STUDY_NAME}/

PHI Key

The scrubber needs one local PHI key for stable pseudonyms and date shifting. Normal web-UI users do not create this key manually; the Load Study flow creates it when needed.

Developers and deployment operators can provision the key from the command line when running the pipeline outside the web UI. Keep this key outside the repository and back it up according to the study team’s policy. Rotating it changes pseudonyms and requires a full re-run.

Where to Put More Detail

User-facing configuration should stay short. Detailed implementation behavior belongs in:

Next Step

Run Quick Start after choosing the settings above.