Configuration
Most users only need a few settings: study name, model provider, optional API key, and a small number of PHI-related choices.
Minimum Settings
Set the study name if the folder cannot be auto-detected:
export STUDY_NAME=Indo-VAP
Choose one model provider.
Local Ollama, no API key:
export LLM_PROVIDER=ollama
export LLM_MODEL=qwen3:8b
Hosted providers:
export LLM_PROVIDER=anthropic
export ANTHROPIC_API_KEY=sk-ant-...
export LLM_MODEL=claude-opus-4-7
# or
export LLM_PROVIDER=openai
export OPENAI_API_KEY=sk-...
export LLM_MODEL=gpt-5.5
# or
export LLM_PROVIDER=google-genai
export GOOGLE_API_KEY=...
export LLM_MODEL=gemini-3.1-pro-preview
The web UI also lets you choose the provider and paste the key during setup.
Recommended Default
For the strongest local privacy posture, start with Ollama:
export LLM_PROVIDER=ollama
export LLM_MODEL=qwen3:8b
Ollama runs on the user’s machine and does not require an external API key.
Study Folder
The expected input layout is:
data/raw/{STUDY_NAME}/
├── datasets/
├── data_dictionary/
└── annotated_pdfs/ # optional
The main output appears under:
output/{STUDY_NAME}/
PHI Key
The scrubber needs one local PHI key for stable pseudonyms and date shifting. Normal web-UI users do not create this key manually; the Load Study flow creates it when needed.
Developers and deployment operators can provision the key from the command line when running the pipeline outside the web UI. Keep this key outside the repository and back it up according to the study team’s policy. Rotating it changes pseudonyms and requires a full re-run.
Where to Put More Detail
User-facing configuration should stay short. Detailed implementation behavior belongs in:
Next Step
Run Quick Start after choosing the settings above.