8.5 Model Settings (Model Providers)

Model Settings allow you to manage all AI model sources, accounts, runtime parameters, and default rules used across Pop.
Whether it is chat, writing, knowledge base retrieval, image analysis, or speech-to-text, every capability relies on model providers.

This chapter introduces how to manage model providers, model parameters, default models, and their operation logic.

1. Model Provider Management

Pop supports multiple mainstream AI providers and allows mixed usage.

✅ Built‑in Supported Providers

Category	Provider	Description
Text & Multimodal	OpenAI	Supports GPT series, O models, and multimodal
	DeepSeek	Strong reasoning and logical capability
	Moonshot	High‑value long‑context models
	Google Gemini	Powerful multimodal understanding
	Anthropic Claude	Excellent summarization & comprehension
Local Models	Ollama	Supports Qwen, LLaMA, Phi, and other open models
	LM Studio	Desktop local inference engine
Custom Models	HTTP API	Connect any service compatible with OpenAI API

You may add, edit, disable, or delete model providers freely.

2. Text Large Language Models (Text LLM)

Text LLMs are the most frequently used models in Pop, powering:

Chat and Q&A
Writing and summarization
Workflow execution (AI nodes)
Knowledge base rewriting / summarization / answering
Code generation and debugging

Configurable Options

Item	Description
API Key	Securely stored provider secret
Model Name	e.g., gpt‑4o‑mini, deepseek-chat
Base URL	Required for self‑hosted models
Org ID / Project ID	Required for some providers
Concurrency & Rate‑Limit	Controls API call rate to avoid throttling
Default Model	Default model used for Chat, KB, or Workflows

3. Image Models

Used for image generation, editing, enhancement, and OCR.
Examples:

Screenshot understanding
Vision‑based Q&A
Image‑to‑image or text‑to‑image generation
Multimodal PDF analysis
Workflow image processing nodes

Configurable Options

Model type (OCR / Generation / Enhancement)
Output resolution (e.g., 1024×1024)
Quality modes (standard / high)
Safety levels
Output format (PNG / JPG / WebP)

4. Speech Models (ASR / TTS)

Used for:

Speech‑to‑text (ASR)
Text‑to‑speech (TTS)
Extracting audio from video (future)

Configurable Options

Input language
Output voice type (male / female / narrator)
Output format (mp3 / wav / pcm)
Local or cloud engine choice

Pop includes Whisper, local ASR, and TTS/ASR models from supported providers.

5. Video Models

(If enabled in your Pop version)

Used for:

Auto‑extract video subtitles
Video summarization
Multimodal conversation (frame analysis)
Video‑based knowledge bases (future)

Supports model selection, frame interval, output format, etc.

6. Global Model Parameters

These parameters apply to all model types.

Common Parameters

Parameter	Description
temperature	Controls randomness (higher = more creative)
top_p	Nucleus sampling
max_tokens	Max output length
frequency_penalty	Reduces repeated phrases
presence_penalty	Encourages topic diversity
stop	Custom stop sequences

Applies to:

Global defaults
Chat defaults
Workflow AI nodes
Notes / document AI settings

7. Custom Models

You can add any service compatible with the OpenAI API protocol.

You need to provide:

Base URL
API Key
Model name
Streaming support (yes/no)
Optional request templates
Timeout & retry strategy

Useful for:

Self‑hosted models (vLLM, LMDeploy, Transformers + FastAPI)
Third‑party providers (SiliconFlow, TogetherAI, etc.)

8. Model Testing Tool

Model Settings includes a built‑in test tool:

Input test prompts
View streaming output
View token usage
Error debugging
Latency measurement

Ideal for validating new model integrations.

9. Default Models & Priority Rules

Pop allows selecting different default models for each feature:

Feature	Default Model Type
Chat Window	Default chat model
Document Summary	Default long‑text model
Knowledge Base	Default KB inference model
Workflows	AI node default model
Image Analysis	Default OCR model
Speech‑to‑Text	Default ASR model

All configurable in System Settings → Model Settings.

10. Best Practices

Need reliability → Use OpenAI official models
Need price‑performance → Use DeepSeek / Moonshot / SiliconFlow
Need offline capability → Use local Ollama models
Long‑context tasks → Choose large‑context models (e.g., 200k tokens)
Image understanding → Pick strong multimodal models (GPT‑4o series, Gemini)
Workflow AI nodes → Set parameters individually per node

For further configuration, visit System Settings → Model Settings.