system-settings/Model Settings

8.5 Model Settings (Model Providers)

Model Settings allow you to manage all AI model sources, accounts, runtime parameters, and default rules used across Pop.
Whether it is chat, writing, knowledge base retrieval, image analysis, or speech-to-text, every capability relies on model providers.

This chapter introduces how to manage model providers, model parameters, default models, and their operation logic.


1. Model Provider Management

Pop supports multiple mainstream AI providers and allows mixed usage.

✅ Built‑in Supported Providers

Category Provider Description
Text & Multimodal OpenAI Supports GPT series, O models, and multimodal
DeepSeek Strong reasoning and logical capability
Moonshot High‑value long‑context models
Google Gemini Powerful multimodal understanding
Anthropic Claude Excellent summarization & comprehension
Local Models Ollama Supports Qwen, LLaMA, Phi, and other open models
LM Studio Desktop local inference engine
Custom Models HTTP API Connect any service compatible with OpenAI API

You may add, edit, disable, or delete model providers freely.


2. Text Large Language Models (Text LLM)

Text LLMs are the most frequently used models in Pop, powering:

  • Chat and Q&A
  • Writing and summarization
  • Workflow execution (AI nodes)
  • Knowledge base rewriting / summarization / answering
  • Code generation and debugging

Configurable Options

Item Description
API Key Securely stored provider secret
Model Name e.g., gpt‑4o‑mini, deepseek-chat
Base URL Required for self‑hosted models
Org ID / Project ID Required for some providers
Concurrency & Rate‑Limit Controls API call rate to avoid throttling
Default Model Default model used for Chat, KB, or Workflows

3. Image Models

Used for image generation, editing, enhancement, and OCR.
Examples:

  • Screenshot understanding
  • Vision‑based Q&A
  • Image‑to‑image or text‑to‑image generation
  • Multimodal PDF analysis
  • Workflow image processing nodes

Configurable Options

  • Model type (OCR / Generation / Enhancement)
  • Output resolution (e.g., 1024×1024)
  • Quality modes (standard / high)
  • Safety levels
  • Output format (PNG / JPG / WebP)

4. Speech Models (ASR / TTS)

Used for:

  • Speech‑to‑text (ASR)
  • Text‑to‑speech (TTS)
  • Extracting audio from video (future)

Configurable Options

  • Input language
  • Output voice type (male / female / narrator)
  • Output format (mp3 / wav / pcm)
  • Local or cloud engine choice

Pop includes Whisper, local ASR, and TTS/ASR models from supported providers.


5. Video Models

(If enabled in your Pop version)

Used for:

  • Auto‑extract video subtitles
  • Video summarization
  • Multimodal conversation (frame analysis)
  • Video‑based knowledge bases (future)

Supports model selection, frame interval, output format, etc.


6. Global Model Parameters

These parameters apply to all model types.

Common Parameters

Parameter Description
temperature Controls randomness (higher = more creative)
top_p Nucleus sampling
max_tokens Max output length
frequency_penalty Reduces repeated phrases
presence_penalty Encourages topic diversity
stop Custom stop sequences

Applies to:

  • Global defaults
  • Chat defaults
  • Workflow AI nodes
  • Notes / document AI settings

7. Custom Models

You can add any service compatible with the OpenAI API protocol.

You need to provide:

  • Base URL
  • API Key
  • Model name
  • Streaming support (yes/no)
  • Optional request templates
  • Timeout & retry strategy

Useful for:

  • Self‑hosted models (vLLM, LMDeploy, Transformers + FastAPI)
  • Third‑party providers (SiliconFlow, TogetherAI, etc.)

8. Model Testing Tool

Model Settings includes a built‑in test tool:

  • Input test prompts
  • View streaming output
  • View token usage
  • Error debugging
  • Latency measurement

Ideal for validating new model integrations.


9. Default Models & Priority Rules

Pop allows selecting different default models for each feature:

Feature Default Model Type
Chat Window Default chat model
Document Summary Default long‑text model
Knowledge Base Default KB inference model
Workflows AI node default model
Image Analysis Default OCR model
Speech‑to‑Text Default ASR model

All configurable in System Settings → Model Settings.


10. Best Practices

  • Need reliability → Use OpenAI official models
  • Need price‑performance → Use DeepSeek / Moonshot / SiliconFlow
  • Need offline capability → Use local Ollama models
  • Long‑context tasks → Choose large‑context models (e.g., 200k tokens)
  • Image understanding → Pick strong multimodal models (GPT‑4o series, Gemini)
  • Workflow AI nodes → Set parameters individually per node

For further configuration, visit System Settings → Model Settings.