Skip to main content

Model Fine-Tuning

3 Frameworks × 7 Training Stages × 3 Tuning Methods — Enterprise LLM Alignment & Customization

Product Overview

Enterprise-grade LLM fine-tuning platform built on LlamaFactory, Unsloth, and Axolotl. Covers seven training stages — SFT, DPO, KTO, RM, PPO, GRPO, and PT — with LoRA, QLoRA, and Full tuning methods. Provides 20+ visual hyperparameter controls, automatic post-training evaluation, dual-model A/B comparison, and LoRA merge export with quantization, delivering an end-to-end pipeline from data to aligned model.

Core Capabilities

Three Fine-Tuning Frameworks

Built-in support for LlamaFactory, Unsloth, and Axolotl via a unified task submission interface. Frameworks are registered through a plugin mechanism that auto-generates training commands and configuration files — no need to handle framework differences manually.

Seven Training Stages

Covers SFT (supervised fine-tuning), DPO (direct preference optimization), KTO, RM (reward modeling), PPO (proximal policy optimization), GRPO, and PT (continued pre-training) — addressing the full spectrum from instruction following to human preference alignment.

Three Tuning Methods

Supports LoRA (low-rank adaptation), QLoRA (quantized low-rank adaptation), and Full (full-parameter fine-tuning). Default learning rates are auto-set per method — 1e-4 for LoRA, 2e-4 for QLoRA, 5e-5 for Full — reducing the hyperparameter tuning barrier.

Visual Hyperparameter Configuration

Graphical panel for 20+ hyperparameters including epochs, batchSize, gradientAccumulationSteps, cutoffLen, warmupRatio, LoRA rank/alpha/dropout, and more. Parameters are shown or hidden dynamically based on the selected tuning method and training stage.

Automatic Evaluation Trigger

Enable the autoEval toggle and specify an evaluation dataset at creation time; the platform automatically triggers model evaluation upon training completion with no manual intervention. A front-end warning fires when evaluation and training datasets overlap to prevent data leakage.

Model Comparison & Merge Export

Launch temporary inference services post-training to load the base model and fine-tuned model side by side for streaming A/B chat comparison. Export via LoRA merge with None / INT8 / INT4 quantization options. Inference resources are auto-reclaimed on TTL expiry.

Tuning Method × Training Stage Support Matrix

Training Stage LoRA QLoRA Full
SFT
DPO
KTO
RM
PPO
GRPO

Fine-Tuning Workflow

1

Select Base Model

Choose the base model from the model repository and specify its storage path

2

Choose Framework & Stage

Select the fine-tuning framework (LlamaFactory/Unsloth/Axolotl), training stage (SFT/DPO/KTO, etc.), and tuning method (LoRA/QLoRA/Full)

3

Configure Hyperparameters

Visually configure 20+ training hyperparameters; the platform auto-fills recommended defaults based on the selected method

4

Train & Monitor

Submit the job and track loss/learningRate/gradNorm curves in real time with automatic detection of startup phases like dataset download and model loading

5

Compare & Export

A/B compare pre- and post-tuning results, then one-click LoRA merge export with optional INT8/INT4 quantization

Back to Rise ModelX