How to Fine-Tune a Local Mistral or Llama 3 Model on Your Own Dataset

In this article, you will learn how to fine-tune open-source large language models for customer support using Unsloth and QLoRA, from dataset preparation through training, testing, and comparison.

Topics we will cover include:

Setting up a Colab environment and installing required libraries.
Preparing and formatting a customer support dataset for instruction tuning.
Training with LoRA adapters, saving, testing, and comparing against a base model.

Let’s get to it.

How to Fine-Tune a Local Mistral or Llama 3 Model on Your Own Dataset

How to Fine-Tune a Local Mistral/Llama 3 Model on Your Own Dataset

Introduction

Large language models (LLMs) like Mistral 7B and Llama 3 8B have shaken the AI field, but their broad nature limits their application to specialized areas. Fine-tuning transforms these general-purpose models into domain-specific experts. For customer support, this means an 85% reduction in response time, a consistent brand voice, and 24/7 availability. Fine-tuning LLMs for specific domains, such as customer support, can dramatically improve their performance on industry-specific tasks.

In this tutorial, we’ll learn how to fine-tune two powerful open-source models, Mistral 7B and Llama 3 8B, using a customer support question-and-answer dataset. By the end of this tutorial, you’ll learn how to:

Set up a cloud-based training environment using Google Colab
Prepare and format customer support datasets
Fine-tune Mistral 7B and Llama 3 8B using Quantized Low-Rank Adaptation (QLoRA)
Evaluate model performance
Save and deploy your custom models

Prerequisites

Here’s what you will need to make the most of this tutorial.

A Google account for accessing Google Colab. You can check Colab here to see if you are ready to access.
A Hugging Face account for accessing models and datasets. You can sign up here.

After you have access to Hugging Face, you will need to request access to these 2 gated models:

Mistral: Mistral-7B-Instruct-v0.3
Llama 3: Meta-Llama-3-8B-Instruct

And as far as the requisite knowledge you should have before starting, here’s a concise overview:

Basic Python programming
Be familiar with Jupyter notebooks
Understanding of machine learning concepts (helpful but not required)
Basic command-line knowledge

You should now be ready to get started.

The Fine-Tuning Process

Fine-tuning adapts a pre-trained LLM to specific tasks by continuing training on domain-specific data. Unlike prompt engineering, fine-tuning actually modifies model weights.

Step 1: Getting Started with Google Colab

Visit Google Colab
Create new notebook: File → New Notebook
Give it a preferred name
Set GPU: Runtime → Change runtime type → T4 GPU

Step 2: Install Unsloth (Run This First)

Here, we will install Unsloth and its dependencies. Unsloth handles CUDA setup automatically.

!pip install “unsloth[colab-new] @ git+ !pip install –no-deps xformers trl peft accelerate bitsandbytes print(“Unsloth installed successfully!”)

!pip install “unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git”

!pip install —no–deps xformers trl peft accelerate bitsandbytes

print(“Unsloth installed successfully!”)

Step 3: Import Unsloth and Setup

The next step is to import Unsloth and perform basic checks.

from unsloth import FastLanguageModel import torch from trl import SFTTrainer from transformers import TrainingArguments from datasets import Dataset import pandas as pd import numpy as np print(“Unsloth loaded successfully!”) print(f”PyTorch: {torch.__version__}”) print(f”CUDA: {torch.cuda.is_available()}”) print(f”GPU: {torch.cuda.get_device_name(0) if torch.cuda.is_available() else ‘None’}”)

from unsloth import FastLanguageModel

import torch

from trl import SFTTrainer

from transformers import TrainingArguments

from datasets import Dataset

import pandas as pd

import numpy as np

print(“Unsloth loaded successfully!”)

print(f“PyTorch: {torch.__version__}”)

print(f“CUDA: {torch.cuda.is_available()}”)

print(f“GPU: {torch.cuda.get_device_name(0) if torch.cuda.is_available() else ‘None’}”)

Step 4: Create Customer Support Dataset

We will supply realistic customer support data in this section for the purpose of fine-tuning the model.

print(“Creating customer support Q&A dataset…”) # Create realistic customer support data customer_support_data = [ { “instruction”: “You are a helpful customer support agent. Answer clearly and professionally.”, “input”: “How do I reset my password?”, “output”: “To reset your password: 1) Click ‘Forgot Password’ on the login page 2) Enter your registered email 3) Check your inbox for the reset link 4) Click the link and create a new password. Check your spam folder if not received within 5 minutes.” }, { “instruction”: “You are a helpful customer support agent. Answer clearly and professionally.”, “input”: “My payment was declined, what should I do?”, “output”: “Common solutions: 1) Check that the card has sufficient funds 2) Ensure the card is not expired 3) Contact your bank to unblock the transaction 4) Try an alternative payment method like PayPal 5) Wait 24 hours before retrying.” }, { “instruction”: “You are a helpful customer support agent. Answer clearly and professionally.”, “input”: “How do I cancel my subscription?”, “output”: “To cancel: 1) Log in to your account 2) Go to ‘Billing & Subscriptions’ 3) Click ‘Cancel Subscription’ 4) Confirm cancellation. Your access continues until the end of the billing period. No partial refunds.” }, { “instruction”: “You are a helpful customer support agent. Answer clearly and professionally.”, “input”: “The app keeps crashing on startup”, “output”: “Troubleshooting steps: 1) Update the app to the latest version 2) Restart your device 3) Clear the app cache/data 4) Reinstall the app 5) Ensure 500MB free storage 6) Update your device OS. If it persists, contact support with error logs.” }, { “instruction”: “You are a helpful customer support agent. Answer clearly and professionally.”, “input”: “Where is my order tracking information?”, “output”: “You can find tracking: 1) Check the confirmation email 2) Log into your account > Orders 3) Click the tracking number. Standard shipping: 3-5 business days. Contact the carrier with the tracking number for the exact location.” }, { “instruction”: “You are a helpful customer support agent. Answer clearly and professionally.”, “input”: “How do I contact customer support?”, “output”: “Support options: 1) Live chat: Mon–Fri 9AM–6PM EST 2) Email: support@company.com (response within 24h) 3) Phone: 1-800-123-4567 4) Twitter: @CompanySupport. Please have your account email ready.” }, { “instruction”: “You are a helpful customer support agent. Answer clearly and professionally.”, “input”: “Can I get a refund?”, “output”: “Refund policy: 1) Digital products: Refund within 14 days if not used 2) Subscriptions: No refunds for partial months 3) Physical goods: Return within 30 days with receipt. Contact billing@company.com with your order ID.” }, { “instruction”: “You are a helpful customer support agent. Answer clearly and professionally.”, “input”: “How do I change my email address?”, “output”: “To change your email: 1) Log in to your account 2) Go to Account Settings 3) Click ‘Change Email’ 4) Enter the new email 5) Verify via the confirmation email sent to the new address. The update takes effect immediately.” }, { “instruction”: “You are a helpful customer support agent. Answer clearly and professionally.”, “input”: “Is my data secure?”, “output”: “Yes, we use: 1) AES-256 encryption 2) SSL/TLS for all connections 3) Regular security audits 4) GDPR compliance 5) Data is never sold to third parties. Read our privacy policy for details.” }, { “instruction”: “You are a helpful customer support agent. Answer clearly and professionally.”, “input”: “Do you offer student discounts?”, “output”: “Yes! Student discount: 50% off. Requirements: 1) Valid .edu email OR 2) Student ID photo. Apply: Email student@company.com with proof. The discount is applied after verification.” }, ]

print(“Creating customer support Q&A dataset…”)

# Create realistic customer support data

customer_support_data = [

{