Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Academics Jobs in Medical doctors Information Secondary Faculty 2026 Job Commercial Pakistan

    March 4, 2026

    ‘I don’t think the public are buying it’: Fallout from the Musqueam agreements

    March 4, 2026

    Hania Aamir’s drama criticized for romanticizing violent characters

    March 4, 2026
    Facebook X (Twitter) Instagram
    Wednesday, March 4
    Trending
    • Academics Jobs in Medical doctors Information Secondary Faculty 2026 Job Commercial Pakistan
    • ‘I don’t think the public are buying it’: Fallout from the Musqueam agreements
    • Hania Aamir’s drama criticized for romanticizing violent characters
    • A collection of presidency hacking instruments focusing on iPhones is now being utilized by cybercriminals
    • PTA warns towards unauthorised VAS activation
    • What’s at Stake for Crypto as Three US States Kick off Celebration Primaries?
    • 10 Tips and Tricks When Playing as Leon in Resident Evil Requiem
    • KM College System Lahore Jobs 2026 for Instructing Workers 2026 Job Commercial Pakistan
    • Iran war oil price spike could cushion Alberta, Saskatchewan budgets
    • Turkiye will contribute to re-establishment of Pak-Afghan ceasefire, Erdogan tells PM Shehbaz – Pakistan
    Facebook X (Twitter) Instagram Pinterest Vimeo
    The News92The News92
    • Home
    • World
    • National
    • Sports
    • Crypto
    • Travel
    • Lifestyle
    • Jobs
    • Insurance
    • Gaming
    • AI & Tech
    • Health & Fitness
    The News92The News92
    Home - AI & Tech - How to Build a Stable and Efficient QLoRA Fine-Tuning Pipeline Using Unsloth for Large Language Models
    AI & Tech

    How to Build a Stable and Efficient QLoRA Fine-Tuning Pipeline Using Unsloth for Large Language Models

    Naveed AhmadBy Naveed AhmadMarch 4, 2026No Comments4 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    In this tutorial, we demonstrate how to efficiently fine-tune a large language model using Unsloth and QLoRA. We focus on building a stable, end-to-end supervised fine-tuning pipeline that handles common Colab issues such as GPU detection failures, runtime crashes, and library incompatibilities. By carefully controlling the environment, model configuration, and training loop, we show how to reliably train an instruction-tuned model with limited resources while maintaining strong performance and rapid iteration speed.

    import os, sys, subprocess, gc, locale
    
    
    locale.getpreferredencoding = lambda: "UTF-8"
    
    
    def run(cmd):
       print("\n$ " + cmd, flush=True)
       p = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, text=True)
       for line in p.stdout:
           print(line, end="", flush=True)
       rc = p.wait()
       if rc != 0:
           raise RuntimeError(f"Command failed ({rc}): {cmd}")
    
    
    print("Installing packages (this may take 2–3 minutes)...", flush=True)
    
    
    run("pip install -U pip")
    run("pip uninstall -y torch torchvision torchaudio")
    run(
       "pip install --no-cache-dir "
       "torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 "
       "--index-url https://download.pytorch.org/whl/cu121"
    )
    run(
       "pip install -U "
       "transformers==4.45.2 "
       "accelerate==0.34.2 "
       "datasets==2.21.0 "
       "trl==0.11.4 "
       "sentencepiece safetensors evaluate"
    )
    run("pip install -U unsloth")
    
    
    import torch
    try:
       import unsloth
       restarted = False
    except Exception:
       restarted = True
    
    
    if restarted:
       print("\nRuntime needs restart. After restart, run this SAME cell again.", flush=True)
       os._exit(0)

    We set up a controlled and compatible environment by reinstalling PyTorch and all required libraries. We ensure that Unsloth and its dependencies align correctly with the CUDA runtime available in Google Colab. We also handle the runtime restart logic so that the environment is clean and stable before training begins.

    import torch, gc
    
    
    assert torch.cuda.is_available()
    print("Torch:", torch.__version__)
    print("GPU:", torch.cuda.get_device_name(0))
    print("VRAM(GB):", round(torch.cuda.get_device_properties(0).total_memory / 1e9, 2))
    
    
    torch.backends.cuda.matmul.allow_tf32 = True
    torch.backends.cudnn.allow_tf32 = True
    
    
    def clean():
       gc.collect()
       torch.cuda.empty_cache()
    
    
    import unsloth
    from unsloth import FastLanguageModel
    from datasets import load_dataset
    from transformers import TextStreamer
    from trl import SFTTrainer, SFTConfig

    We verify GPU availability and configure PyTorch for efficient computation. We import Unsloth before all other training libraries to ensure that all performance optimizations are applied correctly. We also define utility functions to manage GPU memory during training.

    max_seq_length = 768
    model_name = "unsloth/Qwen2.5-1.5B-Instruct-bnb-4bit"
    
    
    model, tokenizer = FastLanguageModel.from_pretrained(
       model_name=model_name,
       max_seq_length=max_seq_length,
       dtype=None,
       load_in_4bit=True,
    )
    
    
    model = FastLanguageModel.get_peft_model(
       model,
       r=8,
       target_modules=["q_proj","k_proj],
       lora_alpha=16,
       lora_dropout=0.0,
       bias="none",
       use_gradient_checkpointing="unsloth",
       random_state=42,
       max_seq_length=max_seq_length,
    )
    

    We load a 4-bit quantized, instruction-tuned model using Unsloth’s fast-loading utilities. We then attach LoRA adapters to the model to enable parameter-efficient fine-tuning. We configure the LoRA setup to balance memory efficiency and learning capacity.

    ds = load_dataset("trl-lib/Capybara", split="train").shuffle(seed=42).select(range(1200))
    
    
    def to_text(example):
       example["text"] = tokenizer.apply_chat_template(
           example["messages"],
           tokenize=False,
           add_generation_prompt=False,
       )
       return example
    
    
    ds = ds.map(to_text, remove_columns=[c for c in ds.column_names if c != "messages"])
    ds = ds.remove_columns(["messages"])
    split = ds.train_test_split(test_size=0.02, seed=42)
    train_ds, eval_ds = split["train"], split["test"]
    
    
    cfg = SFTConfig(
       output_dir="unsloth_sft_out",
       dataset_text_field="text",
       max_seq_length=max_seq_length,
       packing=False,
       per_device_train_batch_size=1,
       gradient_accumulation_steps=8,
       max_steps=150,
       learning_rate=2e-4,
       warmup_ratio=0.03,
       lr_scheduler_type="cosine",
       logging_steps=10,
       eval_strategy="no",
       save_steps=0,
       fp16=True,
       optim="adamw_8bit",
       report_to="none",
       seed=42,
    )
    
    
    trainer = SFTTrainer(
       model=model,
       tokenizer=tokenizer,
       train_dataset=train_ds,
       eval_dataset=eval_ds,
       args=cfg,
    )
    

    We prepare the training dataset by converting multi-turn conversations into a single text format suitable for supervised fine-tuning. We split the dataset to maintain training integrity. We also define the training configuration, which controls the batch size, learning rate, and training duration.

    clean()
    trainer.train()
    
    
    FastLanguageModel.for_inference(model)
    
    
    def chat(prompt, max_new_tokens=160):
       messages = [{"role":"user","content":prompt}]
       text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
       inputs = tokenizer([text], return_tensors="pt").to("cuda")
       streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
       with torch.inference_mode():
           model.generate(
               **inputs,
               max_new_tokens=max_new_tokens,
               temperature=0.7,
               top_p=0.9,
               do_sample=True,
               streamer=streamer,
           )
    
    
    chat("Give a concise checklist for validating a machine learning model before deployment.")
    
    
    save_dir = "unsloth_lora_adapters"
    model.save_pretrained(save_dir)
    tokenizer.save_pretrained(save_dir)

    We execute the training loop and monitor the fine-tuning process on the GPU. We switch the model to inference mode and validate its behavior using a sample prompt. We finally save the trained LoRA adapters so that we can reuse or deploy the fine-tuned model later.

    In conclusion, we fine-tuned an instruction-following language model using Unsloth’s optimized training stack and a lightweight QLoRA setup. We demonstrated that by constraining sequence length, dataset size, and training steps, we can achieve stable training on Colab GPUs without runtime interruptions. The resulting LoRA adapters provide a practical, reusable artifact that we can deploy or extend further, making this workflow a robust foundation for future experimentation and advanced alignment techniques.


    Check out the Full Codes here. Also, feel free to follow us on Twitter and don’t forget to join our 120k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.




    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleReeves downgraded growth as business leaders demand urgent action
    Next Article Turkiye will contribute to re-establishment of Pak-Afghan ceasefire, Erdogan tells PM Shehbaz – Pakistan
    Naveed Ahmad
    • Website
    • Tumblr

    Related Posts

    AI & Tech

    A collection of presidency hacking instruments focusing on iPhones is now being utilized by cybercriminals

    March 4, 2026
    AI & Tech

    AI companies are spending millions to thwart this former tech exec’s Congressional bid

    March 4, 2026
    AI & Tech

    ChatGPT’s new GPT-5.3 Immediate mannequin will cease telling you to settle down

    March 4, 2026
    Add A Comment
    Leave A Reply Cancel Reply

    Demo
    Top Posts

    How to Get a Bigger Penis – The Stem Cell Secret to Natural Penis Enlargement & A Quiz

    February 22, 20261 Views

    10 Totally different Methods to Safe Your Enterprise Premises

    February 19, 20261 Views

    Oatly loses ‘milk’ branding battle in UK Supreme Courtroom

    February 12, 20261 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews

    Subscribe to Updates

    Get the latest tech news from FooBar about tech, design and biz.

    Demo
    Most Popular

    How to Get a Bigger Penis – The Stem Cell Secret to Natural Penis Enlargement & A Quiz

    February 22, 20261 Views

    10 Totally different Methods to Safe Your Enterprise Premises

    February 19, 20261 Views

    Oatly loses ‘milk’ branding battle in UK Supreme Courtroom

    February 12, 20261 Views
    Our Picks

    Academics Jobs in Medical doctors Information Secondary Faculty 2026 Job Commercial Pakistan

    March 4, 2026

    ‘I don’t think the public are buying it’: Fallout from the Musqueam agreements

    March 4, 2026

    Hania Aamir’s drama criticized for romanticizing violent characters

    March 4, 2026

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Facebook X (Twitter) Instagram Pinterest
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms & Conditions
    • Advertise
    • Disclaimer
    © 2026 TheNews92.com. All Rights Reserved. Unauthorized reproduction or redistribution of content is strictly prohibited.

    Type above and press Enter to search. Press Esc to cancel.