Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Sindh Public Service Fee SPSC Jobs 2026 2026 Job Commercial Pakistan

    March 13, 2026

    پاکستان نیوی کی بڑی کامیابی: 12 ملین بیرل تیل سے لدے جہاز بحفاظت کراچی پورٹ پہنچ گئے

    March 13, 2026

    طالبان کی جنگ بندی کی اپیل عالمی سطح پر مسترد؛ پاک فضائیہ کا کابل میں فوجی مرکز پر حملہ

    March 13, 2026
    Facebook X (Twitter) Instagram
    Friday, March 13
    Trending
    • Sindh Public Service Fee SPSC Jobs 2026 2026 Job Commercial Pakistan
    • پاکستان نیوی کی بڑی کامیابی: 12 ملین بیرل تیل سے لدے جہاز بحفاظت کراچی پورٹ پہنچ گئے
    • طالبان کی جنگ بندی کی اپیل عالمی سطح پر مسترد؛ پاک فضائیہ کا کابل میں فوجی مرکز پر حملہ
    • Landowners take stand over years of missed payments by delinquent oil company
    • سوشل میڈیا پر عرب مخالف مہم کا شاخسانہ: لاکھوں پاکستانیوں کے روزگار پر خطرے کی تلوار لٹک گئی
    • Jack Osbourne reveals newborn daughter’s name
    • Facebook Marketplace now lets Meta AI respond to buyers’ messages
    • Foreign exchange reserves inch as much as $21.6b
    • Public Access Bleeding Control Kit Essentials
    • Stars recount eerie tales
    Facebook X (Twitter) Instagram Pinterest Vimeo
    The News92The News92
    • Home
    • World
    • National
    • Sports
    • Crypto
    • Travel
    • Lifestyle
    • Jobs
    • Insurance
    • Gaming
    • AI & Tech
    • Health & Fitness
    The News92The News92
    Home - AI & Tech - How to Build an Autonomous Machine Learning Research Loop in Google Colab Using Andrej Karpathy’s AutoResearch Framework for Hyperparameter Discovery and Experiment Tracking
    AI & Tech

    How to Build an Autonomous Machine Learning Research Loop in Google Colab Using Andrej Karpathy’s AutoResearch Framework for Hyperparameter Discovery and Experiment Tracking

    Naveed AhmadBy Naveed AhmadMarch 13, 2026No Comments5 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    In this tutorial, we implement a Colab-ready version of the AutoResearch framework originally proposed by Andrej Karpathy. We build an automated experimentation pipeline that clones the AutoResearch repository, prepares a lightweight training environment, and runs a baseline experiment to establish initial performance metrics. We then create an automated research loop that programmatically edits the hyperparameters in train.py, runs new training iterations, evaluates the resulting model using the validation bits-per-byte metric, and logs every experiment in a structured results table. By running this workflow in Google Colab, we demonstrate how we can reproduce the core idea of autonomous machine learning research: iteratively modifying training configurations, evaluating performance, and preserving the best configurations, without requiring specialized hardware or complex infrastructure.

    import os, sys, subprocess, json, re, random, shutil, time
    from pathlib import Path
    
    
    def pip_install(pkg):
       subprocess.check_call([sys.executable, "-m", "pip", "install", "-q", pkg])
    
    
    for pkg in [
       "numpy","pandas","pyarrow","requests",
       "rustbpe","tiktoken","openai"
    ]:
       try:
           __import__(pkg)
       except:
           pip_install(pkg)
    
    
    import pandas as pd
    
    
    if not Path("autoresearch").exists():
       subprocess.run(["git","clone","https://github.com/karpathy/autoresearch.git"])
    
    
    os.chdir("autoresearch")
    
    
    OPENAI_API_KEY=None
    try:
       from google.colab import userdata
       OPENAI_API_KEY = userdata.get("OPENAI_API_KEY")
    except:
       OPENAI_API_KEY=os.environ.get("OPENAI_API_KEY")
    
    
    if OPENAI_API_KEY:
       os.environ["OPENAI_API_KEY"]=OPENAI_API_KEY

    We begin by importing the core Python libraries required for the automated research workflow. We install all necessary dependencies and clone the autoresearch repository directly from GitHub, ensuring the environment includes the original training framework. We also configure access to the OpenAI API key, if available, allowing the system to optionally support LLM-assisted experimentation later in the pipeline.

    prepare_path=Path("prepare.py")
    train_path=Path("train.py")
    program_path=Path("program.md")
    
    
    prepare_text=prepare_path.read_text()
    train_text=train_path.read_text()
    
    
    prepare_text=re.sub(r"MAX_SEQ_LEN = \d+","MAX_SEQ_LEN = 512",prepare_text)
    prepare_text=re.sub(r"TIME_BUDGET = \d+","TIME_BUDGET = 120",prepare_text)
    prepare_text=re.sub(r"EVAL_TOKENS = .*","EVAL_TOKENS = 4 * 65536",prepare_text)
    
    
    train_text=re.sub(r"DEPTH = \d+","DEPTH = 4",train_text)
    train_text=re.sub(r"DEVICE_BATCH_SIZE = \d+","DEVICE_BATCH_SIZE = 16",train_text)
    train_text=re.sub(r"TOTAL_BATCH_SIZE = .*","TOTAL_BATCH_SIZE = 2**17",train_text)
    train_text=re.sub(r'WINDOW_PATTERN = "SSSL"','WINDOW_PATTERN = "L"',train_text)
    
    
    prepare_path.write_text(prepare_text)
    train_path.write_text(train_text)
    
    
    program_path.write_text("""
    Goal:
    Run autonomous research loop on Google Colab.
    
    
    Rules:
    Only modify train.py hyperparameters.
    
    
    Metric:
    Lower val_bpb is better.
    """)
    
    
    subprocess.run(["python","prepare.py","--num-shards","4","--download-workers","2"])

    We modify key configuration parameters inside the repository to make the training workflow compatible with Google Colab hardware. We reduce the context length, training time budget, and evaluation token counts so the experiments run within limited GPU resources. After applying these patches, we prepare the dataset shards required for training so that the model can immediately begin experiments.

    subprocess.run("python train.py > baseline.log 2>&1",shell=True)
    
    
    def parse_run_log(log_path):
       text=Path(log_path).read_text(errors="ignore")
       def find(p):
           m=re.search(p,text,re.MULTILINE)
           return float(m.group(1)) if m else None
       return {
           "val_bpb":find(r"^val_bpb:\s*([0-9.]+)"),
           "training_seconds":find(r"^training_seconds:\s*([0-9.]+)"),
           "peak_vram_mb":find(r"^peak_vram_mb:\s*([0-9.]+)"),
           "num_steps":find(r"^num_steps:\s*([0-9.]+)")
       }
    
    
    baseline=parse_run_log("baseline.log")
    
    
    results_path=Path("results.tsv")
    
    
    rows=[{
       "commit":"baseline",
       "val_bpb":baseline["val_bpb"] if baseline["val_bpb"] else 0,
       "memory_gb":round((baseline["peak_vram_mb"] or 0)/1024,1),
       "status":"keep",
       "description":"baseline"
    }]
    
    
    pd.DataFrame(rows).to_csv(results_path,sep="\t",index=False)
    
    
    print("Baseline:",baseline)

    We execute the baseline training run to establish an initial performance reference for the model. We implement a log-parsing function that extracts key training metrics, including validation bits-per-byte, training time, GPU memory usage, and optimization steps. We then store these baseline results in a structured experiment table so that all future experiments can be compared against this starting configuration.

    TRAIN_FILE=Path("train.py")
    BACKUP_FILE=Path("train.base.py")
    
    
    if not BACKUP_FILE.exists():
       shutil.copy2(TRAIN_FILE,BACKUP_FILE)
    
    
    HP_KEYS=[
    "WINDOW_PATTERN",
    "TOTAL_BATCH_SIZE",
    "EMBEDDING_LR",
    "UNEMBEDDING_LR",
    "MATRIX_LR",
    "SCALAR_LR",
    "WEIGHT_DECAY",
    "ADAM_BETAS",
    "WARMUP_RATIO",
    "WARMDOWN_RATIO",
    "FINAL_LR_FRAC",
    "DEPTH",
    "DEVICE_BATCH_SIZE"
    ]
    
    
    def read_text(path):
       return Path(path).read_text()
    
    
    def write_text(path,text):
       Path(path).write_text(text)
    
    
    def extract_hparams(text):
       vals={}
       for k in HP_KEYS:
           m=re.search(rf"^{k}\s*=\s*(.+?)$",text,re.MULTILINE)
           if m:
               vals[k]=m.group(1).strip()
       return vals
    
    
    def set_hparam(text,key,value):
       return re.sub(rf"^{key}\s*=.*$",f"{key} = {value}",text,flags=re.MULTILINE)
    
    
    base_text=read_text(BACKUP_FILE)
    base_hparams=extract_hparams(base_text)
    
    
    SEARCH_SPACE={
    "WINDOW_PATTERN":['"L"','"SSSL"'],
    "TOTAL_BATCH_SIZE":["2**16","2**17","2**18"],
    "EMBEDDING_LR":["0.2","0.4","0.6"],
    "MATRIX_LR":["0.01","0.02","0.04"],
    "SCALAR_LR":["0.3","0.5","0.7"],
    "WEIGHT_DECAY":["0.05","0.1","0.2"],
    "ADAM_BETAS":["(0.8,0.95)","(0.9,0.95)"],
    "WARMUP_RATIO":["0.0","0.05","0.1"],
    "WARMDOWN_RATIO":["0.3","0.5","0.7"],
    "FINAL_LR_FRAC":["0.0","0.05"],
    "DEPTH":["3","4","5","6"],
    "DEVICE_BATCH_SIZE":["8","12","16","24"]
    }
    
    
    def sample_candidate():
       keys=random.sample(list(SEARCH_SPACE.keys()),random.choice([2,3,4]))
       cand=dict(base_hparams)
       changes={}
       for k in keys:
           cand[k]=random.choice(SEARCH_SPACE[k])
           changes[k]=cand[k]
       return cand,changes
    
    
    def apply_hparams(candidate):
       text=read_text(BACKUP_FILE)
       for k,v in candidate.items():
           text=set_hparam(text,k,v)
       write_text(TRAIN_FILE,text)
    
    
    def run_experiment(tag):
       log=f"{tag}.log"
       subprocess.run(f"python train.py > {log} 2>&1",shell=True)
       metrics=parse_run_log(log)
       metrics["log"]=log
       return metrics

    We build the core utilities that enable automated hyperparameter experimentation. We extract the hyperparameters from train.py, define the searchable parameter space, and implement functions that can programmatically edit these values. We also create mechanisms to generate candidate configurations, apply them to the training script, and run experiments while recording their outputs.

    N_EXPERIMENTS=3
    
    
    df=pd.read_csv(results_path,sep="\t")
    best=df["val_bpb"].replace(0,999).min()
    
    
    for i in range(N_EXPERIMENTS):
    
    
       tag=f"exp_{i+1}"
    
    
       candidate,changes=sample_candidate()
    
    
       apply_hparams(candidate)
    
    
       metrics=run_experiment(tag)
    
    
       if metrics["val_bpb"] and metrics["val_bpb"]

    We run the automated research loop that repeatedly proposes new hyperparameter configurations and evaluates their performance. For each experiment, we modify the training script, run the training process, and compare the resulting validation score with the best configuration discovered so far. We log all experiment results, preserve improved configurations, and export the best training script along with the experiment history for further analysis.

    In conclusion, we constructed a complete automated research workflow that demonstrates how machines can iteratively explore model configurations and improve training performance with minimal manual intervention. Throughout the tutorial, we prepared the dataset, established a baseline experiment, and implemented a search loop that proposes new hyperparameter configurations, runs experiments, and tracks results across multiple trials. By maintaining experiment logs and automatically preserving improved configurations, we created a reproducible and extensible research process that mirrors the workflow used in modern machine learning experimentation. This approach illustrates how we can combine automation, experimentation tracking, and lightweight infrastructure to accelerate model development and enable scalable research directly from a cloud notebook environment.


    Check out Full Codes here. Also, feel free to follow us on Twitter and don’t forget to join our 120k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.




    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleGold, silver costs fall in Pakistan following international market decline
    Next Article Pakistan urges restraint amid escalating Center East tensions
    Naveed Ahmad
    • Website
    • Tumblr

    Related Posts

    AI & Tech

    Facebook Marketplace now lets Meta AI respond to buyers’ messages

    March 13, 2026
    AI & Tech

    Why Rivian is holding the $45,000 base model R2 until ‘late 2027’

    March 13, 2026
    AI & Tech

    Sales automation startup Rox AI hits $1.2B valuation, sources say

    March 13, 2026
    Add A Comment
    Leave A Reply Cancel Reply

    Demo
    Top Posts

    عالمی بحران: ایران بھارت کشیدگی، اسرائیلی دھمکی اور پاک قیادت کا مشن سعودی عرب

    March 12, 20261 Views

    آبنائے ہرمز میں بحرانی صورتحال: پی این ایس سی کے دو جہاز محصور، پاکستان کا ایران سے ہنگامی رابطہ

    March 12, 20261 Views

    Mahmood Aslam, 65, is not acting his age and the internet loves it

    March 11, 20261 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews

    Subscribe to Updates

    Get the latest tech news from FooBar about tech, design and biz.

    Demo
    Most Popular

    عالمی بحران: ایران بھارت کشیدگی، اسرائیلی دھمکی اور پاک قیادت کا مشن سعودی عرب

    March 12, 20261 Views

    آبنائے ہرمز میں بحرانی صورتحال: پی این ایس سی کے دو جہاز محصور، پاکستان کا ایران سے ہنگامی رابطہ

    March 12, 20261 Views

    Mahmood Aslam, 65, is not acting his age and the internet loves it

    March 11, 20261 Views
    Our Picks

    Sindh Public Service Fee SPSC Jobs 2026 2026 Job Commercial Pakistan

    March 13, 2026

    پاکستان نیوی کی بڑی کامیابی: 12 ملین بیرل تیل سے لدے جہاز بحفاظت کراچی پورٹ پہنچ گئے

    March 13, 2026

    طالبان کی جنگ بندی کی اپیل عالمی سطح پر مسترد؛ پاک فضائیہ کا کابل میں فوجی مرکز پر حملہ

    March 13, 2026

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Facebook X (Twitter) Instagram Pinterest
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms & Conditions
    • Advertise
    • Disclaimer
    © 2026 TheNews92.com. All Rights Reserved. Unauthorized reproduction or redistribution of content is strictly prohibited.

    Type above and press Enter to search. Press Esc to cancel.