Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Faryal Gohar, Jamal Shah unveil the reason behind their divorce

    February 27, 2026

    Palace battle into Conference League last 16

    February 27, 2026

    Perplexity Just Released pplx-embed: New SOTA Qwen3 Bidirectional Embedding Models for Web-Scale Retrieval Tasks

    February 27, 2026
    Facebook X (Twitter) Instagram
    Friday, February 27
    Trending
    • Faryal Gohar, Jamal Shah unveil the reason behind their divorce
    • Palace battle into Conference League last 16
    • Perplexity Just Released pplx-embed: New SOTA Qwen3 Bidirectional Embedding Models for Web-Scale Retrieval Tasks
    • SECP denies probe into inventory market fall
    • What Drives the Rally and What’s Next?
    • 10 Reasons Marathon Is the Next Big Online Shooter
    • District Training Officer Swat Jobs 2026 2026 Job Commercial Pakistan
    • Netflix walks away from Warner Bros deal, clearing the path for Paramount – National
    • Ronaldo transitions from star participant to membership stakeholder
    • BVB’s Kobel relishing Bayern showdown
    Facebook X (Twitter) Instagram Pinterest Vimeo
    The News92The News92
    • Home
    • World
    • National
    • Sports
    • Crypto
    • Travel
    • Lifestyle
    • Jobs
    • Insurance
    • Gaming
    • AI & Tech
    • Health & Fitness
    The News92The News92
    Home - AI & Tech - Microsoft Research Introduces CORPGEN To Manage Multi Horizon Tasks For Autonomous AI Agents Using Hierarchical Planning and Memory
    AI & Tech

    Microsoft Research Introduces CORPGEN To Manage Multi Horizon Tasks For Autonomous AI Agents Using Hierarchical Planning and Memory

    Naveed AhmadBy Naveed AhmadFebruary 27, 2026No Comments4 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    Microsoft researchers have introduced CORPGEN, an architecture-agnostic framework designed to manage the complexities of realistic organizational work through autonomous digital employees. While existing benchmarks evaluate AI agents on isolated, single tasks, real-world corporate environments require managing dozens of concurrent, interleaved tasks with complex dependencies. The research team identifies this distinct problem class as Multi-Horizon Task Environments (MHTEs).

    The Performance Gap in MHTEs

    Empirical testing reveals that baseline computer using agents (CUAs) experience significant performance degradation when moved from single-task scenarios to MHTEs. Using three independent CUA implementations, completion rates dropped from 16.7% at 25% load to 8.7% at 100% load.

    The research team identified four fundamental failure modes causing this decline:

    • Context Saturation: Context requirements grow O(N) with task count rather than O(1), rapidly exceeding the token window capacity.
    • Memory Interference: Information from one task often contaminates reasoning about another when multiple tasks share a single context window.
    • Dependency Graph Complexity: Corporate tasks form Directed Acyclic Graphs (DAGs) rather than linear chains, requiring complex topological reasoning.
    • Reprioritization Overhead: Decision complexity increases to O(N) per cycle because agents must constantly re-evaluate priorities across all active tasks.
    https://arxiv.org/pdf/2602.14229

    The CORPGEN Architecture

    To address these failures, CORPGEN implements Multi-Objective Multi-Horizon Agent (MOMA) capabilities through four primary architectural mechanisms.

    (a) Hierarchical Planning

    Strategic coherence is maintained through goal decomposition across three temporal scales:

    • Strategic Objectives (Monthly): High-level goals and milestones based on agent identity and role.
    • Tactical Plans (Daily): Actionable tasks for specific applications with priority rankings.
    • Operational Actions (Per-Cycle): Individual tool calls selected based on current state and retrieved memory.

    (b) Sub-Agent Isolation

    Complex operations, such as GUI automation or research, are isolated into modular sub-agents. These autonomous agents operate in their own context scopes and return only structured results to the host agent, preventing cross-task memory contamination.

    (c) Tiered Memory Architecture

    The system utilizes a three-layer memory structure to manage state:

    • Working Memory: Intended for immediate reasoning, this layer resets each cycle.
    • Structured Long-Term Memory (LTM): Stores typed artifacts such as plans, summaries, and reflections.
    • Semantic Memory: Uses Mem0 to support similarity-based retrieval over unstructured past context using embeddings.

    (d) Adaptive Summarization

    To bound context growth, CORPGEN employs rule-based compression. When context length exceeds 4,000 tokens, ‘critical content’ (such as tool calls and state changes) is preserved verbatim, while ‘routine content’ (intermediate reasoning) is compressed into structured summaries.

    Experimental Results and Learning

    Across three CUA backends (UFO2, OpenAI CUA, and hierarchical), CORPGEN achieved up to a 3.5x improvement over baselines, reaching a 15.2% completion rate compared to 4.3% for standalone UFO2 at 100% load.

    Ablation studies indicate that experiential learning provides the largest performance gains. This mechanism distills successful task executions into canonical trajectories which are then indexed in a FAISS database. At execution time, similar trajectories are retrieved as few-shot examples to bias action selection toward validated patterns.

    The research TEAM observed a significant discrepancy in evaluation methods. Artifact-based judgment (inspecting generated files and outputs) achieved a 90% agreement rate with human labels. In contrast, trace-based LLM judgment (relying on screenshots and execution logs) only achieved 40% agreement. This suggests that current benchmarks may systematically underestimate agent performance by relying on limited visual traces rather than the actual artifacts produced.

    Key Takeaways

    • Identification of Multi-Horizon Task Environments (MHTEs): The research team defines a new class of problems called MHTEs, where agents must manage dozens of interleaved, long-horizon tasks (45+ tasks, 500-1500+ steps) within a single persistent context. This differs from traditional benchmarks that evaluate single tasks in isolation.
    • Discovery of Catastrophic Performance Degradation: Standard computer-using agents (CUAs) experience a ‘catastrophic’ drop in performance when task load increases, with completion rates falling from 16.7% at 25% load to 8.7% at 100% load.
    • Four Fundamental Failure Modes: The researchers identified why current agents fail under load: context saturation (O(N) growth), memory interference (task conflation), dependency complexity (managing Directed Acyclic Graphs), and reprioritization overhead (O(N) decision complexity).
    • Architectural Mitigation via CORPGEN: The CORPGEN framework addresses these failures through four core mechanisms: hierarchical planning for goal alignment, sub-agent isolation to prevent memory contamination, tiered memory (working, structured, and semantic), and adaptive summarization to manage token limits.
    • Significant Performance Gains through Experiential Learning: Evaluation across multiple backends showed that CORPGEN can improve performance by up to 3.5x over baselines. Ablation studies revealed that experiential learning—reusing verified successful trajectories—provides the largest performance boost among all architectural components.

    Check out the Paper and Technical details. Also, feel free to follow us on Twitter and don’t forget to join our 120k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.


    Michal Sutter is a data science professional with a Master of Science in Data Science from the University of Padova. With a solid foundation in statistical analysis, machine learning, and data engineering, Michal excels at transforming complex datasets into actionable insights.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleUnity as technique: Pezeshkian’s name for nationwide cohesion amid mounting pressures
    Next Article Marseille search for method out of disaster in opposition to bitter rivals Lyon
    Naveed Ahmad
    • Website
    • Tumblr

    Related Posts

    AI & Tech

    Perplexity Just Released pplx-embed: New SOTA Qwen3 Bidirectional Embedding Models for Web-Scale Retrieval Tasks

    February 27, 2026
    AI & Tech

    ‘Uncanny Valley’: Pentagon vs. ‘Woke’ Anthropic, Agentic vs. Mimetic, and Trump vs. State of the Union

    February 27, 2026
    AI & Tech

    Google AI Just Released Nano-Banana 2: The New AI Model Featuring Advanced Subject Consistency and Sub-Second 4K Image Synthesis Performance

    February 27, 2026
    Add A Comment
    Leave A Reply Cancel Reply

    Demo
    Top Posts

    How to Get a Bigger Penis – The Stem Cell Secret to Natural Penis Enlargement & A Quiz

    February 22, 20261 Views

    Oatly loses ‘milk’ branding battle in UK Supreme Courtroom

    February 12, 20261 Views

    Faryal Gohar, Jamal Shah unveil the reason behind their divorce

    February 27, 20260 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews

    Subscribe to Updates

    Get the latest tech news from FooBar about tech, design and biz.

    Demo
    Most Popular

    How to Get a Bigger Penis – The Stem Cell Secret to Natural Penis Enlargement & A Quiz

    February 22, 20261 Views

    Oatly loses ‘milk’ branding battle in UK Supreme Courtroom

    February 12, 20261 Views

    Faryal Gohar, Jamal Shah unveil the reason behind their divorce

    February 27, 20260 Views
    Our Picks

    Faryal Gohar, Jamal Shah unveil the reason behind their divorce

    February 27, 2026

    Palace battle into Conference League last 16

    February 27, 2026

    Perplexity Just Released pplx-embed: New SOTA Qwen3 Bidirectional Embedding Models for Web-Scale Retrieval Tasks

    February 27, 2026

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Facebook X (Twitter) Instagram Pinterest
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms & Conditions
    • Advertise
    • Disclaimer
    © 2026 TheNews92.com. All Rights Reserved. Unauthorized reproduction or redistribution of content is strictly prohibited.

    Type above and press Enter to search. Press Esc to cancel.