Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Some Of Monster Hunter Wilds’ PC Points Are Prompted By Bizarre Glitch

    January 17, 2026

    Shares blended after jittery week, gold eases, greenback pauses

    January 17, 2026

    Barca attain Copa del Rey quarters

    January 17, 2026
    Facebook X (Twitter) Instagram
    Saturday, January 17
    Trending
    • Some Of Monster Hunter Wilds’ PC Points Are Prompted By Bizarre Glitch
    • Shares blended after jittery week, gold eases, greenback pauses
    • Barca attain Copa del Rey quarters
    • Aptitude Airways reroutes technique to lure company travellers — and not using a enterprise class
    • Supreme Court docket hacker posted stolen authorities knowledge on Instagram
    • My Start Angel – Scorching NEW Supply that sells like hotcakes
    • Fuel provide raised to 746mmcfd on winter demand surge
    • Merchants Pile Again Into Ethereum Futures as Binance Quantity Breaks December Lull
    • Safety Guard & Subedar Jobs 2026 in Lahore 2026 Job Commercial Pakistan
    • EA’s Plans for Battlefield 6 Longevity Proceed because it Poaches The Division Lead
    Facebook X (Twitter) Instagram Pinterest Vimeo
    The News92The News92
    • Home
    • World
    • National
    • Sports
    • Crypto
    • Travel
    • Lifestyle
    • Jobs
    • Insurance
    • Gaming
    • AI & Tech
    • Health & Fitness
    The News92The News92
    Home - AI & Tech - Learn how to Construct Transportable, In-Database Characteristic Engineering Pipelines with Ibis Utilizing Lazy Python APIs and DuckDB Execution
    AI & Tech

    Learn how to Construct Transportable, In-Database Characteristic Engineering Pipelines with Ibis Utilizing Lazy Python APIs and DuckDB Execution

    Naveed AhmadBy Naveed AhmadJanuary 10, 2026No Comments4 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Learn how to Construct Transportable, In-Database Characteristic Engineering Pipelines with Ibis Utilizing Lazy Python APIs and DuckDB Execution
    Share
    Facebook Twitter LinkedIn Pinterest Email


    On this tutorial, we show how we use Ibis to construct a conveyable, in-database function engineering pipeline that appears and looks like Pandas however executes fully contained in the database. We present how we hook up with DuckDB, register knowledge safely contained in the backend, and outline complicated transformations utilizing window features and aggregations with out ever pulling uncooked knowledge into native reminiscence. By holding all transformations lazy and backend-agnostic, we show learn how to write analytics code as soon as in Python and depend on Ibis to translate it into environment friendly SQL. Take a look at the FULL CODES here.

    !pip -q set up "ibis-framework[duckdb,examples]" duckdb pyarrow pandas
    
    
    import ibis
    from ibis import _
    
    
    print("Ibis model:", ibis.__version__)
    
    
    con = ibis.duckdb.join()
    ibis.choices.interactive = True

    We set up the required libraries and initialize the Ibis atmosphere. We set up a DuckDB connection and allow interactive execution so that every one subsequent operations stay lazy and backend-driven. Take a look at the FULL CODES here.

    attempt:
       base_expr = ibis.examples.penguins.fetch(backend=con)
    besides TypeError:
       base_expr = ibis.examples.penguins.fetch()
    
    
    if "penguins" not in con.list_tables():
       attempt:
           con.create_table("penguins", base_expr, overwrite=True)
       besides Exception:
           con.create_table("penguins", base_expr.execute(), overwrite=True)
    
    
    t = con.desk("penguins")
    print(t.schema())

    We load the Penguins dataset and explicitly register it contained in the DuckDB catalog to make sure it’s obtainable for SQL execution. We confirm the desk schema and ensure that the info now lives contained in the database somewhat than in native reminiscence. Take a look at the FULL CODES here.

    def penguin_feature_pipeline(penguins):
       base = penguins.mutate(
           bill_ratio=_.bill_length_mm / _.bill_depth_mm,
           is_male=(_.intercourse == "male").ifelse(1, 0),
       )
    
    
       cleaned = base.filter(
           _.bill_length_mm.notnull()
           & _.bill_depth_mm.notnull()
           & _.body_mass_g.notnull()
           & _.flipper_length_mm.notnull()
           & _.species.notnull()
           & _.island.notnull()
           & _.yr.notnull()
       )
    
    
       w_species = ibis.window(group_by=[cleaned.species])
       w_island_year = ibis.window(
           group_by=[cleaned.island],
           order_by=[cleaned.year],
           previous=2,
           following=0,
       )
    
    
       feat = cleaned.mutate(
           species_avg_mass=cleaned.body_mass_g.imply().over(w_species),
           species_std_mass=cleaned.body_mass_g.std().over(w_species),
           mass_z=(
               cleaned.body_mass_g
               - cleaned.body_mass_g.imply().over(w_species)
           ) / cleaned.body_mass_g.std().over(w_species),
           island_mass_rank=cleaned.body_mass_g.rank().over(
               ibis.window(group_by=[cleaned.island])
           ),
           rolling_3yr_island_avg_mass=cleaned.body_mass_g.imply().over(
               w_island_year
           ),
       )
    
    
       return feat.group_by(["species", "island", "year"]).agg(
           n=feat.rely(),
           avg_mass=feat.body_mass_g.imply(),
           avg_flipper=feat.flipper_length_mm.imply(),
           avg_bill_ratio=feat.bill_ratio.imply(),
           avg_mass_z=feat.mass_z.imply(),
           avg_rolling_3yr_mass=feat.rolling_3yr_island_avg_mass.imply(),
           pct_male=feat.is_male.imply(),
       ).order_by(["species", "island", "year"])

    We outline a reusable function engineering pipeline utilizing pure Ibis expressions. We compute derived options, apply knowledge cleansing, and use window features and grouped aggregations to construct superior, database-native options whereas holding your complete pipeline lazy. Take a look at the FULL CODES here.

    options = penguin_feature_pipeline(t)
    print(con.compile(options))
    
    
    attempt:
       df = options.to_pandas()
    besides Exception:
       df = options.execute()
    
    
    show(df.head())

    We invoke the function pipeline and compile it into DuckDB SQL to validate that every one transformations are pushed all the way down to the database. We then run the pipeline and return solely the ultimate aggregated outcomes for inspection. Take a look at the FULL CODES here.

    con.create_table("penguin_features", options, overwrite=True)
    
    
    feat_tbl = con.desk("penguin_features")
    
    
    attempt:
       preview = feat_tbl.restrict(10).to_pandas()
    besides Exception:
       preview = feat_tbl.restrict(10).execute()
    
    
    show(preview)
    
    
    out_path = "/content material/penguin_features.parquet"
    con.raw_sql(f"COPY penguin_features TO '{out_path}' (FORMAT PARQUET);")
    print(out_path)

    We materialize the engineered options as a desk instantly inside DuckDB and question it lazily for verification. We additionally export the outcomes to a Parquet file, demonstrating how we will hand off database-computed options to downstream analytics or machine studying workflows.

    In conclusion, we constructed, compiled, and executed a complicated function engineering workflow absolutely inside DuckDB utilizing Ibis. We demonstrated learn how to examine the generated SQL, materialized outcomes instantly within the database, and exported them for downstream use whereas preserving portability throughout analytical backends. This method reinforces the core thought behind Ibis: we maintain computation near the info, reduce pointless knowledge motion, and keep a single, reusable Python codebase that scales from native experimentation to manufacturing databases.


    Take a look at the FULL CODES here. Additionally, be happy to comply with us on Twitter and don’t overlook to affix our 100k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.

    Take a look at our newest launch of ai2025.dev, a 2025-focused analytics platform that turns mannequin launches, benchmarks, and ecosystem exercise right into a structured dataset you possibly can filter, evaluate, and export.


    Asif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is dedicated to harnessing the potential of Synthetic Intelligence for social good. His most up-to-date endeavor is the launch of an Synthetic Intelligence Media Platform, Marktechpost, which stands out for its in-depth protection of machine studying and deep studying information that’s each technically sound and simply comprehensible by a large viewers. The platform boasts of over 2 million month-to-month views, illustrating its recognition amongst audiences.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleEmirates premium economic system attain with expanded metropolis rollout to incorporate Karachi
    Next Article Particular person killed after being struck by a prepare in Langley – BC
    Naveed Ahmad
    • Website
    • Tumblr

    Related Posts

    AI & Tech

    Supreme Court docket hacker posted stolen authorities knowledge on Instagram

    January 17, 2026
    AI & Tech

    Advertisements Are Coming to ChatGPT. Right here’s How They’ll Work

    January 16, 2026
    AI & Tech

    How a hacking marketing campaign focused high-profile Gmail and WhatsApp customers throughout the Center East

    January 16, 2026
    Add A Comment
    Leave A Reply Cancel Reply

    Demo
    Top Posts

    Hytale Enters Early Entry After A Decade After Surviving Cancellation

    January 14, 20263 Views

    Textile exports dip throughout EU, US & UK

    January 8, 20262 Views

    Planning & Growth Division Quetta Jobs 2026 2025 Job Commercial Pakistan

    January 3, 20262 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews

    Subscribe to Updates

    Get the latest tech news from FooBar about tech, design and biz.

    Demo
    Most Popular

    Hytale Enters Early Entry After A Decade After Surviving Cancellation

    January 14, 20263 Views

    Textile exports dip throughout EU, US & UK

    January 8, 20262 Views

    Planning & Growth Division Quetta Jobs 2026 2025 Job Commercial Pakistan

    January 3, 20262 Views
    Our Picks

    Some Of Monster Hunter Wilds’ PC Points Are Prompted By Bizarre Glitch

    January 17, 2026

    Shares blended after jittery week, gold eases, greenback pauses

    January 17, 2026

    Barca attain Copa del Rey quarters

    January 17, 2026

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Facebook X (Twitter) Instagram Pinterest
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms & Conditions
    • Advertise
    • Disclaimer
    © 2026 TheNews92.com. All Rights Reserved. Unauthorized reproduction or redistribution of content is strictly prohibited.

    Type above and press Enter to search. Press Esc to cancel.