Releases · pola-rs/polars

21 May 11:05

github-actions

rs-0.48.1

5e1b4b7

Rust Polars 0.48.1

🚀 Performance improvements

Switch eligible casts to non-strict in optimizer (#22850)

🐞 Bug fixes

Fix RuntimeError when serializing the same DataFrame from multiple threads (#22844)

📦 Build system

Fix building polars-lazy with certain features (#22846)
Add missing features (#22839)

🛠️ Other improvements

Update Rust Polars versions (#22854)

Thank you to all our contributors for making this release possible!
@JakubValtar, @bschoenmaeckers, @nameexhaustion and @stijnherfst

Contributors

JakubValtar, bschoenmaeckers, and 2 other contributors

Assets 2

21 May 13:33

github-actions

py-1.30.0

ee0903b

Python Polars 1.30.0 Latest

Latest

🚀 Performance improvements

Switch eligible casts to non-strict in optimizer (#22850)
Allow predicate passing set_sorted (#22797)
Increase default cross-file parallelism limit for new-streaming multiscan (#22700)
Add elementwise execution mode for list.eval (#22715)
Support optimised init from non-dict Mapping objects in from_records and frame/series constructors (#22638)
Add streaming cross-join node (#22581)
Switch off maintain_order in group-by followed by sort (#22492)

✨ Enhancements

Load AWS endpoint_url using boto3 (#22851)
Implemented list.filter (#22749)
Support binaryoffset in search sorted (#22786)
Add nulls_equal flag to list/arr.contains (#22773)
Implement LazyFrame.match_to_schema (#22726)
Improved time-string parsing and inference (generally, and via the SQL interface) (#22606)
Allow for .over to be called without partition_by (#22712)
Support AnyValue translation from PyMapping values (#22722)
Support optimised init from non-dict Mapping objects in from_records and frame/series constructors (#22638)
Support inference of Int128 dtype from databases that support it (#22682)
Add options to write Parquet field metadata (#22652)
Add cast_options parameter to control type casting in scan_parquet (#22617)
Allow casting List<UInt8> to Binary (#22611)
Allow setting of regex size limit using POLARS_REGEX_SIZE_LIMIT (#22651)
Support use of literal values as "other" when evaluating Series.zip_with (#22632)
Allow to read and write custom file-level parquet metadata (#21806)
Support PEP702 @deprecated decorator behaviour (#22594)
Support grouping by pl.Array (#22575)
Preserve exception type and traceback for errors raised from Python (#22561)
Use fixed-width font in streaming phys plan graph (#22540)

🐞 Bug fixes

Fix RuntimeError when serializing the same DataFrame from multiple threads (#22844)
Fix map_elements predicate pushdown (#22833)
Fix reverse list type (#22832)
Don't require numpy for search_sorted (#22817)
Add type equality checking for relevant methods (#22802)
Invalid output for fill_null after when.then on structs (#22798)
Don't panic for cross join with misaligned chunking (#22799)
Panic on quantile over nulls in rolling window (#22792)
Respect BinaryOffset metadata (#22785)
Correct the output order of PartitionByKey and PartitionParted (#22778)
Fallback to non-strict casting for deprecated casts (#22760)
Clippy on new stable version (#22771)
Handle sliced out remainder for bitmaps (#22759)
Don't merge Enum categories on append (#22765)
Fix unnest() not working on empty struct columns (#22391)
Fix the default value type in Schema init (#22589)
Correct name in unnest error message (#22740)
Provide "schema" to DataFrame, even if empty JSON (#22739)
Properly account for nulls in the is_not_nan check made in drop_nans (#22707)
Incorrect result from SQL count(*) with partition by (#22728)
Fix deadlock joining scanned tables with low thread count (#22672)
Don't allow deserializing incompatible DSL (#22644)
Incorrect null dtype from binary ops in empty group_by (#22721)
Don't mark str.replace_many with Mapping as deprecated (#22697)
Gzip has maximum compression of 9, not 10 (#22685)
Fix predicate pushdown of fallible expressions (#22669)
Fix index out of bounds panic when scanning hugging face (#22661)
Panic on group_by with literal and empty rows (#22621)
Return input instead of panicking if empty subset in drop_nulls() and drop_nans() (#22469)
Bump argminmax to 0.6.3 (#22649)
DSL version deserialization endianness (#22642)
Allow Expr.round() to be called on integer dtypes (#22622)
Fix panic when filtering based on row index column in parquet (#22616)
WASM and PyOdide compile (#22613)
Resolve get() SchemaMismatch panic (#22350)
Panic in group_by_dynamic on single-row df with group_by (#22597)
Add new_streaming feature to polars crate (#22601)
Consistently use Unix epoch as origin for dt.truncate (except weekly buckets which start on Mondays) (#22592)
Fix interpolate on dtype Decimal (#22541)
CSV count rows skipped last line if file did not end with newline (#22577)
Make nested strict casting actually strict (#22497)
Make replace and replace_strict mapping use list literals (#22566)
Allow pivot on Time column (#22550)
Fix error when providing CSV schema with extra columns (#22544)
Panic on bitwise op between Series and Expr (#22527)
Multi-selector regex expansion (#22542)

📖 Documentation

Add pre-release policy (#22808)
Fix broken link to service account page in Polars Cloud docs (#22762)
Add match_to_schema to API reference (#22777)
Provide additional explanation and examples for the value_counts "normalize" parameter (#22756)
Rework documentation for drop/fill for nulls/nans (#22657)
Add documentation to new RoundMode parameter in round (#22555)
Add missing repeat_by to API reference, fixup list.get (#22698)
Fix non-rendering bullet points in scan_iceberg (#22694)
Improve insert_column docstring (description and examples) (#22551)
Improve join documentation (#22556)

📦 Build system

Fix building polars-lazy with certain features (#22846)
Add missing features (#22839)
Patch pyo3 to disable recompilation (#22796)

🛠️ Other improvements

Update Rust Polars versions (#22854)
Add basic smoke test for free-threaded python (#22481)
Update Polars Rust versions (#22834)
Fix nix build (#22809)
Fix flake.nix to work on macos (#22803)
Unused variables on release build (#22800)
Update cloud docs (#22624)
Fix unstable list.eval performance test (#22729)
Add proptest implementations for all Array types (#22711)
Dispatch .write_* to .lazy().sink_*(engine='in-memory') (#22582)
Move to all optimization flags to QueryOptFlags (#22680)
Add test for str.replace_many (#22615)
Stabilize sink_* (#22643)
Add proptest for row-encode (#22626)
Update rust version in nix flake (#22627)
Add a nix flake with a devShell and package (#22246)
Use a wrapper struct to store time zone (#22523)
Add proptest testing for for parquet decoding kernels (#22608)
Include equiprobable as valid quantile method (#22571)
Remove confusing error context calling .collect(_eager=True) (#22602)
Fix test_truncate_path test case (#22598)
Unify function flags into 1 bitset (#22573)
Display the operation behind in-memory-map (#22552)

Thank you to all our contributors for making this release possible!
@IvanIsCoding, @JakubValtar, @Julian-J-S, @LucioFranco, @MarcoGorelli, @WH-2099, @alexander-beedie, @borchero, @bschoenmaeckers, @cmdlineluser, @coastalwhite, @etiennebacher, @florian-klein, @itamarst, @kdn36, @mcrumiller, @nameexhaustion, @nikaltipar, @orlp, @pavelzw, @r-brink, @ritchie46, @stijnherfst, @teotwaki, @timkpaine and @wence-

Contributors

orlp, wence-, and 24 other contributors

Assets 4

20 May 11:07

github-actions

rs-0.48.0

bfa5e96

Rust Polars 0.48.0

💥 Breaking changes

Use a wrapper struct to store time zone (#22523)

🚀 Performance improvements

Allow predicate passing set_sorted (#22797)
Increase default cross-file parallelism limit for new-streaming multiscan (#22700)
Add elementwise execution mode for list.eval (#22715)
Support optimised init from non-dict Mapping objects in from_records and frame/series constructors (#22638)
Add streaming cross-join node (#22581)
Switch off maintain_order in group-by followed by sort (#22492)

✨ Enhancements

Format named functions (#22831)
Implemented list.filter (#22749)
Support binaryoffset in search sorted (#22786)
Add nulls_equal flag to list/arr.contains (#22773)
Allow named opaque functions for serde (#22734)
Implement LazyFrame.match_to_schema (#22726)
Improved time-string parsing and inference (generally, and via the SQL interface) (#22606)
Allow for .over to be called without partition_by (#22712)
Support AnyValue translation from PyMapping values (#22722)
Support optimised init from non-dict Mapping objects in from_records and frame/series constructors (#22638)
Add options to write Parquet field metadata (#22652)
Allow casting List<UInt8> to Binary (#22611)
Allow setting of regex size limit using POLARS_REGEX_SIZE_LIMIT (#22651)

🐞 Bug fixes

Fix reverse list type (#22832)
Add type equality checking for relevant methods (#22802)
Invalid output for fill_null after when.then on structs (#22798)
Don't panic for cross join with misaligned chunking (#22799)
Panic on quantile over nulls in rolling window (#22792)
Respect BinaryOffset metadata (#22785)
Correct the output order of PartitionByKey and PartitionParted (#22778)
Fallback to non-strict casting for deprecated casts (#22760)
Clippy on new stable version (#22771)
Handle sliced out remainder for bitmaps (#22759)
Don't merge Enum categories on append (#22765)
Fix unnest() not working on empty struct columns (#22391)
Correct name in unnest error message (#22740)
Properly account for nulls in the is_not_nan check made in drop_nans (#22707)
Incorrect result from SQL count(*) with partition by (#22728)
Fix deadlock joining scanned tables with low thread count (#22672)
Don't allow deserializing incompatible DSL (#22644)
Incorrect null dtype from binary ops in empty group_by (#22721)
Don't mark str.replace_many with Mapping as deprecated (#22697)
Gzip has maximum compression of 9, not 10 (#22685)
Fix predicate pushdown of fallible expressions (#22669)
Fix index out of bounds panic when scanning hugging face (#22661)
Fix polars crate not compiling when lazy feature enabled (#22655)
Panic on group_by with literal and empty rows (#22621)
Return input instead of panicking if empty subset in drop_nulls() and drop_nans() (#22469)
Bump argminmax to 0.6.3 (#22649)
DSL version deserialization endianness (#22642)
Fix nested dtype row encoding (#22557)
Allow Expr.round() to be called on integer dtypes (#22622)
Fix panic when filtering based on row index column in parquet (#22616)
WASM and PyOdide compile (#22613)
Resolve get() SchemaMismatch panic (#22350)

📖 Documentation

Add pre-release policy (#22808)
Fix broken link to service account page in Polars Cloud docs (#22762)
Rework documentation for drop/fill for nulls/nans (#22657)

📦 Build system

Patch pyo3 to disable recompilation (#22796)

🛠️ Other improvements

Update Polars Rust versions (#22834)
Cleanup polars-python lifetimes (#22548)
Fix nix build (#22809)
Fix flake.nix to work on macos (#22803)
Remove unused dependencies in polars-arrow (#22806)
Unused variables on release build (#22800)
Update cloud docs (#22624)
Add proptest implementations for all Array types (#22711)
Dispatch .write_* to .lazy().sink_*(engine='in-memory') (#22582)
Move to all optimization flags to QueryOptFlags (#22680)
Add test for str.replace_many (#22615)
Stabilize sink_* (#22643)
Add proptest for row-encode (#22626)
Emphasize PolarsDataType::get_dtype is static-only (#22648)
Use named fields for Logical (#22647)
Update rust version in nix flake (#22627)
Add a nix flake with a devShell and package (#22246)
Use a wrapper struct to store time zone (#22523)
Add proptest testing for for parquet decoding kernels (#22608)

Thank you to all our contributors for making this release possible!
@IvanIsCoding, @JakubValtar, @Julian-J-S, @LucioFranco, @MarcoGorelli, @WH-2099, @alexander-beedie, @borchero, @bschoenmaeckers, @cmdlineluser, @coastalwhite, @etiennebacher, @florian-klein, @itamarst, @kdn36, @nameexhaustion, @nikaltipar, @orlp, @pavelzw, @r-brink, @ritchie46, @stijnherfst, @teotwaki, @timkpaine and @wence-

Contributors

orlp, wence-, and 23 other contributors

Assets 2

16 May 19:06

github-actions

py-1.30.0-beta.1

103f194

Python Polars 1.30.0-beta.1 Pre-release

Pre-release

🚀 Performance improvements

Increase default cross-file parallelism limit for new-streaming multiscan (#22700)
Add elementwise execution mode for list.eval (#22715)
Support optimised init from non-dict Mapping objects in from_records and frame/series constructors (#22638)
Add streaming cross-join node (#22581)
Switch off maintain_order in group-by followed by sort (#22492)

✨ Enhancements

Support binaryoffset in search sorted (#22786)
Add nulls_equal flag to list/arr.contains (#22773)
Implement LazyFrame.match_to_schema (#22726)
Improved time-string parsing and inference (generally, and via the SQL interface) (#22606)
Allow for .over to be called without partition_by (#22712)
Support AnyValue translation from PyMapping values (#22722)
Support optimised init from non-dict Mapping objects in from_records and frame/series constructors (#22638)
Support inference of Int128 dtype from databases that support it (#22682)
Add options to write Parquet field metadata (#22652)
Add cast_options parameter to control type casting in scan_parquet (#22617)
Allow casting List<UInt8> to Binary (#22611)
Allow setting of regex size limit using POLARS_REGEX_SIZE_LIMIT (#22651)
Support use of literal values as "other" when evaluating Series.zip_with (#22632)
Allow to read and write custom file-level parquet metadata (#21806)
Support PEP702 @deprecated decorator behaviour (#22594)
Support grouping by pl.Array (#22575)
Preserve exception type and traceback for errors raised from Python (#22561)
Use fixed-width font in streaming phys plan graph (#22540)

🐞 Bug fixes

Respect BinaryOffset metadata (#22785)
Correct the output order of PartitionByKey and PartitionParted (#22778)
Fallback to non-strict casting for deprecated casts (#22760)
Clippy on new stable version (#22771)
Handle sliced out remainder for bitmaps (#22759)
Don't merge Enum categories on append (#22765)
Fix unnest() not working on empty struct columns (#22391)
Fix the default value type in Schema init (#22589)
Correct name in unnest error message (#22740)
Provide "schema" to DataFrame, even if empty JSON (#22739)
Properly account for nulls in the is_not_nan check made in drop_nans (#22707)
Incorrect result from SQL count(*) with partition by (#22728)
Fix deadlock joining scanned tables with low thread count (#22672)
Don't allow deserializing incompatible DSL (#22644)
Incorrect null dtype from binary ops in empty group_by (#22721)
Don't mark str.replace_many with Mapping as deprecated (#22697)
Gzip has maximum compression of 9, not 10 (#22685)
Fix predicate pushdown of fallible expressions (#22669)
Fix index out of bounds panic when scanning hugging face (#22661)
Panic on group_by with literal and empty rows (#22621)
Return input instead of panicking if empty subset in drop_nulls() and drop_nans() (#22469)
Bump argminmax to 0.6.3 (#22649)
DSL version deserialization endianness (#22642)
Allow Expr.round() to be called on integer dtypes (#22622)
Fix panic when filtering based on row index column in parquet (#22616)
WASM and PyOdide compile (#22613)
Resolve get() SchemaMismatch panic (#22350)
Panic in group_by_dynamic on single-row df with group_by (#22597)
Add new_streaming feature to polars crate (#22601)
Consistently use Unix epoch as origin for dt.truncate (except weekly buckets which start on Mondays) (#22592)
Fix interpolate on dtype Decimal (#22541)
CSV count rows skipped last line if file did not end with newline (#22577)
Make nested strict casting actually strict (#22497)
Make replace and replace_strict mapping use list literals (#22566)
Allow pivot on Time column (#22550)
Fix error when providing CSV schema with extra columns (#22544)
Panic on bitwise op between Series and Expr (#22527)
Multi-selector regex expansion (#22542)

📖 Documentation

Fix broken link to service account page in Polars Cloud docs (#22762)
Add match_to_schema to API reference (#22777)
Provide additional explanation and examples for the value_counts "normalize" parameter (#22756)
Rework documentation for drop/fill for nulls/nans (#22657)
Add documentation to new RoundMode parameter in round (#22555)
Add missing repeat_by to API reference, fixup list.get (#22698)
Fix non-rendering bullet points in scan_iceberg (#22694)
Improve insert_column docstring (description and examples) (#22551)
Improve join documentation (#22556)

🛠️ Other improvements

Update cloud docs (#22624)
Fix unstable list.eval performance test (#22729)
Add proptest implementations for all Array types (#22711)
Dispatch .write_* to .lazy().sink_*(engine='in-memory') (#22582)
Move to all optimization flags to QueryOptFlags (#22680)
Add test for str.replace_many (#22615)
Stabilize sink_* (#22643)
Add proptest for row-encode (#22626)
Update rust version in nix flake (#22627)
Add a nix flake with a devShell and package (#22246)
Use a wrapper struct to store time zone (#22523)
Add proptest testing for for parquet decoding kernels (#22608)
Include equiprobable as valid quantile method (#22571)
Remove confusing error context calling .collect(_eager=True) (#22602)
Fix test_truncate_path test case (#22598)
Unify function flags into 1 bitset (#22573)
Display the operation behind in-memory-map (#22552)

Thank you to all our contributors for making this release possible!
@JakubValtar, @Julian-J-S, @MarcoGorelli, @WH-2099, @alexander-beedie, @borchero, @cmdlineluser, @coastalwhite, @etiennebacher, @florian-klein, @itamarst, @kdn36, @mcrumiller, @nameexhaustion, @nikaltipar, @orlp, @pavelzw, @r-brink, @ritchie46, @stijnherfst, @teotwaki, @timkpaine and @wence-

Contributors

orlp, wence-, and 21 other contributors

Assets 4

05 May 13:13

github-actions

rs-0.47.0

ba3be4e

Rust Polars 0.47.1

🏆 Highlights

Enable common subplan elimination across plans in collect_all (#21747)
Add lazy sinks (#21733)
Add PartitionByKey for new streaming sinks (#21689)
Enable new streaming memory sinks by default (#21589)

💥 Breaking changes

Make bottom interval closed in hist (#22090)

🚀 Performance improvements

Avoid alloc_zeroed in decompression (#22460)
Lower Expr.(n_)unique to group_by on streaming engine (#22420)
Chunk huge munmap calls (#22414)
Add single-key variants of streaming group_by (#22409)
Improve accumulate_dataframes_vertical performance (#22399)
Use optimize rolling_quantile with varying window sizes (#22353)
Dedicated rolling_skew kernel (#22333)
Call large munmap's in background thread (#22329)
New streaming group_by implementation (#22285)
Patch jemalloc to not purge huge allocs eagerly if we have background threads (#22318)
Turn on parallel=prefiltered by default for new streaming (#22190)
Add CSE to streaming groupby (#22196)
Speed-up new streaming predicate filtering (#22179)
Speedup new-streaming file row count (#22169)
Fix quadratic behavior when casting Enums (#22008)
Lower is_in to bitmap-output semi-join in new streaming engine (#21948)
Fast path for empty inner join (#21965)
Add native semi/anti join in new streaming engine (#21937)
Cache regex compilation globally (#21929)
Use views for binary hash tables and add single-key binary variant (#21872)
Avoid rechunking in gather (#21876)
Switch ahash for foldhash (#21852)
Put THP behind feature flag (#21853)
Enable THP by default (#21829)
Improve join performance for expanding joins (#21821)
Use binary_search instead of contains in business-day functions (#21775)
Implement linear-time rolling_min/max (#21770)
Improve InputIndependentSelect by delegating to InMemorySourceNode (#21767)
Enable common subplan elimination across plans in collect_all (#21747)
Allow elementwise functions in recursive lowering (#21653)
Add primitive single-key hashtable to new-streaming join (#21712)
Remove unnecessary black_boxes in Kahan summation (#21679)
Box large enum variants (#21657)
Improve join performance for new-streaming engine (#21620)
Pre-fill caches (#21646)
Optimize only a single cache input (#21644)
Collect parquet statistics in one contiguous buffer (#21632)
Update Cargo.lock (mainly for zstd 1.5.7) (#21612)
Don't maintain order when maintain_order=False in new streaming sinks (#21586)
Pre-sort groups in group-by-dynamic (#21569)
Provide a fallback skip batch predicate for constant batches (#21477)
Parallelize the passing in new streaming multiscan (#21430)
Toggle projection pushdown for eager rolling (#21405)
Fix pathologic rolling + group-by performance and memory explosion (#21403)
Add sampling to new-streaming equi join to decide between build/probe side (#21197)
Reduce sharing in stringview arrays in new-streaming equijoin (#21129)
Implement native Expr.count() on new-streaming (#21126)
Speed up list operations that use amortized_iter() (#20964)
Use Cow as output for rechunk and add rechunk_mut (#21116)
Reduce arrow slice mmap overhead (#21113)
Reduce conversion cost in chunked string gather (#21112)
Enable prefiltered by default for new streaming (#21109)
Enable parquet column expressions for streaming (#21101)
Deduplicate buffers again in stringview concat kernel (#21098)
Add dedicated concatenate kernels (#21080)
Rechunk only once during join probe gather (#21072)
Speed up from_pandas when converting frame with multi-index columns (#21063)
Change default memory prefetch to MADV_WILLNEED (#21056)
Remove cast to boolean after comparison in optimizer (#21022)
Split last rowgroup among all threads in new-streaming parquet reader (#21027)
Recombine into larger morsels in new-streaming join (#21008)
Improve list.min and list.max performance for logical types (#20972)
Ensure count query select minimal columns (#20923)

✨ Enhancements

Support grouping by pl.Array (#22575)
Preserve exception type and traceback for errors raised from Python (#22561)
Use fixed-width font in streaming phys plan graph (#22540)
Highlight nodes in streaming phys plan graph (#22535)
Support BinaryOffset serde (#22528)
Show physical stage graph (#22491)
Add structure for dispatching iceberg to native scans (#22405)
Add SQL support for checking array values with IN and NOT IN expressions (#22487)
Add more IRBuilder utils (#22482)
Support DataFrame and Series init from torch Tensor objects (#22177)
Add RoundMode for Decimal and Float (#22248)
Inform users that IO error path file name can be expanded with POLARS_VERBOSE=1 (#22427)
Make streaming dispatch public (#22347)
Add rolling_kurtosis (#22335)
Support Cast in IO plugin predicates (#22317)
Add .sort(nulls_last=True) to booleans, categoricals and enums (#22300)
Add rolling min/max for temporals (#22271)
Support literal:list agg (#22249)
Support implode + agg (#22230)
Dispatch scans to new-streaming by default (#22153)
Improved expression autocomplete for IPython, Jupyter, and Marimo (#22221)
Expose FunctionIR::FastCount in the python visitor (#22195)
Add SPLIT_PART string function to the SQL interface (#22158)
Allow scalar expr in Expr.diff (#22142)
Support additional unsigned int aliases in the SQL interface (#22127)
Add STRING_TO_ARRAY function to the SQL interface (#22129)
Add dt.is_business_day (#21776)
Add support for Int128 parsing/recognition to the SQL interface (#22104)
Allow sinking to abstract python io and fs classes (#21987)
Add add_alp_optimize_exprs to IRBuilder (#22061)
Add cat.slice (#21971)
Support growing schema if line lenght increases during csv schema inference (#21979)
Replace thread unsafe GilOnceCell with Mutex (#21927)
Support modified dsl in file cache (#21907)
Add support for io-plugins in new-streaming (#21870)
Add PartitionParted (#21788)
Add DoubleEndedIterator for CatIter (#21816)
Minor improvements to EXPLAIN plan output (#21822)
Add polars_testing folder with relevant files and add_series_equal!() functionality (#21722)
Allow to use repeat_by with (nested) lists and structs (#21206)
Add support for rolling_(sum/min/max) for booleans through casting (#21748)
Support multi-column sort for all nested types and nested search-sorted (#21743)
Add lazy sinks (#21733)
Add PartitionByKey for new streaming sinks (#21689)
Fix replace flags (#21731)
Add mkdir flag to sinks (#21717)
Enable joins on list/array dtypes (#21687)
Add a config option to specify the default engine to attempt to use during lazyframe calls (#20717)
Support all elementwise functions in IO plugin predicates (#21705)
Stabilize Enum datatype (#21686)
Support Polars int128 in from arrow (#21688)
Use FFI to read dataframe instead of transmute (#21673)
Enable new streaming memory sinks by default (#21589)
Cloud support for new-streaming scans and sinks (#21621)
Add len method to arr (#21618)
Closeable files on unix (#21588)
Add new PartitionMaxSize sink (#21573)
Implement unpack_dtypes() functionality with unit tests (#21574)
Support engine callback for LazyFrame.profile (#21534)
Dispatch new-streaming CSV negative slice to separate node (#21579)
Add NDJSON source to new streaming engine (#21562)
Add lossy decoding to read_csv for non-utf8 encodings (#21433)
Add 'nulls_equal' parameter to is_in (#21426)
Improve numeric stability rolling_{std, var, cov, corr} (#21528)
IR Serde cross-filter (#21488)
Support writing Time type in json (#21454)
Activate all optimizations in sinks (#21462)
Add AssertionError variant to PolarsError in polars-error (#21460)
Pass filter to inner readers in multiscan new streaming (#21436)
Implement i128 -> str cast (#21411)
Version DSL (#21383)
Make user facing binary formats mostly self describing (#21380)
Filter hive files using predicates in new streaming (#21372)
Add negative slicing to new streaming multiscan (#21219)
Pub-licize Expr DSL Function enums (#20421)
Implement sorted flags for struct series (#21290)
Support reading arrow Map type from Delta (#21330)
Add a dedicated remove method for DataFrame and LazyFrame (#21259)
Expose include_file_paths to python visitor (#21279)
Implement merge_sorted for struct (#21205)
Add positive slice for new streaming MultiScan (#21191)
Don't take in rewriting visitor (#21212)
Add SQL support for the DELETE statement (#21190)
Add row index to new streaming multiscan (#21169)
Improve DataFrame fmt in explain (#21158)
Add projection pushdown to new streaming multiscan (#21139)
Implement join on struct dtype (#21093)
Use unique temporary directory path per user and restrict permissions (#21125)
Enable new streaming multiscan for CSV (#21124)
Environment POLARS_MAX_CONCURRENT_SCANS in multiscan for new streaming (#21127)
Multi/Hive scans in new streaming engine (#21011)
Add linear_spaces (#20941)
Implement merge_sorted for binary (#21045)
Hold string cache in new streaming engine and fix row-encoding (#21039)
Support max/min method for Time dtype (#19815)
Implement a streaming merge sorted node (#20960)
Automatically use temporary credentials API for scanning Unity catalog tables (#21020)
Add negative slice support to new-streaming engine (#21001)
Allow for more RG skipping by rewriting expr in planner (#20828)
Rename catalog schema to namespace (#20993)
Add functionality to create and delete catalogs, tables and schemas to Unity catalog client (#20956)
Improved support for KeyboardInterrupts (#20961...

Contributors

orlp, GaelVaroquaux, and 72 other contributors

Assets 2

30 Apr 20:57

github-actions

py-1.29.0

a0e3e38

Python Polars 1.29.0

🚀 Performance improvements

Avoid alloc_zeroed in decompression (#22460)

✨ Enhancements

Highlight nodes in streaming phys plan graph (#22535)
Show physical stage graph (#22491)
Add structure for dispatching iceberg to native scans (#22405)
Add SQL support for checking array values with IN and NOT IN expressions (#22487)
Support DataFrame and Series init from torch Tensor objects (#22177)
Add RoundMode for Decimal and Float (#22248)
Inform users that IO error path file name can be expanded with POLARS_VERBOSE=1 (#22427)

🐞 Bug fixes

Streaming outer join coalesce bug (#22530)
Remove redundant print statement in assert_frame_schema_equal() (#22529)
Bug in .unique() followed by .slice() (#22471)
Fix error reading parquet with datetimes written by pandas (#22524)
Fix schema_overrides not taking effect in NDJSON (#22521)
Fold flags and verify scalar correctness in apply (#22519)
Invalid values were triggering panics instead of returning null in dt.to_date / dt.to_datetime (#22500)
Ensure numpy isinstance check is lazy (avoid forcing the dependency) (#22486)
Incorrectly dropped sort after unique for some queries (#22489)
Fix incorrect ternary agg state with mixed columns and scalars (#22496)
Make replace and replace_strict properly elementwise (#22465)
Fix index out of bounds panic on parquet prefiltering (#22458)
Integer underflow when checking parquet UTF-8 (#22472)
Add implementation for array.get with idx overflow (#22449)
Deprecate str. collection functions with flat strings and mark as elementwise (#22461)
Deprecate flat list.gather and mark as elementwise (#22456)
Inform users that IO error path file name can be expanded with POLARS_VERBOSE=1 (#22427)

📖 Documentation

Fix typo in structs page (#22504)

🛠️ Other improvements

Don't store name/dtype in grouper (#22525)
Add structure for dispatching iceberg to native scans (#22405)
Remove unused reduction code (#22462)
Pin to explicit macOS version in code coverage (#22432)

Thank you to all our contributors for making this release possible!
@AH-Merii, @JakubValtar, @Julian-J-S, @Kevin-Patyk, @Liyixin95, @MarcoGorelli, @Matt711, @alexander-beedie, @brianmakesthings, @coastalwhite, @nameexhaustion, @orlp and @ritchie46

Contributors

orlp, alexander-beedie, and 11 other contributors

Assets 4

27 Apr 15:33

github-actions

py-1.28.1

506319e

Python Polars 1.28.1

🐞 Bug fixes

Reading of reencoded categorical in Parquet (#22436)
Last thread in parquet predicate filter oob (#22429)

📖 Documentation

Fix a few typos in the new "multiplexing" page (#22434)
Add multiplexing page (#22426)

📦 Build system

Update pyo3 and numpy crates to version 0.24 (#22015)

🛠️ Other improvements

Add test for implode + over (#22437)
Fix CI by removing use_legacy_dataset (#22438)
Only use pytorch index-url for pytorch package (#22355)

Thank you to all our contributors for making this release possible!
@bschoenmaeckers, @coastalwhite, @etiennebacher, @mcrumiller and @ritchie46

Contributors

mcrumiller, ritchie46, and 3 other contributors

Assets 4

26 Apr 09:02

github-actions

py-1.28.0

8d30e79

Python Polars 1.28.0

🚀 Performance improvements

Lower Expr.(n_)unique to group_by on streaming engine (#22420)
Chunk huge munmap calls (#22414)
Add single-key variants of streaming group_by (#22409)
Improve accumulate_dataframes_vertical performance (#22399)
Use optimize rolling_quantile with varying window sizes (#22353)
Dedicated rolling_skew kernel (#22333)
Call large munmap's in background thread (#22329)
New streaming group_by implementation (#22285)
Patch jemalloc to not purge huge allocs eagerly if we have background threads (#22318)
Turn on parallel=prefiltered by default for new streaming (#22190)

✨ Enhancements

When reporting unexpected types in errors, module-qualify the typename (#22390)
Add Series backward_fill / forward_fill (#22360)
Add GPU support to sink_* APIs (#20940)
Changed mapping type from dict to Mapping (#19400) (#19436)
Make streaming dispatch public (#22347)
Add rolling_kurtosis (#22335)
Support Cast in IO plugin predicates (#22317)
Add .sort(nulls_last=True) to booleans, categoricals and enums (#22300)
Add rolling min/max for temporals (#22271)
Support literal:list agg (#22249)
Support running Polars SQL queries against any objects implementing the PyCapsule interface (#22235)
Support implode + agg (#22230)
Dispatch scans to new-streaming by default (#22153)

🐞 Bug fixes

Ensure write_excel correctly preserves null values in nested dtype data on export (#22379)
Panic when visualizing streaming physical plan with joins (#22404)
Fix incorrect filter after LazyFrame.rename().select() (#22380)
Fix select(len()) performance regression (#22363)
Handle pytz named timezone in lit (#21785)
Don't leak state during prefill CSE cache (#22341)
Maintain float32 type in partitioned group-by (#22340)
Resolve streaming panic on multiple merge_sorted (#22205)
Fix ndjson nested types (#22325)
Fix nested datetypes in ndjson (#22321)
Check matching lengths for pl.corr (#22305)
Move type coercion for pl.duration to planner (#22304)
Check dtype to avoid panic with mixed types in min/max_horizontal (#21857)
Coalesce correct column for new streaming full join (#22301)
Don't collect NaN from Parquet Statistics (#22294)
Set revmap for empty AnyValue to Series (#22293)
Add an __all__ entry to internal type definition module (#22254)
Datetime parser was incorrectly parsing 8-digit fractional seconds when format specified to expect 9 (#22180)
More robust str → date conversion when reading from spreadsheet (#22276)
Deprecate using is_in with 2 equal types and mark as elementwise (#22178)
Duplicate key column name in streaming group_by due to CSE (#22280)
Raise ColumnNotFoundError for missing columns in join_where (#22268)
Parquet filters for logical types and operations (#22253)
Ensure floating-point accuracy in hist (#22245)
Check matching key datatypes for new streaming joins (#22247)
Incorrect length BinaryArray/ListBuilder (#22227)

📖 Documentation

Update docs for schema arg in scan_csv to match read_csv (#22357)
Update pl.when documentation (#22345)
Add missing is_business_day to documentation reference (#22338)
Improve interpolation documentation to clarify behavior of null values (#22274)

🛠️ Other improvements

Install pytorch for 3.13 on Windows (#22356)
Make interpolate fix more robust (#22421)
Fix interpolate test (#22417)
Reduce hot table size in debug mode (#22400)
Replace intrinsic with non-intrinsic (#22401)
Make streaming dispatch public (#22347)
Update rustc to 'nightly-2025-04-19' (#22342)
Update mozilla-actions/sccache-action (#22319)
Purge old parquet and scan code (#22226)
Add an __all__ entry to internal type definition module (#22254)
Add online skew/kurtosis algorithm for future use in rolling kernels (#22261)
Add Polars Cloud 0.0.7 release notes (#22223)
Change format name from list to implode (#22240)
Make other parallel parquet modes filter afterwards (#22228)
Close async reader issues (#22224)
Add BinaryArrayBuilder (#22225)

Thank you to all our contributors for making this release possible!
@DavideCanton, @JakubValtar, @Jesse-Bakker, @MarcoGorelli, @NeejWeej, @Shoeboxam, @adamreeve, @alexander-beedie, @axellpadilla, @cmdlineluser, @coastalwhite, @d-reynol, @dongchao-1, @florian-klein, @kdn36, @math-hiyoko, @mcrumiller, @mroeschke, @nameexhaustion, @orlp, @ritchie46, @stijnherfst and @yiteng-guo

Contributors

orlp, adamreeve, and 21 other contributors

Assets 4

11 Apr 10:26

github-actions

py-1.27.1

319a9a8

Python Polars 1.27.1

✨ Enhancements

Improved expression autocomplete for IPython, Jupyter, and Marimo (#22221)

🐞 Bug fixes

Incorrect condition on empty inner join fast path (#22208)
Fallback predicate filter for min=max with is_in (#22213)
Don't panic for LruCachedFunc for size=0 (#22215)
Writing masked out list values to json (#22210)
Deadlock in streaming distributor (#22207)

Thank you to all our contributors for making this release possible!
@Matt711, @alexander-beedie, @coastalwhite, @dependabot[bot], @orlp, @ritchie46 and dependabot[bot]

Contributors

orlp, alexander-beedie, and 4 other contributors

Assets 4

09 Apr 17:27

github-actions

py-1.27.0

075fe61

Python Polars 1.27.0

💥 Breaking changes

Make bottom interval closed in hist (#22090)
Change Partition API to base_path and file_path (#21888)

🚀 Performance improvements

Add CSE to streaming groupby (#22196)
Speed-up new streaming predicate filtering (#22179)
Speedup new-streaming file row count (#22169)
Fix quadratic behavior when casting Enums (#22008)
Lower is_in to bitmap-output semi-join in new streaming engine (#21948)
Fast path for empty inner join (#21965)
Add native semi/anti join in new streaming engine (#21937)
Cache regex compilation globally (#21929)

✨ Enhancements

Add SPLIT_PART string function to the SQL interface (#22158)
Allow scalar expr in Expr.diff (#22142)
Support additional unsigned int aliases in the SQL interface (#22127)
Add STRING_TO_ARRAY function to the SQL interface (#22129)
Add dt.is_business_day (#21776)
Add an eager parameter to pl.cov (#22098)
Add support for Int128 parsing/recognition to the SQL interface (#22104)
Add an eager parameter to pl.coalesce (#22092)
Add an eager parameter to pl.corr (#22097)
Allow sinking to abstract python io and fs classes (#21987)
Add add_alp_optimize_exprs to IRBuilder (#22061)
Add cat.slice (#21971)
Support growing schema if line lenght increases during csv schema inference (#21979)
Replace thread unsafe GilOnceCell with Mutex (#21927)
Support modified dsl in file cache (#21907)

🐞 Bug fixes

Implode in agg (#22197)
Reduce GIL hold time for IO plugins in new-streaming (#22186)
Enhance predicate validation and cast safety in join_where (#22112)
Handle Parquet with compressed empty DataPage v2 (#22172)
Schema error during lowering (#22175)
Rewrite unroll of overlapping groups to mitigate out of range index panic (#22072)
Incorrect rounding for very large/small numbers (#22173)
Allow set input to list.set_* operations (#22163)
Deadlock in join due to rayon nested task-stealing (#22159)
Mark Expr.repeat_by as elementwise (#22068)
Fix csv serializer panic by supporting ScalarColumn in as_single_chunk (#22146)
Raise an error if a number doesn't have associated unit in duration strings (#22035)
Add i128 as supertype to boolean (#22138)
Fix panic when constructing DF from pyarrow due to duplicate field names (#22114)
Add broadcasts and error messages for many elementwise operations (#22130)
Throw error for n=0 on list.gather_every (#22122)
Throw error for unsupported rolling operations (#22121)
Error on unequal length str.to_integer arguments (#22100)
Make bottom interval closed in hist (#22090)
Relative path resolution for plugin libraries (#21911)
Avoiding panic with striptime for out-of-bounds dates (#21208)
Join revmaps for categoricals in merge_sorted (#21976)
Fix glob expansion matching extra files (#21991)
Ensure SQL dot-notation for nested column fields resolves correctly (#22109)
Parquet filter performance regression from multiscan dispatch (#22116)
Panic for unequal length ewm_mean_by args (#22093)
Add scalarity checks to pl.repeat (#22088)
Type check n parameter of pl.repeat (#22071)
Mark bitwise_{count,leading,trailing}_{ones,zeros} as elementwise (#22044)
Mark pl.*_ranges functions correctly as element-wise (#22059)
Correctly type check pl.arctan2 (#22060)
Mark pl.business_day_count as elementwise (#22055)
Check input python type for str.extract_groups (#22032)
Check types for fill_char in str.pad_{start,end} (#22036)
Mark str.to_decimal properly as non-elementwise (#22040)
Documented return type for bin.encode and bin.decode (#22022)
Revert #22017 and improve block(_in_place)_on doc comment (#22031)
Remove outdated depth warning (#22030)
Expression pl.concat was incorrectly marked as elementwise (#22019)
Use block_in_place_on to start streaming (#22017)
Panic on empty aggregation in streaming (#22016)
Error instead of panick for invalid durations in dt.offset_by() and dt.round() (#21982)
Raise error instead of silently appending NULL in NDJSON parsing (#21953)
Ensure AV is static before pushing to row buffer (#21967)
Deadlock in new-streaming multiplexer (#21963)
Release GIL in collect_with_callback (#21941)
Panic in new RegexCache (#21935)
Type hint of cs.exclude() is SelectorType instead of Expr (#21892)
Add correct deprecation warning for .str.concat (#21666)
Use absolute paths by defaults for plugins (#21904)

📖 Documentation

Add user guide section on working with Sheets in Colab (#22161)
Update distributed engine docs (#22128)
Add Polars Cloud release notes (#22021)
Remove trailing space in settings POLARS_CLOUD_CLIENT_ID (#21995)
Fix typo (#21954)
Fix 'pickleable' typo in docs (#21938)
Change ctx to compute=ctx for all remote query examples (#21930)

🛠️ Other improvements

Remove old MultiScanExec for in-memory (#22184)
Separate FunctionOptions from DSL calls (#22133)
Undeprecate backward_fill and forward_fill (#22156)
Handle conversion of Duration specially in pyir (#22101)
Deprecate duplicate backward_fill and forward_fill interface (#22083)
Solve clippy lints for 1.86 (#22102)
Remove rust exclusive MaxBound and MinBound fill strategies (#22063)
Change Partition API to base_path and file_path (#21888)
Fix pydantic model_fields deprecation (#21958)

Thank you to all our contributors for making this release possible!
@DeflateAwning, @EnricoMi, @Jacob640, @JakubValtar, @MarcoGorelli, @MaxJackson, @alexander-beedie, @amotzop, @anath2, @bschoenmaeckers, @cnpryer, @coastalwhite, @dependabot[bot], @eitsupi, @etiennebacher, @hemanth94, @kdn36, @mcrumiller, @nameexhaustion, @orlp, @r-brink, @rgertenbach, @ritchie46, @sebasv, @silannisik, @stijnherfst, @wence-, @zachlefevre and dependabot[bot]

Contributors

orlp, wence-, and 26 other contributors

Assets 4

Uh oh!

Releases: pola-rs/polars

Rust Polars 0.48.1

🚀 Performance improvements

🐞 Bug fixes

📦 Build system

🛠️ Other improvements

Contributors

Uh oh!

Python Polars 1.30.0

🚀 Performance improvements

✨ Enhancements

🐞 Bug fixes

📖 Documentation

📦 Build system

🛠️ Other improvements

Contributors

Uh oh!

Rust Polars 0.48.0

💥 Breaking changes

🚀 Performance improvements

✨ Enhancements

🐞 Bug fixes

📖 Documentation

📦 Build system

🛠️ Other improvements

Contributors

Uh oh!

Python Polars 1.30.0-beta.1

🚀 Performance improvements

✨ Enhancements

🐞 Bug fixes

📖 Documentation

🛠️ Other improvements

Contributors

Uh oh!

Rust Polars 0.47.1

🏆 Highlights

💥 Breaking changes

🚀 Performance improvements

✨ Enhancements

Contributors

Uh oh!

Python Polars 1.29.0

🚀 Performance improvements

✨ Enhancements

🐞 Bug fixes

📖 Documentation

🛠️ Other improvements

Contributors

Uh oh!

Python Polars 1.28.1

🐞 Bug fixes

📖 Documentation

📦 Build system

🛠️ Other improvements

Contributors

Uh oh!

Python Polars 1.28.0

🚀 Performance improvements

✨ Enhancements

🐞 Bug fixes

📖 Documentation

🛠️ Other improvements

Contributors

Uh oh!

Python Polars 1.27.1

✨ Enhancements

🐞 Bug fixes

Contributors

Uh oh!

Python Polars 1.27.0

💥 Breaking changes

🚀 Performance improvements

✨ Enhancements

🐞 Bug fixes

📖 Documentation

🛠️ Other improvements

Contributors

Uh oh!