feat: support dayname and monthname natively#4544
Merged
mbutrovich merged 5 commits intoJun 2, 2026
Merged
Conversation
c08890c to
f60f5b1
Compare
Add Comet support for the Spark 4.0+ dayname and monthname expressions by routing them through the Arrow-direct codegen dispatcher in the spark-4.0 shim, so they run Spark's own generated code for exact parity including locale handling. Also add tests confirming that several datetime functions already execute in Comet via Spark's rewrite rules, and update the expression support doc to reflect this coverage: - to_date / to_timestamp / to_timestamp_ntz / to_timestamp_ltz reduce to Cast (no format) or GetTimestamp (with format) - make_timestamp_ntz / make_timestamp_ltz (6-argument form) reduce to MakeTimestamp - current_date / current_timestamp / now / curdate are constant-folded to literals before Comet sees the plan
f60f5b1 to
5488ec0
Compare
… ci] The docs-only corrections for to_date/to_timestamp/make_timestamp/current_* move to apache#4543. This PR's doc change is limited to the new dayname and monthname expressions. monthname is now also checked off (it was missed).
Replace the JVM codegen-dispatch path with a native Rust scalar function. Spark's DayName/MonthName map a DateType value through DayOfWeek/Month.getDisplayName(TextStyle.SHORT, Locale.US), which is locale-fixed and timezone-independent, so the native function reproduces it exactly with US-English lookup tables keyed on the weekday/month of the Date32 value. Wired through CometExprShim for Spark 4.0/4.1/4.2 via scalarFunctionExprToProto, registered as dayname/monthname in comet_scalar_funcs. The test now asserts native execution without the codegen dispatcher and covers null and literal inputs.
…to those functions [skip ci] The dayname/monthname conversion was identical across the spark-4.0/4.1/4.2 CometExprShim files. Move it into a single CometExprShim4x trait under the shared spark-4.x source root (compiled for all 4.x profiles) and have each per-version shim mix it in and delegate, removing the triplication. Also drop the to_date/to_timestamp, make_timestamp, and current_date/now tests so this PR is strictly the dayname/monthname feature; those cover rewrite-backed and constant-folded functions whose documentation lives in the docs PR.
Replace the Scala CometExpressionSuite test with one SQL file per function under expressions/datetime, gated to Spark 4.0+ via MinSparkVersion so they skip cleanly on 3.4/3.5 where the functions do not exist. The default query mode asserts native Comet execution and Spark-identical results; coverage includes a null date and a literal argument.
mbutrovich
approved these changes
Jun 2, 2026
Contributor
mbutrovich
left a comment
There was a problem hiding this comment.
LGTM, thanks @andygrove!
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Part of #4418.
Rationale for this change
daynameandmonthname(Spark 4.0+) had no Comet support and forced a fallback to Spark.What changes are included in this PR?
This work was scaffolded with the
implement-comet-expressionskill.daynameandmonthnameare implemented natively. Spark'sDayName/MonthNamemap aDateTypevalue throughDayOfWeek/Month.getDisplayName(TextStyle.SHORT, Locale.US).DateFormatter.defaultLocaleis the constantLocale.US, so the result is a fixed set of abbreviated English names (Mon..Sun,Jan..Dec) with no session-locale or timezone dependence. The native Rust scalar function reproduces this exactly with US-English lookup tables keyed on the weekday / month computed from the Date32 value.dayname/monthnamescalar function innative/spark-expr/src/datetime_funcs/day_month_name.rs, registered incomet_scalar_funcs.CometExprShim(these are Spark 4.0+ expression classes, so they cannot live in shared serde) that emits the scalar function proto.These were Spark 4.0+ only, so they are kept out of the shared serde and gated to 4.0+.
The documentation-only corrections for the rewrite-backed and constant-folded datetime functions (
to_date,to_timestamp,make_timestamp_*,current_*,now,curdate) are split out into #4543.How are these changes tested?
A native Rust unit test covers the weekday / month name mapping, including the epoch boundary and a pre-epoch date. A
CometExpressionSuitetest (gated to Spark 4.0+) usescheckSparkAnswerAndOperatorto assert both functions execute fully in Comet (no fallback) and return Spark-identical results across column references, a null date, and a constant-folding-disabled literal. Verified locally against the Spark 4.0 profile.