-
Notifications
You must be signed in to change notification settings - Fork 131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: adds _Stores(Native|Compliant)
#2130
Conversation
That class was the motivating case in #1301 (comment)
Didn't think of it before, but all the namespace code can be reduced a lot with this
- Only the constructor needs to be implemented for each backend - To avoid #2064 (comment)
TODO
Important I feel like this clearly demonstrates the benefit of protocols π
|
Wasn't planning to implement it yet but https://github.com/narwhals-dev/narwhals/actions/runs/13616393368/job/38059997094?pr=2130
- No changes to runtime - but these really ought to be checking the type actually has the attribute(s)
- Lots of long access paths cleaned up - Making it easier to spot other refactorings
_Stores(Native|Compliant)
_Stores(Native|Compliant)
narwhals/utils.py
Outdated
class _StoresNative(Protocol[NativeT_co]): | ||
@property | ||
def native(self) -> NativeT_co: | ||
"""Return the native object.""" | ||
... | ||
|
||
|
||
class _StoresCompliant(Protocol[CompliantT_co]): | ||
@property | ||
def compliant(self) -> CompliantT_co: | ||
"""Return the narwhals object.""" | ||
... | ||
|
||
|
||
class _SeriesNamespace( # type: ignore[misc] # noqa: PYI046 | ||
_StoresCompliant[CompliantSeriesT_co], | ||
_StoresNative[NativeT_co], | ||
Protocol[CompliantSeriesT_co, NativeT_co], | ||
): | ||
_compliant_series: CompliantSeriesT_co | ||
|
||
@property | ||
def compliant(self) -> CompliantSeriesT_co: | ||
return self._compliant_series | ||
|
||
@property | ||
def native(self) -> NativeT_co: | ||
return self._compliant_series.native | ||
|
||
def from_native(self, series: Any, /) -> CompliantSeriesT_co: | ||
return self.compliant._from_native_series(series) | ||
|
||
|
||
class _ExprNamespace( # type: ignore[misc] # noqa: PYI046 | ||
_StoresCompliant[CompliantExprT_co], Protocol[CompliantExprT_co] | ||
): | ||
_compliant_expr: CompliantExprT_co | ||
|
||
@property | ||
def compliant(self) -> CompliantExprT_co: | ||
return self._compliant_expr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The core idea is providing (read-only) accessors and fast-paths to commonly used methods.
These are what I've felt were needed so far.
We can also but get more specific - like for temporal properties:
ArrowSeriesDateTimeNamespace
narwhals/narwhals/_arrow/series_dt.py
Lines 24 to 31 in 5c642eb
class ArrowSeriesDateTimeNamespace(ArrowSeriesNamespace): | |
@property | |
def unit(self) -> TimeUnit: # NOTE: Unsafe (native). | |
return cast("pa.TimestampType[TimeUnit, Any]", self.native.type).unit | |
@property | |
def time_zone(self) -> str | None: # NOTE: Unsafe (narwhals). | |
return cast("Datetime", self.compliant.dtype).time_zone |
Ideally, I think these should not be used for methods accepting more than a single argument.
That way it keeps things pretty clean and doesn't introduce backend-specific details at this high of a level
def millisecond(self: Self) -> ArrowSeries: | ||
return self._compliant_series._from_native_series( | ||
pc.millisecond(self._compliant_series._native_series) | ||
) | ||
return self.from_native(pc.millisecond(self.native)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the most typical change - where it becomes significantly eaiser to see the unique thing the method is doing
Before
def millisecond(self: Self) -> ArrowSeries:
return self._compliant_series._from_native_series(
pc.millisecond(self._compliant_series._native_series)
)
After
def millisecond(self: Self) -> ArrowSeries:
return self.from_native(pc.millisecond(self.native))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @dangotbanned ! love these simplifications
just one really minor comment (also, a merge conflict)
Couldn't get VSCode to format the doc correctly when using "Examples:" #2130 (comment)
narwhals/utils.py
Outdated
class _StoresNative(Protocol[NativeT_co]): | ||
"""Provides access to a native object. | ||
Native objects are types like: | ||
>>> from pandas import Series | ||
>>> from pyarrow import Table | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important
Ignore the typo ("are" -> "have")
fixed in (0921f33)
100% expecting this doc to be questioned.
Was the only style that VSCode seemed to parse correctly:
_Stores(Native|Compliant)
_Stores(Native|Compliant)
- Bit confused as what I've done is fairly similar to (#2130) - Getting the same issue as (#2084) https://github.com/narwhals-dev/narwhals/actions/runs/13740460535/job/38429062404?pr=2149
What type of PR is this? (check all applicable)
Related issues
native
property and deprecate.to_native()
methodΒ #1301)CompliantExpr
Β #2119 (comment)Checklist
If you have comments or can explain your changes, please do so below
Originally (#1301 (comment))
Note
The parts that I've touched in
_arrow/*.py
are ready for review.Other modules have only had the bare-minimum changes to meet the new protocol(s)
The goal would be to repeat this kind of pattern throughout
narwhals
(pending feedback)