Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: adds _Stores(Native|Compliant) #2130

Merged
merged 28 commits into from
Mar 8, 2025
Merged

Conversation

dangotbanned
Copy link
Member

@dangotbanned dangotbanned commented Mar 2, 2025

What type of PR is this? (check all applicable)

  • πŸ’Ύ Refactor
  • ✨ Feature
  • πŸ› Bug Fix
  • πŸ”§ Optimization
  • πŸ“ Documentation
  • βœ… Test
  • 🐳 Other

Related issues

Checklist

  • Code follows style guide (ruff)
  • Tests added
  • Documented the changes

If you have comments or can explain your changes, please do so below

Originally (#1301 (comment))

@FBruzzesi I've tested this out for some parts of _arrow over on main...stores-native-compliant

Really think it's a nice improvement to readabilty and DRYs things up well πŸ™‚

With a lot of the repetition out of the way, I was able to spot a few more things that could be written more concisely.

Not sure if this is worth opening an issue for - but if we went ahead with it - then it'd probably make sense to try and do it in one or more big chunks to avoid conflicts

Note

The parts that I've touched in _arrow/*.py are ready for review.
Other modules have only had the bare-minimum changes to meet the new protocol(s)
The goal would be to repeat this kind of pattern throughout narwhals (pending feedback)

@dangotbanned
Copy link
Member Author

dangotbanned commented Mar 2, 2025

TODO

Important

I feel like this clearly demonstrates the benefit of protocols πŸ˜…
(342d629)

@dangotbanned dangotbanned changed the title refactor(DRAFT): adds _Stores(Native|Compliant) refactor(RFC): adds _Stores(Native|Compliant) Mar 2, 2025
Comment on lines 124 to 164
class _StoresNative(Protocol[NativeT_co]):
@property
def native(self) -> NativeT_co:
"""Return the native object."""
...


class _StoresCompliant(Protocol[CompliantT_co]):
@property
def compliant(self) -> CompliantT_co:
"""Return the narwhals object."""
...


class _SeriesNamespace( # type: ignore[misc] # noqa: PYI046
_StoresCompliant[CompliantSeriesT_co],
_StoresNative[NativeT_co],
Protocol[CompliantSeriesT_co, NativeT_co],
):
_compliant_series: CompliantSeriesT_co

@property
def compliant(self) -> CompliantSeriesT_co:
return self._compliant_series

@property
def native(self) -> NativeT_co:
return self._compliant_series.native

def from_native(self, series: Any, /) -> CompliantSeriesT_co:
return self.compliant._from_native_series(series)


class _ExprNamespace( # type: ignore[misc] # noqa: PYI046
_StoresCompliant[CompliantExprT_co], Protocol[CompliantExprT_co]
):
_compliant_expr: CompliantExprT_co

@property
def compliant(self) -> CompliantExprT_co:
return self._compliant_expr
Copy link
Member Author

@dangotbanned dangotbanned Mar 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The core idea is providing (read-only) accessors and fast-paths to commonly used methods.

These are what I've felt were needed so far.

We can also but get more specific - like for temporal properties:

ArrowSeriesDateTimeNamespace

class ArrowSeriesDateTimeNamespace(ArrowSeriesNamespace):
@property
def unit(self) -> TimeUnit: # NOTE: Unsafe (native).
return cast("pa.TimestampType[TimeUnit, Any]", self.native.type).unit
@property
def time_zone(self) -> str | None: # NOTE: Unsafe (narwhals).
return cast("Datetime", self.compliant.dtype).time_zone

Ideally, I think these should not be used for methods accepting more than a single argument.
That way it keeps things pretty clean and doesn't introduce backend-specific details at this high of a level

@dangotbanned dangotbanned marked this pull request as ready for review March 3, 2025 18:32
Comment on lines 122 to +123
def millisecond(self: Self) -> ArrowSeries:
return self._compliant_series._from_native_series(
pc.millisecond(self._compliant_series._native_series)
)
return self.from_native(pc.millisecond(self.native))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the most typical change - where it becomes significantly eaiser to see the unique thing the method is doing

Before

    def millisecond(self: Self) -> ArrowSeries:
        return self._compliant_series._from_native_series(
            pc.millisecond(self._compliant_series._native_series)
        )

After

    def millisecond(self: Self) -> ArrowSeries:
        return self.from_native(pc.millisecond(self.native))

Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @dangotbanned ! love these simplifications

just one really minor comment (also, a merge conflict)

Comment on lines 130 to 137
class _StoresNative(Protocol[NativeT_co]):
"""Provides access to a native object.
Native objects are types like:
>>> from pandas import Series
>>> from pyarrow import Table
"""
Copy link
Member Author

@dangotbanned dangotbanned Mar 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Ignore the typo ("are" -> "have")
fixed in (0921f33)

100% expecting this doc to be questioned.

Was the only style that VSCode seemed to parse correctly:

image

Alternatives

image

image

image

@dangotbanned dangotbanned changed the title refactor(RFC): adds _Stores(Native|Compliant) refactor: adds _Stores(Native|Compliant) Mar 8, 2025
@dangotbanned dangotbanned merged commit 90ca761 into main Mar 8, 2025
30 checks passed
@dangotbanned dangotbanned deleted the stores-native-compliant branch March 8, 2025 12:20
dangotbanned added a commit that referenced this pull request Mar 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants