refactor: adds `_compliant` sub-package #2149

dangotbanned · 2025-03-05T16:22:01Z

Closes #2044

What type of PR is this? (check all applicable)

Related issues

Checklist

Code follows style guide (ruff)
Tests added
Documented the changes

If you have comments or can explain your changes, please do so below

Direct follow-up to (#2119), aiming to make space for similar changes for other Compliant* protocols.

This PR spiralled a bit, but tackles a few quite-related issues.

Lazy-only typing

#2044 has been resolved through the use of the new protocol NativeExpr.

narwhals/narwhals/_compliant/expr.py

Lines 65 to 73 in b3bdb3b

    
           class NativeExpr(Protocol): 
        
               """An `Expr`-like object from a package with [Lazy-only support](https://narwhals-dev.github.io/narwhals/extending/#levels-of-support). 
        
               Protocol members are chosen *purely* for matching statically - as they 
        
               are common to all currently supported packages. 
        
               """ 
        
               def between(self, *args: Any, **kwds: Any) -> Any: ... 
        
               def isin(self, *args: Any, **kwds: Any) -> Any: ...

`(Arrow|PandasLike)*` deduplication

All of the new classes prefixed with Eager now provide the common functionality between the two Eager-only backends.
The big one is EagerExpr

narwhals/narwhals/_compliant/expr.py

Lines 257 to 260 in b3bdb3b

    
           class EagerExpr( 
        
               CompliantExpr[EagerDataFrameT, EagerSeriesT], 
        
               Protocol38[EagerDataFrameT, EagerSeriesT], 
        
           ):

A very unexpected part of that was that they could also share *ExprNamespace implementations

(5a13b63)
(2777822)

`*Namespace` protocols

Working on the Eager namespaces led to the new generic protocols for both (expr|series)_* namespaces https://github.com/narwhals-dev/narwhals/blob/b3bdb3b3ba52dada9cadde3c50d68520f1270a09/narwhals/_compliant/any_namespace.py

These can be re-used for all backends 😄

Misc

I followed up on (#2055 (comment)) and converted some methods to simpler .from_ variants.
The only remaining one (_create_compliant_series) is discussed in (#2149 (comment))

I also opened (#2169) for following up some strange pandas behavior

Note

There might be some other bits in commit descriptions I've missed

Follow-up to #2119 (comment)

- Long-term, should probably be defined in `nw.typing` - Or just generally used in parts of each backend impl #2064 (comment)

#2044 (comment)

narwhals/typing.py

narwhals/_compliant/expr.py

https://github.com/narwhals-dev/narwhals/actions/runs/13680592205/job/38251688439

#2149 (comment)

Fixes #2044

dangotbanned · 2025-03-06T17:24:15Z

narwhals/_compliant/typing.py

+CompliantSeriesOrNativeExprT_co = TypeVar(
+    "CompliantSeriesOrNativeExprT_co",
+    bound="CompliantSeries | NativeExpr",
+    covariant=True,
+)


Really want something that rolls off the tongue better than this 🤔

```log error: All protocol members must have explicitly declared types [misc] ```

narwhals/_compliant/expr.py

Beginning to merge parts of (#2055)

narwhals/_compliant/expr.py

MarcoGorelli · 2025-03-08T15:04:36Z

thanks Dan! From a quick look I think this looks great, feel free to ping me for review when you're happy with it

dangotbanned · 2025-03-08T15:07:23Z

thanks Dan! From a quick look I think this looks great, feel free to ping me for review when you're happy with it

Thanks @MarcoGorelli, will do!

dangotbanned · 2025-03-08T15:35:33Z

narwhals/_compliant/expr.py

+    # For PyArrow.Series, we return Python Scalars (like Polars does) instead of PyArrow Scalars.
+    # However, when working with expressions, we keep everything PyArrow-native.
+    def _reuse_series_extra_kwargs(
+        self, *, returns_scalar: bool = False
+    ) -> dict[str, Any]:
+        return {}


I feel like this is fine in terms of a pure refactoring of:

narwhals/narwhals/_expression_parsing.py

Lines 160 to 172 in b508e63

# For PyArrow.Series, we return Python Scalars (like Polars does) instead of PyArrow Scalars.

# However, when working with expressions, we keep everything PyArrow-native.

extra_kwargs = (

{"_return_py_scalar": False}

if returns_scalar and expr._implementation is Implementation.PYARROW

else {}

)

out: list[CompliantSeries] = [

plx._create_series_from_scalar(

getattr(series, attr)(**extra_kwargs, **_kwargs),

reference_series=series, # type: ignore[arg-type]

)

Into this override:

narwhals/narwhals/_arrow/expr.py

Lines 138 to 141 in e64626a

def _reuse_series_extra_kwargs(

self, *, returns_scalar: bool = False

) -> dict[str, Any]:

return {"_return_py_scalar": False} if returns_scalar else {}

But I really can't shake wanting to handle this differently somehow.
Haven't spent too long thinking about it - maybe take another look before review

- Eventually want to get rid of `create_compliant_series` - `pandas` currently works differently in `create_compliant_series` and `native_series_from_iterable`

- Experimenting with an idea from (#2149) - Trying to understand the different behavior from `create_compliant_series`

Only `.name` differs between the two `EagerExpr` - so why not share?

- Bit confused as what I've done is fairly similar to (#2130) - Getting the same issue as (#2084) https://github.com/narwhals-dev/narwhals/actions/runs/13740460535/job/38429062404?pr=2149

narwhals/_compliant/expr.py

`pandas` impl will fail now

Didn't realise until now that both backends did the same thing

dangotbanned · 2025-03-09T10:19:47Z

narwhals/_compliant/namespace.py

+    @deprecated("ref'd in untyped code")
+    def _create_compliant_series(self, value: Any) -> EagerSeriesT: ...


Took a while to fully unravel.

Tip
See this code search and my earlier note to understand existing usage.

Very rough notes

IGNORE PLS

I'm just sticking this here to get it out of my changes:

# NOTE: Calling from an instance of `Namespace` (provides `context`) # NOTE: All usage within `*Expr.map_batches` # - `PandasLikeExpr` uses that **once** # - `ArrowExpr` uses **twice** # - Expr -> Namespace -> Series # - But handling `numpy` # NOTE: `PandasLikeDataFrame.with_row_index` uses the wrapped `utils` function once # NOTE External # - `_expression_parsing.extract_compliant` (numpy array) # - `dataframe.DataFrame._extract_compliant` (numpy array) # NOTE General # - Most similar to `EagerSeries._from_iterable`

I'm starting to think the broader replacement for _create_compliant_series will be:

Collapsed imports

from typing import TYPE_CHECKING from typing import Any from typing import Protocol from typing import overload from narwhals._compliant.typing import EagerDataFrameT from narwhals._compliant.typing import EagerSeriesT if TYPE_CHECKING: import numpy as np from typing_extensions import TypeAlias from narwhals.typing import _1DArray from narwhals.typing import _2DArray

_NumpyScalar: TypeAlias = "np.generic[Any]" class EagerNamespace( CompliantNamespace[EagerDataFrameT, EagerSeriesT], Protocol[EagerDataFrameT, EagerSeriesT], ): @overload def from_numpy(self, value: _NumpyScalar | _1DArray, /) -> EagerSeriesT: ... @overload def from_numpy(self, value: _2DArray, /) -> EagerDataFrameT: ... def from_numpy(self, value: _NumpyScalar | _1DArray | _2DArray, /) -> Any: ...

_NumpyScalar is handled by:

narwhals/narwhals/_compliant/series.py

Lines 54 to 55 in 067252b

def _from_scalar(self, value: Any) -> Self:

return self._from_iterable([value], name=self.name, context=self)

_1DArray is handled by:

narwhals/narwhals/_compliant/series.py

Lines 57 to 59 in 067252b

@classmethod

def _from_iterable(

cls: type[Self], data: Iterable[Any], name: str, *, context: _FullContext

_2DArray would be each of the implementations in:

nw.functions.from_numpy

narwhals/narwhals/functions.py

Lines 435 to 561 in c223138

def from_numpy(

data: _2DArray,

schema: Mapping[str, DType] | Schema | Sequence[str] | None = None,

*,

native_namespace: ModuleType,

) -> DataFrame[Any]:

"""Construct a DataFrame from a NumPy ndarray.

Notes:

Only row orientation is currently supported.

For pandas-like dataframes, conversion to schema is applied after dataframe

creation.

Arguments:

data: Two-dimensional data represented as a NumPy ndarray.

schema: The DataFrame schema as Schema, dict of {name: type}, or a sequence of str.

native_namespace: The native library to use for DataFrame creation.

Returns:

A new DataFrame.

Examples:

>>> import numpy as np

>>> import pyarrow as pa

>>> import narwhals as nw

>>>

>>> arr = np.array([[5, 2, 1], [1, 4, 3]])

>>> schema = {"c": nw.Int16(), "d": nw.Float32(), "e": nw.Int8()}

>>> nw.from_numpy(arr, schema=schema, native_namespace=pa)

┌──────────────────┐

|Narwhals DataFrame|

|------------------|

| pyarrow.Table |

| c: int16 |

| d: float |

| e: int8 |

| ---- |

| c: [[5,1]] |

| d: [[2,4]] |

| e: [[1,3]] |

└──────────────────┘

"""

return _from_numpy_impl(data, schema, native_namespace=native_namespace)

def _from_numpy_impl(

data: _2DArray,

schema: Mapping[str, DType] | Schema | Sequence[str] | None = None,

*,

native_namespace: ModuleType,

) -> DataFrame[Any]:

from narwhals.schema import Schema

if not is_numpy_array_2d(data):

msg = "`from_numpy` only accepts 2D numpy arrays"

raise ValueError(msg)

implementation = Implementation.from_native_namespace(native_namespace)

if implementation is Implementation.POLARS:

if isinstance(schema, (Mapping, Schema)):

schema_pl: pl.Schema | Sequence[str] | None = Schema(schema).to_polars()

elif is_sequence_but_not_str(schema) or schema is None:

schema_pl = schema

else:

msg = (

"`schema` is expected to be one of the following types: "

"Mapping[str, DType] | Schema | Sequence[str]. "

f"Got {type(schema)}."

)

raise TypeError(msg)

native_frame = native_namespace.from_numpy(data, schema=schema_pl)

elif implementation.is_pandas_like():

if isinstance(schema, (Mapping, Schema)):

from narwhals._pandas_like.utils import get_dtype_backend

it: Iterable[DTypeBackend] = (

get_dtype_backend(native_type, implementation)

for native_type in schema.values()

)

native_frame = native_namespace.DataFrame(data, columns=schema.keys()).astype(

Schema(schema).to_pandas(it)

)

elif is_sequence_but_not_str(schema):

native_frame = native_namespace.DataFrame(data, columns=list(schema))

elif schema is None:

native_frame = native_namespace.DataFrame(

data, columns=[f"column_{x}" for x in range(data.shape[1])]

)

else:

msg = (

"`schema` is expected to be one of the following types: "

"Mapping[str, DType] | Schema | Sequence[str]. "

f"Got {type(schema)}."

)

raise TypeError(msg)

elif implementation is Implementation.PYARROW:

pa_arrays = [native_namespace.array(val) for val in data.T]

if isinstance(schema, (Mapping, Schema)):

schema_pa = Schema(schema).to_arrow()

native_frame = native_namespace.Table.from_arrays(pa_arrays, schema=schema_pa)

elif is_sequence_but_not_str(schema):

native_frame = native_namespace.Table.from_arrays(

pa_arrays, names=list(schema)

)

elif schema is None:

native_frame = native_namespace.Table.from_arrays(

pa_arrays, names=[f"column_{x}" for x in range(data.shape[1])]

)

else:

msg = (

"`schema` is expected to be one of the following types: "

"Mapping[str, DType] | Schema | Sequence[str]. "

f"Got {type(schema)}."

)

raise TypeError(msg)

else: # pragma: no cover

try:

# implementation is UNKNOWN, Narwhals extension using this feature should

# implement `from_numpy` function in the top-level namespace.

native_frame = native_namespace.from_numpy(data, schema=schema)

except AttributeError as e:

msg = "Unknown namespace is expected to implement `from_numpy` function."

raise AttributeError(msg) from e

return from_native(native_frame, eager_only=True)

The _2DArray case is really interesting to me, since it has already carved out space for us (albeit DataFrame only):

narwhals/narwhals/functions.py

Lines 554 to 560 in c223138

try:

# implementation is UNKNOWN, Narwhals extension using this feature should

# implement `from_numpy` function in the top-level namespace.

native_frame = native_namespace.from_numpy(data, schema=schema)

except AttributeError as e:

msg = "Unknown namespace is expected to implement `from_numpy` function."

raise AttributeError(msg) from e

If we look ahead to v2, the particular names will change but direction is the same:

API: io functions for v2 #2116

API: io functions for v2 #2116 (comment)

Here's what I think the generics need to be:

from typing import Protocol from typing_extensions import TypeVar from narwhals.typing import _1DArray from narwhals.typing import _2DArray if TYPE_CHECKING: import numpy as np from typing_extensions import Self from typing_extensions import TypeAlias from narwhals.utils import _FullContext _NumpyScalar: TypeAlias = np.generic[Any] ToNumpyT_co = TypeVar("ToNumpyT_co", covariant=True) FromNumpyT_contra = TypeVar("FromNumpyT_contra", contravariant=True, default=ToNumpyT_co) class NumpyConvertible(Protocol[ToNumpyT_co, FromNumpyT_contra]): def to_numpy(self) -> ToNumpyT_co: ... @classmethod def from_numpy(cls, value: FromNumpyT_contra, *, context: _FullContext) -> Self: ... class SeriesImpl(NumpyConvertible[_1DArray, _NumpyScalar | _1DArray]): ... class DataFrameImpl(NumpyConvertible[_2DArray]): ...

Notes

To get TypeVar defaults, I'll use the same trick as (feat(typing): Backport generic Series to v1 #2110 (comment))

Probably will want to move that into a new module nw._typing_compat

Can also be the home for Protocol38 and nw.utils.deprecated

Using https://docs.astral.sh/ruff/settings/#lint_typing-modules

I'm not using quoted annotations/forward refs in this example

Will in the real thing

Just trying to make it easier to read on GitHub 🙂

Prior Art

https://github.com/zen-xu/pyarrow-stubs/blob/e9b6405cd151b38e2c6ea8e759e089ce795e4630/pyarrow-stubs/__lib_pxi/array.pyi#L1031-L1052

dangotbanned · 2025-03-09T14:59:42Z

thanks Dan! From a quick look I think this looks great, feel free to ping me for review when you're happy with it

@MarcoGorelli I'm gonna have to wrap this up soon - just to avoid spending too long on resolving conflicts.

Still somewhat manageable atm - but fully a self-inflicted chore I want to avoid now 😅

Will do my best to get it in shape today - really happy with the rabbit holes I've gone down though 😉

Update

This'll have to do for now - ready as I'll ever be @MarcoGorelli

Very open to alternatives on what the module should be called Was the *least-bad* one I thought of

Really hate this as a solution and keep trying to get `mypy` to let me do something else

Was not easy getting that to please both `mypy` & `pyright`

dangotbanned added 4 commits March 5, 2025 15:10

refactor: Add _compliant sub-package

931d7ea

Follow-up to #2119 (comment)

refactor: migrate imports, delete _selectors

cef630e

refactor: Export Eval(Names|Series)

ca2ca24

- Long-term, should probably be defined in `nw.typing` - Or just generally used in parts of each backend impl #2064 (comment)

feat(DRAFT): Add placeholder (Eager|Lazy)Expr protocols

9cc3ce8

#2044 (comment)

dangotbanned added the internal label Mar 5, 2025

dangotbanned commented Mar 5, 2025

View reviewed changes

narwhals/typing.py Outdated Show resolved Hide resolved

dangotbanned commented Mar 5, 2025

View reviewed changes

narwhals/_compliant/expr.py Outdated Show resolved Hide resolved

dangotbanned added 7 commits March 5, 2025 16:37

ci: ignore coverage

53413ae

https://github.com/narwhals-dev/narwhals/actions/runs/13680592205/job/38251688439

Merge remote-tracking branch 'upstream/main' into compliant-package

9efd3c6

refactor: avoid adding typing.IntoCompliantExpr

2bf05fb

#2149 (comment)

refactor: avoid adding typing.CompliantFrameT

7fb4914

#2149 (comment)

refactor: avoid adding typing.CompliantSeriesT_co

51820a1

#2149 (comment)

refactor: avoid adding typing.CompliantNamespace

8f1c5ca

#2149 (comment)

refactor: avoid adding typing.CompliantExpr

5bf2bdb

#2149 (comment)

dangotbanned changed the title ~~refactor(DRAFT): adds _compliant sub-package~~ refactor: adds _compliant sub-package Mar 6, 2025

dangotbanned marked this pull request as ready for review March 6, 2025 16:04

fix(typing): Resolve NativeExpr-related issues

075228b

Fixes #2044

dangotbanned marked this pull request as draft March 6, 2025 17:22

dangotbanned linked an issue Mar 6, 2025 that may be closed by this pull request

fix TypeVar used in (SparkLike|DuckDB)Expr #2044

Open

dangotbanned commented Mar 6, 2025

View reviewed changes

dangotbanned added fix typing labels Mar 6, 2025

dangotbanned added 2 commits March 6, 2025 17:37

refactor(typing): spec out and reuse LazyExpr

a0fe52a

be quiet mypy!

23fdf8f

```log error: All protocol members must have explicitly declared types [misc] ```

dangotbanned commented Mar 6, 2025

View reviewed changes

narwhals/_compliant/expr.py Show resolved Hide resolved

dangotbanned added 4 commits March 6, 2025 20:29

feat(DRAFT): EagerExpr from _expression_parsing.py

d58e2c9

Beginning to merge parts of (#2055)

feat: Generic dunders

6cb725c

chore(typing): fix variance issues

bc93782

feat(DRAFT): add a couple more methods

7d48726

dangotbanned commented Mar 6, 2025

View reviewed changes

narwhals/_compliant/expr.py Outdated Show resolved Hide resolved

Merge remote-tracking branch 'upstream/main' into compliant-package

e64626a

dangotbanned commented Mar 8, 2025

View reviewed changes

refactor: avoid reimplementing ArrowSeries._from_iterable

612df69

- Eventually want to get rid of `create_compliant_series` - `pandas` currently works differently in `create_compliant_series` and `native_series_from_iterable`

dangotbanned added a commit that referenced this pull request Mar 8, 2025

refactor(DRAFT): testing native_series_from_iterable w/o copy=False

3c343c7

- Experimenting with an idea from (#2149) - Trying to understand the different behavior from `create_compliant_series`

dangotbanned mentioned this pull request Mar 8, 2025

refactor(DRAFT): native_series_from_iterable w/o copy=False #2169

Draft

dangotbanned added 2 commits March 8, 2025 18:43

feat: Super reusable namespaces 🤯

5a13b63

Only `.name` differs between the two `EagerExpr` - so why not share?

maybe 3.8 compat?

c662d85

- Bit confused as what I've done is fairly similar to (#2130) - Getting the same issue as (#2084) https://github.com/narwhals-dev/narwhals/actions/runs/13740460535/job/38429062404?pr=2149

dangotbanned commented Mar 8, 2025

View reviewed changes

narwhals/_compliant/expr.py Show resolved Hide resolved

dangotbanned added 4 commits March 8, 2025 19:42

chore(typing): add some safety for .name

d53ce01

`pandas` impl will fail now

feat: adds EagerExprNameNamespace

2777822

Didn't realise until now that both backends did the same thing

Merge branch 'main' into compliant-package

b05f132

refactor: Simplify EagerExprNameNamespace

067252b

dangotbanned commented Mar 9, 2025

View reviewed changes

dangotbanned mentioned this pull request Mar 9, 2025

docs(ruff): Improve config docs #2173

Merged

11 tasks

dangotbanned added 2 commits March 9, 2025 14:33

Merge remote-tracking branch 'upstream/main' into compliant-package

0d57240

chore: resolve more conflicts from (#2168)

309f910

dangotbanned added 7 commits March 9, 2025 14:59

Merge branch 'main' into compliant-package

3a434c2

docs: add module-level doc

8ec9cd9

Very open to alternatives on what the module should be called Was the *least-bad* one I thought of

docs(typing): Explain is_eager_expr

c368a60

Really hate this as a solution and keep trying to get `mypy` to let me do something else

docs: NativeExpr

7d3addc

chore: remove outdated comments

66ecf84

Merge remote-tracking branch 'upstream/main' into compliant-package

d1dd6ce

fix(typing): EagerNamespace.all_horizontal

b3bdb3b

Was not easy getting that to please both `mypy` & `pyright`

dangotbanned mentioned this pull request Mar 9, 2025

API: io functions for v2 #2116

Open

dangotbanned marked this pull request as ready for review March 9, 2025 17:48

dangotbanned requested a review from MarcoGorelli March 9, 2025 18:08

dangotbanned mentioned this pull request Mar 9, 2025

refactor: adding Compliant* sub-protocols #2055

Closed

11 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: adds `_compliant` sub-package #2149

refactor: adds `_compliant` sub-package #2149

dangotbanned commented Mar 5, 2025 •

edited

Loading

dangotbanned Mar 6, 2025

MarcoGorelli commented Mar 8, 2025

dangotbanned commented Mar 8, 2025

dangotbanned Mar 8, 2025

dangotbanned Mar 9, 2025 •

edited

Loading

dangotbanned Mar 9, 2025

dangotbanned Mar 9, 2025 •

edited

Loading

dangotbanned commented Mar 9, 2025 •

edited

Loading

	class NativeExpr(Protocol):
	"""An `Expr`-like object from a package with [Lazy-only support](https://narwhals-dev.github.io/narwhals/extending/#levels-of-support).

	Protocol members are chosen purely for matching statically - as they
	are common to all currently supported packages.
	"""

	def between(self, args: Any, *kwds: Any) -> Any: ...
	def isin(self, args: Any, *kwds: Any) -> Any: ...

	class EagerExpr(
	CompliantExpr[EagerDataFrameT, EagerSeriesT],
	Protocol38[EagerDataFrameT, EagerSeriesT],
	):

	# For PyArrow.Series, we return Python Scalars (like Polars does) instead of PyArrow Scalars.
	# However, when working with expressions, we keep everything PyArrow-native.
	extra_kwargs = (
	{"_return_py_scalar": False}
	if returns_scalar and expr._implementation is Implementation.PYARROW
	else {}
	)

	out: list[CompliantSeries] = [
	plx._create_series_from_scalar(
	getattr(series, attr)(extra_kwargs, _kwargs),
	reference_series=series, # type: ignore[arg-type]
	)

	def _reuse_series_extra_kwargs(
	self, *, returns_scalar: bool = False
	) -> dict[str, Any]:
	return {"_return_py_scalar": False} if returns_scalar else {}

		@deprecated("ref'd in untyped code")
		def _create_compliant_series(self, value: Any) -> EagerSeriesT: ...

	def _from_scalar(self, value: Any) -> Self:
	return self._from_iterable([value], name=self.name, context=self)

	@classmethod
	def _from_iterable(
	cls: type[Self], data: Iterable[Any], name: str, *, context: _FullContext

	def from_numpy(
	data: _2DArray,
	schema: Mapping[str, DType] \| Schema \| Sequence[str] \| None = None,
	*,
	native_namespace: ModuleType,
	) -> DataFrame[Any]:
	"""Construct a DataFrame from a NumPy ndarray.

	Notes:
	Only row orientation is currently supported.

	For pandas-like dataframes, conversion to schema is applied after dataframe
	creation.

	Arguments:
	data: Two-dimensional data represented as a NumPy ndarray.
	schema: The DataFrame schema as Schema, dict of {name: type}, or a sequence of str.
	native_namespace: The native library to use for DataFrame creation.

	Returns:
	A new DataFrame.

	Examples:
	>>> import numpy as np
	>>> import pyarrow as pa
	>>> import narwhals as nw
	>>>
	>>> arr = np.array([[5, 2, 1], [1, 4, 3]])
	>>> schema = {"c": nw.Int16(), "d": nw.Float32(), "e": nw.Int8()}
	>>> nw.from_numpy(arr, schema=schema, native_namespace=pa)
	┌──────────────────┐
	\|Narwhals DataFrame\|
	\|------------------\|
	\| pyarrow.Table \|
	\| c: int16 \|
	\| d: float \|
	\| e: int8 \|
	\| ---- \|
	\| c: [[5,1]] \|
	\| d: [[2,4]] \|
	\| e: [[1,3]] \|
	└──────────────────┘
	"""
	return _from_numpy_impl(data, schema, native_namespace=native_namespace)


	def _from_numpy_impl(
	data: _2DArray,
	schema: Mapping[str, DType] \| Schema \| Sequence[str] \| None = None,
	*,
	native_namespace: ModuleType,
	) -> DataFrame[Any]:
	from narwhals.schema import Schema

	if not is_numpy_array_2d(data):
	msg = "`from_numpy` only accepts 2D numpy arrays"
	raise ValueError(msg)
	implementation = Implementation.from_native_namespace(native_namespace)

	if implementation is Implementation.POLARS:
	if isinstance(schema, (Mapping, Schema)):
	schema_pl: pl.Schema \| Sequence[str] \| None = Schema(schema).to_polars()
	elif is_sequence_but_not_str(schema) or schema is None:
	schema_pl = schema
	else:
	msg = (
	"`schema` is expected to be one of the following types: "
	"Mapping[str, DType] \| Schema \| Sequence[str]. "
	f"Got {type(schema)}."
	)
	raise TypeError(msg)
	native_frame = native_namespace.from_numpy(data, schema=schema_pl)

	elif implementation.is_pandas_like():
	if isinstance(schema, (Mapping, Schema)):
	from narwhals._pandas_like.utils import get_dtype_backend

	it: Iterable[DTypeBackend] = (
	get_dtype_backend(native_type, implementation)
	for native_type in schema.values()
	)
	native_frame = native_namespace.DataFrame(data, columns=schema.keys()).astype(
	Schema(schema).to_pandas(it)
	)
	elif is_sequence_but_not_str(schema):
	native_frame = native_namespace.DataFrame(data, columns=list(schema))
	elif schema is None:
	native_frame = native_namespace.DataFrame(
	data, columns=[f"column_{x}" for x in range(data.shape[1])]
	)
	else:
	msg = (
	"`schema` is expected to be one of the following types: "
	"Mapping[str, DType] \| Schema \| Sequence[str]. "
	f"Got {type(schema)}."
	)
	raise TypeError(msg)

	elif implementation is Implementation.PYARROW:
	pa_arrays = [native_namespace.array(val) for val in data.T]
	if isinstance(schema, (Mapping, Schema)):
	schema_pa = Schema(schema).to_arrow()
	native_frame = native_namespace.Table.from_arrays(pa_arrays, schema=schema_pa)
	elif is_sequence_but_not_str(schema):
	native_frame = native_namespace.Table.from_arrays(
	pa_arrays, names=list(schema)
	)
	elif schema is None:
	native_frame = native_namespace.Table.from_arrays(
	pa_arrays, names=[f"column_{x}" for x in range(data.shape[1])]
	)
	else:
	msg = (
	"`schema` is expected to be one of the following types: "
	"Mapping[str, DType] \| Schema \| Sequence[str]. "
	f"Got {type(schema)}."
	)
	raise TypeError(msg)
	else: # pragma: no cover
	try:
	# implementation is UNKNOWN, Narwhals extension using this feature should
	# implement `from_numpy` function in the top-level namespace.
	native_frame = native_namespace.from_numpy(data, schema=schema)
	except AttributeError as e:
	msg = "Unknown namespace is expected to implement `from_numpy` function."
	raise AttributeError(msg) from e
	return from_native(native_frame, eager_only=True)

refactor: adds _compliant sub-package #2149

Are you sure you want to change the base?

refactor: adds _compliant sub-package #2149

Conversation

dangotbanned commented Mar 5, 2025 • edited Loading

What type of PR is this? (check all applicable)

Related issues

Checklist

If you have comments or can explain your changes, please do so below

Lazy-only typing

(Arrow|PandasLike)* deduplication

*Namespace protocols

Misc

dangotbanned Mar 6, 2025

Choose a reason for hiding this comment

MarcoGorelli commented Mar 8, 2025

dangotbanned commented Mar 8, 2025

dangotbanned Mar 8, 2025

Choose a reason for hiding this comment

dangotbanned Mar 9, 2025 • edited Loading

Choose a reason for hiding this comment

dangotbanned Mar 9, 2025

Choose a reason for hiding this comment

dangotbanned Mar 9, 2025 • edited Loading

Choose a reason for hiding this comment

Notes

Prior Art

dangotbanned commented Mar 9, 2025 • edited Loading

Update

refactor: adds `_compliant` sub-package #2149

refactor: adds `_compliant` sub-package #2149

dangotbanned commented Mar 5, 2025 •

edited

Loading

`(Arrow|PandasLike)*` deduplication

`*Namespace` protocols

dangotbanned Mar 9, 2025 •

edited

Loading

dangotbanned Mar 9, 2025 •

edited

Loading

dangotbanned commented Mar 9, 2025 •

edited

Loading