Try normalizing the dbcore String type #3774

sampsyo · 2020-10-18T11:20:59Z

This is an attempt to fix #3773 by more aggressively normalizing values for dbcore's String type.

Strings were not being normalized, unlike some other types, leading to downstream problems like #3773.

wisp3rwind · 2020-10-19T09:10:15Z

That looks a lot like

Lines 89 to 91 in 52ca0cb

    
           # TODO This should eventually be replaced by 
        
           # `self.model_type(value)` 
        
           return value

should be adressed, right? Does anyone (in particular @geigerzaehler according to git blame) remember why that comment was added in the first place? Maybe, given that this PR has no CI failures, there's no issue with coercing all values to their model_type anymore?

In fact, maybe String.normalize() should ultimately be more strict, and assert that its values are actually six.text_type, rather than just converting them. Pretty much any value can successfully be converted to a string, so there's no real type checking otherwise.

sampsyo · 2020-10-19T12:49:56Z

Absolutely, yeah, that would be the thing to fix. Here's the context as best as I can remember. Before we added the type system, lots of parts of the codebase were very loosey-goosey about how they used various fields. This was a growing problem, of course, which is why we added the types in the first place. But we were nervous about all that existing code being that was so loosey-goosey about the types. The idea was to proceed in two phases:

Get the type system into place, but don't disrupt existing code—behave the same way as the type-free old system.
Flip the switch to turn on automatic conversions.

You could imagine adding a third phase to this, as @wisp3rwind suggested: instead of silently converting things like item.title = 123, start throwing errors. But it seems like we stalled out after the first phase. Getting more aggressive about this would be the right thing to do, however.

wisp3rwind

I just tried to run the tests with the change mentioned in the TODO note above. There's one test failure, namely test_format_flex_field_bytes in test_dbcore. The issue is that types.Default then forces all flexattrs to unicode, but the test expects to be able to write utf8 bytes to the database. I'm pretty sure that forcing unicode here would break flexattr usage by some plugins. And since the database actually stores bytes rather than unicode objects (right?), it seems sensible to be able to write raw bytes.

So it might indeed by best to only coerce Strings to their model_type, not all fields.

wisp3rwind · 2020-10-22T15:25:12Z

beets/dbcore/types.py

@@ -207,6 +207,12 @@ class String(Type):
    sql = u'TEXT'
    query = query.SubstringQuery

+    def normalize(self, value):
+        if value is None:
+            return self.model_type()


I think it would be cleaner to return self.null here (even though that is just self.model_type() for String.

Yeah, you're absolutely right about that.

sampsyo · 2020-10-23T14:08:15Z

Intriguing! Thanks for trying that out. Let's keep the dream alive—maybe we can work around that limitation by, for example, allowing fields without a declared type to remain dynamically typed and not coerced. But for now, we can make this more conservative change for strings only.

Try normalizing the dbcore String type

a63ee0e

Strings were not being normalized, unlike some other types, leading to downstream problems like #3773.

sampsyo mentioned this pull request Oct 18, 2020

FetchArt crashes with TypeError for particular album #3773

Closed

Use Unicode in lyrics tests

e99becb

wisp3rwind reviewed Oct 22, 2020

View reviewed changes

sampsyo added 4 commits October 23, 2020 10:10

Use null instead of model_type()

29bab24

Merge branch 'master' into normalize-str

c3a54cc

Changelog entry

2cbec2d

Merge branch 'master' into normalize-str

8ac14a3

sampsyo merged commit 627005d into master Oct 28, 2020

snejus deleted the normalize-str branch June 15, 2024 03:43

snejus mentioned this pull request Dec 5, 2024

Release: Fix changelog formatting #5529

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Try normalizing the dbcore String type #3774

Try normalizing the dbcore String type #3774

sampsyo commented Oct 18, 2020

wisp3rwind commented Oct 19, 2020

sampsyo commented Oct 19, 2020

wisp3rwind left a comment

wisp3rwind Oct 22, 2020

sampsyo Oct 23, 2020

sampsyo commented Oct 23, 2020

Try normalizing the dbcore String type #3774

Try normalizing the dbcore String type #3774

Conversation

sampsyo commented Oct 18, 2020

wisp3rwind commented Oct 19, 2020

sampsyo commented Oct 19, 2020

wisp3rwind left a comment

Choose a reason for hiding this comment

wisp3rwind Oct 22, 2020

Choose a reason for hiding this comment

sampsyo Oct 23, 2020

Choose a reason for hiding this comment

sampsyo commented Oct 23, 2020