Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Genius lyrics provider not finding track #3446

Closed
xhocquet opened this issue Nov 30, 2019 · 11 comments
Closed

Genius lyrics provider not finding track #3446

xhocquet opened this issue Nov 30, 2019 · 11 comments
Labels
needinfo We need more details or follow-up from the filer before this can be tagged "bug" or "feature."

Comments

@xhocquet
Copy link
Contributor

Hey there, first off thanks for the software, it's very useful! I'm trying to get some lyrics for a new album I added to my library. I originally noticed more 'edge' hip-hop lyrics not importing despite Genius being enabled, and Genius having a TON of rap lyrics. It does not seem like the plugin is making a request to Genius like the other providers (which do not have the song)

Specific song I am trying to import below: https://genius.com/Billy-woods-marlow-lyrics

Problem

Running this command in verbose (-vv) mode:

$ beet -vvv lyrics billy woods marlow -f
user configuration: C:\Users\meesles\AppData\Roaming\beets\config.yaml
data directory: C:\Users\meesles\AppData\Roaming\beets
plugin paths:
Sending event: pluginload
lyrics: Disabling google source: no API key configured.
library database: C:\Users\meesles\AppData\Roaming\beets\library.db
library directory: D:\Library\Music
Sending event: library_opened
lyrics: failed to fetch: http://lyrics.wikia.com/Billy_Woods:Marlow (404)
lyrics: failed to fetch: https://www.musixmatch.com/lyrics/Billy-Woods/Marlow (404)
lyrics: lyrics not found: Billy Woods - Terror Management - Marlow
Sending event: cli_exit

File: http://www.mediafire.com/file/0rfq90y1v77mejf/01_Marlow.mp3/file

Setup

  • OS: Windows 10
  • Python version: 3.7.4
  • beets version: 1.4.9
  • Turning off plugins made problem go away (yes/no): no

My configuration (output of beet config) is:

lyrics:                                                                  
    bing_lang_from: []                                                   
    sources: google lyricwiki musixmatch genius                          
    fallback: ''                                                         
    auto: yes                                                            
    bing_client_secret: REDACTED                                         
    bing_lang_to:                                                        
    google_API_key: REDACTED                                             
    google_engine_ID: REDACTED                                           
    genius_api_key: REDACTED                                             
    force: no                                                            
    local: no                                                            
directory: D:\Library\Music                                              
                                                                         
import:                                                                  
    move: yes                                                            
    incremental: yes                                                     
                                                                         
paths:                                                                   
    default: $albumartist/$album%aunique{}/$track $title                 
    singleton: Non-Album/$artist/$title                                  
    comp: Compilations/$album%aunique{}/$track $title                    
                                                                         
ui:                                                                      
    color: yes                                                           
                                                                         
musicbrainz:                                                             
    user: xhocquet                                                       
    pass: REDACTED                                                       
                                                                         
plugins: missing mbcollection mbsync lyrics random mbsubmit              
mbcollection:                                                            
    auto: yes                                                            
    collection: 51e9fbd7-7fdf-4fd5-a315-4cd225562474                     
    remove: yes                                                          
mbsubmit:                                                                
    format: $track. $title - $artist ($length)                           
    threshold: medium                                                    
missing:                                                                 
    count: no                                                            
    total: no                                                            
    album: no                                                            
                                                                         
@sampsyo
Copy link
Member

sampsyo commented Dec 1, 2019

Hi! Looking at the code, it doesn’t look like the Genius backend logs any messages in the success case or in several possible failure cases. So if the plug-in were making requests and not finding anything for some reason, it’s possible that you wouldn’t see anything in the log.

So: is your diagnosis that the plugin isn’t making requests based on the log alone, or some other evidence like network traffic? If the former, something could indeed be broken, but we’ll need to add more logging to find out what.

@sampsyo sampsyo added the needinfo We need more details or follow-up from the filer before this can be tagged "bug" or "feature." label Dec 1, 2019
@xhocquet
Copy link
Contributor Author

xhocquet commented Dec 1, 2019

@sampsyo Spot on, I just expected a log like the other matchers but do not know for sure. How can I generate more logs to further debug this issue? Happy to help! As you saw, the link to the song matches all the fields from my metadata, so not sure why it wouldn't be working.

@xhocquet xhocquet changed the title Genius lyrics provider does not seem to make request Genius lyrics provider not finding track Dec 1, 2019
@sampsyo
Copy link
Member

sampsyo commented Dec 1, 2019

Great! I think we should add a few self._log.debug calls, like this one here:

beets/beetsplug/lyrics.py

Lines 366 to 367 in 60bba37

self._log.debug(u'Genius page request for {0} failed: {1}',
page_url, exc)

…elsewhere throughout the backend's logic.

For example, I think the most likely culprit is that this condition is false:

beets/beetsplug/lyrics.py

Lines 402 to 404 in 60bba37

if song_info:
song_api_path = song_info["result"]["api_path"]
return self.lyrics_from_song_api_path(song_api_path)

So, after that, we could put something like:

else:
    self._log.debug('No matching artist.')
    return None

or similar. Then, maybe it would make sense to log something about the contents of json["response"]["hits"] to diagnose why this is missing.

If you do go about adding this extra logging (thanks!!), could I ask you the favor of opening a pull request so other users can benefit too?

@xhocquet
Copy link
Contributor Author

xhocquet commented Dec 2, 2019

@sampsyo I wasn't expecting to patch this up, but I'd be happy to try! I love this project and my Python is passable 😄 . I'll start looking into dev setup and see what I can do! Thanks for the pointers.

@xhocquet
Copy link
Contributor Author

xhocquet commented Dec 4, 2019

@sampsyo Cool, made a little progress so far. I added debuggers as suggested and I believe I've landed myself smack dab in one of Python (2)'s most infamous headaches: string encodings!

What I'm seeing when I print out the primary artist of the Genius response:

lyrics: Red Marlow
lyrics: Christopher Marlowe
lyrics: Christopher Marlowe
lyrics: Marlowe
lyrics: Christopher Marlowe
lyrics: Marlowe
lyrics: ​billy woods
lyrics: Marlowe
lyrics: Marlowe
lyrics: Christopher Marlowe

In fact, the billy_woods response in my terminal has an extra character and one that causes issues trying to decode:

Screen Shot 2019-12-04 at 1 07 21 PM

I tried a couple things:

  1. Using .strip() which doesn't seem to remove the character
  2. Using .encode() with both utf-8 and ascii, both return a similar but different error:
    • UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 0: ordinal not in range(128)
    • UnicodeEncodeError: 'ascii' codec can't encode character u'\u200b' in position 0: ordinal not in range(128)
  3. Using str() to convert to a string. Fails with error:
    • UnicodeEncodeError: 'ascii' codec can't encode character u'\u200b' in position 0: ordinal not in range(128)

I'm sure by just trying a bunch of different options I could get it to work, but two questions for you:

  1. Is this even the right approach? Is stripping the strings appropriate during comparisons? Or is this simply a data issue for Genius?
  2. If this is something we should do, any guidance on existing patterns to deal with this?

Cheers!

@sampsyo
Copy link
Member

sampsyo commented Dec 5, 2019

Wow!! That's quite a thing! According to your cut n' paste, that's Unicode character U+200B ZERO WIDTH SPACE. I have no clue why Genius would include that on that artist. It definitely seems like a data issue in Genius.

If it affects more than this specific single artist, maybe it's worth adding a fix that strips this specific character, i.e., artist.strip(u'\u200b').

Regardless, however, adding the extra logging to make this problem more visible seems like a good idea!

@sampsyo
Copy link
Member

sampsyo commented Dec 5, 2019

@xhocquet
Copy link
Contributor Author

xhocquet commented Dec 5, 2019

@sampsyo Nice find! Makes sense.. I guess? I can think of a few different ways to maintain capitalization without weird characters though 🤔

So if it's a consistent pattern, can we add the strip() call? If so I'll do that and open a PR with some new loggers as well. I actually ended up using a log on each entry that came back from the list, which may be a bit too much for normal users. What do you think? Just one item saying Genius could not match the artist?

@sampsyo
Copy link
Member

sampsyo commented Dec 5, 2019

Yeah, that sounds great!

Indeed, I don't think we need to log every artist. Just logging that a match wasn't found seems good.

@xhocquet
Copy link
Contributor Author

xhocquet commented Dec 6, 2019

@sampsyo v1 up for your review! #3448

For my use-case, I also had to lowercase the artist comparisons to get it to match. I see a lot of lowercasing going on throughout the codebase, so I assumed that was a safe operation. Let me know if I need to make any changes!

@xhocquet
Copy link
Contributor Author

xhocquet commented Feb 4, 2020

Fixed!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needinfo We need more details or follow-up from the filer before this can be tagged "bug" or "feature."
Projects
None yet
Development

No branches or pull requests

2 participants