Ubiquitous vcrpy usage #442

ajakk · 2020-03-29T04:12:00Z

Hi, I'm working on packaging buku 4.3 for Gentoo. In Gentoo, when the package manager runs a package's unit tests, network access is disabled by default. I noticed that the buku 4.3 tests don't all work with network access disabled, so I'm looking to expand vcrpy usage until the tests can all work without access to the Internet.

I see two ways of doing this:

Add @vcr.use_cassette(...) decorators on all of the relevant functions. I've already worked on this a bit in my forked repository, but I believe I've thought of a better way
Add a pytest fixture with a scope of session which handles decorating functions with vcr boilerplate automatically. This would entail the most disruption to how buku currently handles tests.

Do you prefer I continue the first method or rework things in favor of the second?

The text was updated successfully, but these errors were encountered:

jarun · 2020-03-29T04:34:05Z

@rachmadaniHaryono please take a look.

rachmadaniHaryono · 2020-03-29T04:52:11Z

if jarun allow the additional file i'm ok with it.

i'm interested with second method.

first method may be harder with the maintenance because each test need to be paired with individual yaml file

jarun · 2020-03-29T05:12:44Z

@rachmadaniHaryono what's this additional file? can you explain a bit?

rachmadaniHaryono · 2020-03-29T06:08:31Z

see this commit example https://github.com/ajakk/buku/commit/1b5a6eeb063690dd2627e7c14afc9714bca5485a

each test with vcrpy may require yaml file

there is possibility that the yaml will only be created not added from repo, but it mean each tester may have different result, unless the tester share the yaml files outside from repo

jarun · 2020-03-29T07:31:41Z

No can do. ;)

Can't gentoo users install from pip?

ajakk · 2020-03-29T16:52:20Z

They can, but that doesn't mean they should. Such an approach broken for several reasons, the biggest of which is probably that it confuses the package manager if something is installed or uninstalled by pip as root. It can break systems. Surely on just about every distribution the recommended (and supported) way of installing software is via the package manager.

What's the problem with extra files? They're automatically handled by vcrpy, no added maintenance burden on the maintainer.

jarun · 2020-03-29T17:20:09Z

What's the problem with extra files?

It's not 1/2 extra files... it's too many extra files. If you can't come up with a sensible test strategy without the need to use tons of files to a repo, I don't see why I need to allow this.

No added maintenance burden on the maintainer.

I like to keep my project repos clean
Any additional file added to a repo is a burden for the maintainers; maybe not when it works, but definitely when they fail

In Gentoo, when the package manager runs a package's unit tests, network access is disabled by default.

That's the disaster. It's ridiculous because you want to test a utility that accesses the web in the real world in a limited test suite and expect everything would work well when users actually access the web. It defeats the purpose of running the test case, at least for web based software. Why run the tests at all?

Why don't you limit the test scope to tests that do not need internet access, maybe using a parameter? All the tests we run on Travis to sanitize Buku are not necessarily of interest to Gentoo.

jarun · 2020-03-29T17:25:08Z

the biggest of which is probably that it confuses the package manager if something is installed or uninstalled by pip as root

Because the package manager is not capable of tracking the versions, timestamps and checksums for each package installed through it in it's exclusive database? I think it's a trivial problem to solve.

ajakk · 2020-03-29T18:07:11Z

I don't see why I need to allow this.

The benefits of using vcrpy are clear(see the 'Rationale' section).

Any additional file added to a repo is a burden for the maintainers

The burden would be minimal or even less than what it is now if my option #2 is pursued. I think it would be helpful to see what extensive vcrpy usage would look like in the real world. Let's take my package tuir. It has numerous cassettes. Let's say, for whatever reason, the test_content_cache cassette is broken and needs to be regenerated. That cassette is generated from the test_content_cache function in tests/test_content.py. That's easy to find based on the name of the YAML. I rerun that test function with --record-mode=once, something like pytest tests/test_content.py::test_content_cache --record-mode=once. The cassette is regenerated. The cassette only needs to be regenerated in cases where you'd have to modify the test function anyway, so little maintenance burden is added.

That's the disaster.

Gentoo's network sandbox is quite useful. What if the data behind one of the URLs buku's tests fetch changes unexpectedly? Does that mean buku is broken because someone on the Internet changed their website? No, that has nothing to do with buku's functionality at all, and yet the tests are broken due to it. Using a cassette would significantly increase test stability.

It's ridiculous because you want to test a utility that accesses the web in the real world in a limited test suite and expect everything would work well when users actually access the web...Why run the tests at all?

You're testing your code, not whatever Internet pages you're fetching. It's a reasonable assumption that the Internet pages will always be the same, you're assuming that anyway by basing tests on them, so why not follow through on that assumption and record cassettes for the requests to speed up (and improve the stability of) your tests?

Why don't you limit the test scope to tests that do not need internet access, maybe using a parameter?

This is similar to saying "why not just don't run tests that could fail?". Because then something could be broken in the code being tested and it wouldn't be discovered, defeating the point of testing in the first place.

Because the package manager is not capable of tracking the versions, timestamps and checksums for each package installed through it in it's exclusive database? I think it's a trivial problem to solve.

Portage is incapable of tracking changes made to packages by other package managers. Getting multiple package managers (portage and pip) to work properly together on the same root filesystem when they have design incompatbilities is a broken idea, hopefully for obvious reasons, and certainly not trivial. Even so, Gentoo does not support installing things as root with pip. I think some relatively trivial changes to buku's test system are quite worth doing in the name of both increasing test efficiency and enabling buku to not break on Gentoo. Buku is stuck at version 3.7 on Gentoo. I don't think it needs to be stuck there because buku's tests break Gentoo.

rachmadaniHaryono · 2020-03-29T23:02:13Z

Gentoo's network sandbox is quite useful. What if the data behind one of the URLs buku's tests fetch changes unexpectedly? Does that mean buku is broken because someone on the Internet changed their website? No, that has nothing to do with buku's functionality at all, and yet the tests are broken due to it. Using a cassette would significantly increase test stability.

this happenned multiple times and we just change/remove it when needed.

i can't say anything about adding extra file as jarun is the one who determine if we can add more file or not.

but imo we should focus on the problem, which is

In Gentoo, when the package manager runs a package's unit tests, network access is disabled by default.

I noticed that the buku 4.3 tests don't all work with network access disabled

another option other than vcrpy+yaml is to disable any test with network access on gentoo test. this may skip error on gentoo user but this is (hopefully) minimal as we already have linux test on our automated test. if there is really specific gentoo error, user/tester can create issue page on the repo

e:

Such an approach broken for several reasons, the biggest of which is probably that it confuses the package manager if something is installed or uninstalled by pip as root

just as the link you recommended and below SO we really don't recommend using sudo pip to install.

the installation put on readme is only generic one with the assumption that the user know essential pip usage (note we just change pip to pip3 to let user know that we only support python3)

ajakk · 2020-03-29T23:23:00Z

this happenned multiple times and we just change/remove it when needed.

This is an easy example of where using vcrpy for all test network-accesses would reduce the maintainence burden on the maintainer. An application's tests should fail if and only if the application's code is broken somehow, not because the Internet changed.

another option other than vcrpy+yaml is to disable any test with network access on gentoo test. this may skip error on gentoo user but this is (hopefully) minimal as we already have linux test on our automated test. if there is really specific gentoo error, user/tester can create issue page on the repo

Just not using tests that fail defeats the purpose of testing. Why test at all if you just ignore potential test failures?

just as the link you recommended and below SO we really don't recommend using sudo pip to install.

Jarun suggested it. I only mention it because the maintainer suggested it.

rachmadaniHaryono · 2020-03-29T23:59:06Z

Jarun suggested it. I only mention it because the maintainer suggested it.

jarun: Can't gentoo users install from pip?

jarun suggested using pip but not sudo pip

This is an easy example of where using vcrpy for all test network-accesses would reduce the maintainence burden on the maintainer. An application's tests should fail if and only if the application's code is broken somehow, not because the Internet changed.

Just not using tests that fail defeats the purpose of testing. Why test at all if you just ignore potential test failures?

you are correct, but i will let jarun decide the file addition as he is the owner of the repo

i do still interested if we can work out how to fix gentoo test

e: random idea: is it possible for gentoo tester to test from github branch?

if it is possible, gentoo tester can use their branch and that let other maintainer have the file too, without changing the main repo

jarun · 2020-03-30T02:35:30Z

What if the data behind one of the URLs buku's tests fetch changes unexpectedly? Does that mean buku is broken because someone on the Internet changed their website?

Yes, if it happens and Buku can't handle it, it is broken. We need to fix that because tomorrow some user might fetch the same URL.

This is similar to saying "why not just don't run tests that could fail?". Because then something could be broken in the code being tested and it wouldn't be discovered, defeating the point of testing in the first place.

Please get back with the tests in a single file that can be run without using the internet and those reasonably tests Buku's functionality. Run only those in that file.

Otherwise, as @rachmadaniHaryono suggested, see if you can use your branch. Buku is not going to be changed a lot in the future.

ajakk mentioned this issue Apr 1, 2020

www-misc/buku: Add version 4.3 gentoo/gentoo#15096

Closed

jarun closed this as completed Apr 8, 2020

github-actions bot locked and limited conversation to collaborators Jun 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ubiquitous vcrpy usage #442

Ubiquitous vcrpy usage #442

ajakk commented Mar 29, 2020 •

edited

Loading

jarun commented Mar 29, 2020

rachmadaniHaryono commented Mar 29, 2020

jarun commented Mar 29, 2020

rachmadaniHaryono commented Mar 29, 2020

jarun commented Mar 29, 2020

ajakk commented Mar 29, 2020

jarun commented Mar 29, 2020

jarun commented Mar 29, 2020 •

edited

Loading

ajakk commented Mar 29, 2020 •

edited

Loading

rachmadaniHaryono commented Mar 29, 2020 •

edited

Loading

ajakk commented Mar 29, 2020

rachmadaniHaryono commented Mar 29, 2020 •

edited

Loading

jarun commented Mar 30, 2020 •

edited

Loading

Ubiquitous vcrpy usage #442

Ubiquitous vcrpy usage #442

Comments

ajakk commented Mar 29, 2020 • edited Loading

jarun commented Mar 29, 2020

rachmadaniHaryono commented Mar 29, 2020

jarun commented Mar 29, 2020

rachmadaniHaryono commented Mar 29, 2020

jarun commented Mar 29, 2020

ajakk commented Mar 29, 2020

jarun commented Mar 29, 2020

jarun commented Mar 29, 2020 • edited Loading

ajakk commented Mar 29, 2020 • edited Loading

rachmadaniHaryono commented Mar 29, 2020 • edited Loading

ajakk commented Mar 29, 2020

rachmadaniHaryono commented Mar 29, 2020 • edited Loading

jarun commented Mar 30, 2020 • edited Loading

ajakk commented Mar 29, 2020 •

edited

Loading

jarun commented Mar 29, 2020 •

edited

Loading

ajakk commented Mar 29, 2020 •

edited

Loading

rachmadaniHaryono commented Mar 29, 2020 •

edited

Loading

rachmadaniHaryono commented Mar 29, 2020 •

edited

Loading

jarun commented Mar 30, 2020 •

edited

Loading