Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't use sparse registry mirror #487

Closed
inflation opened this issue Jul 10, 2023 · 14 comments · Fixed by EmbarkStudios/tame-index#4 or #506
Closed

Can't use sparse registry mirror #487

inflation opened this issue Jul 10, 2023 · 14 comments · Fixed by EmbarkStudios/tame-index#4 or #506
Labels
A-cli Area: engine around the lints C-enhancement Category: raise the bar on expectations E-help-wanted Call for participation: Help is requested to fix this issue.

Comments

@inflation
Copy link

Steps to reproduce the bug with the above code

I'm having this config for the mirror of crates.io:

[source.crates-io]
replace-with = 'tuna'

[source.tuna]
registry = "sparse+https://mirrors.tuna.tsinghua.edu.cn/crates.io-index/"

[registries.tuna]
index = "sparse+https://mirrors.tuna.tsinghua.edu.cn/crates.io-index/"

Actual Behaviour

cargo semver-checks check-release failed with:

Error: The repo at path /Users/inflation/.cargo/registry/index/mirrors.tuna.tsinghua.edu.cn-2eab394af869c8a2 is unusable due to having an invalid HEAD reference: reference 'refs/heads/master' not found; class=Reference (4); code=NotFound (-3)

Expected Behaviour

Checks normally

Generated System Information

Software version

cargo-semver-checks 0.22.1

Operating system

macOS 13.4.1 (Darwin 22.5.0)

Command-line

/Users/inflation/.cargo/bin/cargo-semver-checks semver-checks --bugreport

cargo version

> cargo -V
cargo 1.70.0 (ec8a8a0ca 2023-04-25)

Compile time information

  • Profile: release
  • Target triple: aarch64-apple-darwin
  • Family: unix
  • OS: macos
  • Architecture: aarch64
  • Pointer width: 64
  • Endian: little
  • CPU features: aes,crc,dit,dotprod,dpb,dpb2,fcma,fhm,flagm,fp16,frintts,jsconv,lor,lse,neon,paca,pacg,pan,pmuv3,ras,rcpc,rcpc2,rdm,sb,sha2,sha3,ssbs,v8.1a,v8.2a,v8.3a,v8.4a,vh
  • Host: aarch64-apple-darwin

Build Configuration

No response

Additional Context

No response

@inflation inflation added the C-bug Category: doesn't meet expectations label Jul 10, 2023
@obi1kenobi
Copy link
Owner

Interesting, thanks for the report! I've never used registry mirrors, is there somewhere I can read about what benefits they provide and in what circumstances they get used?

cargo-semver-checks uses the crates_index crate to interact with the crate index and the local cargo cache. So we support whatever that crate supports, and like I mentioned, I haven't looked at mirror functionality so I'm not sure whether this is supported or not.

The error message seems to indicate that, for this configuration, the underlying crates_index crate expected to find a git repo but did not. Is it possible this is due to a configuration error?

The underlying code in question is here:
https://github.com/frewsxcv/rust-crates-index/blob/cf65d090563692aa44777f47a3b3f45973cc890d/src/bare_index.rs#L138-L152

If you've double-checked your configuration and have found it correct, would you mind chatting with the crates_index maintainers in an issue there to see if mirrors are supported and if they require any special functionality to enable? cargo-semver-checks uses the documented crates_index::Index::new_cargo_default() way of accessing the index which loads config information (hence your error) so if this is indeed a bug, it appears to me that it's somewhere inside that call there and not in cargo-semver-checks itself.

@inflation
Copy link
Author

I think it's more about the sparse protocol than the mirrors. By default, if I'm only using the sparse index of crates.io, it'll fetch the git repo automatically. However for custom registries that only contains sparse index, it'll just assume it's a git repo and failed to fetch HEAD.

@obi1kenobi
Copy link
Owner

Ah interesting. I'm not sure if this is a fundamental limitation of crates_index or just a bug there.

Would you mind opening a linked issue in the crates_index repo, sharing this information with them, and asking if registries that only contain sparse indexes can be supported? Like I mentioned, I don't understand the registry protocol details at all, so if I were to open the issue, I'd be pretty useless and would just be relaying information between you and the maintainers of that crate.

@inflation
Copy link
Author

crates_index actually provide a sparse_index so you can use that. The thing is you are assuming everything is a git repo by using git-index feature and crate_index::Index, which is not true. I don't know if it's better to have the detection happening in crates_index or here, but the general idea is to check if [source.crates-io.protocol] is set to sparse or any registry URL is starting with sparse+.

For now it should be using the sparse protocol by default (which cargo already have), and provide a flag to use the git index repo for fallback.

@obi1kenobi
Copy link
Owner

These are good points. I think it'd be better if the crates_index or a wrapper around it provides detection of the appropriate config, or else we'd be reimplementing a good chunk of cargo config management all over again.

I can ask the maintainers of crates_index about this and see what they think.

@obi1kenobi
Copy link
Owner

It seems that the sparse index functionality in crates_index is still a work in progress, here are a few related issues:
frewsxcv/rust-crates-index#44
frewsxcv/rust-crates-index#123
frewsxcv/rust-crates-index#126

@obi1kenobi obi1kenobi added A-cli Area: engine around the lints C-enhancement Category: raise the bar on expectations E-help-wanted Call for participation: Help is requested to fix this issue. and removed C-bug Category: doesn't meet expectations labels Jul 11, 2023
@obi1kenobi
Copy link
Owner

Perhaps related to #160 as well.

@pksunkara
Copy link
Contributor

crates_index is now unmaintained. frewsxcv/rust-crates-index#132

@obi1kenobi
Copy link
Owner

obi1kenobi commented Jul 23, 2023 via email

@Jake-Shadle
Copy link
Contributor

Normally I'm not "that guy", but in light of frewsxcv/rust-crates-index#132, I thought I would mention that I have been working on an alternative to crates-index, https://github.com/EmbarkStudios/tame-index, as several use cases I had were blocked by the constraints of crates-index.

Notably, the use case mentioned here is supported by tame-index since I had a similar use case in cargo-deny for needing to open the crates.io index to get metadata for crates, but not wanting to duplicate functionality already in crates-index, as well as not having a single type that could access a git/sparse/local index which was clunky and unpleasant.

The code would be roughly:

impl RustdocFromRegistry {
    pub fn new(target_root: &std::path::Path, config: &mut GlobalConfig) -> anyhow::Result<Self> {
        let crates_io_url = tame_index::index::IndexUrl::crates_io(
            None, // unless you want to override the config root to something other than the current directory
            None, // unless you want to override cargo home
            None, // unless you want to override the cargo version
        )?;

        Ok(Self {
            target_root: target_root.to_owned(),
            version: None,
            index: tame_index::index::ComboIndexCache::new(IndexLocation::new(crates_io_url))?,
        })
    }

    fn load_rustdoc(
        &self,
        config: &mut GlobalConfig,
        rustdoc_cmd: &RustdocCommand,
        crate_data: CrateDataForRustdoc,
    ) -> anyhow::Result<PathBuf> {
        let crate_ = self.index.cached_krate(crate_data.name.try_into().expect("published cargo crate somehow had an invalid name")).with_context(|| {
            anyhow::format_err!(
                "{} not found in registry (crates.io). \
        For workarounds check \
        https://github.com/obi1kenobi/cargo-semver-checks#does-the-crate-im-checking-have-to-be-published-on-cratesio",
                crate_data.name
            )
        })?;
        ...
    }
}

I guess one wrinkle is that tame-index uses gix for its git implementation while this crates uses git2 so you probably wouldn't want to enable the git feature flag, so updating of the git registry wouldn't be supported.

@obi1kenobi
Copy link
Owner

That looks quite nice, thanks for mentioning it and showing code!

I'm not sure I understood the git caveat at the end. Is the issue that users using a git registry wouldn't benefit from a local cache? Or is the problem something else?

Overall this is quite tempting to switch to. It seems like we'd get less complexity, more functionality, and maintained dependency. I'd love to understand the git caveat better, but assuming it's not a major issue I'd love to make the switch!

@Jake-Shadle
Copy link
Contributor

Well the current code is doing an update of the index, which is only available if the git feature flag is enabled and then gix is pulled in. Unlike crates-index though, it is able to read .cache entries from a git index if they are available, but would again need gix enabled to read the crate's index metadata if the .cache entry did not exist.

My use case for cargo-deny isn't impacted by this because cargo_metadata is always executed before index operations so the git/sparse index is always guaranteed to have .cache entries for all crates returned by the cargo_metadata call.

If you wanted to keep that functionality you could enable the git feature, but it would bloat compile times and binary artifacts since git2 and gix would be both present.

@obi1kenobi
Copy link
Owner

Ah I see. I think we have to be able to update the index, since we need to find historical versions of a crate from the index etc. I'm not too worried about compile times, since many / most users just use the prebuilt binaries that get generated on every release.

How would updating the index work after enabling the git feature?

Any chance you might be open to making a PR? It seemed like you got 99% of the way there in the example you gave, and you know way more about all this than I do so I trust you more than I trust myself on this 😅

@Jake-Shadle
Copy link
Contributor

Sure, I can do that.

Jake-Shadle added a commit to EmbarkStudios/tame-index that referenced this issue Jul 26, 2023
- Add helpers to GitError
- Make gix and reqwest exported via crate::externals

While working on a PR to resolve
obi1kenobi/cargo-semver-checks#487, that crate
was looking at the error for git operations to determine if they should
be retried, which is....unfortunately way more complicated with gix. So
this just adds helper methods on GitError to make that use case nicer.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-cli Area: engine around the lints C-enhancement Category: raise the bar on expectations E-help-wanted Call for participation: Help is requested to fix this issue.
Projects
None yet
4 participants