-
Notifications
You must be signed in to change notification settings - Fork 592
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[rust] add crates.io enichment option for rust audit binary, json schema and spdx license updates. #3554
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @jimmystewpot, thanks for this enhancement; overall this looks great, and it's much appreciated that you've followed conventions really well. I left a few specific comments, but the biggest takeaways, I think are:
- "duplicate" the configuration struct (but it won't be completely duplicated -- for the multilevel configuration, you'll use
*bool
whereas therust.CatalogerConfig
would have abool
, for example) - we probably don't want to choose between one metadata type or the other, but rather add a way to keep both (though the suggestions I have are only suggestions and I'd like to run these by the team when we start to introduce new patterns for things). maybe the best thing is just to add the fields to the existing structs and not worry about having to support multiple metadata types just yet
- you'll need to Sign-off your commit(s) see contributing.md
"github.com/anchore/syft/syft/pkg/cataloger/rust" | ||
) | ||
|
||
type rustConfig rust.CatalogerConfig |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be a typedef of the rust struct but should be a declaration of all fields the CLI allows to be configured -- the idea is that these options define the CLI interface and can evolve separately from the internal configuration structs.
|
||
// NewCargoLockCataloger returns a new Rust Cargo lock file cataloger object. | ||
func NewCargoLockCataloger() pkg.Cataloger { | ||
return generic.NewCataloger("rust-cargo-lock-cataloger"). | ||
func NewCargoLockCataloger(opts CatalogerConfig) pkg.Cataloger { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it looks like this opts is not used?
type CatalogerConfig struct { | ||
InsecureSkipTLSVerify bool `yaml:"insecure-skip-tls-verify" json:"insecure-skip-tls-verify" mapstructure:"insecure-skip-tls-verify"` | ||
UseCratesEnrichment bool `json:"use-crates-enrichment" yaml:"use-crates-enrichment" mapstructure:"use-crates-enrichment"` | ||
Proxy string `yaml:"proxy,omitempty" json:"proxy,omitempty" mapstructure:"proxy"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this equivalent to the go http_proxy
environment variable? I don't think we would need a special config for this, rather just advise users to use the environment variable such that it's used for all http calls instead of needing to configure each individually. If there's really some reason that we need configuration other than the environment variable, we should figure out how to set this globally for all http requests.
"github.com/anchore/syft/internal/mimetype" | ||
"github.com/anchore/syft/syft/pkg" | ||
"github.com/anchore/syft/syft/pkg/cataloger/generic" | ||
) | ||
|
||
const cargoAuditBinaryCatalogerName = "cargo-auditable-binary-cataloger" | ||
const ( | ||
toolName = "syft" // used for the user-agent string. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't add a string here for the user-agent. Ideally this would come from configuration passed from the API user. The app name is initially passed in here, this just needs to get passed through to the appropriate configuration. Maybe we should make this more convenient somehow. But we really don't want this hardcoded as "syft", since a number of apps use the Syft API and are not, in fact, Syft.
const cargoAuditBinaryCatalogerName = "cargo-auditable-binary-cataloger" | ||
const ( | ||
toolName = "syft" // used for the user-agent string. | ||
cargoAuditBinaryCatalogerName = "rust-cargo-auditable-binary-cataloger" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like the right change, but I think this would technically be a breaking change if anyone was using the cargo-auditable-binary-cataloger
string. This should probably be reverted to the previous value but add an issue to update it, so we can make sure to do so in Syft 2.0 or whenever appropriate.
func newCratesResolver(name string, opts CatalogerConfig) *rustCratesResolver { | ||
base, err := url.Parse(opts.CratesBaseURL) | ||
if err != nil { | ||
panic(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should never panic but instead use error returns
if r := recover(); r != nil { | ||
fmt.Fprintf(os.Stderr, "recovered from panic while resolving license at: \n%s", string(debug.Stack())) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there shouldn't be a need for panic recovery here -- what is panicking?
// cratesRemoteMetadata represents the remote metadata for a crate | ||
// as fetched from crates.io via an API request. | ||
// This is used for deserialization of the response from crates.io | ||
type cratesRemoteMetadata struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this type should probably be defined above or below methods on rustCratesResolver rather than in the middle
switch c.opts.UseCratesEnrichment { | ||
case true: | ||
ctx, cancel := context.WithTimeout(context.Background(), time.Duration(c.opts.CratesTimeout)) | ||
defer cancel() | ||
cratesEnrichment, err := c.cratesResolver.ResolveCrate(ctx, dep.Name, dep.Version) | ||
if err != nil { | ||
log.Tracef("rust cataloger: failed to resolve crate %s/%s using crates.io: %v", dep.Name, dep.Version, err) | ||
// fallback to not using the crates enriched package information. | ||
p = newPackageFromAudit(&dep, location.WithAnnotation(pkg.EvidenceAnnotationKey, pkg.PrimaryEvidenceAnnotation)) | ||
continue | ||
} | ||
p = newPackageWithEnrichment(&dep, cratesEnrichment, location.WithAnnotation(pkg.EvidenceAnnotationKey, pkg.PrimaryEvidenceAnnotation)) | ||
case false: | ||
p = newPackageFromAudit(&dep, location.WithAnnotation(pkg.EvidenceAnnotationKey, pkg.PrimaryEvidenceAnnotation)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: if
/else
for boolean
p = newPackageFromAudit(&dep, location.WithAnnotation(pkg.EvidenceAnnotationKey, pkg.PrimaryEvidenceAnnotation)) | ||
continue | ||
} | ||
p = newPackageWithEnrichment(&dep, cratesEnrichment, location.WithAnnotation(pkg.EvidenceAnnotationKey, pkg.PrimaryEvidenceAnnotation)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably the thing that we need to discuss a bit as a team how best to handle: in the case enrichment is enabled, there is no pkg.RustBinaryAuditEntry
metadata created, instead populating a richer, but different, metadata struct type. I've long been a proponent of allowing multiple metadata types, but we don't really have a standard way of doing this yet. I don't think we should have less data when enriching, but we would end up in a situation that potentially something is checking for the pkg.RustBinaryAuditEntry
type and it's not found in this case.
I've talked with @wagoodman about this, but I don't think we came to a concrete solution, although since metadata types are arbitrary we could easily add a []any
or something similar, and maybe have a helper function to find and return metadata. I don't know if we need this yet, but it definitely looks like some of the fields are being read when outputting different formats from the new enriched data.
If it were me, and the restrictions we have today exist, I might think adding a helper function in the syft/pkg
package of something like:
func GetMetadata[T any](p *Package) *T {
if t, ok := p.Metadata.(T); ok {
return &t
}
if t, ok := p.Metadata.(*T); ok {
return t
}
if metadatas, ok := p.Metadata.([]any); ok {
for _, m := range metadatas {
if t, ok := m.(T); ok {
return &t
}
if t, ok := m.(*T); ok {
return t
}
}
}
return nil
}
... or something of the sort. which would let us use it fairly simply where we need it, like:
if m := pkg.GetMetadata[pkg.RustBinaryAuditEntry](p); m != nil {
// do something with the metadata
}
... and we then could set metadata to []any{ RustBinaryAuditEntry{...}, RustCargoMetadata{...} }
. And, though it's not directly applicable here, if we migrated usage of the metadata types to this function instead of the direct type assertions we have, we could then also support merging packages more completely without losing certain metadata, etc..
Sorry for the long-winded comment here, just noting this for discussion along with some background.
Description
This pull request supports remotely enriching Rust auditable binaries using crates.io. It adds the license, supplier, originator, description, and other fields (optionally if enabled) to the manifest.
This information is unavailable in the cargo lock and binary; if approved, I will add this capability to the other rust cataloger.
Type of change
I've also updated the SPDX license list, as that was failing the
make test,
and updated the JSON schema version to support the new crates-enriched metadata. There are still some missing unit tests, specifically the mocks for the crates.io lookup and caching functionality. I wanted to submit a PR early, seek guidance, and ensure this would benefit the community before investing more time in standardising it across the Rust catalogers.The new feature adds a
rust
key to the configuration that allows the feature to be turned on/off and some settings tuned for site-specific needs.Checklist: