Description
In discussions being had within the ML work group around "model cards" we ack. that the in-progress CycloneDX schema to describe ML models has, at best ad-hoc standards to draw common data from. Specifically, we have found various structured and unstructured data being presented specific to ML service providers (e.g., Google, AWS, IBM, etc.) and ML catalogs such as HuggingFace and projects like Tensorflow. Therefore, our approach is to adopt basic data objects and fields that make sense of the commonality we have found, but wish to allow for providers to reference descriptors (e.g., schemas) for their specific model information (data inputs, outputs, statistical analysis, imaging, etc.) using the "externalReference"
objects. It would be helpful for automation, to know for validation or visualization purposes, if the referenced data has a published schema (e.g., in XSD or JSON) to apply against the information/data pointed to by the "url"
field.
As we support new XBOM types (e.g., Crypto CBOM, Machine Learning MLBOM, etc.) we will encounter more and more domain-specific data (hopefully with structure schemas) and it was encouraged the I take my proposal to add a "schema"
field to the "externalReference"
to make the reference more meaningful in those cases.
Applicability to existing reference types:
-
General types
-
"bom"
- example: "schema": "http://cyclonedx.org/schema/bom-1.5.schema.json"
-
"build-meta"
- example: "schema": "https://maven.apache.org/maven-v4_0_0.xsd"
-
Specific Types
- "service"
- example: "schema": "https://github.com/OAI/OpenAPI-Specification/blob/main/schemas/v3.0/schema.json"
- "service"
Hopefully, you can see from these few examples (and perhaps suggest new "type" values for the enum field) the value of adding the "schema"
field with an associated data type that reflects a URI (or the more generalized for IRI). This might appear in the "ExternalReference" definition as follows:
"externalReference": {
"type": "object",
"title": "External Reference",
"description": "Specifies an individual external reference",
"required": [
"url",
"type"
],
"additionalProperties": false,
"properties": {
"url": {
"type": "string",
"title": "URL",
"description": "The URL to the external reference",
"format": "iri-reference"
},
"schema": {
"type": "string",
"title": "Schema",
"description": "Reference to a document that defines the schema (e.g., elements, attributes and data types) for the document referenced by the URL.",
"format": "iri-reference",
"examples": ["http://csrc.nist.gov/ns/oscal/1.0"]
},
"schemaType": {
"type": "string",
"title": "Schema Type",
"description": "Reference to the versioned schema format specification",
"format": "iri-reference",
"examples": ["http://json-schema.org/draft-07/schema#"]
},
"comment": {
...
},
"type": {
"type": "string",
"title": "Type",
"enum": [ "vcs", "issue-tracker", "website", ... // etc.
]
},
"hashes": {
...
}
}
},
Note: we may also want to add a complimentary field such as to disambiguate what schema version (draft) of either XML schema or JSON schema was used as well... shown above as "schemaType"
.