# Databases Multiple sources of {term}`CVE` information can be registered and used by sbom-cve-check tool. There are two kinds of databases: - The {term}`CVE database` provides general information about CVEs. However, some of this information may be missing: - A CVE identifier. - A description. - Vulnerable versions, and not vulnerable versions. These version ranges are provided, among other things, via a CPE match block. - {term}`CPE` provides vendor and product names, which allows to identify the component. - {term}`CVSS` (Common Vulnerability Scoring System) information, including the score and the vector string. - The {term}`annotation database`, which provides a VEX {term}`assessment` for a specific CVE. This assessment is tied to one or more specific component versions: The analysis was realized for these versions. In addition to this assessment, the {term}`annotation` entry can provide the same kind of information contained in a CVE database. ## Common database options All database options can be specified from the command line, using `--add-db` or `--set-db-cfg` flags, or from one or more TOML configuration files as described in the [database configuration](#database-configuration) subsection. ### Database type The following CVE and annotation databases are supported: | Name | Short description | |----------------------|--------------------------------------------------------------| | `openvex-file` | Path to one OpenVEX annotation file | | `openvex-dir` | Directory containing OpenVEX annotation files | | `spdx3-file` | Path to one SPDX v3.0 SBOM file | | `yocto-vex-manifest` | Path to Yocto Project VEX manifest, generated by vex.bbclass | | `simple-annotations` | Directory containing simple YAML annotation files | | `cve-db-nvd-fkie` | Path to FKIE nvd-json-data-feeds git repository | | `cve-db-cvelist` | Path to CVEProject cvelistV5 git repository | The database type can be specified in the following ways: - In a configuration file or via the `--set-db-cfg` flag, using the `type=` option. - From the command line using the `--add-db` flag: The first value is the database type name. ### Database priority Each database has a priority, which is an integer: The higher the priority value, the higher the priority. When checking for CVE, for a specific component identified, for example, by a CPE, the list of applicable CVE is obtained. Then, for each applicable CVE, each database is queried from the highest to the lowest priority, and an assessment is retrieved or computed. The highest valid assessment that was obtained is used in the export file. An annotation database must have a unique priority: Another database cannot use the same priority. But a CVE database can have the same priority, which is the default for the NVD and the CVE List databases. If the priority is not explicitly set using the `priority=` option, a default value is automatically assigned based on the following rules: - CVE databases (CVE List and NVD): Priority is set to 50. - Other databases (primarily annotations): - If the information is provided via the command line (e.g., from the SBOM or using `--yocto-vex-manifest`), the priority is set to 100 plus the declaration order of the database. - If the information is provided via a configuration file or the command line using the `--add-db` or `--set-db-cfg` flag, the priority is set to 200 plus the declaration order of the database. - Priority order for declarations: 1. Databases declared using the `--add-db` flag. 2. Databases declared in configuration files. 3. Databases declared using the `--set-db-cfg` flag. In summary, the CVE databases should have the lowest priority, the databases added from the command line the middle priority, and the databases added from the configuration file the highest priority. And the databases declared at the beginning (of the configuration file, or of the command line) will have a lower priority than a database declared at the end. ### Database path Each database must have a path, which can be specified as follows: - In a configuration file or via the `--set-db-cfg` flag, using the `path=` option. If a relative path is used, it is interpreted by default relative to the directory containing the configuration file (as described in the [configuration](config.md) section). - From the command line using the `--add-db` flag: The second value is the database path. If a relative path is used, this is a relative path to the current working directory. This path can point to a file or a directory depending on the database type. The `path=` option specified in a configuration file can contain environment variables: `${name}` is replaced by the value of the environment variable "*name*". Also, if the path starts with an initial component of `~`, it will be replaced by that user's home directory. If the command line flag `--databases-dir` is used, it will set or override the `SBOM_CVE_CHECK_DATABASES_DIR` environment variable to the specified flag value. If the environment variable `SBOM_CVE_CHECK_DATABASES_DIR` is not set, it will default to the following value: `~/.cache/sbom_cve_check/databases`. This variable is used in the [default configuration](#default-configuration-file) file: `db_default.toml` to specify the location of NVD and CVE List databases. ### Database name Each database has a name that describes it. The name may be used in various messages. The name is not required to be unique, but this is recommended. It can be specified: - From a configuration file using the `name=` option. If not present, it uses, instead, the configuration [section name](#example-of-configuration-file). - From the command line using the `--add-db` flag, by specifying the `name=` option. If not present, it uses the filename specified in the [database path](#database-path). ### Obsolete assessment check Annotation databases have an additional configuration option, named `obsolete_assessment_check=`. This option takes a [boolean](#boolean-value). If this option is enabled for a particular annotation database, then it checks if the assessments provided by this database are obsolete. An assessment is considered obsolete if a CVE or annotation database with a lower priority provides the same kind of CVE assessment. The list of obsolete assessments can be exported depending on the selected [exporter](export.md). By default, for all annotation databases, the search for obsolete assessments is disabled. It can be enabled or disabled for a specific database using the option previously mentioned. To change the default, and to be able to search for obsolete assessments on all annotation databases, the following flag needs to be set: `--check-obsolete-assessment-by-default`. The search for obsolete assessments is not enabled by default, since it can make the execution of the tool way slower in some cases. ### Disabling a Database To disable a database, add the following configuration option: ```toml disabled = true ``` ## Git database options Some database types can be fetched from a git repository. These databases take the following configuration options: - `git_url=`: Allow to specify or to override the default fetch URL of the git repository. The repository will be cloned in the directory specified by the `path=` database option. This directory is going to be a git repository, the git is not cloned inside a subdirectory. - `auto_update_max_age=`: Configures the maximum allowed duration since the last "update". For details on what constitutes the last "update", refer to the `max_age_since_last_commit=` option. If this duration is exceeded, the tool automatically updates the Git repository to the latest commit of the selected branch. If not specified, the default value is 20 hours. Otherwise, it takes either: - A duration in seconds (or in other [time units](#duration-units)). - The special string `off`. If set to `off`, the Git database will **never** be updated. - The value `0`. In this case, a `git pull` will **always** be performed. To **globally** disable automatic updates, use the `--disable-auto-updates` flag. This is equivalent to setting `auto_update_max_age=off` for all Git databases. - `max_age_since_last_commit=`: A boolean that allows to configure what is considered the last "update": - If set to True, it indicates that this is the duration since the last commit. The committer date time of the HEAD commit is used. - If set to False, this is the duration since the last command that moves the HEAD in a drastic way. The modification date time of `ORIG_HEAD` is checked. This should reflect the last git fetch/pull. If not specified, default to True for a CVE database and to False for an annotation database. - `cache_index_path=`: Path to the cache file containing the database index. By default, the cache file is stored at the root of the git repository under the following file name: `.sbom-cve-check-cache-index.json`. The cache could be disabled by setting a null value or an empty string to this option. - `git_branch=`: Clone from a specific branch name. If not set, use the default remote branch. - `git_ref=`: Checkout a specific commit identifier or tag. If this configuration is not set, checkout the latest (HEAD) commit from remote. If the git branch is not specified, when cloning, consider that the git reference is a tag, and clone the tag directly. Combined with the `git_fetch_depth=` option, that allows cloning only the tag and not the complete git history. After an initial clone, if the git reference is changed in the configuration file, the git branch should be specified. The complete git history will be downloaded. If this is not wanted, the git repository could be removed. This way a new clone will be realized. - `git_fetch_depth=`: Configure git clone/fetch depth. By default, if this option is not set: - When cloning, if no git reference is specified, a shallow clone is realized with a depth of 1. Otherwise, a normal clone is realized. - When fetching to update the git repository, do not limit the number of commits to fetch. :::{warning} Be aware that any commit not pushed to remote, or any local modifications will be discarded. ::: Before each CVE analysis, the git repository is reset. If you do not want this and prefer to manage the Git repository on your own, do not set `git_url=` or set it to an empty string. Otherwise, your local modifications will be overwritten/lost. ## Database types Each supported [database type](#database-type) may have additional configuration options. ### openvex-file This annotation database provides a unique {term}`OpenVEX` JSON file. The [database path](#database-path) needs to point to the JSON file. ### openvex-dir This annotation database is a directory containing one or multiple OpenVEX json files. The [database path](#database-path) needs to point to this directory. This database type requires an additional configuration option: `globs=`. This is a list of glob patterns, which allows only matching files to be provided. If it is specified from the command line, the value passed to the `globs=` option needs to be a comma-separated list. For example the following patterns could be used: - `*/CVE-*.json`: it matches any directory (only one level), then any file starting with `CVE-` and ending with `.json`. - `**/*.json`: it matches any number of directory segments, then it matches any files ending with `.json`. For more information about glob pattern language, see the [pathlib](https://docs.python.org/3/library/pathlib.html#pattern-language) documentation. This database can optionally be fetched from a git repository. To enable that, the `git_url` option must be specified. Additional associated options can be configured as described in the [git database](#git-database-options) section. ### spdx3-file This annotation database provides a single {term}`SPDX` v3.0 JSON file. While it is possible to manually add an SPDX v3.0 database, it is expected, and much more common, for this database to be added automatically via the input [SBOM](sbom.md) file. ### yocto-vex-manifest This annotation database provides a single Yocto Project {term}`VEX` manifest file generated from the Yocto Project `vex.bbclass`. This file contains the following information for each Yocto Project recipe (non-exclusive list): - package version - list of CPE - list of associated Yocto Project annotations specified by `CVE_STATUS` By default, in the Yocto Project, the file is provided in the `deploy` directory with the following name: `${IMAGE_BASENAME}-${MACHINE}.rootfs.json` For convenience, this database file can also be directly added with the `--yocto-vex-manifest` flag followed by the path to the Yocto Project VEX manifest file. But in that case, no additional options can be specified: default options will be used. ### simple-annotations This annotation database is a directory containing one or multiple YAML files. The [database path](#database-path) needs to point to this directory. This database type requires an additional configuration option: `globs=`. This is a list of glob patterns. Each pattern could be: - A path to a directory. In this case, search in this directory for files named with the following format: `CVE-2025-1234.yaml`. - A pattern matching a CVE YAML file. If it is specified from the command line, the value passed to the `globs=` option needs to be a comma-separated list. The YAML file name must be the CVE identifier. Any extensions could be used, but it is recommended to use either `.yaml` or `.yml`. If a custom file extension is used, you must specify a pattern that matches the filename. This database type has an additional optional option: `arch=`. This option configures the architecture of the image that is going to be analyzed. For example `arm`, `arm64` or `x86-64`. This option allows to filter annotations YAML files that have the `arch-only` key as described below. The example below is one annotation entry for the CVE id `CVE-2025-1234`, which must be named `CVE-2025-1234.yaml`: ```yaml vulnerable: 'no' last-review: '2025-01-01' cve-product: curl:libcurl versions: [7.12.3] comment: 'Not applicable because ...' ``` The YAML file must contain the following keys: - `vulnerable`: Can be either a boolean or a string. In case of a string, if the value is equal to `no` then the CVE is considered not vulnerable, otherwise any other value is considered vulnerable. - `last-review`: A date time encoded in ISO format - `cve-product`: Either the vendor and the product name are separated by a colon, or just the product name. - `versions`: The list of versions that indicates which package version this annotation applies to. If in the SBOM the package version does not strictly match, this annotation is ignored. - `comment`: The annotation statement The YAML file can contain additional keys, including the following optional keys: - `arch-only`: A list of architecture, for example `[arm, arm64]`. If this key is missing, this is the same as specifying `[all]`. If `all` is specified inside the `arch-only` list, no filtering is going to occur. Below is an example of a TOML configuration file for this simple annotation database. ```toml [databases.my-annotations] type = "simple-annotations" git_url = "git@example.com:my-project/my-annotations.git" path = "${SBOM_CVE_CHECK_DATABASES_DIR}/my-annotations" auto_update_max_age = "10min" max_age_since_last_commit = true globs = ["common/yocto", "boards/my-board/yocto"] ``` This database can optionally be fetched from a git repository. To enable that the `git_url` option must be specified. Additional associated options can be configured as described in the [git database](#git-database-options) section. ### cve-db-nvd-fkie This is a CVE database. This is the community reconstruction of the JSON {term}`NVD` Data Feeds, managed by {term}`FKIE`, stored in a git repository. The [database path](#database-path) points to the git directory. The path will be created if it does not exist. This database also takes the optional configuration options described in the [git database](#git-database-options) section. ### cve-db-cvelist This is a CVE database. This CVE List database contains the catalog of all [CVE Records](https://www.cve.org/ResourcesSupport/Glossary?activeTerm=glossaryRecord) identified by, or reported to, the [CVE Program](https://www.cve.org/). The [database path](#database-path) points to the git directory. The path will be created if it does not exist. This database also takes the optional configuration options described in the [git database](#git-database-options) section. ## Database Configuration Databases can be added in multiple ways: - From the command line, using the `--add-db` or `--set-db-cfg` flags (as many times as needed). - From configuration files, specified via the `--config` flag (multiple configuration files can be loaded by repeating the `--config` flag). **Database Configuration Merge:** If multiple configurations are specified (from files or the command line), they are **merged**: - A database configured in one file (e.g., `[databases.my-annotations]`) can be extended or modified in another file using the same identifier. - The second configuration can add new options or override existing ones, but it cannot delete previously declared options. **TOML Configuration File:** The configuration file uses the {term}`TOML` format. Each database is defined as a table within the `databases` table. An example can be seen in the [subsection](#example-of-configuration-file) below. **Command Line: `--add-db` Flag:** To add a database directly from the command line, the `--add-db` flag can be used with the following syntax: ```sh --add-db database-type database-path key1=val1 key2=val2 ``` - `database-type`: The [type of the database](#database-type). - `database-path`: The [path to the database](#database-path). - `key1=val1`: Additional options (e.g., `name="My Database"`). See the [subsection](#example-of-command-line) below for an example. **Command Line: `--set-db-cfg` Flag:** To add or override a database configuration from the command line, use the `--set-db-cfg` flag with the following syntax: ```sh --set-db-cfg database-id key1=val1 key2=val2 ``` - `database-id`: The unique identifier of the database (which is the "databases" section name in the TOML file). - `key1=val1`: Additional options (e.g., `name="My Database"`). **Database Types:** For a list of supported database types, see the [Database Type](#database-type) subsection. Detailed documentation for each type is available in the [Database Types](#database-types) section. ### Duration units The duration of a configuration value can have a suffix to indicate the time unit. The following suffixes are supported: | Supported suffixes | Unit | |---------------------------------|------------------------------| | `msec`, `ms` | milliseconds (0.001 seconds) | | `seconds`, `second`, `sec`, `s` | seconds (1 second) | | `minutes`, `minute`, `min`, `m` | minutes (60 seconds) | | `hours`, `hour`, `hr`, `h` | hours (3600 seconds) | | `days`, `day`, `d` | days (86400 seconds) | | `weeks`, `week`, `w` | weeks (604800 seconds) | | `months`, `month`, `M` | months (2629800 seconds) | The duration can be specified as a decimal value followed by a unit, with an optional space between the value and the unit. You can also combine multiple units, such as `1h 15min` or `1h15min`. The space between values is optional. The duration can be specified in the TOML file as an integer or a decimal value. In this case, the default unit is the seconds. ### Boolean value For an option expecting a boolean value, the following case-insensitive value can be specified: | Possible boolean values | Result | |---------------------------|--------| | `1`, `yes`, `true`, `on` | True | | `0`, `no`, `false`, `off` | False | In the TOML file, it is recommended to specify the value as a normal boolean. ### Default configuration file The tool loads, by default, the internal `db_default.toml` configuration file, which is stored within this Python package, which contains the configuration for the default databases. To prevent loading this default configuration file, the following flag can be specified: `--ignore-default-config`. This default configuration file adds the following databases with their respective options: - NVD, downloaded from the FKIE-CAD git repository: - `type` is `cve-db-nvd-fkie` - `name` is `nvd-fkie` - `git_url` is `https://github.com/fkie-cad/nvd-json-data-feeds.git` - CVE List, downloaded from the CVEProject git repository: - `type` is `cve-db-cvelist` - `name` is `cvelist` - `git_url` is `https://github.com/CVEProject/cvelistV5.git` Both CVE databases share the following options (which are default values): - `priority` equals 50, as explained in [database priority](#database-priority). - `auto_update_max_age` is set to `20h`, and `max_age_since_last_commit` is set to true, as explained in [git database options](#git-database-options). ### Extending the default configuration To customize or override the default database configuration, loaded from the internal `db_default.toml` file, a TOML configuration file can be created. Below is an example of a TOML configuration file that you can create and pass to the tool using the `--config` flag: ```toml [databases.nvd-fkie] git_url = "https://github.com/fkie-cad/nvd-json-data-feeds.git" auto_update_max_age = "1h" cache_index_path = "${SBOM_CVE_CHECK_DATABASES_DIR}/index-nvd-fkie.json" [databases.cvelist] auto_update_max_age = "50min" max_age_since_last_commit = false ``` The key is to define your new options under either the `[databases.nvd-fkie]` or `[databases.cvelist]` section. ### Example of configuration file Below is an example of a TOML configuration file that adds a directory containing multiple JSON OpenVEX files: ```toml [databases.Annotations] name = "My custom annotations" type = "openvex-dir" path = "../databases/annotations" globs = ["abc/*/*.json", "recursive/**/*.json"] obsolete_assessment_check = true ``` ### Example of command-line To add the same database as described in the subsection above, but this time from the command line, the following arguments need to be specified: ```sh --add-db openvex-dir ../databases/annotations \ "name=My custom annotations" \ "globs=abc/*/*.json, recursive/**/*.json" \ "obsolete_assessment_check=True" ```