Skip to main content
The safedep query command runs SQL-like queries against SafeDep Cloud’s analytics surface: the packages, projects, endpoints, events, and security findings collected across your tenant, enriched with global threat intelligence (vulnerabilities, EPSS, CISA KEV, OpenSSF Scorecard, and more).

Prerequisites

Install the CLI and sign in to SafeDep Cloud:
safedep auth login
safedep auth status   # confirm tenant and OAuth token
SafeDep scopes every query to your tenant. You never write a tenant filter yourself.

Your first query

Start by seeing what you can query. List the tables, then inspect one to view its columns, capability flags, and join edges:
safedep query schema list             # all tables + one-line descriptions
safedep query schema show projects    # one table: columns, flags, joins
See Discovering the schema for the full set of schema commands. Now run a query. It is mostly SQL. Select some columns, filter with WHERE, sort with ORDER BY:
safedep query exec --sql "
  SELECT projects.name
  FROM projects
  WHERE projects.origin_source = 'SOURCE_GITHUB'
  ORDER BY projects.name"
projects.name
safedep/vet
safedep/pmg
safedep/control-tower
...
Two things make this different from a generic database:
  1. Write every column as table.column using the real table name. No table aliases, no bare column names. Result aliases via AS work (COUNT(...) AS n, then ORDER BY n).
  2. Every query must filter on an indexed column or a bounded time range. Here projects.origin_source is indexed, so the query is accepted. This rule keeps queries cheap; the schema marks which columns are indexed.
safedep query exec reads the statement from --sql, --sql-file, or stdin:
safedep query exec --sql "SELECT projects.name FROM projects WHERE projects.origin_source = 'SOURCE_GITHUB'"
safedep query exec --sql-file ./query.sql
echo "SELECT packages.name FROM packages WHERE packages.ecosystem = 'ECOSYSTEM_NPM'" | safedep query exec

How the data is organized

Tables come in two kinds:
  • Tenant tables hold your data: projects, project_versions, boms, packages, endpoints, the component_* finding tables, and the event tables (inventory_events, package_guard_events).
  • Join-only enrichment tables hold global reference data shared across all tenants: vulnerabilities, epss, kev, licenses, malware_analysis, open_source_packages, open_source_projects, scorecards, scorecard_checks, terraform_providers.
The most important rule: every query must reference at least one tenant table. You reach an enrichment table only by joining out from a tenant table. A typical query walks from a first-party project, through its bill of materials, to a package, to a finding, and out to global enrichment:
projects -> project_versions -> boms -> packages -> component_vulnerabilities -> vulnerabilities -> epss
 (tenant)      (tenant)       (tenant)  (tenant)        (tenant finding)         (enrichment)   (enrichment)

What SQL we support

FeatureNotes
SELECTColumns must be qualified table.column.
WHERE=, <, <=, >, >=, AND, OR. Enums filtered by name.
LIKEPrefix only: 'name%'. Leading wildcards ('%name') are rejected.
JOINNeeds an ON clause, but its contents are ignored: the catalog applies a predefined join. Only join tables connected by an edge.
GROUP BYOnly columns flagged groupable.
HAVINGFilter on aggregates after grouping.
ORDER BYIncluding by an aggregate alias.
Aggregatescount, sum, avg, min, max.
Not supported: subqueries, CTEs (WITH), UNION and set operations, window functions, casts, arbitrary functions, and multiple statements.

Discovering the schema

You do not need to memorize tables or columns. The schema is self-describing, and this is the entry point for both humans and AI agents.
safedep query schema list             # table names + one-line descriptions
safedep query schema show packages    # one table: columns, flags, joins
safedep query schema get              # everything: all tables, edges, rules
safedep query schema get -o json      # same, machine-readable
Each column carries capability flags that tell you what it can do:
FlagMeaning
selectableMay appear in SELECT.
filterableMay appear in WHERE.
groupableMay appear in GROUP BY.
aggregatableMay be wrapped in sum/avg/min/max.
indexedSatisfies the “must filter on an indexed column” rule.
schema get also prints the join edges (which tables connect, and their cardinality) and the usage rules with example queries, so one call gives you everything needed to write a valid query.

Letting an AI agent write the queries

Point your AI coding agent at the safedep CLI. The schema is self-describing and server errors come back verbatim, so the agent can discover tables, write a query, and self-correct on its own. Prompt:
You can query SafeDep Cloud data with the `safedep` CLI (a constrained SQL
dialect). Run `safedep query schema get -o json` to discover the tables,
columns, rules, and join edges. Write a query, run it with
`safedep query exec -o json --sql "<statement>"`, and self-correct from the
verbatim server errors. Rules and worked examples:
https://docs.safedep.io/reference/sql-query

Examples

Anchor on the indexed ecosystem column (an OR across all severities would not satisfy the index rule):
safedep query exec --sql "
  SELECT packages.ecosystem, vulnerabilities.severity_rating,
         COUNT(DISTINCT vulnerabilities.vuln_id) AS num_vulns
  FROM packages
  JOIN component_vulnerabilities ON component_vulnerabilities.vulnerability_id = packages.id
  JOIN vulnerabilities ON vulnerabilities.vuln_id = component_vulnerabilities.vulnerability_id
  WHERE packages.ecosystem = 'ECOSYSTEM_NPM'
  GROUP BY packages.ecosystem, vulnerabilities.severity_rating
  ORDER BY num_vulns DESC"
safedep query exec --sql "
  SELECT packages.name, COUNT(DISTINCT vulnerabilities.vuln_id) AS critical_vulns
  FROM packages
  JOIN component_vulnerabilities ON component_vulnerabilities.vulnerability_id = packages.id
  JOIN vulnerabilities ON vulnerabilities.vuln_id = component_vulnerabilities.vulnerability_id
  WHERE vulnerabilities.severity_rating = 'CRITICAL'
  GROUP BY packages.name
  ORDER BY critical_vulns DESC" --limit 15
EPSS is join-only enrichment, so anchor in the tenant packages table and join out. Sorting by EPSS surfaces the CVEs most likely to be exploited:
safedep query exec -o json --sql "
  SELECT packages.name, vulnerabilities.cve_id, epss.epss, epss.percentile
  FROM packages
  JOIN component_vulnerabilities ON component_vulnerabilities.vulnerability_id = packages.id
  JOIN vulnerabilities ON vulnerabilities.vuln_id = component_vulnerabilities.vulnerability_id
  JOIN epss ON epss.cve = vulnerabilities.cve_id
  WHERE vulnerabilities.severity_rating = 'CRITICAL'
  ORDER BY epss.epss DESC"
packages.name                         cve_id            epss     percentile
org.springframework:spring-webmvc     CVE-2022-22965    0.944    0.99985
org.apache.logging.log4j:log4j-core   CVE-2021-44228    0.944    0.99964
next                                  CVE-2025-29927    0.921    0.99724
kev is the CISA Known Exploited Vulnerabilities catalog. Use a bounded range on the indexed kev.date_added to anchor the query:
safedep query exec --sql "
  SELECT packages.name, packages.ecosystem, vulnerabilities.cve_id, vulnerabilities.severity_rating
  FROM packages
  JOIN component_vulnerabilities ON component_vulnerabilities.vulnerability_id = packages.id
  JOIN vulnerabilities ON vulnerabilities.vuln_id = component_vulnerabilities.vulnerability_id
  JOIN kev ON kev.cve_id = vulnerabilities.cve_id
  WHERE kev.date_added >= '2018-01-01T00:00:00Z'
  ORDER BY packages.name"
is_malware is not indexed, so anchor on the indexed detected_at timestamp:
safedep query exec --sql "
  SELECT packages.name, packages.ecosystem, packages.version,
         component_malicious_packages.is_malware,
         component_malicious_packages.is_verified
  FROM packages
  JOIN component_malicious_packages ON component_malicious_packages.analysis_id = packages.id
  WHERE component_malicious_packages.detected_at >= '2020-01-01T00:00:00Z'
  ORDER BY packages.name"
Walk from packages through component_licenses to the global licenses catalog for SPDX metadata. Anchor on the indexed packages.ecosystem:
safedep query exec --sql "
  SELECT packages.name, component_licenses.license_code, licenses.spdx_license_id
  FROM packages
  JOIN component_licenses ON component_licenses.license_code = packages.id
  JOIN licenses ON licenses.license_code = component_licenses.license_code
  WHERE packages.ecosystem = 'ECOSYSTEM_NPM'
  ORDER BY packages.name"
endpoints are the machines and CI runners reporting to your tenant. The timestamp columns are indexed, so a bounded range anchors the query:
safedep query exec --sql "
  SELECT endpoints.identifier, endpoints.endpoint_type, endpoints.trust_level, endpoints.last_sync_at
  FROM endpoints
  WHERE endpoints.last_sync_at >= '2025-01-01T00:00:00Z'
  ORDER BY endpoints.last_sync_at DESC"
endpoints.identifier      endpoint_type   trust_level   last_sync_at
Vignesh                   0               1             2026-06-05T07:23:03Z
CICD_GHA_SAFEDEP_CLI      0               1             2026-06-05T06:11:20Z
macbookpro.lan            0               1             2026-06-04T08:54:03Z
The endpoint_type and trust_level enums are not groupable, so use WHERE filters and listings rather than GROUP BY for endpoints. Filter them by name (endpoint_type = 'CI_RUNNER').
For “which endpoints…” questions you want a de-duplicated list, not one row per event. endpoints.identifier is not groupable, so use SELECT DISTINCT:
safedep query exec --sql "
  SELECT DISTINCT endpoints.identifier
  FROM package_guard_events
  JOIN endpoints ON endpoints.id = package_guard_events.invocation_id
  WHERE package_guard_events.package_action = 'PMG_PACKAGE_ACTION_BLOCKED'"
endpoints.identifier
macbookpro.lan
pop-os
Sahils-MacBook-Pro.local
Vignesh
Join Package Guard events to the endpoint that produced them, newest first, for a full audit trail:
safedep query exec --sql "
  SELECT endpoints.identifier,
         package_guard_events.package_name,
         package_guard_events.package_ecosystem,
         package_guard_events.tool_name,
         package_guard_events.timestamp
  FROM package_guard_events
  JOIN endpoints ON endpoints.id = package_guard_events.invocation_id
  WHERE package_guard_events.package_action = 'PMG_PACKAGE_ACTION_BLOCKED'
  ORDER BY package_guard_events.timestamp DESC" --limit 50
endpoints.identifier        package_name                     package_ecosystem   tool_name   timestamp
pop-os                      speed5                           2                   pmg         2026-05-27T10:30:34Z
pop-os                      lab-helper                       2                   pmg         2026-05-27T10:28:14Z
Sahils-MacBook-Pro.local    martinez-polygon-clipping-tony   2                   pmg         2026-05-07T10:20:55Z
Sahils-MacBook-Pro.local    prettier                         2                   pmg         2026-05-07T07:40:20Z
Sahils-MacBook-Pro.local    fast-check                       2                   pmg         2026-05-07T07:39:50Z
The package_ecosystem enum renders as its stored number in table output (2 = ECOSYSTEM_NPM, 3 = ECOSYSTEM_PYPI); use -o json for the enum name.
inventory_events captures items discovered by vet on each endpoint. Filter on the indexed item_kind to find AI-related items:
safedep query exec --sql "
  SELECT inventory_events.item_identity, inventory_events.app, endpoints.identifier
  FROM inventory_events
  JOIN endpoints ON endpoints.id = inventory_events.app
  WHERE inventory_events.item_kind = 'INVENTORY_ITEM_KIND_MCP_SERVER'
  ORDER BY inventory_events.item_identity" --limit 50

Output modes

Select with -o / --output. When omitted, the CLI auto-detects the format: a rendered table for an interactive terminal, plain text when piped.
ModeUse it for
tableHuman reading. Adds a footer: <rows> | ~<cost> cost | <ms>ms, plus a next page hint when more rows exist.
plainPipelines. Tab-separated header and rows, no footer.
jsonScripts and agents. Typed columns, planner stats, and next_page_token.
JSON shape:
{
  "columns": [ { "name": "packages.name", "type": "STRING" } ],
  "rows": [ { "packages.name": "django" } ],
  "count": 1,
  "next_page_token": "",
  "stats": { "estimated_cost": 1243.7, "estimated_rows": 1, "elapsed_ms": 9 }
}

Pagination

Pagination is caller-driven; the CLI does not auto-iterate. Fetch the first page, then re-run the same query with the returned token:
TOK=$(safedep query exec -o json --sql "SELECT packages.name FROM packages WHERE packages.ecosystem = 'ECOSYSTEM_NPM'" | jq -r .next_page_token)
safedep query exec --sql "SELECT packages.name FROM packages WHERE packages.ecosystem = 'ECOSYSTEM_NPM'" --page-token "$TOK"

Troubleshooting

ErrorCauseFix
queries must reference at least one tenant tableQuery starts from a join-only enrichment table.Anchor FROM on a tenant table (for example packages) and join out.
query must filter on an indexed column or a bounded time rangeNo effective indexed filter, or an OR across all enum values.Add an equality filter on an indexed column, or a bounded timestamp range.
column "X" is not groupableGROUP BY on a non-groupable column.Group only by groupable columns; use SELECT DISTINCT for “which” questions.
column "X" is not aggregatablesum/avg/min/max on a column without the flag.Use count(column), or aggregate only aggregatable columns.
LIKE pattern rejectedLeading wildcard ('%name').Use a prefix pattern ('name%').
Duplicate rowsJoin fan-out from many_to_one edges.Use COUNT(DISTINCT ...) or de-duplicate client-side.
Unknown table or columnName not in the schema.Re-run safedep query schema get and use exact names.

SafeDep Cloud Overview

What SafeDep Cloud collects to build the data you query.

Authentication

Sign in with safedep auth login before running queries.

vet Filtering

Filter findings locally with vet’s CEL query language.

API & Automation

Programmatic access to SafeDep’s APIs.