Vulnerability Management

Mapping CMDB and Asset Sources Into One Inventory

By PMAP Security Team 17 min read

Every security program eventually hits the same wall. The CMDB says one thing, the vulnerability scanner says another, and the network-discovery sweep finds hosts that neither system knew existed. Each source is internally consistent. Together they describe an environment that does not quite exist. When you cannot agree on what an asset is, you cannot agree on whether it is at risk.

This is the problem PMAP solves at the ingest layer. Instead of treating the CMDB, the scanners, and discovery tooling as competing systems of record, PMAP reconciles them into one inventory. It does this with a diff-based sync model rather than a blind dump, a deterministic matching rule for deduplication, and an explicit source-precedence order that decides who wins when two systems disagree about the same field. The result is one record per real asset, with a clear provenance trail behind every value.

This article walks through how that reconciliation works in practice. If you are building the surrounding architecture, the security integration layer explains how connectors are registered and credentialed, and the attack surface and asset inventory pillar covers why a trustworthy inventory is the foundation for everything downstream.

Why Asset Sources Drift Apart

Asset sources drift apart because each one is built for a different job, and none of them is built to be authoritative about the others.

A CMDB exists to record the configuration items an IT organization manages. It is excellent at recording ownership, business service mapping, lifecycle state, and change history. It is only as current as the process that feeds it. New cloud instances spin up between change windows. Decommissioned hosts linger as records long after the hardware is gone. The CMDB describes the environment that was approved, which is not always the environment that exists.

A vulnerability scanner exists to find weaknesses on hosts it can reach. In PMAP, scanner integrations auto-populate assets during ingestion. When Nessus or Qualys reports a finding on an IP, an asset is created or updated to hang that finding on. Scanners are current about what is reachable and live, but they are blind to ownership, business context, and anything outside their scan scope.

Network-discovery tooling exists to enumerate what is actually on the wire. An Nmap or Masscan sweep finds responding hosts regardless of whether anyone documented them. Discovery is the closest thing to ground truth about presence, and it knows almost nothing about meaning.

Each source is right about its own slice and wrong about the rest. The drift is not a data-quality failure you can fix once. It is structural. The only durable answer is a reconciliation layer that ingests all three, matches their records to the same underlying asset, and applies clear rules about which source is trusted for which field. That layer is what the rest of this article describes.

The AssetSource Connector Archetype

PMAP organizes every external system it talks to into a small set of connector archetypes. One of them exists specifically for systems that enumerate inventory and pipeline structure rather than scan for vulnerabilities. PMAP calls it the AssetSourceConnector.

The AssetSourceConnector archetype handles repository and pipeline enumeration plus CI fan-out. In the shipped catalog it backs the source-control and CI/CD platforms: GitHub, GitLab, Azure DevOps, Jenkins, Bamboo, and Bitbucket. These connectors enumerate the repositories and pipelines a platform exposes, which become assets and scan targets inside PMAP. This is distinct from the ScannerConnector archetype, which runs remote scans and fetches results, and from the ITSMConnector archetype, which creates and syncs tickets.

The archetype model matters for one practical reason. Because connectors self-register through a registry and the core service resolves them by type rather than through hardcoded branches, adding a new asset source does not require touching the reconciliation logic. The matching, diffing, and precedence rules described below apply uniformly. A repository enumerated from GitHub and an IP-addressable host imported from a CMDB feed both land in the same inventory, governed by the same rules.

It is worth being precise about where each kind of asset comes from, because PMAP uses different ingest paths for different sources. Repositories and pipelines arrive through AssetSource enumeration. Reachable hosts arrive through scanner ingestion and the discovery inbox. CMDB inventory arrives through the asset-sync diff and the bulk-create API. The sections that follow take each of these paths in turn.

Asset-Sync as a Diff, Not a Dump

The most common way to integrate a CMDB with a downstream system is the worst one. You export the full asset list, push it into the target, and hope the target sorts out what changed. This creates duplicate records, resurrects deleted hosts, and overwrites local edits on every run. PMAP does not work this way.

PMAP models CMDB integration as a diff. The asset-sync endpoint, GET /{id}/asset-sync, fetches the vendor asset list and compares it against the assets already in PMAP. What it returns is not the full vendor list. It is the delta: the assets that are new, the assets that changed, and the matches that already exist. You see exactly what the sync would do before anything is written.

The import side, POST /{id}/asset-sync/import, then bulk-upserts that delta. Upsert is the operative word. New assets are created. Matched assets are updated in place. Nothing is duplicated, because the diff already resolved each vendor record to either a new asset or an existing one. Because the operation is an upsert against a stable match key rather than an insert, running the same sync twice produces the same inventory. The import pipeline is idempotent by design, which is the same property PMAP enforces across scanner imports and file imports.

This diff-first model changes the operational character of CMDB integration. A nightly sync is not a risky bulk operation that you watch nervously. It is a reconciliation pass that, on a steady-state environment, finds little to do and changes little. The interesting runs are the ones where the diff is large, and those are exactly the runs you want a human to look at before importing.

Reviewing the Delta Before You Import

Because the diff and the import are two separate steps, there is a natural place to put a human in the loop. PMAP exposes the asset-sync diff as a review surface before any write happens.

The diff view presents the delta as a table. New assets the CMDB knows about and PMAP does not. Changed assets where the vendor record differs from the PMAP record. Existing matches that need no action. An operator can read this list, confirm it matches their understanding of recent change activity, and only then run the import. If a CMDB misconfiguration suddenly proposes creating ten thousand phantom assets, that shows up in the diff as ten thousand pending creates, not as ten thousand new rows you discover the morning after.

This separation is deliberate. The diff is read-only and safe to run as often as you like. The import is the only step that mutates the inventory, and it acts only on the delta you reviewed. For organizations where the CMDB is authoritative for some attributes but not trusted for presence, this review step is where that judgment gets applied.

Discovered Assets From Network Scanners

CMDB sync answers the question of what IT thinks it manages. Network discovery answers a different and often more uncomfortable question: what is actually responding on the network, documented or not.

PMAP routes network-discovery results through a dedicated path called the discovery inbox. When a network scanner such as Nmap or Masscan finds responding hosts, those assets do not silently merge into the live inventory. They land in the inbox first, surfaced by GET /{id}/discovered-assets. This is a staging area for assets that exist on the wire but have not yet been assigned an owner, a company, or a project inside PMAP.

From the inbox, an operator assigns discovered assets in bulk. POST /{id}/discovered-assets/assign attaches the selected assets to a PMAP company or project in one call. This is the moment an undocumented host becomes a managed asset with a place in the tenancy model.

The reason for the inbox, rather than auto-assignment, is governance. Discovery is the source most likely to surface surprises: shadow IT, a forgotten lab subnet, a contractor’s device, a host that should have been decommissioned. Dropping those straight into a production project would pollute risk scoring and ownership before anyone validated them. The inbox makes discovery a reviewed promotion rather than a silent one, which is consistent with how PMAP treats every source that could introduce uncertainty.

How Assets Are Matched and Deduplicated

Reconciliation lives or dies on matching. If PMAP cannot reliably decide that the host the CMDB calls WEB-PROD-01 and the IP the scanner reported are the same asset, every other rule is moot. You either get duplicates or you get collisions.

PMAP matches assets by IP, hostname, and external ID. When an import pipeline processes a vendor record, it looks for an existing asset that matches on one of those keys before deciding whether to create a new record or update an existing one. This is the same matching contract that makes the asset-sync diff and the scanner import idempotent. Re-importing the same source produces no duplicates because the second pass resolves to the same assets the first pass created.

Matching by raw string comparison is where naive systems fail, so PMAP canonicalizes before it compares. IP addresses are normalized to their canonical form before the match runs, which strips IPv4 leading zeros and IPv6 zone identifiers. This prevents false positives such as 10.10.235.181 being treated as related to 10.250.235.181 by a loose pattern match. Hostnames are normalized to lowercase punycode, so internationalized domain names and mixed-case entries resolve to one canonical key rather than several near-duplicates.

The deduplication story extends to the create path itself. When an asset is created, PMAP matches by name and by each IP in the request within the same company. If both the name and an IP point to the same existing asset, it is treated as a duplicate of that asset. If the name and IP point to different assets, PMAP raises an ambiguous-match condition rather than guessing, because silently merging two distinct assets is worse than asking. A post-insert re-resolve handles the race between checking for a match and performing the write, so concurrent imports do not slip a duplicate through the gap.

The practical effect is that you can feed PMAP the CMDB, three scanners, and a discovery sweep that all describe the same web server, and end up with one asset record carrying the union of what those sources know. Which brings us to the question of what happens when those sources do not agree.

Source Precedence: Who Wins When Sources Disagree

Matching tells PMAP that two records describe the same asset. It does not tell PMAP which record to believe when they carry conflicting values for the same field. That is the job of source precedence, and PMAP makes the ordering explicit rather than leaving it to import order or luck.

The precedence chain, highest to lowest, is manual > rule > runbook > scanner > import. A value set manually by an operator outranks a value set by an automation rule. A rule outranks a runbook action. A runbook outranks a scanner. A scanner outranks an import, which is the lowest-trust source and is where bulk feeds such as a CMDB push typically land.

The most important consequence is at the top of the chain. A scanner enrichment never overwrites a manually set field unless that field is explicitly unlocked. If a security engineer manually corrected an asset’s environment or ownership, the next scanner run does not silently undo that correction. The human judgment wins, and it stays winning until a human changes it.

PMAP does not resolve conflicts by silently picking a side and moving on, either. When a lower-precedence source tries to overwrite a value owned by a higher-precedence source, the attempt is recorded as a pending conflict for administrative review rather than applied. There is also a field-level lock mechanism, so an operator can pin a specific field against any automated source regardless of precedence. Every write is recorded with its source and its outcome, which gives you an audit trail of why a field holds the value it does.

This is what separates reconciliation from a last-writer-wins free-for-all. The order is fixed and visible. Manual edits are protected by default. Conflicts surface instead of disappearing. When someone asks why the inventory says what it says, the answer is in the provenance log, not in a guess about which integration ran most recently.

Scanner Enrichment That Fills the Gaps

Precedence governs conflicts, but most of the time scanners are not fighting other sources. They are filling in fields no other source provided. This is enrichment, and it is how a thin CMDB record becomes a richly described asset.

PMAP runs scanner enrichment as a silent merge into the asset’s structured data. As scanners report what they observe, PMAP folds that detail into the matched asset: open ports, running services, detected technologies, installed software, operating-system details, hostname, fully qualified domain name, and MAC address. A CMDB record that knew an asset’s name and owner gains a live picture of its actual exposed surface, sourced from the tools that can see it.

The enrichment merger is deliberately constrained. Only eight field paths may be written by enrichment: open ports, services, technologies, software, OS details, hostname, FQDN, and MAC. Any enrichment record that targets a path outside this set is rejected at merge time. Enrichment cannot reach in and rewrite ownership, criticality, or business context. It enriches the technical picture and leaves the human-owned fields alone, which is exactly the boundary you want between an automated source and the attributes a person is responsible for.

This narrow allowlist is what makes it safe to point many scanners at the same asset. Each one contributes to the technical field set within its lane, the precedence rules arbitrate the rare overlaps, and the eight-field boundary guarantees enrichment never quietly mutates the fields that carry organizational meaning. The CMDB supplies meaning, the scanners supply the technical surface, and the merge produces an asset that is more complete than any single source.

One Inventory as the Foundation of Attack Surface

Everything above serves one outcome: a single inventory you can trust as the basis for attack-surface visibility. You cannot reason about your attack surface from three disagreeing lists. You can reason about it from one reconciled inventory where every asset is matched, deduplicated, and carries a known provenance for each field.

Once an asset is reconciled, it becomes the anchor for the rest of PMAP. Every finding references an asset. Every risk score resolves to an asset. Scan coverage, ownership resolution, and reporting all hang off the inventory. If the inventory is duplicated or contradictory, those downstream views inherit the noise. If it is clean, they inherit the clarity.

This is why CMDB integration is not a side feature. It is part of building the inventory that makes attack-surface management coherent. The attack surface and asset inventory pillar develops this further, and the operational side of curating a trustworthy inventory from many sources is covered in the guide on building a trustworthy asset inventory. For the patterns themselves, organizations standardizing on a configuration management database often align with the CMDB practice described in the ITIL service-management framework and the asset-inventory guidance in CIS Controls v8, both of which treat a single authoritative inventory as a control rather than a convenience.

How CMDB Integration Fits the Integration Layer

CMDB sync, discovery, and scanner enrichment all reach the inventory through the same connector layer that handles every other external system. That is the point of the architecture. Asset sources are not a special case bolted onto the side. They are one more category of connector, registered, credentialed, and resolved through the same hub.

This consistency is what keeps the reconciliation rules uniform. The matching keys, the diff-based import, the precedence chain, and the enrichment allowlist do not change based on which connector delivered the data. A CMDB feed, a scanner result, and a discovery sweep are governed by the same logic because they all pass through the same ingest path. The security integration layer pillar covers how that connector hub is structured, how credentials are protected, and how the registry resolves each connector type.

For teams evaluating PMAP against an existing stack, the question is rarely whether a single source can be imported. It is whether all of the sources can be reconciled into something you can stand behind. The answer here is a diff you can review, a match you can trust, a precedence order you can predict, and a provenance trail you can audit. That is the difference between many lists and one inventory.

Reconcile every asset source into one inventory. Read the asset inventory and risk management datasheet for the full model.

Frequently Asked Questions

Does PMAP overwrite my CMDB, or does the CMDB overwrite PMAP?

Neither overwrites the other wholesale. PMAP imports CMDB inventory as the lowest-trust source in its precedence chain, where bulk imports sit below scanner, runbook, rule, and manual sources. A CMDB feed fills in fields that nothing else has set and proposes new assets, but it does not silently override a value that a higher-precedence source already owns. Manually set fields are protected from any automated source unless explicitly unlocked.

How does PMAP avoid creating duplicate assets when several sources describe the same host?

PMAP matches incoming records against existing assets by IP, hostname, and external ID before deciding whether to create or update. IP addresses are canonicalized and hostnames are normalized to lowercase punycode before the comparison, so formatting differences do not produce false duplicates. On the create path, a match on both name and IP is treated as a duplicate of the existing asset, and a post-insert re-resolve closes the race between concurrent imports. Re-importing the same source produces no new duplicates because the import pipeline is idempotent.

What happens if the CMDB and a scanner disagree about the same field?

The source-precedence chain decides. The order, highest to lowest, is manual, rule, runbook, scanner, import. A scanner value outranks a CMDB import value, and a manual value outranks both. When a lower-precedence source tries to overwrite a value owned by a higher-precedence source, PMAP records the attempt as a pending conflict for administrative review rather than applying it silently. Every write is logged with its source and outcome.

Can I review a CMDB sync before it changes my inventory?

Yes. Asset-sync is a two-step operation. The diff endpoint fetches the vendor asset list and compares it against PMAP, returning only the delta of new and changed assets without writing anything. You review that delta as a table, confirm it matches expected change activity, and only then run the import, which bulk-upserts the reviewed delta. The diff is read-only and safe to run as often as you like.

How are assets found by network scanners handled differently from CMDB assets?

Network-discovery results route through the discovery inbox rather than merging straight into the live inventory. Discovered hosts surface in a staging area where an operator reviews them and then bulk-assigns them to a PMAP company or project. This makes promoting an undocumented host a reviewed action, which keeps shadow IT and stale lab hosts from polluting risk scoring and ownership before anyone validates them.

Which asset fields can scanner enrichment write?

Enrichment is restricted to eight field paths: open ports, services, technologies, software, OS details, hostname, FQDN, and MAC address. Any enrichment record targeting a path outside this set is rejected at merge time. This boundary lets scanners build a rich technical picture of an asset while leaving ownership, criticality, and business context to the human-owned fields and higher-precedence sources.

Do source-control platforms like GitHub count as asset sources?

Yes. Source-control and CI/CD platforms are handled by the AssetSource connector archetype, which enumerates repositories and pipelines and turns them into assets and scan targets inside PMAP. GitHub, GitLab, Azure DevOps, Jenkins, Bamboo, and Bitbucket are supported through this archetype. They are reconciled into the same inventory and governed by the same matching and precedence rules as scanner and CMDB sources.

author avatar
PMAP Security Team

Newsletter

Get the next writeup in your inbox

One short email when a new case writeup or detection deep dive ships. No marketing drip, no third-party tracking.