PA.
HomeProjectsTech StackBlog
Resume
PA.

Senior AI/ML Engineer · KYC/AML · Africa

HomeProjectsTech StackBlog

© 2026 Patrick Attankurugu. Built with Next.js.

Back to home
KYB · Graph Databases

Graph Databases for Beneficial Ownership: Mapping UBO Networks with Neo4j

December 2024·8 min read

Shell companies, nominee directors, circular ownership chains. The corporate structures designed to hide Ultimate Beneficial Owners are inherently graph problems. Relational databases struggle with them. Neo4j makes them tractable.

The Compliance Requirement

Know Your Business (KYB) regulations require financial institutions to identify the natural persons who ultimately own or control a corporate customer. FATF guidance sets the threshold at 25% ownership for most jurisdictions, though some countries use 10% or 15%. The challenge is that ownership is rarely straightforward.

A company might be owned by another company, which is owned by a trust, which is controlled by a foundation, which has a board appointed by someone who turns out to be a politically exposed person. Each layer adds opacity. Each jurisdiction has different disclosure rules. And the people who design these structures are specifically trying to make the ownership trail hard to follow.

4.2
Avg. ownership layers in flagged cases
37%
Cases with cross-border structures
12%
Cases with circular ownership

Why Relational Databases Fail Here

On paper, you can model ownership in a relational database. You create an entities table, an ownership table with foreign keys, and write recursive CTEs to traverse the chain. We tried this first. It works for shallow structures (two or three layers) but breaks down in practice for several reasons.

Recursive CTEs become slow as depth increases. At four or five layers of ownership, query times grow to seconds. At seven or eight (not uncommon in complex structures), they become impractical. The optimizer struggles because it cannot predict the recursion depth in advance.

Worse, real ownership structures are not trees. They are graphs with cycles. Company A owns 30% of Company B, which owns 20% of Company C, which owns 15% of Company A. Recursive CTEs need explicit cycle detection to avoid infinite loops, adding complexity and further degrading performance.

Finally, relational databases have no native concept of path analysis. Questions like "what is the shortest ownership path between Person X and Company Y" or "are there any circular structures involving this entity" require complex application-level logic. In a graph database, these are single-line queries.

SQL vs. Cypher: The Same Query, Two Paradigms

WITH RECURSIVE ownership_chain AS (
  SELECT parent_id, child_id, ownership_pct, 1 AS depth
  FROM corporate_ownership
  WHERE child_id = 'accra_holdings'

  UNION ALL

  SELECT co.parent_id, co.child_id,
    co.ownership_pct * oc.ownership_pct / 100,
    oc.depth + 1
  FROM corporate_ownership co
  JOIN ownership_chain oc ON co.child_id = oc.parent_id
  WHERE oc.depth < 10
)
SELECT * FROM ownership_chain
JOIN entities ON entities.id = ownership_chain.parent_id
WHERE ownership_pct > 25;

-- 23 lines, hard to extend, slow at depth > 5
Advantages
✓ Familiar syntax
✓ No new infrastructure
Limitations
✗ Recursive CTEs are slow at depth
✗ Hard to add relationship types
✗ Cannot traverse bidirectionally
✗ No native graph algorithms

The Neo4j Data Model

We modeled the ownership graph with four node types and five relationship types. The schema is deliberately simple because the power comes from traversal, not from entity attributes.

Property Graph Schema

Node Types
Person
name, nationality, dob, pep_tier
Company
name, jurisdiction, reg_number, status
Trust
name, jurisdiction, trust_type
Foundation
name, jurisdiction, purpose
Relationship Types
OWNSDirect ownership stake
ownership_pct, since, source
DIRECTSBoard or management position
role, since, nominee
CONTROLSVoting rights, veto power
mechanism, since
BENEFITS_FROMTrust beneficiary
benefit_type, since
RELATED_TOFamily or close association
relation_type

Detecting Hidden UBOs: A Real Example

The interactive diagram below walks through a real (anonymized) case from our KYB screening. At the surface, a Ghanaian company appeared to have a straightforward ownership structure. Each layer of investigation revealed additional complexity, until graph traversal identified the UBO as a Tier 1 PEP.

Interactive: Drill Through Ownership Layers

Registered Owner: Accra Holdings Ltd

Risk: Low (apparent)

A Ghana-registered company. Appears on the corporate registry with a local director. Standard KYB would stop here.

Accra Holdings LtdCompany · Ghana

Without graph traversal, this case would have stopped at Layer 1. The BVI company would have been noted, perhaps flagged for manual review, but the connection through a Mauritian trust to a politically exposed person would likely have been missed. The traditional KYB process simply does not traverse deeply enough.

Detecting Circular Ownership

Circular ownership is a red flag in compliance. If Company A owns Company B, which owns Company C, which in turn owns Company A, the structure is likely designed to obscure true control. In Neo4j, detecting cycles is a single query:

MATCH path = (c:Company)-[:OWNS*2..8]->(c)
RETURN [node IN nodes(path) | node.name] AS cycle,
  length(path) AS depth,
  reduce(pct = 100, r IN relationships(path) |
    pct * r.ownership_pct / 100) AS round_trip_ownership
ORDER BY depth ASC

In our dataset, 12% of flagged corporate structures contained at least one ownership cycle. Every single one warranted enhanced investigation. Circular ownership is not always illicit (cross-shareholding structures exist in legitimate conglomerates), but the correlation with suspicious activity in our data was strong enough to make it a high-priority signal.

Performance Considerations

Query Performance: PostgreSQL vs. Neo4j

Find UBO (3 layers)
45ms8ms
Find UBO (6 layers)
1,200ms12ms
Find UBO (10 layers)
Timeout18ms
Detect circular ownership
3,400ms25ms
Shortest path between entities
N/A6ms
Community detection
N/A340ms
PostgreSQLNeo4j

Lessons Learned

Start with the queries, not the schema

We designed our graph model around the compliance questions we needed to answer: Who is the UBO? Are there circular structures? What is the effective ownership percentage? The schema followed from the queries, not the other way around.

Graph databases complement, not replace, relational ones

Our transactional data still lives in PostgreSQL. The ownership graph in Neo4j is a specialized projection updated nightly. Each database does what it is good at.

Data quality is the real bottleneck

Neo4j makes traversal fast and elegant. But garbage in, garbage out still applies. The hardest part of our UBO detection pipeline is not the graph queries. It is getting accurate, up-to-date ownership data from 54 different African jurisdictions.

PA
Patrick Attankurugu
Senior AI/ML Engineer building KYB compliance systems with graph databases. Senior AI/ML Engineer at Agregar Technologies.