Learn→Data Governance & Lineage→Fine-grained Access Control & Policies→2 of 5

Data Governance & Lineage • Fine-grained Access Control & PoliciesMedium⏱️ ~3 min

How FGAC Works: The Policy Evaluation Flow

From User to Data and Back:

Fine grained access control spans the entire request path. Understanding this flow is critical because policies must be enforced consistently across interactive queries, batch jobs, and even vector embeddings for generative AI applications.

The journey starts with authentication. A user logs into a business intelligence tool like Looker. Authentication against an identity provider produces a token carrying not just user ID, but also attributes: team membership, regional scope, clearance level. For example, owns_customers_in_region=EU or clearance=finance_only.

1
Identity Context: The query arrives at the data engine with a verified token containing user attributes and group memberships.
2
Policy Lookup: The engine queries a central policy store for applicable rules. For a 10 billion row fact table, it might retrieve a predicate like region IN permitted_regions that reduces visibility to 100 million rows.
3
Query Rewrite: The optimizer injects row filters and column masks into the logical plan. This happens at compile time before execution begins.
4
Pushdown Execution: The rewritten query with filters pushes down to storage or compute nodes. Every path to data enforces the same policies.
The Permission Table Pattern:

Many implementations use a permission table to avoid massive OR predicates. Instead of generating WHERE customer_id IN (1, 2, 3...1000000), the engine joins with a small table mapping users to allowed rows. This permission table typically has millions of rows versus billions in fact tables, stays highly cached in memory, and gets updated by identity provisioning systems.

Performance Impact
< 1ms
POLICY LOOKUP
3-5s
P50 QUERY

The critical design goal is keeping policy evaluation cost under 1 millisecond so that latency remains dominated by actual data processing, not by security checks. At scale with 200,000 interactive queries daily, you cannot afford 10 milliseconds of policy overhead per query without violating Service Level Objectives (SLOs).

💡 Key Takeaways

✓Policies are evaluated at query compile time and rewritten into the logical plan before execution

✓Permission tables map users to allowed rows, avoiding massive OR predicates with thousands of terms

✓Policy lookup must complete in under 1ms to avoid dominating query latency at scale

✓The same policies must apply across all data access paths: SQL, batch jobs, vector search, and exports

✓Identity tokens carry attributes like region and clearance that drive policy decisions, not just user IDs

📌 Interview Tips

1A query normally scanning 200 GB gets a permission table join that filters to 2 GB before scan starts

2User with 50 group memberships avoids generating WHERE clause with 50 OR conditions by joining to pre computed permission view

3Data pipeline running as service principal inherits scoped permissions and cannot read more data than its designated scope

← Back to Fine-grained Access Control & Policies Overview