Securing PII in Snowflake
As companies continue to adopt Snowflake, a common challenge is scaling governance and security alongside platform adoption. New use cases mean more users, finer access controls, and more records to audit. The “central data hub” promise of Snowflake is real, but it adds responsibility for your data team. Even if you slept at a Holiday Inn Express last night and have your YAML and Terraform ready, this will require time and patience.
A particular focus for data teams is managing access to Personally Identifiable Information (PII). While Snowflake offers native PII controls, setup, maintenance, and monitoring can be complex. Beyond PII, understanding existing roles and permissions, what users have access to, and reconciling these details is often not intuitive.
In this post, we discuss the importance of securing PII in Snowflake, review traditional methods of data protection, and finally we look at how Spyglass revolutionizes the way data teams implement the principle of least privilege.
With that said, let’s dive in!
Data breaches are a constant risk
Picture this: You're sipping your morning coffee, scrolling through headlines, when BAM! Another data breach. Sensitive information exposed, reputations tarnished, trust shattered. Protecting PII isn't just a checkbox on your compliance list—it's the guardian of your customer’s trust and the shield for your company’s reputation.
Here are a few examples of recent data breaches with pretty serious consequences.
T-Mobile Breach (2021) – Over 40 million customers had their personal data exposed, including SSNs, license/ID information, and more. The breach resulted in significant legal and financial repercussions for T-Mobile, including a $350 million settlement to resolve a class-action lawsuit.
Capital One (2019) – A hacker gained access to the personal information of over 100 million individuals, including Social Security numbers and bank account details. Capital One faced a $80 million fine from U.S. regulators and settled customer lawsuits for $190 million.
Ashley Madison (2015) – 32 million customers had their names, emails, and payment information exposed in a hack that would become the subject of a Netflix documentary and resulted in a $11.2 million settlement.
The financial implications of failing to protect PII continue to grow. According to IBM's Cost of a Data Breach Report, the average cost of a data breach in 2024 is $4.88 million.
This figure encompasses everything from regulatory fines and legal fees to the cost of lost business and reputational damage. For many companies, a breach of this magnitude can be financially crippling.
Less tangible yet potentially more significant is how a data breach erodes trust.
When customers hand over their personal information, they do so with the expectation that it will be safeguarded. A breach shatters this trust, making customers hesitant to share their data in the future. Rebuilding this trust isn’t easy—it requires time, transparency, and a demonstrated commitment to data security.
Data breaches also impact real people. A customer whose identity is stolen because of a data breach might face financial ruin, emotional distress, and years of untangling the mess. Protecting PII is about more than just compliance—it’s about safeguarding the individuals behind the data.
How data teams protect their data today
Many of the hacks above were the result of poor access control: internal users had access to way more data than they should have. This allowed hackers to exfiltrate tons of data in a short time.
So how do data teams address this data access problem today?
Many teams rely on manually configuring tools like Snowflake’s role-based access control (RBAC) and dynamic data masking policies. But these tools aren’t without their flaws. While powerful, they are difficult to configure, hard to maintain, and require expertise to implement correctly.
Monthly or quarterly audits are also common, where teams put together ad-hoc reports to figure out:
- Where is all the PII located?
- Who has access to that PII?
- Who should have access to that PII?
- Who has actually accessed that PII?
Most of all, data teams are in the critical path for all security and access changes, which is often a time-consuming operational burden. The movement towards data self-service has helped, but the dream hasn’t been realized yet.
How Spyglass makes PII simple
Here’s how Spyglass transforms the daunting task of protecting PII into a walk in the park:
Start scanning in a few clicks
Start by creating a new auto-classifier (e.g. “PII”, “SOC2”, “HIPAA”, etc.) and select the data that is in scope. By default Spyglass classifies all of your data, but you can opt-out for development databases or similar.
On the auto-classifier page, you can start the initial scan in one click. Afterwards, scanning happens continuously without any required work from your team!
View scan progress
During this scan, Spyglass runs a data classification process on all tables in the configured databases and schemas, and reports on its progress. You can see which databases have been fully scanned (labeled as “Active”) and which ones are still ongoing (labeled as “Scanning”).
For each database, you can see a list of schemas that have been scanned, as well as the functional roles that allow access to data within those schemas. Additionally, you can see a summary of the tables that have personal information contained within them.
You can review the automation activity and see exactly what tables are being classified throughout the initial scan. In the future, as new tables are created, you can see all classification activity that is happening on this tab.
Audit user behavior
Empower your governance team to find the exact data that they need, when they need it. Not only can you filter by basic query information (user, role, etc.), but you can also filter by the data accessed. For example, you can look for specific PII that was viewed based on the table or column name.
Automated compliance
Spyglass doesn’t just react to compliance issues—it anticipates them. We provide proactive insights about your data, offering guidance on how to fix common issues before they become problems.
With automated compliance, implementing the principle of least privilege becomes second nature. Spyglass continuously monitors your data environment, ensuring that users only have the access they need, nothing more, nothing less.
And, as you make changes, they’re all part of our centralized change management system, so that changes are approved, auditable, and reversible.
Want to learn more?
Spyglass delivers security automation for data teams and self-service for data consumers. We handle security so you can focus on the job you were hired for, not the one you inherited. If you’ve nodded your head while reading this, reach out to us at demo@spyglass.software!