PuppyGraph is the first and only real time, zero-ETL graph query engine in the market, empowering data teams to query existing relational data stores as a unified graph model that deployed in under 10 minutes, bypassing traditional graph databases' cost, latency, and maintenance hurdles. Capable of scaling with petabytes of data and executing complex 10-hop queries in seconds, PuppyGraph supports use cases from enhancing LLMs with knowledge graphs to fraud detection, cybersecurity and more. Trusted by industry leaders, including Coinbase, AMD, Netskope, Palo Alto Network, eBay, and more.

How does PuppyGraph compare to Neo4j?

Unlike Neo4j, which requires you to load and sync data into its proprietary graph store, PuppyGraph runs directly on your data sources—eliminating ETL, reducing TCO, and enabling faster time-to-value. PuppyGraph also integrates natively with Databricks Unity Catalog, Google BigQuery, and AlloyDB.

What are the performance benefits of PuppyGraph?

PuppyGraph delivers multi-hop traversals in seconds over billions of edges. Real customer stories cite 5-hop queries on 1B+ edges in under 3 seconds.

Does PuppyGraph support my cloud data stack?

Yes. PuppyGraph natively integrates with Databricks Unity Catalog, Google BigQuery, AlloyDB, and AWS, keeping a single governed copy of your data.

How does PuppyGraph handle data governance and security?

PuppyGraph leverages your existing catalog and security (Unity Catalog, BigQuery, AlloyDB), so all graph queries respect your current access controls.

Can PuppyGraph power AI and LLM applications (GraphRAG)?

Yes. PuppyGraph enables Graph-based Retrieval Augmented Generation (GraphRAG) directly on your governed data—providing explainable, multi-hop context for LLMs and enterprise AI.

See all articles

Table of Contents

Introduction to MySQL

Cybersecurity

Data Security vs Data Privacy: Key Differences Explained

Hao Wu

Software Engineer

June 18, 2026

Data security and data privacy are used interchangeably often enough that the distinction has blurred, but they answer two different questions. Data security asks whether the wrong people can reach, alter, or destroy your data. Data privacy asks whether your organization should be collecting, using, and sharing the personal data it holds in the first place, and whether the people that data describes have a say in it. The two are related, and the relationship is asymmetric: you can lock data down tightly and still use it in ways the people it concerns never agreed to, but you cannot honor a privacy commitment on data you are unable to keep secure. Security is necessary for privacy, and not sufficient for it.

That asymmetry is why the terms cannot be collapsed into one. A breach is a security failure; selling customer data without consent is a privacy failure, even if the data was encrypted the whole time. This post defines each discipline on its own terms, lays the two side by side in a comparison table, explains how they reinforce each other in practice, and works through when each one is the part of the problem to lead with.

Get Started with PuppyGraph for FREE

What is data security?

Data security is the protection of data itself, across its lifecycle and wherever it lives, against unauthorized access, modification, exfiltration, and loss. The emphasis on the data itself is the important part. The goal is to keep the information protected regardless of which system holds it at a given moment, which means defending it at rest in storage, in transit between systems, and in use while it is being processed, rather than only hardening a perimeter around it.

The discipline is usually framed by the CIA triad, three properties that together describe what it means for data to be secure. Confidentiality means only authorized parties can read the data. Integrity means the data is accurate and has not been tampered with or corrupted. Availability means authorized users can reach the data when they need it, so a ransomware attack that locks up records is a security failure even though nothing was disclosed. A security program is, in effect, the set of controls that maintain these three properties against accident, insiders, and attackers.

One characteristic of data security is worth stating plainly because it is what most cleanly separates it from privacy: security is largely indifferent to what the data is about. The controls that protect a database of customer records are the same controls that protect source code, financial statements, and machine credentials. Data security covers all sensitive data, not only the personal kind, and it is concerned with keeping that data away from parties who have no authorization to it, not with whether authorized use of it is appropriate.

Get Started with PuppyGraph for FREE

Key features

The controls that make up a data security program fall into a handful of recurring categories.

Encryption and tokenization render data unreadable to anyone without the key, protecting it at rest and in transit, and increasingly in use. Tokenization goes a step further for specific fields by substituting a sensitive value with a non-sensitive surrogate, common for payment numbers. Both are only as strong as the key management behind them.

Access control and identity management enforce who is allowed to reach which data, built on authentication (proving who a principal is) and authorization (deciding what that principal may do), and governed by least privilege and role-based access. This is where the standing question of who can reach what actually lives.

Data loss prevention (DLP) monitors and blocks inappropriate movement of sensitive data across channels such as email, endpoints, and cloud uploads, stopping, for example, a classified file from being sent to a personal account.

Monitoring, logging, and auditing provide the continuous visibility to detect anomalous access and to reconstruct what happened after an incident, which is also the evidence base that compliance reporting draws on.

Backup and recovery maintain recoverable copies and tested restoration paths so that data survives corruption, deletion, and ransomware. In the ransomware era this availability concern is squarely a security control, not a separate IT housekeeping task.

Get Started with PuppyGraph for FREE

What is data privacy?

Data privacy is the governance of how personal data is collected, used, shared, and retained, together with the rights individuals hold over the data that describes them. Where security is about protecting data from unauthorized parties, privacy is about the proper handling of personal data by the authorized parties themselves. It is concerned less with whether an attacker can reach the data and more with whether the organization that legitimately holds it is using it in ways that are lawful, disclosed, and consistent with what the individual agreed to.

This is the can-versus-should distinction. Security is largely about what people are able to do to data: who can read it, change it, move it. Privacy is about what an organization ought to do with personal data it already has legitimate access to. A company can have flawless security and still commit a privacy violation by quietly repurposing data it collected for one reason to serve an entirely different one. The data was never exposed; it was misused.

Privacy also narrows the scope of what is in play. Where security protects all sensitive data, privacy is specifically about personal data: information that identifies or relates to an individual, broadly what US contexts call PII, though regimes like the GDPR define personal data more widely (an IP address or an online identifier can count). And much of what privacy requires comes from law. Regimes such as the GDPR in the EU, CCPA and CPRA in California, and HIPAA for US health data impose specific obligations on how particular categories of personal data are handled, with real penalties for getting it wrong. The GDPR, for instance, requires that personal data be kept secure (it folds the CIA triad in as one obligation), but it goes well beyond security to demand lawful basis, consent, and honoring individual rights, which are requirements security controls alone never address.

Get Started with PuppyGraph for FREE

Key features

A privacy program is built around a different set of obligations than a security one.

Lawful basis and consent establish that the organization has a legitimate reason to process personal data in the first place, and, where consent is the basis, that it was freely given, specific, and revocable. This has no analog in security, which does not ask why data is held, only that it be protected.

Purpose limitation restricts personal data to the use it was collected for, so data gathered to fulfill an order cannot be silently redirected into ad targeting without a new basis.

Data minimization and retention limits require collecting only the personal data actually needed and keeping it only as long as there is a reason to, rather than retaining everything indefinitely because storage is cheap.

Data subject rights give individuals standing to act on their own data: to access it, correct it, delete it, port it elsewhere, or object to its processing. Servicing these requests (handled under the banner of data subject access requests, or DSARs, though strictly a DSAR is the access right) is an operational obligation with deadlines, and it requires knowing everywhere a person's data lives.

Transparency and notice require telling people what is collected and why, in terms they can understand, so that consent and expectations rest on accurate information rather than buried defaults.

Get Started with PuppyGraph for FREE

Data security vs data privacy: comparison table

The two disciplines overlap in tooling and teams, but they answer different questions, are driven by different forces, and fail in different ways. The table below lays the contrast out along the dimensions that matter when reasoning about either one, including how each one fails, which is where the distinction becomes most concrete.

Dimension	Data Security	Data Privacy
Core Question	Can unauthorized parties reach, alter, or destroy the data?	Is personal data collected and used lawfully and as the individual expects?
What It Protects	The data itself, regardless of subject	Personal data, and individuals' rights over it
Scope of Data	All sensitive data (IP, financials, credentials, personal)	Personal / identifiable data only
Primary Driver	Threat landscape and breach risk	Regulation, individual rights, and consent
Typical Owners	Security and infrastructure teams	Legal, privacy, and compliance teams (often a DPO)
Core Mechanisms	Encryption, access control, DLP, monitoring, backup	Consent, purpose limitation, minimization, rights handling, notice
Reference Frameworks	CIA triad, ISO 27001, NIST CSF, SOC 2	GDPR, CCPA/CPRA, HIPAA, FIPPs
Failure Mode	Breach, leak, ransomware, tampering	Misuse, over-collection, processing without consent, ignored rights
Relationship	Necessary for privacy; not sufficient	Depends on security; adds use and governance constraints on top

The failure-mode row is the one that makes the difference legible. A security failure looks like data reaching someone who should never have had it: a leak, a ransomware lockup, a tampered record. A privacy failure can happen with the data never leaving authorized hands at all: collecting more than was disclosed, keeping it past its purpose, using it for something the individual never agreed to, or failing to honor a deletion request. This is the can-versus-should split made concrete. Security keeps the wrong people out; privacy constrains what the right people are allowed to do. The two interlock precisely because secured data can still be misused, and well-governed data can still be breached, so neither discipline covers for the other's failure mode.

Get Started with PuppyGraph for FREE

How data security and data privacy work together

In practice the two are layers of one program, not separate concerns. Security is the enforcement substrate that makes privacy commitments real. A privacy policy can promise that personal data is accessible only to staff who need it, retained for a fixed period, and deleted on request, but those promises are kept by security mechanisms: access controls that scope who can reach the data, encryption that protects it, logging that proves how it was handled, and deletion that actually removes it. Without those controls, a privacy policy is a statement of intent with nothing enforcing it.

The dependency runs the other way too. Privacy defines what security has to protect and to what standard. It is privacy obligations that designate which datasets count as personal data, which therefore need the strictest access control, the shortest retention, and the most careful logging. Security supplies the mechanism; privacy supplies much of the policy that the mechanism enforces. Run well, the same controls serve both: role-based access enforces least privilege (a security goal) and data minimization at the access layer (a privacy goal) at the same time.

What both disciplines depend on, and what neither gets for free, is an accurate picture of the data itself: where sensitive and personal data lives, what it is, who can reach it, and how it flows between systems. Security needs this to answer which identities can reach a regulated dataset and what the blast radius is if one of them is compromised. Privacy needs the same underlying map to answer where a given person's data resides across every system, which is what a data subject access or deletion request actually requires. In both cases the answer is not in any single table. It lives in how identities, roles, datasets, classifications, and data flows relate to each other across the warehouse, the lake, and the SaaS tools around them, and assembling that by hand across siloed systems is the part that does not scale.

That cross-cutting question, relating data that lives in separate tables across separate systems, is a multi-hop relationship problem, which is the kind of problem a graph layer is built for. PuppyGraph is a graph query engine that maps existing identity, dataset, access, classification, and lineage tables to a graph and lets a team traverse those relationships directly, without copying the data into a separate database (zero-ETL, querying the tables in place) and with compute separated from storage. The security question and the privacy question become the same kind of traversal. For access and blast radius:

MATCH (u:Identity)-[:HAS_ROLE]->(:Role)-[:GRANTS]->(d:Dataset)
WHERE d.classification = 'personal_data'
RETURN u.name, collect(d.name) AS reachable_personal_data

And for the data-mapping question behind a subject request, the same graph traces every dataset and downstream system a given individual's records flow into, following the lineage edges between tables rather than reconstructing them from documentation. PuppyGraph speaks openCypher and Gremlin, so these queries run against the tables a data team already maintains.

The boundary matters and is worth stating directly. PuppyGraph is not a data security tool, and it is not a privacy tool. It is not DLP, not encryption or key management, not a consent platform, and not a DSAR workflow system. It is a relationship and correlation layer that complements those tools, joining the data they already produce so that who-can-reach-what and where-does-this-data-go become queries rather than manual reconstructions. Security teams including Palo Alto Networks, Datadog, Netskope, and Trend Micro use PuppyGraph for this kind of relationship analysis over data that stays where it already lives.

Get Started with PuppyGraph for FREE

When to choose data security?

The heading is a useful way to scope priorities, but it should not be read as an either/or. Every organization that holds data needs security, and any organization holding personal data needs privacy on top of it. The real question is which side is the area to lead with given where you are, and there are situations where data security is clearly the part of the problem to invest in first.

You hold sensitive data that is not personal. Trade secrets, source code, financial models, and system credentials carry no privacy obligation because no individual is the subject, but losing them is still costly. Privacy frameworks would not cover this data at all; security is the only discipline that does.

Your immediate risk is a breach. If the pressing exposure is ransomware, exfiltration, or an over-permissioned environment where too many identities have standing access, that is a security gap, and it is urgent regardless of what the data contains.

You have no protection baseline yet. Because security is the substrate privacy is built on, an organization without basic encryption, access control, and monitoring should establish those first. A privacy program layered on top of weak security is making promises the underlying controls cannot keep.

Get Started with PuppyGraph for FREE

When to choose data privacy?

The same caveat applies in reverse: prioritizing privacy work assumes a security baseline already exists or is being built alongside it, since privacy commitments are only as real as the controls enforcing them. With that established, privacy is the side to lead with in a recognizable set of situations.

You process personal data at scale. If your organization collects and uses large volumes of data about consumers, patients, or employees, the dominant risk is no longer only that the data leaks, but that it is used in ways that are unlawful or undisclosed, and that is the domain of privacy.

You fall under privacy regulation. Operating under the GDPR, CCPA and CPRA, HIPAA, or similar regimes brings specific, enforceable obligations around lawful basis, consent, retention, and individual rights, with penalties that security controls alone do nothing to avoid. Regulatory scope often settles where privacy investment is mandatory rather than optional.

Your security is sound but your governance is not. A common position is that the data is well protected but no one can confidently say what personal data is held, why, for how long, or where a given person's records live across the stack. That is a privacy and governance gap sitting on top of adequate security, and closing it is privacy work, not more security tooling.

Get Started with PuppyGraph for FREE

Conclusion

Data security and data privacy are distinct disciplines that are easy to conflate and costly to treat as the same thing. Security protects data, any data, from unauthorized access, tampering, and loss, framed by the CIA triad. Privacy governs whether and how personal data is collected, used, and shared, framed by regulation and individual rights. They are interdependent rather than alternative: security is the substrate that enforces privacy, and privacy defines much of what security must protect, so a mature program treats them as two layers of one effort rather than competing line items. Both ultimately rest on the same foundation, an accurate understanding of where sensitive and personal data lives, who can reach it, and how it flows, which is a relationship problem before it is a tooling one.

Try the forever-free PuppyGraph Developer Edition and book a demo with the team to see how openCypher and Gremlin queries run over warehouse and lakehouse tables, with no graph-specific ETL, turning the access paths and data flows that both your security and privacy programs depend on into traversals over the tables they already maintain.

‍

Hao Wu

Software Engineer

Hao Wu is a Software Engineer with a strong foundation in computer science and algorithms. He earned his Bachelor’s degree in Computer Science from Fudan University and a Master’s degree from George Washington University, where he focused on graph databases.