PuppyGraph is the first and only real time, zero-ETL graph query engine in the market, empowering data teams to query existing relational data stores as a unified graph model that deployed in under 10 minutes, bypassing traditional graph databases' cost, latency, and maintenance hurdles. Capable of scaling with petabytes of data and executing complex 10-hop queries in seconds, PuppyGraph supports use cases from enhancing LLMs with knowledge graphs to fraud detection, cybersecurity and more. Trusted by industry leaders, including Coinbase, AMD, Netskope, Palo Alto Network, eBay, and more.

How does PuppyGraph compare to Neo4j?

Unlike Neo4j, which requires you to load and sync data into its proprietary graph store, PuppyGraph runs directly on your data sources—eliminating ETL, reducing TCO, and enabling faster time-to-value. PuppyGraph also integrates natively with Databricks Unity Catalog, Google BigQuery, and AlloyDB.

What are the performance benefits of PuppyGraph?

PuppyGraph delivers multi-hop traversals in seconds over billions of edges. Real customer stories cite 5-hop queries on 1B+ edges in under 3 seconds.

Does PuppyGraph support my cloud data stack?

Yes. PuppyGraph natively integrates with Databricks Unity Catalog, Google BigQuery, AlloyDB, and AWS, keeping a single governed copy of your data.

How does PuppyGraph handle data governance and security?

PuppyGraph leverages your existing catalog and security (Unity Catalog, BigQuery, AlloyDB), so all graph queries respect your current access controls.

Can PuppyGraph power AI and LLM applications (GraphRAG)?

Yes. PuppyGraph enables Graph-based Retrieval Augmented Generation (GraphRAG) directly on your governed data—providing explainable, multi-hop context for LLMs and enterprise AI.

See all articles

Table of Contents

Introduction to MySQL

Data Lakehouse

Sharding vs Partitioning: Key Differences

Sa Wang

Software Engineer

June 30, 2026

Sharding and partitioning both break a large dataset into smaller pieces, and the two words are used interchangeably often enough that the real distinction gets lost. The cleanest way to hold them apart is by where the pieces live. Partitioning divides a table into smaller parts that one database engine manages on a single server, so queries touch less data and maintenance gets easier without changing where the data sits. Sharding takes those pieces and spreads them across separate database servers, so the dataset can grow past what one machine can store or serve.

That difference in placement is why the comparison matters: partitioning is a way to organize data inside one database, while sharding is a way to scale a database beyond one machine, and the two carry very different operational costs. The confusion is understandable, because sharding is itself a form of partitioning, specifically horizontal partitioning applied across machines. This post defines each technique, sets them side by side in a comparison table, untangles the specific overlap between horizontal partitioning and sharding, compares their performance characteristics, and works through how to choose. It closes with a note on querying data that has already been divided this way.

What is database partitioning?

Partitioning is the practice of dividing a single logical table (and its indexes) into smaller physical pieces called partitions, all managed by the same database engine on the same server. To the application, the table still looks like one table: queries are written against the logical table, and the engine decides which partitions to read. Partitioning is, in this sense, transparent. The data is reorganized underneath, but the interface above it does not change.

Partitioning splits along one of two axes. Horizontal partitioning divides a table by rows, so each partition holds a subset of the rows with the full set of columns. A common scheme partitions an orders table by date so that each month of orders lands in its own partition. Vertical partitioning divides a table by columns, so each partition holds a subset of the columns for every row, typically separating frequently accessed columns from large or rarely read ones. Horizontal partitioning is by far the more common form, and it comes in several strategies: range partitioning (by value ranges, such as date or ID ranges), list partitioning (by an explicit set of values, such as region codes), and hash partitioning (by a hash of a key, to spread rows evenly).

Most relational databases support this directly. PostgreSQL, for example, offers declarative partitioning with range, list, and hash strategies, where you define a parent table and its partitions and the engine routes inserts and prunes reads automatically. The payoff is concrete. Query performance improves through partition pruning: a query filtered to last month’s orders reads only the relevant partition instead of scanning the whole table. Maintenance gets cheaper: dropping a month of old data is a matter of detaching one partition rather than running a large, slow DELETE. Index efficiency improves too, because per-partition indexes are smaller and stay more cache-resident than one giant index over the entire table.

What partitioning does not do is add capacity. Every partition still lives on the same machine, sharing its CPU, memory, disk, and I/O bandwidth. Partitioning makes a large table on one server more manageable and often faster to query; it does not let that table outgrow the server it sits on. That ceiling is exactly where sharding begins.

What is database sharding?

Sharding is horizontal partitioning carried across multiple database servers. Each partition, now called a shard, is a self-contained database running on its own node with its own CPU, memory, and storage, and together the shards hold the full dataset with no single node holding all of it. This is a shared-nothing architecture: nodes do not share memory or disk, and they coordinate only through the network. The goal is no longer organizing data on one machine but scaling past the limits of any one machine, in storage, in write throughput, and in total query capacity.

Which row goes to which shard is decided by a shard key (also called a partition key), a chosen column or set of columns whose value maps each row to a shard. The mapping is usually by hash of the key (to distribute rows evenly) or by range (to keep related rows together). A users table sharded by a hash of user_id spreads users roughly uniformly across nodes, so that no single node becomes the bottleneck for reads or writes. Because the data now lives on many servers, something has to direct each query to the right shard. That routing lives in a separate layer: a dedicated router or proxy, a coordinator node, or logic in the application itself. MongoDB, for instance, splits sharded data into chunks across nodes, runs a balancer to keep their distribution even as data grows, and routes queries through a mongos process so that clients still see one logical collection. In the relational world, extensions and middleware such as Citus for PostgreSQL and Vitess for MySQL add sharding on top of single-node engines.

Sharding buys horizontal scale, and it charges for it in coordination. A query that targets a single shard (one that filters on the shard key) is fast and isolated, but a query that spans shards has to fan out to many nodes and combine their results, and operations that cross shards (joins, aggregations, transactions that touch more than one shard) require coordination the database otherwise would not need. Adding or removing nodes means rebalancing data across them, and a poorly chosen shard key can send disproportionate traffic to one node, a hotspot that undoes the point of sharding. Sharding is the technique you reach for when one server is genuinely not enough, accepting distributed-systems complexity as the cost of breaking past the single-machine ceiling.

Get Started with PuppyGraph for FREE

Sharding vs partitioning: core differences

Both techniques divide data, but they answer different questions and live at different layers of the stack. The table below sets the differences that tend to drive a decision side by side.

Dimension	Partitioning	Sharding
What it is	One logical table split into partitions within a single database engine	Data split into shards spread across multiple independent database servers
Where the data lives	One server, one engine	Many servers, each its own database (shared-nothing)
Primary goal	Make a large table on one machine manageable and faster to query	Scale storage, throughput, and capacity beyond one machine
Transparency	Engine-managed and transparent to the application	Needs a routing layer (proxy, coordinator, or application logic)
How a piece is chosen	Partition key with range, list, or hash strategy	Shard key, usually by hash or range
Cross-piece queries	Handled inside one engine; cross-partition reads are local	Cross-shard queries fan out over the network and must be combined
Adds capacity	No; all partitions share the one server's resources	Yes; each shard adds CPU, memory, and storage
Operational complexity	Lower; one system to run, back up, and monitor	Higher; many nodes, rebalancing, distributed coordination
Failure domain	The whole table shares the server's fate	A node failure affects only its shards (with replication for availability)

The table makes the trade-off concrete. Partitioning optimizes within a fixed amount of hardware: it reduces how much data a query reads and how much work maintenance takes, at low operational cost, but it cannot lift the ceiling set by the single server it runs on. Sharding lifts that ceiling by adding machines, at the price of a routing layer, cross-shard coordination, and the operational surface of a distributed system. They are not competing answers to one question so much as answers to two different questions: how to organize data on a machine, and how to grow data beyond a machine.

How horizontal partitioning differs from sharding

The sharpest source of confusion is that horizontal partitioning and sharding describe the same operation, dividing a table by rows, and differ only in where the resulting pieces are placed. Sharding is horizontal partitioning where the partitions are distributed across separate servers. Every shard is a horizontal partition; not every horizontal partition is a shard. The distinction is not the split, it is the deployment.

That deployment difference is what changes everything downstream. With horizontal partitioning on a single instance, all the partitions still draw on the same CPU, memory, and disk, so you gain organization (pruning, smaller indexes, cheaper maintenance) but not additional aggregate capacity. The total work the system can do is unchanged; it is just better arranged. With sharding, each partition sits on its own node, so the partitions add real capacity: more total memory to cache hot data, more cores to run queries in parallel, more disks to absorb writes. Horizontal partitioning is about scaling the manageability of data on one machine; sharding is about scaling the capacity of the system across many.

The cost structure flips with it. Within a single engine, a query that spans several partitions is still a local operation: the planner reads multiple partitions and combines them using the server’s own memory and CPU, with no network in the path. Across shards, that same span becomes a distributed operation, fanning out over the network, and a join or transaction that crosses shard boundaries needs coordination that a single engine handles for free. (Vertical partitioning, splitting by columns rather than rows, is a third option on a different axis entirely, and it is not what sharding refers to.) The practical reading: reach for horizontal partitioning to tame a large table you can still host on one server, and for sharding only once the data or load genuinely exceeds what one server can hold.

Get Started with PuppyGraph for FREE

Sharding vs partitioning: performance comparison

Partitioning improves performance by reducing the work a single server does per query. Partition pruning is the main lever: when a query filters on the partition key, the engine skips every partition that cannot contain matching rows, so a scan over one month touches one partition rather than years of history. Smaller per-partition indexes stay warmer in cache and are faster to traverse, and bulk maintenance (loading, archiving, dropping old data) operates one partition at a time instead of churning the whole table. The limit is structural: because every partition shares the one machine’s resources, partitioning cannot raise total throughput beyond what that machine can deliver. It makes a fixed amount of hardware go further; it does not add more of it.

Sharding improves performance by adding hardware and running work in parallel across it. Read and write throughput scale roughly with the number of shards, because each node handles only its slice of the data, and a query that targets a single shard runs against a smaller dataset on dedicated hardware. The performance picture splits sharply by query shape, though. Single-shard queries, those that filter on the shard key, are fast and fully parallelizable across the cluster. Cross-shard queries are the expensive case: a scatter-gather across nodes, a distributed join, or a multi-shard transaction pays network and coordination costs that a single engine never incurs, and an uneven shard key creates hotspots where one node carries a disproportionate share of the load. Sharding’s performance ceiling is high but its floor depends heavily on whether the access pattern matches the shard key.

The contrast comes down to ceilings. Partitioning lowers the cost of operating under a fixed ceiling and is nearly free operationally; sharding raises the ceiling itself but introduces a class of distributed-query costs that good schema and shard-key design exist to minimize. Many large systems use both: shard to spread data across nodes, then partition within each shard so that each node’s slice is itself well organized.

Sharding vs partitioning: which one should you choose?

The decision follows from where the pressure is coming from, not from which technique is more advanced.

Start with partitioning when the data still fits one server. If a table has grown large enough that scans are slow, indexes are unwieldy, or archiving old rows is painful, but the dataset and its load still fit comfortably on a single machine, partitioning is the lighter answer. It buys query pruning and cheaper maintenance without adding a single new piece of infrastructure, and it keeps one system to run, back up, and reason about.

Reach for sharding when one server is the bottleneck. When the data no longer fits on one machine, or write throughput has saturated the largest instance you can provision, or you need capacity to grow with demand, sharding is the technique that applies. It is a larger commitment, bringing a routing layer, rebalancing, and distributed-query costs, so it is worth deferring until the single-machine ceiling is the actual constraint rather than a hypothetical one.

Often the answer is both. Sharding and partitioning compose cleanly: shard across nodes to scale out, and partition within each node to keep each shard’s data organized. Choosing one does not rule out the other, and large deployments commonly use them together, sharding for capacity and partitioning for locality and maintenance.

Scaling graph queries without managing shards

Sharding is what buys scale once one server is no longer enough, and the sections above show what it costs to run: a routing layer, rebalancing as nodes come and go, and careful shard-key design to keep any one node from becoming a hotspot. Distributing data and computation by hand is genuinely complex work, and it is worth knowing when a layer can take that on for you.

PuppyGraph is a graph query engine that does exactly that for graph workloads. It maps a graph schema onto existing tables in sources like PostgreSQL, Snowflake, and Apache Iceberg and runs graph queries (openCypher and Gremlin) against them in place, with no ETL and no separate graph database to maintain. Its distributed computation is auto-sharded: the engine partitions and spreads the work across executor nodes itself, so you scale graph queries by adding nodes rather than by designing and operating a sharding scheme. Because computation and storage are separate, the data stays in the warehouse or lake while the engine scales on its own, and predicate pushdown, min/max statistics, and vectorized columnar execution keep each query reading only the data it needs.

Get Started with PuppyGraph for FREE

Conclusion

Sharding and partitioning both divide a large dataset into smaller pieces, and the difference is where those pieces live and what problem the division solves. Partitioning splits a table into parts that one engine manages on one server, improving query performance and maintenance without changing where the data sits or adding capacity. Sharding distributes those parts across independent servers, lifting the single-machine ceiling on storage and throughput at the cost of a routing layer and distributed-query coordination. Horizontal partitioning is the bridge between them: sharding is horizontal partitioning deployed across machines, and it is that deployment across machines, rather than the way the rows are divided, that turns a manageability technique into a scaling one. Choose partitioning to organize data that still fits one server, sharding to grow past it, and often both together.

Try the forever-free PuppyGraph Developer Edition and book a demo with the team to see how openCypher and Gremlin queries run over warehouse and lakehouse tables, with no graph-specific ETL, even when those tables are already partitioned or sharded across systems.

Sa Wang

Software Engineer

Sa Wang is a Software Engineer with exceptional mathematical ability and strong coding skills. He holds a Bachelor's degree in Computer Science and a Master's degree in Philosophy from Fudan University, where he specialized in Mathematical Logic.

‍