
Structured data across a wide range of workloads—from product catalogs to telemetry streams to user activity logs. Its schema-less structure and distributed architecture make it a natural fit for applications that demand both agility and scale.
But in many real-world scenarios, data points aren’t just valuable on their own—they’re more powerful when understood in context. Connections between entities often reveal the patterns that matter most: how users interact, how systems behave, and how events unfold over time. While MongoDB provides expressive tools for working with nested documents and join operations across collections, some types of relationship analysis are more naturally expressed as graph queries.
That’s where PuppyGraph comes in. It adds a real-time graph layer on top of your existing MongoDB deployment—no ETL, no data duplication. You can define a graph model across your collections and run queries using openCypher or Gremlin, all without modifying your source data.
In this tutorial, we’ll walk through how PuppyGraph connects to MongoDB, how it complements document-based architectures with graph capabilities, and how you can get started running graph queries with minimal setup.
MongoDB is a document-oriented NoSQL database designed to store and manage data in a flexible, JSON-like format. Unlike traditional relational databases that use tables and rows, MongoDB employs collections and documents, allowing for dynamic schemas that can evolve with application requirements. This flexibility makes it particularly well-suited for handling semi-structured and unstructured data, accommodating use cases such as content management systems, real-time analytics, AI vector search, and Internet of Things (IoT) applications.
In MongoDB, data is organized into collections of documents, each containing key-value pairs. This structure enables developers to represent complex hierarchical relationships within a single document, reducing the need for expensive join operations. For example, a document representing a blog post can encapsulate not only the post content but also metadata like author information and comments, all within the same document.
The database offers a rich set of features, including a powerful query API that supports field searches, range queries, and regular expressions. Indexing capabilities enhance query performance, allowing developers to create indexes on any field. Additionally, MongoDB’s aggregation framework facilitates data transformation and analysis directly within the database, streamlining the development of analytics applications.
Recognizing the operational challenges associated with managing databases, MongoDB introduced MongoDB Atlas, a fully managed cloud database service. MongoDB Atlas simplifies the deployment, scaling, and management of MongoDB databases, allowing developers to focus on building applications rather than handling database administration tasks.
MongoDB Atlas provides automated deployment across major cloud providers, including AWS, Google Cloud Platform, and Microsoft Azure, offering flexibility and global reach. It features automated backups, ensuring data durability and facilitating disaster recovery. Built-in monitoring tools provide real-time insights into database performance, enabling proactive optimization and maintenance.
Security is a core component of MongoDB Atlas, with features such as end-to-end encryption, network isolation, and fine-grained access controls to protect sensitive data. The platform also supports compliance with various industry standards, making it suitable for applications with stringent security requirements.
By combining the flexibility of MongoDB’s document model with the operational simplicity of a managed service, MongoDB Atlas empowers organizations to build and scale applications with greater agility and confidence.
For teams working with MongoDB, many valuable insights come from understanding how entities relate across collections—whether it’s tracing user journeys, mapping operational dependencies, or detecting linked anomalies. In many cases, understanding those relationships across documents and collections can unlock deeper insights, especially when the goal is to trace connections, analyze paths, or detect patterns that span multiple entities.
PuppyGraph adds a real-time graph layer to MongoDB, allowing teams to query those relationships using graph-specific languages like Gremlin or openCypher. Without migrating or duplicating data, you can define how collections map to nodes and edges, then run graph queries directly against MongoDB Atlas or self-hosted deployments. Under the hood, PuppyGraph connects via the MongoDB Atlas SQL JDBC driver, querying live data and returning results with no ETL or transformation required.
This integration offers several key benefits:
Integrating PuppyGraph with MongoDB Atlas involves a series of components working together to enable seamless graph analytics capabilities.
This architecture allows organizations to leverage their existing MongoDB infrastructure to perform sophisticated graph analyses, enhancing their data analysis capabilities without the need for additional data processing steps.

We will go through a simple demo and see how MongoDB is integrated with PuppyGraph exactly. It is also recommended to read the getting-started document. What we will do here is essentially the same.
See the documentation to get started with MongoDB Atlas. You can use the MongoDB Atlas CLI or MongoDB Atlas UI to deploy a free cluster easily. Follow the detailed instructions in the document up to step 4, Manage the IP access list.
See the documentation to connect your cluster via MongoDB Shell. You need to get your connection string. After connecting successfully, run the following commands to create collections and insert data. Documents within a collection are flexible; they don’t have to adhere to the same schema. However, to mitigate potential errors, we create collections with schema validators.
First selecting the database, which will be automatically created after collections are created.
use modernThen create collections with schema validators and insert data.
db.createCollection("person", {
validator: {
$jsonSchema: {
bsonType: "object",
required: [ "id", "name", "age" ],
properties: {
id: { bsonType: "string" },
name: { bsonType: "string" },
age: { bsonType: "int"}
}
}
}
})
db.person.insertMany([
{id: 'v1', name: 'marko', age: 29},
{id: 'v2', name: 'vadas', age: 27},
{id: 'v4', name: 'josh', age: 32},
{id: 'v6', name: 'peter', age: 35}
])
db.createCollection("software", {
validator: {
$jsonSchema: {
bsonType: "object",
required: [ "id", "name", "lang" ],
properties: {
id: { bsonType: "string" },
name: { bsonType: "string" },
lang: { bsonType: "string" }
}
}
}
})
db.software.insertMany([
{id: 'v3', name: 'lop', lang: 'java'},
{id: 'v5', name: 'ripple', lang: 'java'}
])
db.createCollection("created", {
validator: {
$jsonSchema: {
bsonType: "object",
required: [ "id", "from_id", "to_id", "weight" ],
properties: {
id: { bsonType: "string" },
from_id: { bsonType: "string" },
to_id: { bsonType: "string" },
weight: { bsonType: "double" }
}
}
}
})
db.created.insertMany([
{id: 'e9', from_id: 'v1', to_id: 'v3', weight: 0.4},
{id: 'e10', from_id: 'v4', to_id: 'v5', weight: Double(1.1)},
{id: 'e11', from_id: 'v4', to_id: 'v3', weight: 0.4},
{id: 'e12', from_id: 'v6', to_id: 'v3', weight: 0.2}
])
db.createCollection("knows", {
validator: {
$jsonSchema: {
bsonType: "object",
required: [ "id", "from_id", "to_id", "weight" ],
properties: {
id: { bsonType: "string" },
from_id: { bsonType: "string" },
to_id: { bsonType: "string" },
weight: { bsonType: "double" }
}
}
}
})
db.knows.insertMany([
{id: 'e7', from_id: 'v1', to_id: 'v2', weight: 0.5},
{id: 'e8', from_id: 'v1', to_id: 'v4', weight: Double(1.1)}
])The data for this demo comes from the "modern" graph defined by Apache TinkerPop.

Run the following command to start the PuppyGraph container. The PUPPYGRAPH_PASSWORD environment variable sets the password for the default user puppygraph to puppygraph123. You can change it to your desired password. The --rm flag ensures that the container is removed after it stops.
docker run -p 8081:8081 -p 8182:8182 -p 7687:7687 -e PUPPYGRAPH_PASSWORD=puppygraph123 -d --name puppy --rm --pull=always puppygraph/puppygraph:stableLog into the PuppyGraph Web UI at http://localhost:8081 with the following credentials:
There are two methods to model the graph:
curl -XPOST -H "content-type: application/json" --data-binary @./schema.json --user "puppygraph:puppygraph123" localhost:8081/schema{
"catalogs": [
{
"name": "mongodb_data",
"type": "mongodb",
"jdbc": {
"username": "[username]",
"password": "[password]",
"jdbcUri": "[jdbcUri]",
"driverClass": "com.mongodb.jdbc.MongoDriver"
}
}
],
"graph": {
"vertices": [
{
"label": "person",
"oneToOne": {
"tableSource": {
"catalog": "mongodb_data",
"schema": "modern",
"table": "person"
},
"id": {
"fields": [
{
"type": "String",
"field": "id",
"alias": "id"
}
]
},
"attributes": [
{
"type": "Long",
"field": "age",
"alias": "age"
},
{
"type": "String",
"field": "name",
"alias": "name"
}
]
}
},
{
"label": "software",
"oneToOne": {
"tableSource": {
"catalog": "mongodb_data",
"schema": "modern",
"table": "software"
},
"id": {
"fields": [
{
"type": "String",
"field": "id",
"alias": "id"
}
]
},
"attributes": [
{
"type": "String",
"field": "lang",
"alias": "lang"
},
{
"type": "String",
"field": "name",
"alias": "name"
}
]
}
}
],
"edges": [
{
"label": "knows",
"fromVertex": "person",
"toVertex": "person",
"tableSource": {
"catalog": "mongodb_data",
"schema": "modern",
"table": "knows"
},
"id": {
"fields": [
{
"type": "String",
"field": "id",
"alias": "id"
}
]
},
"fromId": {
"fields": [
{
"type": "String",
"field": "from_id",
"alias": "from_id"
}
]
},
"toId": {
"fields": [
{
"type": "String",
"field": "to_id",
"alias": "to_id"
}
]
},
"attributes": [
{
"type": "Double",
"field": "weight",
"alias": "weight"
}
]
},
{
"label": "created",
"fromVertex": "person",
"toVertex": "software",
"tableSource": {
"catalog": "mongodb_data",
"schema": "modern",
"table": "created"
},
"id": {
"fields": [
{
"type": "String",
"field": "id",
"alias": "id"
}
]
},
"fromId": {
"fields": [
{
"type": "String",
"field": "from_id",
"alias": "from_id"
}
]
},
"toId": {
"fields": [
{
"type": "String",
"field": "to_id",
"alias": "to_id"
}
]
},
"attributes": [
{
"type": "Double",
"field": "weight",
"alias": "weight"
}
]
}
]
}
}When using the graph schema builder or the schema.json file, you need to fill in either the JDBC Connection String or jdbcUri – they are the same thing. The JDBC Connection String is used to connect to the MongoDB Atlas database.To find it, follow the instructions as:

You also need to fill the user and password fields according to your setting. Once complete, you would see the schema graph.

Go to Dashboard in the Web UI, you can see dashboard like the picture below. Each tab represents a query, and you can click on them to view the details. To add a new tab, click the plus (+) symbol located at the bottom right corner.

Navigate to Query in the Web UI, then you can use Graph Query for Gremlin/openCypher queries with visualization.

Here are some example queries:
Gremlin:
g.V().has("name", "marko").valueMap()openCypher:
MATCH (v {name: 'marko'}) RETURN vGremlin:
g.V().has("name", "marko")
.out("knows").out("created").path()openCypher:
MATCH p=(v {name: 'marko'})-[:knows]->()-[:created]->()
RETURN pMongoDB’s flexible document model and robust query engine make it a strong foundation for modern applications, whether you’re powering transactional systems or real-time analytics. For use cases where understanding relationships between entities is key, adding graph capabilities can unlock a new class of insights.
With PuppyGraph, teams can introduce real-time graph analytics into their MongoDB Atlas environment without modifying schemas, exporting data, or managing additional infrastructure. By connecting through the MongoDB Atlas SQL Interface, PuppyGraph lets you define graph models directly over your collections and query them using Gremlin or openCypher—while the data stays exactly where it is.
If you're working with connected data and want to explore graph queries on MongoDB Atlas, try PuppyGraph’s free Developer Edition and experience what’s possible—no ETL required.
Get started with PuppyGraph!
Developer Edition
Enterprise Edition