Internal Tools, 10gen
J. Randall Hunt
#MongoDBDays
Introduction to Replication
and Replica Sets
Agenda
• Replica Sets Lifecycle
• Developing with Replica Sets
• Operational Considerations
Why Replication?
• How many have faced node failures?
• How many have been woken up from sleep to
do a fail-over(s)?
• How many have experienced issues due to
network latency?
• Different uses for data
– Normal processing
– Simple analytics
Replica Set Lifecycle
Replica Set – Creation
Replica Set – Initialize
Replica Set – Failure
Replica Set – Failover
Replica Set – Recovery
Replica Set – Recovered
Replica Set Roles &
Configuration
Replica Set Roles
> conf = {
_id : "mySet",
members : [
{_id : 0, host : "A”, priority : 3},
{_id : 1, host : "B", priority : 2},
{_id : 2, host : "C”},
{_id : 3, host : "D", hidden : true},
{_id : 4, host : "E", hidden : true, slaveDelay : 3600}
]
}
> rs.initiate(conf)
Configuration Options
> conf = {
_id : "mySet”,
members : [
{_id : 0, host : "A”, priority : 3},
{_id : 1, host : "B", priority : 2},
{_id : 2, host : "C”},
{_id : 3, host : "D", hidden : true},
{_id : 4, host : "E", hidden : true, slaveDelay : 3600}
]
}
> rs.initiate(conf)
Configuration Options
Primary DC
> conf = {
_id : "mySet”,
members : [
{_id : 0, host : "A”, priority : 3},
{_id : 1, host : "B", priority : 2},
{_id : 2, host : "C”},
{_id : 3, host : "D", hidden : true},
{_id : 4, host : "E", hidden : true, slaveDelay : 3600}
]
}
> rs.initiate(conf)
Configuration Options
Secondary DC
Default Priority = 1
> conf = {
_id : "mySet”,
members : [
{_id : 0, host : "A”, priority : 3},
{_id : 1, host : "B", priority : 2},
{_id : 2, host : "C”},
{_id : 3, host : "D", hidden : true},
{_id : 4, host : "E", hidden : true, slaveDelay : 3600}
]
}
> rs.initiate(conf)
Configuration Options
Analytics
node
> conf = {
_id : "mySet”,
members : [
{_id : 0, host : "A”, priority : 3},
{_id : 1, host : "B", priority : 2},
{_id : 2, host : "C”},
{_id : 3, host : "D", hidden : true},
{_id : 4, host : "E", hidden : true, slaveDelay : 3600}
]
}
> rs.initiate(conf)
Configuration Options
Backup node
Developing with Replica
Sets
Strong Consistency
Delayed Consistency
Write Concern
• Network acknowledgement
• Wait for error
• Wait for journal sync
• Wait for replication
Unacknowledged
MongoDB Acknowledged (wait for
error)
Wait for Journal Sync
Wait for Replication
Tagging
• Control where data is written to, and read from
• Each member can have one or more tags
– tags: {dc: "ny"}
– tags: {dc: "ny",
 subnet: "192.168",
 rack:
"row3rk7"}
• Replica set defines rules for write concerns
• Rules can change without changing app code
{
_id : "mySet",
members : [
{_id : 0, host : "A", tags : {"dc": "ny"}},
{_id : 1, host : "B", tags : {"dc": "ny"}},
{_id : 2, host : "C", tags : {"dc": "sf"}},
{_id : 3, host : "D", tags : {"dc": "sf"}},
{_id : 4, host : "E", tags : {"dc": "cloud"}}],
settings : {
getLastErrorModes : {
allDCs : {"dc" : 3},
someDCs : {"dc" : 2}} }
}
> db.blogs.insert({...})
> db.runCommand({getLastError : 1, w : "someDCs"})
Tagging Example
Wait for Replication (Tagging)
Read Preference Modes
• 5 modes
– primary (only) - Default
– primaryPreferred
– secondary
– secondaryPreferred
– Nearest
When more than one node is possible, closest node is used
for reads (all modes but primary)
Operational
Considerations
Maintenance and Upgrade
• No downtime
• Rolling upgrade/maintenance
– Start with Secondary
– Primary last
Replica Set – 1 Data Center
• Single datacenter
• Single switch & power
• Points of failure:
– Power
– Network
– Data center
– Two node failure
• Automatic recovery of
single node crash
Replica Set – 2 Data Centers
• Multi data center
• DR node for safety
• Can’t do multi data
center durable write
safely since only 1
node in distant DC
Replica Set – 3 Data Centers
• Three data centers
• Can survive full data
center loss
• Can do w= { dc : 2 } to
guarantee write in 2
data centers (with tags)
Recent improvements
• Read preference support with sharding
– Drivers too
• Improved replication over WAN/high-latency
networks
• rs.syncFrom command
• buildIndexes setting
• replIndexPrefetch setting
Just Use It
• Use replica sets
• Easy to setup
– Try on a single machine
• Check doc page for RS tutorials
– http://docs.mongodb.org/manual/replication/#tutorials
Internal Tools, 10gen
J. Randall Hunt
#MongoDBDays
Thank You
Next Sessions at 12:40
5th Floor:
West Side Ballroom 3&4: Indexing and Query Optimization
West Side Ballroom 1&2: Text Search (Beta)
Juilliard Complex: Business Track: Building a Personalized Mobile
App Experience Using MongoDB atADP
Lyceum Complex:Ask the Experts
7th Floor:
Empire Complex: Scaling MongoDB; Sharding Into and Beyond the
Multi-Terabyte Range
SoHo Complex:Automated Slow QueryAnalysis: MongoDB &
Hadoop, Sittin' in a Tree

Basic Replication in MongoDB

  • 1.
    Internal Tools, 10gen J.Randall Hunt #MongoDBDays Introduction to Replication and Replica Sets
  • 2.
    Agenda • Replica SetsLifecycle • Developing with Replica Sets • Operational Considerations
  • 3.
    Why Replication? • Howmany have faced node failures? • How many have been woken up from sleep to do a fail-over(s)? • How many have experienced issues due to network latency? • Different uses for data – Normal processing – Simple analytics
  • 4.
  • 5.
  • 6.
    Replica Set –Initialize
  • 7.
  • 8.
  • 9.
  • 10.
    Replica Set –Recovered
  • 11.
    Replica Set Roles& Configuration
  • 12.
  • 13.
    > conf ={ _id : "mySet", members : [ {_id : 0, host : "A”, priority : 3}, {_id : 1, host : "B", priority : 2}, {_id : 2, host : "C”}, {_id : 3, host : "D", hidden : true}, {_id : 4, host : "E", hidden : true, slaveDelay : 3600} ] } > rs.initiate(conf) Configuration Options
  • 14.
    > conf ={ _id : "mySet”, members : [ {_id : 0, host : "A”, priority : 3}, {_id : 1, host : "B", priority : 2}, {_id : 2, host : "C”}, {_id : 3, host : "D", hidden : true}, {_id : 4, host : "E", hidden : true, slaveDelay : 3600} ] } > rs.initiate(conf) Configuration Options Primary DC
  • 15.
    > conf ={ _id : "mySet”, members : [ {_id : 0, host : "A”, priority : 3}, {_id : 1, host : "B", priority : 2}, {_id : 2, host : "C”}, {_id : 3, host : "D", hidden : true}, {_id : 4, host : "E", hidden : true, slaveDelay : 3600} ] } > rs.initiate(conf) Configuration Options Secondary DC Default Priority = 1
  • 16.
    > conf ={ _id : "mySet”, members : [ {_id : 0, host : "A”, priority : 3}, {_id : 1, host : "B", priority : 2}, {_id : 2, host : "C”}, {_id : 3, host : "D", hidden : true}, {_id : 4, host : "E", hidden : true, slaveDelay : 3600} ] } > rs.initiate(conf) Configuration Options Analytics node
  • 17.
    > conf ={ _id : "mySet”, members : [ {_id : 0, host : "A”, priority : 3}, {_id : 1, host : "B", priority : 2}, {_id : 2, host : "C”}, {_id : 3, host : "D", hidden : true}, {_id : 4, host : "E", hidden : true, slaveDelay : 3600} ] } > rs.initiate(conf) Configuration Options Backup node
  • 18.
  • 19.
  • 20.
  • 21.
    Write Concern • Networkacknowledgement • Wait for error • Wait for journal sync • Wait for replication
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
    Tagging • Control wheredata is written to, and read from • Each member can have one or more tags – tags: {dc: "ny"} – tags: {dc: "ny",
 subnet: "192.168",
 rack: "row3rk7"} • Replica set defines rules for write concerns • Rules can change without changing app code
  • 27.
    { _id : "mySet", members: [ {_id : 0, host : "A", tags : {"dc": "ny"}}, {_id : 1, host : "B", tags : {"dc": "ny"}}, {_id : 2, host : "C", tags : {"dc": "sf"}}, {_id : 3, host : "D", tags : {"dc": "sf"}}, {_id : 4, host : "E", tags : {"dc": "cloud"}}], settings : { getLastErrorModes : { allDCs : {"dc" : 3}, someDCs : {"dc" : 2}} } } > db.blogs.insert({...}) > db.runCommand({getLastError : 1, w : "someDCs"}) Tagging Example
  • 28.
  • 29.
    Read Preference Modes •5 modes – primary (only) - Default – primaryPreferred – secondary – secondaryPreferred – Nearest When more than one node is possible, closest node is used for reads (all modes but primary)
  • 30.
  • 31.
    Maintenance and Upgrade •No downtime • Rolling upgrade/maintenance – Start with Secondary – Primary last
  • 32.
    Replica Set –1 Data Center • Single datacenter • Single switch & power • Points of failure: – Power – Network – Data center – Two node failure • Automatic recovery of single node crash
  • 33.
    Replica Set –2 Data Centers • Multi data center • DR node for safety • Can’t do multi data center durable write safely since only 1 node in distant DC
  • 34.
    Replica Set –3 Data Centers • Three data centers • Can survive full data center loss • Can do w= { dc : 2 } to guarantee write in 2 data centers (with tags)
  • 35.
    Recent improvements • Readpreference support with sharding – Drivers too • Improved replication over WAN/high-latency networks • rs.syncFrom command • buildIndexes setting • replIndexPrefetch setting
  • 36.
    Just Use It •Use replica sets • Easy to setup – Try on a single machine • Check doc page for RS tutorials – http://docs.mongodb.org/manual/replication/#tutorials
  • 37.
    Internal Tools, 10gen J.Randall Hunt #MongoDBDays Thank You
  • 40.
    Next Sessions at12:40 5th Floor: West Side Ballroom 3&4: Indexing and Query Optimization West Side Ballroom 1&2: Text Search (Beta) Juilliard Complex: Business Track: Building a Personalized Mobile App Experience Using MongoDB atADP Lyceum Complex:Ask the Experts 7th Floor: Empire Complex: Scaling MongoDB; Sharding Into and Beyond the Multi-Terabyte Range SoHo Complex:Automated Slow QueryAnalysis: MongoDB & Hadoop, Sittin' in a Tree