M5 Networking For Secure Database Connectivity
M5 Networking For Secure Database Connectivity
Networking for
Secure Database
Connectivity
Hello, my name is Julianne Cuneo and I’m a Data Analytics Specialist Customer
Engineer at Google. Welcome back to Enterprise Database Migration.
Security should always be a top concern, so let’s have a look at how to build a secure
network for database connectivity.
Objectives
There are various levels of network security we need to focus on: first building a
secure network for the database servers, and then the client apps that will connect to
them.
Next, in cases where you need to communicate across networks, you can use VPNs
and VPC peering.
As always, firewall rules are a useful way to control access to the databases.
Throughout this module, you will automate building the network infrastructure using
Terraform.
Building Secure
Networks
Building a secure network to host the database server is the first place to start.
A database migration project requires a secure
network configuration in Google Cloud
● All resources need to be in a project.
● Projects contain one or more VPCs: Project
Projects contain one or more VPCs, which provide virtual networking. In Google
Cloud, VPCs are a global resource.
Each VPC contains one or more subnets, which are regional resources. You need to
create a subnet for each region you want VMs located in.
Use multiple networks to easily isolate
machines from each other
By default:
● Machines in the same VPC can communicate Project
external IP addresses.
Subnet B Subnet D
● A VM with no external IP address
is only reachable from inside its VPC
(by default).
By default, machines in the same VPC can communicate with one another via their
internal IP address, regardless of the region they are in.
You can use multiple networks to control which machines are reachable from other
machines and to isolate machines from the outside world.
Firewall rules are used to control which machines
can communicate using which ports
By default:
● All ports are closed to ingress.
● All ports are open to egress.
When creating firewall rules:
● Targets are used to specify which machines
in the VPC the rule applies to.
● Sources specify which machines outside the
VPC the rule applies to.
● Use Allow rules to permit ingress.
● Use Deny rules prevent egress.
Firewall rules can control which machines can communicate with one another through
designated ports.
Firewall rules consist of targets, which specify which machines in the VPC the rules
apply to, and sources, which specify which machines outside the VPC the rules apply
to.
Because ingress is closed by default, you use Allow rules to permit specific ingress,
and because egress is open by default, you use Deny rules to prevent specific
egress.
A default network is created when you enable the
Compute Engine service
● A subnet is created for every region in ● The default network makes it easy to get
the world. started, but is probably not appropriate
for production environments:
● Default firewall rules are created for:
○ You probably don’t want every subnet
○ SSH
to be used.
○ RDP
○ The firewall rules are too permissive
○ HTTP
(SSH and RDP are allowed from all
○ HTTPS
sources, for example).
○ ICMP
● All internal traffic is allowed
When you enable the Compute Engine service in a project, the system automatically
creates a default network. This default network contains a subnet for each region in
the world.
Also, default firewall rules are created for SSH, RDP, HTTP, HTTPS, and ICMP, and
all internal traffic is allowed.
The default network makes it easy to get started, but it should be just that—a starting
point. Most production environments will need a modified set of these rules. For
example, you probably don’t want every subnet to be used, and certainly don’t want
the overly permissive rules allowing SSH and RDP from all sources.
Creating a custom VPC network
The best option is to create your own custom VPC network. You start by giving it a
name and then add a subnet for each region you want to use.
Subnets require an internal IP address range. Make sure the address ranges in
different subnets don’t overlap.
Creating firewall rules illustrated
Here’s an example of using the console to fill in the relevant details for creating a
firewall rule. The parameters for configuring a firewall rule include name, network,
priority, ingress or egress, allow or deny, targets, sources, protocols, and ports.
Firewall rule parameters
● Each firewall rule has a unique name, and you should use a consistent
naming convention.
● Each rule is scoped to a network.
● If multiple rules conflict, the priority determines which rule wins. Lower
numbers have higher priority.
● Direction determines whether this is an ingress or egress rule. Recall that
ingress is blocked by default, and egress is allowed by default.
● Therefore, the action is allow or deny. If you are creating an ingress rule, it is
probably an "allow"; for an egress rule, it is probably a "deny."
● Targets determine which machines in your network the rules apply to. This can
be set to all the machines, or you can use tags or service accounts to specify
only certain machines.
● Sources are used to determine which machines outside your network the rule
applies to. Sources are usually determined using IP address ranges.
● Finally, you specify protocols and ports. Protocols are either TCP, UDP, or
ICMP (which is ping). You can specify ports or ranges.
It’s not so different from defining firewall rules on your own router. The interesting
thing to note is priority, which handles conflicts when multiple rules overlap. In that
case, the the rule with the lower priority number wins and is used over the higher
number.
Firewall targets and sources
For targets, you can specify all machines on the network. Or you can specify VMs
with a particular network tag. A network tag is simply a string that you can assign to
VMs. Lastly, you can specify the service account that VMs are assigned when they
are created.
Sources are usually specified using IP addresses or ranges using CIDR notation. With
CIDR notation, an IP address is specified followed by a slash and a number. The
number determines the range and can be from 0 to 32. Lower numbers indicate
larger ranges. Thus, a /32 means only that address, a /24 means all IP addresses that
begin with the first three numbers, a /16 means all addresses that begin with the first
two numbers, and so on. The range 0.0.0.0/0 means all the machines in the world are
considered sources for this rule. As with targets, you can also use tags or service
accounts to determine sources.
Use Terraform to automate the
creation of resources
● Automation is essential so you can effectively test and iterate on a solution.
● Terraform is included in Cloud Shell by default:
○ Installing it or running a separate Terraform server isn’t necessary.
Terraform is a tool from HashiCorp that allows you to automate the creation of
resources. Each cloud provider has their own version of a similar tool. AWS has Cloud
Formation, and Microsoft Azure has Resource Manager. But Terraform is commonly
used to automate cloud resources because it is supported on all the major cloud
platforms. Thus, many organizations who want to use multiple or hybrid cloud
environments prefer it, so they only need to learn one tool.
Terraform uses files with instructions, and you put all the files for a specific
deployment in one folder.
You create templates that describe what you want to create, and there are templates
available for creating all types of Google Cloud resources.
You add one or more these .tf files to the folder and combine them for deployment.
If there are variables, you can set them in the .tfvars file.
And the .tfstate file keeps track of existing resources and is used when resources are
updated or destroyed.
Use the Google Cloud Terraform provider
provider "google" {
project = var.project_id
region = var.gcp_region_1
}
This code snippet shows how to use the Google Cloud Terraform provider. Google
collaborates with HashiCorp to keep this provider up to date.
In the example, the Google Cloud project ID and region are being set. Note that the
values are being set to variables that are created in another .tf file. Variables are often
a better choice than hard coding values because they promote reuse of the
configuration file more easily.
Create variables for your Terraform deployment
to more easily change settings
variable "project_id" {
type = string
description = "GCP Project ID"
}
variable "gcp_region_1" {
type = string
description = "GCP Region"
}
variable "subnet_cidr_public" {
type = string
description = "Subnet CIDR for Public Network"
}
In this code block, three variables are being declared: "project_id," "gcp_region_1,"
and "subnet_cidr_public." This is done in the file variables.tf.
Note that the values are not being set here: the variables are simply being declared.
You could however set default values for each variable here if you wanted to.
Variables can be set in a template or
at runtime
project_id = "database-migration-tf"
gcp_region_1 = "us-east4"
subnet_cidr_public = "10.1.1.0/24"
subnet_cidr_private = "10.2.2.0/24"
You can set the values of your variable in the .tfvars file, or alternatively, you can set
them at runtime.
If you do not set the value of a variable in this file, you will be prompted for it when
running the template. You could also pass the variable values as parameters when
running the template.
Terraform template for creating a VPC and subnet
In the first code block, a network is created called “public-vpc.” It is a resource of the
type "google_compute_network."
The network is then referenced in the second code block when creating the subnet.
Notice the network property of the subnet. It refers to the network created above it
using the resource type and name. A network also has a property called "name" which
is being used to set the value of the subnet's network property.
Terraform commands
Using Terraform involves using the templates to make your own config scripts, then
running several Terraform commands.
● The command terraform init should be run from the folder where the
Terraform files are; it sets the working directory. You only need to do this once
whenever you change directories.
● terraform plan generates and displays an execution plan. Terraform
decides on the order in which to create resources independently of how you
ordered them in your configuration. This is part of the Terraform logic, and
dependencies are known by the provider you are using.
● The apply command builds or modifies the infrastructure according to the
plan that was developed from your templates. If you don’t want to be prompted
to approve the plan, you can include the -auto-approve option when you
apply.
● Finally, destroy would get rid of the resources defined in a template. This is
all tracked in the .tfstate file. There’s also an -auto-approve option here.
A secure environment allows only
known clients to access the database
Public
and Private
private IPs
IPs Client Database
A good practice in securing your environment is to only allow known clients to have
access to your database server through firewall rules.
Creating VMs with Terraform
resource "google_compute_instance" "sql-client" {
name = "sql-client-${random_id.instance_id.hex}"
machine_type = "f1-micro"
zone = var.gcp_zone_1
tags = ["allow-ssh"]
boot_disk {
initialize_params {
image = "ubuntu-os-cloud/ubuntu-1604-xenial-v20200429"
}
}
Here is an example template for creating a virtual machine based on the Ubuntu
image. The network tag "allow-ssh" is used as the target tag for the SSH firewall rule.
Note the various properties that are being set for the machines. These are the same
properties you would set in the web console; you are just doing it with some code.
Creating firewall rules with Terraform
Here’s an example of how to create a firewall rule using Terraform. In this case, all
sources can connect via tcp through port 22, but only to targets with the tag
"allow-ssh."
Avoid common pitfalls when
configuring networks
● Use care when configuring firewall rules:
○ Only use source 0.0.0.0/0 when appropriate.
○ Use tags or Service accounts to specify targets.
● Don't use external IP addresses on VMs that don't need them.
○ External IPs also cost money and increase egress cost.
● Use CIDR ranges that are small when configuring subnets.
○ Reduces chance of IP conflict when connecting networks.
● Always use SSL when communicating from outside Google Cloud.
○ Traffic within Google Cloud is encrypted by default.
● Use a VPN or Cloud Interconnect to connect on-premises networks to Google.
Perhaps that last example was a bit too permissive. In general, you should avoid
allowing all sources, so watch out for using 0.0.0.0/0, unless you have to. For
example, allowing HTTPS access to web servers from all sources is probably what
you want, but use more restrictive rules for dangerous protocols like SSH, RDP, or
database access. Instead, use tags or service accounts.
Also avoid using external IP addresses on VMs if you don’t need them. In addition to
weakening security, they cost money and also increase egress cost.
When configuring subnets, try to use CIDR ranges that are small to reduce the
chance of IP overlap conflicts.
Always use SSL when communicating between Google Cloud and the outside world.
Within Google Cloud, all traffic is encrypted by default, but outside of the Google
network, you need to encrypt it.
And when you need to connect on-premises networks to Google resources, the best
option is to use VPN or Cloud Interconnect, not external IPs.
Lab Intro
Using Terraform to Create
Networks and Firewalls
In this lab, you use Terraform to automate the creation of network infrastructure. You
will create two VPCs, one for public service and one for the database server, and
create firewall rules for each network.
Lab Review
Using Terraform to Create
Networks and Firewalls
In this lab:
You used Terraform to automate infrastructure creation.
Created two VPCs: one for public servers and one for databases.
And created firewalls for each network.
After you have networks, you want to connect resources with them.
Communication services must be set up so
the right computers can talk to each other
Project
End users can reach Web servers can reach Admins have secure
web servers. database servers. access to the database.
The whole point of networks is to allow resources to connect to one another. But you
want to make sure you set this up right, because different types of resources need to
talk to one another through different network protocols to make the system secure
and to minimize costs.
Communication services must be set up so
the right computers can talk to each other
Project
End users can reach Web servers can reach Admins have secure
web servers. database servers. access to the database.
End users will use the pubic internet to connect to a web server that hosts your app.
Communication services must be set up so
the right computers can talk to each other
Project
End users can reach Web servers can reach Admins have secure
web servers. database servers. access to the database.
But that web server will need to talk to a database, which should only allow
connections from the same project through VPC.
Communication services must be set up so
the right computers can talk to each other
Project
End users can reach Web servers can reach Admins have secure
web servers. database servers. access to the database.
You don’t want the general public to be able to reach your database server. However,
you will have some on-premises resources, such as your admin or data analysts, who
will need to connect to the database. For them, a VPN or Interconnect is the more
appropriate way to connect versus a public IP.
Cloud VPN securely connects your on-premises
network to your Google Cloud VPC network
● Useful for low-volume data connections
● Classic VPN: 99.9% SLA
● High-availability (HA) VPN: 99.99% SLA
● Supports:
○ Site-to-site VPN
○ Static routes (Classic VPN)
○ Dynamic routes (Cloud Router)
○ IKEv1 and IKEv2 ciphers
Cloud VPN is the way to connect your on-premises network to your Google Cloud
VPC network. It’s highly available and secure.
Cloud VPN can support connections up to about 3 Gigabits per second, and It can be
configured two ways. The Classic configuration consists of a single VPN tunnel and
offers a 99.9% availability SLA. A high-availability configuration uses a second tunnel
to achieve 99.99% availability.
Cloud VPN can also be configured for both static or dynamic routing using Cloud
Router.
VPN Gateway
Google Cloud provides a VPN Gateway as a service that you use on-premises to
connect your local network to the VPC Network. You set up a VPN gateway in your
on-premises network and Cloud VPN in your Google Cloud VPC, and connect the
two. This provides an encrypted connection for secure communication between the
networks.
Cloud Router enables dynamic routes using
Border Gateway Protocol (BGP)
Cloud Router is not a physical router but is instead a service that works over Cloud
VPN or Cloud Interconnect connections to provide dynamic routing by using the
Border Gateway Protocol. Dynamic routing just means that if the network on either
side changes, the connection between the two is automatically updated.
Dedicated Interconnect provides
direct physical connections
1 2
Order your Send LOA-CFAs
Dedicated Interconnect To your vendor
4 3
Create VLAN attachments Test the
and establish BGP sessions interconnection
These service providers have existing physical connections to Google's network that
they make available for their customers to use. After you establish connectivity with a
service provider, you can request a Partner Interconnect connection from your service
provider. Then, you establish a BGP session between your Cloud Router and
on-premises router to start passing traffic between your networks via the service
provider's network.
Encrypted tunnel to
VPC networks 1.5–3 Gbps per On-premises VPN
IPsec VPN tunnel
through the public tunnel gateway
internet
Dedicated, direct
Dedicated 10 Gbps or Connection in Internal IP
connection to VPC
Interconnect 100 Gbps per link colocation facility addresses
networks
Dedicated
bandwidth, 50 Mbps –
Partner
connection to VPC 10 Gbps per Service provider
Interconnect
network through a connection
service provider
The IPsec VPN tunnels that Cloud VPN offers have a capacity of 1.5 to 3 Gbps per
tunnel and require a VPN device on your on-premises network. The 1.5-Gbps
capacity applies to traffic that traverses the public internet, and the 3-Gbps capacity
applies to traffic that is traversing a direct peering link. You can configure multiple
tunnels if you want to scale this capacity.
Dedicated Interconnect has a capacity of 10 Gbps or 100 Gbps per link and requires
you to have a connection in a Google-supported colocation facility. You can have up
to 8 links to achieve multiples of 10 Gbps, or up to 2 links to achieve multiples of 100
Gbps, but 10 Gbps is the minimum capacity.
All of these options provide internal IP address access between resources in your
on-premises network and in your VPC network. The main differences are the
connection capacity and the requirements for using a service.
VPC Network Peering allows private RFC 1918 connectivity across two VPC
networks, regardless of whether they belong to the same project or the same
organization. Now remember that each VPC network will have firewall rules that
define what traffic is allowed or denied between the networks.
For example, in this diagram there are two organizations that represent a consumer
and a producer. Each organization has its own organization node, VPC network, VM
instances, Network Admin, and Instance Admin. In order for VPC Network Peering to
be established successfully, the Producer Network Admin needs to peer the Producer
Network with the Consumer Network, and the Consumer Network Admin needs to
peer the Consumer Network with the Producer Network. When both peering
connections are created, the VPC Network Peering session becomes Active and
routes are exchanged. This allows the VM instances to communicate privately using
their internal IP addresses.
VPC Network Peering works with Compute Engine, Kubernetes Engine, and App
Engine flexible environments. Remember a few things when using VPC Network
Peering:
As pointed out on the last slide, each side of the peering association needs to be set
up separately. So note here how the public and private networks each reciprocally
connect to the other. If you only do one, it is not going to automatically allow the other.
Internal IP address ranges in peered
networks cannot overlap
As always, make sure that you don’t have overlapping IP address ranges between the
peered networks.
Routes are automatically generated that allow
traffic to flow between peered networks
Creating the peering requests automatically generates the necessary routes to allow
the traffic to flow between the peered networks.
Use Terraform to automate the creation
of peered networks
Terraform templates make automating this easier. In the code shown here, note that
there is a public-private request and then the reciprocal private-public request. It's
simple: you just make two peering requests that are the inverse of each other.
Lab Intro
Using Terraform to Create
a Network Peering
Multiple networks allow you to easily isolate machines and control which machines
have access to other ones. In the course example, you want databases isolated in
their own network with no external IP addresses. Those servers, by default, will only
be accessible from machines in the same network. You can then put database clients
like web servers in another network. Then peer the two networks allowing
communication via only internal addresses.
In this lab, you learned how to use Terraform to set up network peering. The peering
allows machines in two different networks to communicate using internal IP
addresses. This allows you to protect your databases by controlling exactly which
machines have access to them and from which networks.
Enabling
Communication
across Networks
Now that the networks are in place and connected to each other, you need to set up
the appropriate communications between them.
Configure firewall rules to allow communication to
the database server from the peered network
SQL Server on
Allow RDP and port 1433
Client Windows
First, allow clients from the peered network to talk to the database server. For SQL
Server, if you are using a Windows server, you would use the default port of 1433 and
RDP. For MySQL, assuming you are running a Linux server, you would open port
3306 and SSH.
Example firewall rule to allow traffic to SQL Server
from the peered network
resource "google_compute_firewall" "private-allow-sql" {
name = "${google_compute_network.private-vpc.name}-allow-sql"
network = "${google_compute_network.private-vpc.name}"
allow {
protocol = "tcp"
ports = ["1433"]
}
source_ranges = [
"${var.subnet_cidr_public}"
]
target_tags = ["allow-sql"]
}
Here’s an example Terraform script to allow clients to connect to the SQL Server
instance through port 1433. Note that this code is using the variable that sets the
internal IP address range of the client network to set the source_ranges property of
the firewall rule. Thus, the SQL Server instance is in a private network that is only
accessible from the client network. Users can connect to the web servers in the public
network, but only the servers in the public network can connect to the SQL Server
instance.
Also, note the target_tags variable. This rule only applies to servers tagged with the
string "allow-sql".
A NAT proxy is required to provide internet access
to machines with no external IP access
VM VM VM VM VM VM
IP3 IP4 IP5 IP3 IP4 IP5
Can create a NAT using a Compute Engine VM or the Cloud NAT service
When you have a machine with no public IP address, which is usually the preferred
configuration, in order for that machine to reach the internet, you would need a NAT
proxy. You can create a NAT proxy using either a custom Compute Engine VM or by
using the Cloud NAT service.
Create a Cloud NAT gateway in the
console or Terraform
resource "google_compute_router_nat" "private-nat" {
name = "private-nat"
router = google_compute_router.nat-router.name
region = google_compute_router.nat-router.region
nat_ip_allocate_option = "AUTO_ONLY"
source_subnetwork_ip_ranges_to_nat = "ALL_SUBNETWORKS_ALL_IP_RANGES"
log_config {
enable = true
filter = "ERRORS_ONLY"
}
}
Here’s a Terraform example using the Cloud NAT gateway service. Again, the NAT
gateway will allow machines with no External IP address to access the internet. Note
the region and router parameters. These parameters refer to a Cloud Router, which
also must be configured.
Cloud Router allows for dynamic routing
using BGP (Border Gateway Protocol)
bgp {
asn = 64514
}
}
Here the Cloud Router configuration is shown. Note, Cloud Router is a regional
service. By default, Cloud Router advertises subnets in its region for regional dynamic
routing, or all subnets in a VPC network for global dynamic routing. New subnets are
automatically advertised by Cloud Router. Also note, the network that the router
serves must be specified.
Private Google access allows machines without
external IP addresses to access to Google services
● Set when creating subnets.
● For example, a database server
needs access to Cloud Storage
for backups.
So, if a machine with no public IP address needs internet access, you now know how
to fix that Using Cloud NAT.
But what about letting that VM access Google Cloud services like Cloud Storage or
BigQuery? In order to permit that, you need to turn on Private Google access. This is
done when configuring subnets. Private Google Access permits access to Cloud and
Developer APIs and most Google Cloud services, with a few exceptions.
A real-world database use case might be allowing the database server access to
Cloud Storage for storing backups.
Lab Intro
Using Terraform to Create Clients
and Servers
In this lab, you use Terraform to create client and server VMs in both public and
private networks and set up the communications between them to allow them to
communicate with each other.
Lab Review
Using Terraform to Create Clients
and Servers
In this lab, you used Terraform to created client and server VMs in both public and
private networks and set up the communications between them to allow them to
communicate with each other. At this point, you have learned how to automate the
setup of secure networks. This is very important when migrating databases to the
cloud.
Network
Considerations for
Managed Databases
Cloud SQL allows you to choose either private or public IP addresses, or both. Also
make note that once a private IP address is enabled, it cannot be disabled.
If a private IP is chosen, a network peering is created
from Google’s network to the network specified
● Cloud SQL database is managed by
Google in a network that they own.
● You choose the network to associate
with Google’s network.
● Cloud SQL database is available
from the peered network using the
private IP address.
When you enable a Private IP, Google automatically creates a network peering
between your network (which you specify) and their network where your Cloud SQL
database is running. This is just like what you set up earlier in the course. The
difference is that Google manages the Cloud SQL network.
After the private IP is enabled, machines in the peered network can communicate
without needing a public IP address. You can clear the Public IP option, and the
server will not be given one.
A Cloud SQL instance will be in Google’s own managed network. You may need to
allow resources in your project to connect to that Cloud SQL instance, so you would
choose a private IP to do that. This will associate your chosen network with Google’s
network and peer them.
If a public IP is chosen, the database
is protected by a firewall
● By default, only applications in the
current project can access the
database.
● You authorize one or more additional
networks using IP addresses or
ranges.
● Create a VPN or Cloud Interconnect
to connect networks outside of
Google Cloud.
If you choose a public IP, your database server is protected by firewall rules instead.
By default, only apps in the current project can access the database, but you can
authorize additional networks to have access by using IP addresses or ranges, just
like you do when specifying source IPs in a firewall rule.
You can also create a VPN or Cloud Interconnect to connect networks outside Google
Cloud. This would be the preferred way to connect your on-premises network to the
manage database instance.
Module Review