Deploying ProxySQL on GKE with CloudSQL

Alfred Tommy
8 min readSep 19, 2019

--

A common use-case in production databases (MySQL) with read replicas is balancing traffic between the read endpoints, and handling query distribution by directing particular queries to relevant nodes. When deploying ProxySQL as a service on GKE, the challenge is to keep a single static endpoint for your application(s). This blog has been influenced by severalnines (https://severalnines.com/database-blog/mysql-docker-running-proxysql-kubernetes-service) but has been tailored for deployment on GCP.

Firstly, let us try to visualise the scenario. Our objective is to have a kubernetes cluster whose pods direct only read queries to the CloudSQL read replica and both read and write queries to the master. All interactions with the DB occur via the load balancer on top of proxysql. Makes sense?

No? Fine, here is a visual representation for the unimaginative ones among us.

So, shall we get started?

PART 1: Setting up CloudSQL

CREATE A CLOUDSQL INSTANCE

Let’s start off by creating the CloudSQL instance. You can configure it however you like but do remember to enable private IP connectivity. This essentially sets up a VPC peering between the VPC you provide in the associated networking drop down menu and the VPC where Google hosts your Managed CloudSQL instance, allowing you to access it over it’s internal IP.

CREATE THE READ REPLICA

Now that the master is setup, it’s time to spin up the read replica. This can be done with a single click of a button (3 clicks if you’re pedantic); Just click on the 3 vertical dots on the right side of your master instance and select ‘Create read replica’ from the drop down menu. (Do remember to enable private IP connectivity while configuring the read replica)

CREATING DB USERS

The final step is to register users with the CloudSQL instance. There are 2 main users that you would need to add, excluding the root user of course, the first one is the user that your application would use, the second one is the user that proxysql would use a.k.a the proxysql monitoring user.

You can create these users via the CloudSQL GUI under Users > Create User Account. For the proxysql monitoring user, keep the username as ‘proxysql’ and the password as ‘proxysqlpassw0rd’, and Allow any host.

Alternatively, you can login to the CloudSQL instance and manually add the users by using these commands in the cloud shell.

gcloud sql connect <instance_connection_name> --user=<user_name> --quiet

Once, you’re logged in to MySQL

CREATE USER appuser@'%' IDENTIFIED BY 'appuserpassw0rd';
FLUSH PRIVILEGES;
CREATE USER proxysql@'%' IDENTIFIED BY 'proxysqlpassw0rd';
FLUSH PRIVILEGES; //To reload the grant table

PART 2: Setting up GKE cluster

So now that we’ve got the DB part taken care of (for now), let’s go ahead and create a GKE cluster to deploy our application & ProxySQL.

You can go ahead and configure your cluster the way you want to, but there are just 3 things to keep in mind.

  1. VPC-native must be enabled: This allows us to securely connect with Google Services that lie outside our VPC from within our Kubernetes cluster.

2. When defining your node pool, you will want to set certain access scopes based on your access requirements. In this case, we need to ensure that access to the CloudSQL API is enabled for our kubernetes nodes.

3. Lastly, you would want to deploy your cluster in the same Network that you had selected while setting up your CloudSQL instance in part 1 of this article. In my case, it is default.

PART 3: Yaml yaml yaml

CREATING proxysql.cnf

Looking good so far, it’s finally time to create our yaml files!
Time for a pun that is just as humorous as it is necessary.

It’s an L-shaped YAM guys -.-

I know it’s a potato……sigh, the things we do for puns. As soon as you’re done cringing, let’s move on to generating these files. The first file we’ll work with will be our ProxySQL configuration file, which we will label proxysql.cnf (we’ll get to the yamls, chill).

datadir="/var/lib/proxysql"
admin_variables=
{
admin_credentials="proxysql-admin:adminpassw0rd"
mysql_ifaces="0.0.0.0:6032"
refresh_interval=2000
}
mysql_variables=
{
threads=4
max_connections=2048
default_query_delay=0
default_query_timeout=36000000
have_compress=true
poll_timeout=2000
interfaces="0.0.0.0:6033"
default_schema="information_schema"
stacksize=1048576
server_version="5.1.30"
connect_timeout_server=10000
monitor_history=60000
monitor_connect_interval=200000
monitor_ping_interval=200000
ping_interval_server_msec=10000
ping_timeout_server=200
commands_stats=true
sessions_sort=true
monitor_username="proxysql"
monitor_password="proxysqlpassw0rd"
}
mysql_replication_hostgroups =
(
{ writer_hostgroup=10, reader_hostgroup=20, comment="MySQL Replication 5.7" }
)
mysql_servers =
(
{ address="<master-private-IP>" , port=3306 , hostgroup=10, max_connections=100 },
{ address="<master-private-IP>" , port=3306 , hostgroup=20, max_connections=100, weight=10 },
{ address="<replica-private-IP>" , port=3306 , hostgroup=20, max_connections=100 , max_replication_lag=5}
)
mysql_users =
(
{ username = "appuser" , password = "appuserpassw0rd" , default_hostgroup = 10 , active = 1 },
{ username = "<root>" , password = "<root_password>" , default_hostgroup = 10 , active = 1 }
)
mysql_query_rules =
(
{
rule_id=100
active=1
match_pattern="^SELECT .* FOR UPDATE"
destination_hostgroup=10
apply=1
},
{
rule_id=200
active=1
match_pattern="^SELECT .*"
destination_hostgroup=20
apply=1
}
)

Things to note from proxysql.cnf:

  1. Remember the admin_credentials segment because we will need those credentials to access ProxySQL in admin mode.
  2. In mysql_variables, the monitor_username and monitor_password should match the proxysql user credentials you created on the CloudSQL instance.
  3. Take a look at the mysql_servers section:
mysql_servers =
(
{ address="<master-private-IP>" , port=3306 , hostgroup=10, max_connections=100 },
{ address="<master-private-IP>" , port=3306 , hostgroup=20, max_connections=100, weight=10 },
{ address="<replica-private-IP>" , port=3306 , hostgroup=20, max_connections=100 , max_replication_lag=5}
)

Here, we are setting the master to the both writer and reader hostgroups but the read replica is only set under the reader hostgroup.

The weight value of 10 under the master read hostgroup tells proxysql to give preference to the master when dealing with read queries, the value of the weight determines to what extent the preference will be kept. A higher value equals to a higher preference. Max_replication_lag value has been set at 5 seconds so that if replication lag beyond this value starts occuring, ProxySQL will route all read queries only to the master so as to prevent stale/null data being retrieved by the application.

The query rules section is kinda intuitive if you understand regex, and is basically meant for enabling the read-write split.

CREATING proxysql-deployment.yml

This is what our deployment file looks like:

apiVersion: apps/v1
kind: Deployment
metadata:
name: proxysql
labels:
app: proxysql
spec:
replicas: 2
selector:
matchLabels:
app: proxysql
tier: frontend
strategy:
type: RollingUpdate
template:
metadata:
labels:
app: proxysql
tier: frontend
spec:
restartPolicy: Always
containers:
- image: severalnines/proxysql:1.4.12
name: proxysql
volumeMounts:
- name: proxysql-config
mountPath: /etc/proxysql.cnf
subPath: proxysql.cnf
ports:
- containerPort: 6033
name: proxysql-mysql
- containerPort: 6032
name: proxysql-admin
volumes:
- name: proxysql-config
configMap:
name: proxysql-configmap

Here we are referencing the proxysql configmap as proxysql-configmap, we will ensure that we deploy the configmap with the same name to avoid any discrepancies.

Run this command to enable HPA (Horizontal Pod Autoscaling) for this deployment (in this case our deployment name will be proxysql) to make sure our pods can scale up when needed.

kubectl autoscale deployment <deployment-name> --cpu-percent=80 --min=2 --max=6

You can adjust these values based on your requirement.

CREATING proxysql-service.yaml

Ok, hold up now. When creating the service, the main thing to keep in mind is the service type. NodePort would give us an externally accessible node endpoint IP, ClusterIP would give us a private IP endpoint which is what we need, but we require it to contain a static value. ClusterIP would change the endpoint every time the service is deleted/recreated/redeployed. Even a public load balancer will not suit our criteria, we need an internal LB, and this is how we get it:

  1. First, we have to reserve a static internal IP address. Go to ‘VPC Networks’ > Click on the VPC name (In our case ‘default’) > Select the ‘Static Internal IPs’ tab> Click on ‘Reserve Static Address’

2. Note down the IP address that you just reserved because this will be the endpoint that your applications use. Make sure that the subnet you reserve the IP in should be in the same subnet range that your ProxySQL pods within your kubernetes cluster are deployed.

3. Now, we can create our service file proxysql-service.yaml. This is what our service file would look like:

apiVersion: v1
kind: Service
metadata:
name: proxysql
annotations:
cloud.google.com/load-balancer-type: "Internal"
labels:
app: proxysql
tier: frontend
spec:
type: LoadBalancer
loadBalancerIP: <reserved_static_internal_IP>
ports:
- port: 30033
protocol: TCP
targetPort: 6033
name: proxysql-mysql
- port: 30032
protocol: TCP
targetPort: 6032
name: proxysql-admin
selector:
app: proxysql
tier: frontend

This configuration ensures that the service only uses the static IP we have provisioned for it, giving us a single static endpoint for our application(s).

PART 4: Deploy EVERYTHING!

If you’ve been following correctly, then at this stage, you should have 3 files — proxysql-deployment.yaml, proxysql-service.yaml, and proxysql.cnf.

Let’s start with creating a namespace for our proxysql

kubectl create namespace <namespace_name>
OR
kubectl create namespace proxysql-ns

deploying the configmap

kubectl create configmap proxysql-configmap --from-file=proxysql.cnf --namespace=proxysql-ns

Then the deployment file

kubectl create -f proxysql-deployment.yaml --namespace=proxysql-ns

And finally, the service

kubectl create -f proxysql-service.yaml --namespace=proxysql-ns

Check the status of your deployments by running these commands (or kubectl describe)

kubectl get pods --namespace=proxysql-ns
kubectl get svc --namespace=proxysql-ns
kubectl get deployments --namespace=proxysql-ns

Aaaaaand Ta-dah! You now have a single ProxySQL endpoint that your applications can access, and a distributed database structure that utilizes your master instance and read replicas wisely. Kubernetes will take care of your pods in terms of health checks, autoscaling, etc so you can sit down, grab a beer, and pat yourself on the back.

Note: If you want to test read-write split, there is plenty of documentation publicly available that will help you do the same. Until next time :)

--

--