In the previous part of the guide we have shown how to create a CoreOS Xen guest, let’s now leverage that and create a cluster of 3 CoreOS guests running running etcd.
First of all we should create Xen configuration files for the other two nodes, this is simply a matter of changing a few values
afterwards it’s time for us to take a look at the ignition configuration for the nodes, but before doing that let’s first take care of the connection between the etcd daemons.
We want our etcd daemons to communicate over an authenticated TLS channel with each other, this means each node needs to have two sets of certificates for etcd (one for the server, one to be used as a peer certificate to talk to the other) and a client certificate to make it easy to test interacting with etcd via etcdctl.
It is not necessary to have a separate client certificate for each server, however I find it good practice not to share certificates whenever possible. Self signed certificate generation is discussed on this page in the CoreOS website.
Let’s first download and install CFSSL, although there are some pre-built binaries available already, it is quite easy to set up golang and compile from source. With golang I prefer to install the standard distribution rather than use the Debian packages as it is more up-to-date.
First of all check at the golang download page which is the current stable version, 1.9.2 at the time of this writing, and download it (or copy the link and use it as below). I usually install golang in /usr/local/go-x.y.z and symlink /usr/local/go to it
go version go1.9.2 linux/amd64
Version: 1.2.0 Revision: dev Runtime: go1.9.2
as discussed in the CoreOS certificates page, we will need two spec files for our certificate generation, we should also create a directory to store all our certificates
{ "signing": { "default": { "expiry": "43800h" }, "profiles": { "server": { "expiry": "43800h", "usages": [ "signing", "key encipherment", "server auth" ] }, "client": { "expiry": "43800h", "usages": [ "signing", "key encipherment", "client auth" ] }, "peer": { "expiry": "43800h", "usages": [ "signing", "key encipherment", "server auth", "client auth" ] } } } }
{ "CN": "ETCD CA", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "O": "Fun clusters", "OU": "The etcd cluster" } ] }
with this set-up we are now ready to create the certificates we need, let’s first do it command by command and then create a bash function to make it easier to do for multiple nodes.
2017/12/04 13:07:27 [INFO] generating a new CA key and certificate from CSR 2017/12/04 13:07:27 [INFO] generate received request 2017/12/04 13:07:27 [INFO] received CSR 2017/12/04 13:07:27 [INFO] generating key: rsa-2048 2017/12/04 13:07:27 [INFO] encoded CSR 2017/12/04 13:07:27 [INFO] signed certificate with serial number 371395664172797524463428193829566444461989582320
2017/12/04 13:10:43 [INFO] generate received request 2017/12/04 13:10:43 [INFO] received CSR 2017/12/04 13:10:43 [INFO] generating key: rsa-2048 2017/12/04 13:10:44 [INFO] encoded CSR 2017/12/04 13:10:44 [INFO] signed certificate with serial number 711114156779509809672244743349661684421006137401
2017/12/04 13:11:04 [INFO] generate received request 2017/12/04 13:11:04 [INFO] received CSR 2017/12/04 13:11:04 [INFO] generating key: rsa-2048 2017/12/04 13:11:04 [INFO] encoded CSR 2017/12/04 13:11:04 [INFO] signed certificate with serial number 28611347093971073034596156201548317440092463213
2017/12/04 13:11:14 [INFO] generate received request 2017/12/04 13:11:14 [INFO] received CSR 2017/12/04 13:11:14 [INFO] generating key: rsa-2048 2017/12/04 13:11:15 [INFO] encoded CSR 2017/12/04 13:11:15 [INFO] signed certificate with serial number 428167357505177330066797952811634513293217918758
total 64 drwxr-xr-x 2 root root 4096 Dec 4 13:11 . drwxr-xr-x 3 root root 4096 Dec 4 12:59 .. -rw-r--r-- 1 root root 832 Dec 4 12:59 ca-config.json -rw-r--r-- 1 root root 223 Dec 4 13:03 ca-csr.json -rw------- 1 root root 1679 Dec 4 13:07 ca-key.pem -rw-r--r-- 1 root root 1257 Dec 4 13:07 ca.pem -rw-r--r-- 1 root root 944 Dec 4 13:11 etcd-node-1-client.csr -rw------- 1 root root 1675 Dec 4 13:11 etcd-node-1-client-key.pem -rw-r--r-- 1 root root 1273 Dec 4 13:11 etcd-node-1-client.pem -rw------- 1 root root 1675 Dec 4 13:11 etcd-node-1-peer-key.pem -rw-r--r-- 1 root root 1310 Dec 4 13:11 etcd-node-1-peer.pem -rw------- 1 root root 1675 Dec 4 13:10 etcd-node-1-server-key.pem -rw-r--r-- 1 root root 1298 Dec 4 13:10 etcd-node-1-server.pem
Certificate: ... Signature Algorithm: sha256WithRSAEncryption Issuer: O = Fun clusters, OU = The etcd cluster, CN = ETCD CA Validity Not Before: Dec 4 21:06:00 2017 GMT Not After : Dec 3 21:06:00 2022 GMT Subject: CN = etcd-node-1-server ... X509v3 extensions: X509v3 Key Usage: critical Digital Signature, Key Encipherment X509v3 Extended Key Usage: TLS Web Server Authentication X509v3 Basic Constraints: critical CA:FALSE ... X509v3 Subject Alternative Name: DNS:etcd-node-1, IP Address:192.168.100.11
as you have noticed, the commands to create the certificates are very similar, and so easy to automate, you can see it here also with completion. Note that if you don’t provide the IP, the function will be able to figure it out from the dnsmasq hosts file we created earlier.
Let’s now remove the certificates we created and regenerate them using the script.
2017/12/12 17:03:07 [INFO] generating a new CA key and certificate from CSR 2017/12/12 17:03:07 [INFO] generate received request 2017/12/12 17:03:07 [INFO] received CSR 2017/12/12 17:03:07 [INFO] generating key: rsa-2048 2017/12/12 17:03:07 [INFO] encoded CSR 2017/12/12 17:03:07 [INFO] signed certificate with serial number 583203797657129789344945978835970118719717980298 ...
...
...
total 96 drwxr-xr-x 2 root root 4096 Dec 12 17:04 . drwxr-xr-x 3 root root 4096 Dec 12 17:03 .. -rw-r--r-- 1 root root 832 Dec 12 17:03 ca-config.json -rw-r--r-- 1 root root 212 Dec 12 17:03 ca-csr.json -rw------- 1 root root 1679 Dec 12 17:03 etcd-ca-key.pem -rw-r--r-- 1 root root 1253 Dec 12 17:03 etcd-ca.pem -rw------- 1 root root 1679 Dec 12 17:03 etcd-node-1-client-key.pem -rw-r--r-- 1 root root 1273 Dec 12 17:03 etcd-node-1-client.pem -rw------- 1 root root 1679 Dec 12 17:03 etcd-node-1-peer-key.pem -rw-r--r-- 1 root root 1306 Dec 12 17:03 etcd-node-1-peer.pem -rw------- 1 root root 1675 Dec 12 17:03 etcd-node-1-server-key.pem -rw-r--r-- 1 root root 1298 Dec 12 17:03 etcd-node-1-server.pem -rw------- 1 root root 1675 Dec 12 17:04 etcd-node-2-client-key.pem -rw-r--r-- 1 root root 1273 Dec 12 17:04 etcd-node-2-client.pem -rw------- 1 root root 1679 Dec 12 17:04 etcd-node-2-peer-key.pem -rw-r--r-- 1 root root 1306 Dec 12 17:04 etcd-node-2-peer.pem -rw------- 1 root root 1675 Dec 12 17:04 etcd-node-2-server-key.pem -rw-r--r-- 1 root root 1298 Dec 12 17:04 etcd-node-2-server.pem -rw------- 1 root root 1679 Dec 12 17:04 etcd-node-3-client-key.pem -rw-r--r-- 1 root root 1273 Dec 12 17:04 etcd-node-3-client.pem -rw------- 1 root root 1679 Dec 12 17:04 etcd-node-3-peer-key.pem -rw-r--r-- 1 root root 1306 Dec 12 17:04 etcd-node-3-peer.pem -rw------- 1 root root 1679 Dec 12 17:04 etcd-node-3-server-key.pem -rw-r--r-- 1 root root 1298 Dec 12 17:04 etcd-node-3-server.pem
With all the certificates now ready we are ready to create Ignition configuration files containing them. In general you would not want certificates and keys to be present in deployment files for security reasons, however given our setup it makes it easiest to include them in the .ct files directly.
Unfortunately you can’t really cut and paste the files directly otherwise they won’t transpile correctly, they have to be indented the correct amount, therefore we will create a small python script to help us do so automatically. Download it and put it in your PATH, say $XENDIR/bin, so rather than writing directly a .ct file, we will be able to write a .ct.tmpl and use the kgenct script to preprocess it.
For example a .ct.tmpl equivalent to the .ct file we were using before would be the following:
storage:
files:
- filesystem: "root"
path: "/etc/hostname"
mode: 0644
contents:
inline: etcd-node-1
- path: /etc/ntp.conf
filesystem: root
mode: 0644
contents:
inline: |
server 192.168.100.1
restrict default nomodify nopeer noquery limited kod
restrict 127.0.0.1
restrict [::1]
systemd:
units:
- name: systemd-timesyncd.service
mask: true
- name: ntpd.service
enable: true
passwd:
users:
- name: core
ssh_authorized_keys:
- ssh-rsa -###/root/.ssh/id_rsa.pub
highlighted is the only changed line, anything starting with -### will be inserted in that position in the file (after being joined together, if it was multiple lines originally), while as we’ll see later anything starting with |### will instead be inserted leaving the multiple lines separate, but indenting it correctly.
For example if we had this sample testnode.ct.tmpl file
storage:
files:
- filesystem: "root"
path: "/etc/ssl/certs/server.pem"
contents:
inline: |
|###certs/etcd-node-1-server.pem
passwd:
users:
- name: core
ssh_authorized_keys:
- -###/root/.ssh/id_rsa.pub
and executed the script we would see the following
-rwxr-xr-x 1 root root 1183 Dec 12 17:06 /storage/xen/bin/kgenct
storage: files: - filesystem: "root" path: "/etc/ssl/certs/server.pem" contents: inline: | -----BEGIN CERTIFICATE----- MIIDnDCCAoSgAwIBAgIUdMuMi7eKOEBYsKXKoLGUoN6oCgswDQYJKoZIhvcNAQEL ....... FLDgVOAyGJw2rJ31mTbmAA== -----END CERTIFICATE----- passwd: users: - name: core ssh_authorized_keys: - ssh-rsa AAAAB3N........
Note the kgen refresh target will automatically run kgenct if you have a .ct.tmpl file for the current node and it is newer than the .ct file (or if the .ct file does not exist).
We are now ready to actually configure etcd, this is the block we need to add to our configuration file
etcd:
name: etcd-node-1
listen_client_urls: https://192.168.100.11:2379
advertise_client_urls: https://192.168.100.11:2379
listen_peer_urls: https://192.168.100.11:2380
initial_advertise_peer_urls: https://192.168.100.11:2380
initial_cluster: etcd-node-1=https://192.168.100.11:2380,etcd-node-2=https://192.168.100.12:2380,etcd-node-3=https://192.168.100.13:2380
initial_cluster_token: etcd-token
initial_cluster_state: new
as you can see we are declaring we are running etcd on our node, etcd-node-1, and set up the various endpoints we will listen to and advertise to our peers. We are also saying the initial cluster is our 3-node cluster, and the endpoint addresses to use.
The actual certificates that etcd will use are going to be defined in a systemd drop-in via some environmental variables that etcd itself will reference when starting up
systemd:
units:
- name: etcd-member.service
enabled: true
dropins:
- name: 30-certs.conf
contents: |
[Service]
Environment="ETCD_CERT_FILE=/etc/ssl/certs/etcd/server.pem"
Environment="ETCD_KEY_FILE=/etc/ssl/certs/etcd/server-key.pem"
Environment="ETCD_TRUSTED_CA_FILE=/etc/ssl/certs/etcd/ca.pem"
Environment="ETCD_CLIENT_CERT_AUTH=true"
Environment="ETCD_PEER_CERT_FILE=/etc/ssl/certs/etcd/peer.pem"
Environment="ETCD_PEER_KEY_FILE=/etc/ssl/certs/etcd/peer-key.pem"
Environment="ETCD_PEER_TRUSTED_CA_FILE=/etc/ssl/certs/etcd/ca.pem"
Environment="ETCD_PEER_CLIENT_CERT_AUTH=true"
this will be tied to the etcd service (which is called etcd-member.service) as a systemd drop-in as discussed here
Note we could also have added the etcd certificates to the configuration itself as this page mentions, however I thought it would be good to show systemd drop-ins in use as they can be quite useful in modifying your system by customizing systemd services. This does depend on your particular transpiler version, as it might not recognize some of these options.
These environmental variables will tell etcd where the certificates that should be used for its operation are located. These certificate files will need to be put in the image, and this can be done simply via ignition directives just like when we created our own /etc/hostname.
Note that certificates have to be stored in /etc/ssl/certs because other directories are not made available to etcd via etcdwrapper, you can see this by looking at the source available at the time of writing, so if you had put the certificates in your own directory you would be surprised when etcd started up saying it couldn’t find them at that path.
Given all this here is the .ct.tmpl file we will be using for our nodes, the highlighted lines are the node-specific lines, all other lines are the same for node-1.ct.tmpl, node-2.ct.tmpl and node-3.ct.tmpl, so when copying the .ct file to create the other node copies make sure you change node-1 to node-2 / 3 and 192.168.100.1 to .2 / .3 in the lines highlighted here.
storage:
directories:
- filesystem: "root"
path: "/etc/ssl/certs/etcd"
mode: 0750
user:
name: "etcd"
group:
name: "root"
files:
- filesystem: "root"
path: "/etc/hostname"
mode: 0644
contents:
inline: etcd-node-1
- filesystem: "root"
path: "/etc/ssl/certs/etcd/ca.pem"
mode: 0640
user:
name: "etcd"
group:
name: "root"
contents:
inline: |
|###certs/etcd-ca.pem
- filesystem: "root"
path: "/etc/ssl/certs/etcd/server.pem"
mode: 0640
user:
name: "etcd"
group:
name: "root"
contents:
inline: |
|###certs/etcd-node-1-server.pem
- filesystem: "root"
path: "/etc/ssl/certs/etcd/server-key.pem"
mode: 0600
user:
name: "etcd"
group:
name: "root"
contents:
inline: |
|###certs/etcd-node-1-server-key.pem
- filesystem: "root"
path: "/etc/ssl/certs/etcd/peer.pem"
mode: 0640
user:
name: "etcd"
group:
name: "root"
contents:
inline: |
|###certs/etcd-node-1-peer.pem
- filesystem: "root"
path: "/etc/ssl/certs/etcd/peer-key.pem"
mode: 0600
user:
name: "etcd"
group:
name: "root"
contents:
inline: |
|###certs/etcd-node-1-peer-key.pem
- filesystem: "root"
path: "/etc/ssl/certs/etcd/client.pem"
mode: 0640
user:
name: "etcd"
group:
name: "root"
contents:
inline: |
|###certs/etcd-node-1-client.pem
- filesystem: "root"
path: "/etc/ssl/certs/etcd/client-key.pem"
mode: 0600
user:
name: "etcd"
group:
name: "root"
contents:
inline: |
|###certs/etcd-node-1-client-key.pem
- path: /etc/ntp.conf
filesystem: root
mode: 0644
contents:
inline: |
server 192.168.100.1
restrict default nomodify nopeer noquery limited kod
restrict 127.0.0.1
restrict [::1]
passwd:
users:
- name: core
ssh_authorized_keys:
- -###/root/.ssh/id_rsa.pub
etcd:
name: etcd-node-1
listen_client_urls: https://192.168.100.11:2379
advertise_client_urls: https://192.168.100.11:2379
listen_peer_urls: https://192.168.100.11:2380
initial_advertise_peer_urls: https://192.168.100.11:2380
initial_cluster: etcd-node-1=https://192.168.100.11:2380,etcd-node-2=https://192.168.100.12:2380,etcd-node-3=https://192.168.100.13:2380
initial_cluster_token: etcd-token
initial_cluster_state: new
systemd:
units:
- name: systemd-timesyncd.service
mask: true
- name: ntpd.service
enable: true
- name: etcd-member.service
enabled: true
dropins:
- name: 30-certs.conf
contents: |
[Service]
Environment="ETCD_CERT_FILE=/etc/ssl/certs/etcd/server.pem"
Environment="ETCD_KEY_FILE=/etc/ssl/certs/etcd/server-key.pem"
Environment="ETCD_TRUSTED_CA_FILE=/etc/ssl/certs/etcd/ca.pem"
Environment="ETCD_CLIENT_CERT_AUTH=true"
Environment="ETCD_PEER_CERT_FILE=/etc/ssl/certs/etcd/peer.pem"
Environment="ETCD_PEER_KEY_FILE=/etc/ssl/certs/etcd/peer-key.pem"
Environment="ETCD_PEER_TRUSTED_CA_FILE=/etc/ssl/certs/etcd/ca.pem"
Environment="ETCD_PEER_CLIENT_CERT_AUTH=true"
Note if you would like to use a specific version of etcd you can add to the drop-in above a line like Environment=“ETCD_IMAGE_TAG=v3.2.11” for example, you can see the available tags here
after having created the three .ct.tmpl files with the respective node values, we can simply refresh the nodes and start them up. We will also recreate all the certificates just to show a completely-from-scratch start.
grub.cfg set to:set linux_append="coreos.config.url=http://192.168.100.1/etcd/node-1.json" Removed /etc/machine-id for systemd units refresh Creating the transpile file from the template node-1.ct.tmpl Transpiling node-1.ct and adding it to nginx Creating coreos/first_boot grub.cfg set to:set linux_append="coreos.config.url=http://192.168.100.1/etcd/node-2.json" Removed /etc/machine-id for systemd units refresh Creating the transpile file from the template node-2.ct.tmpl Transpiling node-2.ct and adding it to nginx Creating coreos/first_boot grub.cfg set to:set linux_append="coreos.config.url=http://192.168.100.1/etcd/node-3.json" Removed /etc/machine-id for systemd units refresh Creating the transpile file from the template node-3.ct.tmpl Transpiling node-3.ct and adding it to nginx Creating coreos/first_boot
[1] 18724 [2] 18725 [3] 18726 Parsing config from node-1.cfg Parsing config from node-3.cfg Parsing config from node-2.cfg
● etcd-member.service - etcd (System Application Container) Loaded: loaded (/usr/lib/systemd/system/etcd-member.service; enabled; vendor preset: enabled) Drop-In: /etc/systemd/system/etcd-member.service.d └─20-clct-etcd-member.conf, 30-certs.conf Active: activating (start) since Wed 2017-12-13 01:48:21 UTC; 1min 10s ago Docs: https://github.com/coreos/etcd Process: 678 ExecStartPre=/usr/bin/rkt rm --uuid-file=/var/lib/coreos/etcd-member-wrapper.uuid (code=exited, status=254) Process: 654 ExecStartPre=/usr/bin/mkdir --parents /var/lib/coreos (code=exited, status=0/SUCCESS) Main PID: 730 (rkt) Tasks: 8 (limit: 32768) Memory: 142.7M CPU: 2.151s CGroup: /system.slice/etcd-member.service └─730 /usr/bin/rkt run --uuid-file-save=/var/lib/coreos/etcd-member-wrapper.uuid --trust-keys-from-https --mount volume=coreos-systemd-dir,target=/run/systemd/system --volume coreos-systemd-dir,ki Dec 13 01:49:11 etcd-node-1 etcd-wrapper[730]: Downloading signature: 0 B/473 B Dec 13 01:49:11 etcd-node-1 etcd-wrapper[730]: Downloading signature: 473 B/473 B Dec 13 01:49:11 etcd-node-1 etcd-wrapper[730]: Downloading signature: 473 B/473 B Dec 13 01:49:11 etcd-node-1 etcd-wrapper[730]: Downloading ACI: 0 B/12.9 MB Dec 13 01:49:11 etcd-node-1 etcd-wrapper[730]: Downloading ACI: 8.19 KB/12.9 MB Dec 13 01:49:12 etcd-node-1 etcd-wrapper[730]: Downloading ACI: 1.9 MB/12.9 MB Dec 13 01:49:13 etcd-node-1 etcd-wrapper[730]: Downloading ACI: 8.63 MB/12.9 MB Dec 13 01:49:14 etcd-node-1 etcd-wrapper[730]: Downloading ACI: 12.9 MB/12.9 MB Dec 13 01:49:16 etcd-node-1 etcd-wrapper[730]: image: signature verified: Dec 13 01:49:16 etcd-node-1 etcd-wrapper[730]: Quay.io ACI Converter (ACI conversion signing key) <support@quay.io>
● etcd-member.service - etcd (System Application Container) Loaded: loaded (/usr/lib/systemd/system/etcd-member.service; enabled; vendor preset: enabled) Drop-In: /etc/systemd/system/etcd-member.service.d └─20-clct-etcd-member.conf, 30-certs.conf Active: active (running) since Wed 2017-12-13 01:49:39 UTC; 2min 27s ago Docs: https://github.com/coreos/etcd Process: 678 ExecStartPre=/usr/bin/rkt rm --uuid-file=/var/lib/coreos/etcd-member-wrapper.uuid (code=exited, status=254) Process: 654 ExecStartPre=/usr/bin/mkdir --parents /var/lib/coreos (code=exited, status=0/SUCCESS) Main PID: 730 (etcd) Tasks: 8 (limit: 32768) Memory: 131.0M CPU: 3.059s CGroup: /system.slice/etcd-member.service └─730 /usr/local/bin/etcd --name=etcd-node-1 --listen-peer-urls=https://192.168.100.11:2380 --listen-client-urls=https://192.168.100.11:2379 --initial-advertise-peer-urls=https://192.168.100.11:23 Dec 13 01:49:39 etcd-node-1 systemd[1]: Started etcd (System Application Container). Dec 13 01:49:39 etcd-node-1 etcd-wrapper[730]: 2017-12-13 01:49:39.514001 I | embed: ready to serve client requests Dec 13 01:49:39 etcd-node-1 etcd-wrapper[730]: 2017-12-13 01:49:39.514544 I | embed: serving client requests on 192.168.100.11:2379 Dec 13 01:49:40 etcd-node-1 etcd-wrapper[730]: 2017-12-13 01:49:40.030318 I | rafthttp: peer 521d9cc310c2aecb became active Dec 13 01:49:40 etcd-node-1 etcd-wrapper[730]: 2017-12-13 01:49:40.030722 I | rafthttp: established a TCP streaming connection with peer 521d9cc310c2aecb (stream Message reader) Dec 13 01:49:40 etcd-node-1 etcd-wrapper[730]: 2017-12-13 01:49:40.041760 I | rafthttp: established a TCP streaming connection with peer 521d9cc310c2aecb (stream MsgApp v2 writer) Dec 13 01:49:40 etcd-node-1 etcd-wrapper[730]: 2017-12-13 01:49:40.048895 I | rafthttp: established a TCP streaming connection with peer 521d9cc310c2aecb (stream MsgApp v2 reader) Dec 13 01:49:40 etcd-node-1 etcd-wrapper[730]: 2017-12-13 01:49:40.073238 I | rafthttp: established a TCP streaming connection with peer 521d9cc310c2aecb (stream Message writer) Dec 13 01:49:40 etcd-node-1 etcd-wrapper[730]: 2017-12-13 01:49:40.179703 N | etcdserver/membership: set the initial cluster version to 3.1 Dec 13 01:49:40 etcd-node-1 etcd-wrapper[730]: 2017-12-13 01:49:40.180121 I | etcdserver/api: enabled capabilities for version 3.1
Dec 04 23:53:46 etcd-node-1 etcd-wrapper[754]: 2017-12-04 23:53:46.171283 I | raft: 7c35e6112f639de0 received MsgVoteResp from 7c35e6112f639de0 at term 9 Dec 04 23:53:46 etcd-node-1 etcd-wrapper[754]: 2017-12-04 23:53:46.171556 I | raft: 7c35e6112f639de0 [logterm: 1, index: 3] sent MsgVote request to 521d9cc310c2aecb at term 9 Dec 04 23:53:46 etcd-node-1 etcd-wrapper[754]: 2017-12-04 23:53:46.171826 I | raft: 7c35e6112f639de0 [logterm: 1, index: 3] sent MsgVote request to aef3e78ed8950e34 at term 9 Dec 04 23:53:46 etcd-node-1 etcd-wrapper[754]: 2017-12-04 23:53:46.219826 W | rafthttp: health check for peer 521d9cc310c2aecb could not connect: x509: cannot validate certificate for 192.168.100.13 because Dec 04 23:53:46 etcd-node-1 etcd-wrapper[754]: 2017-12-04 23:53:46.227869 W | rafthttp: health check for peer aef3e78ed8950e34 could not connect: x509: cannot validate certificate for 192.168.100.12 because Dec 04 23:53:47 etcd-node-1 etcd-wrapper[754]: 2017-12-04 23:53:47.970704 I | raft: 7c35e6112f639de0 is starting a new election at term 9
in order to debug any certificate issues, you can use openssl to try to establish a connection to the other node’s etcd directly
if there are any certificate issues openssl should let you know.
Let’s now make sure our etcd cluster is up and running correctly, and that we can run some commands on it. Note with the configuration above we are not listening on 127.0.0.1, and we have to use certificates to connect, so it makes things easier to create a couple of aliases that set the parameters for us.
cluster may be unhealthy: failed to list members Error: client: etcd cluster is unavailable or misconfigured; error #0: dial tcp 127.0.0.1:2379: getsockopt: connection refused ; error #1: dial tcp 127.0.0.1:4001: getsockopt: connection refused error #0: dial tcp 127.0.0.1:2379: getsockopt: connection refused error #1: dial tcp 127.0.0.1:4001: getsockopt: connection refused
member 521d9cc310c2aecb is healthy: got healthy result from https://192.168.100.13:2379 member 7c35e6112f639de0 is healthy: got healthy result from https://192.168.100.11:2379 member aef3e78ed8950e34 is healthy: got healthy result from https://192.168.100.12:2379 cluster is healthy
Hello
Hello
{"action":"get","node":{"key":"/message","value":"Hello","modifiedIndex":12,"createdIndex":12}}
{"action":"get","node":{"key":"/message","value":"Hello","modifiedIndex":12,"createdIndex":12}}
PrevNode.Value: Hello
60 / 60 Boooo...oooooooooooooooo! 100.00%1m0s PASS: Throughput is 150 writes/s PASS: Slowest request took 0.145293s PASS: Stddev is 0.021090s PASS
60 / 60 Boooo...oooooooooooooooo! 100.00%1m0s FAIL: Throughput too low: 37 writes/s Slowest request took too long: 2.187643s Stddev too high: 0.304261s FAIL
521d9cc310c2aecb, started, etcd-node-3, https://192.168.100.13:2380, https://192.168.100.13:2379 7c35e6112f639de0, started, etcd-node-1, https://192.168.100.11:2380, https://192.168.100.11:2379 aef3e78ed8950e34, started, etcd-node-2, https://192.168.100.12:2380, https://192.168.100.12:2379
+---------------------+------------------+---------+---------+-----------+-----------+------------+ | ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX | +---------------------+------------------+---------+---------+-----------+-----------+------------+ | 192.168.100.11:2379 | 7c35e6112f639de0 | 3.1.10 | 22 MB | true | 105 | 9034 | | 192.168.100.12:2379 | aef3e78ed8950e34 | 3.1.10 | 22 MB | false | 105 | 9034 | | 192.168.100.13:2379 | 521d9cc310c2aecb | 3.1.10 | 22 MB | false | 105 | 9034 | +---------------------+------------------+---------+---------+-----------+-----------+------------+
[1] 18062 [2] 18063 [3] 18064 Shutting down domain 8 Shutting down domain 7 Shutting down domain 9 [1] Done xl shutdown etcd-node-1 [2]- Done xl shutdown etcd-node-2 [3]+ Done xl shutdown etcd-node-3
First of all, note how the configuration we gave to Ignition above ended up into the node’s etcd systemctl unit, you can see this by printing out the unit file from the console inside the node
# /usr/lib/systemd/system/etcd-member.service [Unit] Description=etcd (System Application Container) Documentation=https://github.com/coreos/etcd Wants=network.target Conflicts=etcd.service Conflicts=etcd2.service [Service] Type=notify Restart=on-failure RestartSec=10s TimeoutStartSec=0 LimitNOFILE=40000 Environment="ETCD_IMAGE_TAG=v3.1.10" Environment="ETCD_NAME=%m" Environment="ETCD_USER=etcd" Environment="ETCD_DATA_DIR=/var/lib/etcd" Environment="RKT_RUN_ARGS=--uuid-file-save=/var/lib/coreos/etcd-member-wrapper.uuid" ExecStartPre=/usr/bin/mkdir --parents /var/lib/coreos ExecStartPre=-/usr/bin/rkt rm --uuid-file=/var/lib/coreos/etcd-member-wrapper.uuid ExecStart=/usr/lib/coreos/etcd-wrapper $ETCD_OPTS ExecStop=-/usr/bin/rkt stop --uuid-file=/var/lib/coreos/etcd-member-wrapper.uuid [Install] WantedBy=multi-user.target # /etc/systemd/system/etcd-member.service.d/20-clct-etcd-member.conf [Service] ExecStart= ExecStart=/usr/lib/coreos/etcd-wrapper $ETCD_OPTS \ --name="etcd-node-1" \ --listen-peer-urls="https://192.168.100.11:2380" \ --listen-client-urls="https://192.168.100.11:2379" \ --initial-advertise-peer-urls="https://192.168.100.11:2380" \ --initial-cluster="etcd-node-1=https://192.168.100.11:2380,etcd-node-2=https://192.168.100.12:2380,etcd-node-3=https://192.168.100.13:2380" \ --initial-cluster-state="new" \ --initial-cluster-token="etcd-token" \ --advertise-client-urls="https://192.168.100.11:2379" # /etc/systemd/system/etcd-member.service.d/30-certs.conf [Service] Environment="ETCD_CERT_FILE=/etc/ssl/certs/etcd/server.pem" Environment="ETCD_KEY_FILE=/etc/ssl/certs/etcd/server-key.pem" Environment="ETCD_TRUSTED_CA_FILE=/etc/ssl/certs/etcd/ca.pem" Environment="ETCD_CLIENT_CERT_AUTH=true" Environment="ETCD_PEER_CERT_FILE=/etc/ssl/certs/etcd/peer.pem" Environment="ETCD_PEER_KEY_FILE=/etc/ssl/certs/etcd/peer-key.pem" Environment="ETCD_PEER_TRUSTED_CA_FILE=/etc/ssl/certs/etcd/ca.pem" Environment="ETCD_PEER_CLIENT_CERT_AUTH=true"
as you can see the built-in etcd systemd unit is extended by two additional .conf files created by Ignition, one with the startup command and one with the certificates to be used via the drop-in we created.
Let’s now continue to the next part of the guide