VOOZH about

URL: https://dev.to/irfanmiral/the-ansible-playbook-i-run-on-every-new-server-5fkp

⇱ The Ansible Playbook I Run on Every New Server - DEV Community


I've written before about the checklist I run through on every new server: a non-root user, key-only SSH, a default-deny firewall, Fail2ban, unattended upgrades. Doing that by hand takes under an hour and isn't hard. But "under an hour, by hand, for every server" adds up fast once you're managing more than two or three of them, and manual steps are exactly where small inconsistencies creep in, one server gets MaxAuthTries 3 and another doesn't, because whoever set it up that day was in a hurry.

The fix is turning the checklist into a playbook. Same steps, same order, every time, and idempotent enough that running it again on a server that's already configured changes nothing.

The playbook

---
- name: Baseline hardening for a new server
 hosts: new_servers
 become: true

 vars:
 admin_user: deploy
 ssh_public_key: "{{lookup('file','~/.ssh/id_ed25519.pub')}}"

 tasks:
 - name: Create admin user
 user:
 name: "{{admin_user}}"
 groups: sudo
 shell: /bin/bash
 create_home: true

 - name: Add SSH key for admin user
 authorized_key:
 user: "{{admin_user}}"
 key: "{{ssh_public_key}}"

 - name: Harden SSH config (overrides cloud-init defaults)
 copy:
 dest: /etc/ssh/sshd_config.d/99-hardening.conf
 content: |
 PermitRootLogin no
 PasswordAuthentication no
 MaxAuthTries 3
 mode: '0644'
 notify: restart sshd

 - name: Install UFW and Fail2ban
 apt:
 name: [ufw, fail2ban]
 state: present
 update_cache: true

 - name: Configure UFW defaults
 ufw:
 direction: "{{item.direction}}"
 policy: "{{item.policy}}"
 loop:
 - { direction: incoming, policy: deny }
 - { direction: outgoing, policy: allow }

 - name: Allow required ports
 ufw:
 rule: allow
 port: "{{item}}"
 proto: tcp
 loop: ['22', '80', '443']

 - name: Enable UFW
 ufw:
 state: enabled

 - name: Enable Fail2ban
 systemd:
 name: fail2ban
 enabled: true
 state: started

 - name: Install unattended-upgrades
 apt:
 name: unattended-upgrades
 state: present

 handlers:
 - name: restart sshd
 systemd:
 name: ssh
 state: restarted

Running it

A fresh server gets added to the inventory and the playbook runs against just that host:

ansible-playbook -i inventory.ini baseline.yml -l new-server-01 -u root

The first run must connect as root — it's the only user available on a fresh server. Once the playbook completes, root login is disabled and deploy is your entry point for every run after:

ansible-playbook -i inventory.ini baseline.yml -l new-server-01 -u deploy

The first run does all the work, creates the user, locks down SSH, sets up the firewall. The handler only restarts sshd if the config actually changed, so the very first run is the only one where that happens, every run after is a no-op confirmation that nothing has drifted.

Why idempotency is the actual point

The real value here isn't the time saved on day one, typing the commands manually isn't slow. It's that six months later, when I'm not sure whether a particular server got the full treatment or was set up in a rush during an incident, I can run the playbook again and find out. If everything's already in place, Ansible reports zero changes and I move on. If something's missing, it gets fixed on the spot, with no need to remember which of the four or five manual steps was skipped.

This playbook is intentionally small. It doesn't install application stacks or configure anything project-specific, it's the floor every server stands on before anything else gets layered on top. Keeping it separate from application playbooks means it stays stable, and a baseline that doesn't change often is one you can trust without re-reading it every time.


Originally published at irfanmiral.com

Need help with your infrastructure? See my services or get in touch.