- In-depth experience supporting Linux servers in large environments with over 500 servers, preferably in highly secure and highly available 24/7 environments. Deep troubleshooting expertise to quickly identify server, network, and application issues, and methodical approach to implementing, testing, and measuring changes to remediate issues. Experience troubleshooting complex application flows which span multiple clusters of systems. Knowledge of common OS and application level settings to optimize system performance
- Experience with common system administration scripting languages such as bash and Python
- In-depth experience using Ansible (or similar tools) to environments with over 500 servers. Experience designing complex static and dynamic inventories, variable structures, and playbooks. Writing custom Ansible roles from scratch with advanced features of Ansible such as Jinja2 templating, complex logic flows, dynamic role/task includes, and handlers. Strong focus on ensuring idempotency & reusability and following best practices where they make sense
- In-depth experience creating well organized modules with Terraform to ensure cloud provisioning automation can be easily reused. Experience creating image build pipelines with Packer
- Deep understanding of networking and security best practices
- Experience with Hashicorp Vault or other secure storage tools
- Experience working in environments with high security requirements
Nice to have, but not required
- Ansible role testing with Molecule (or similar tools) and Ansible module development experience
- Knowledge of, or experience with, Proof-of-Work and Proof-of-Stake decentralized consensus mechanisms used in blockchains
- Experience running applications on Kubernetes
- Experience creating CI/CD pipelines from scratch to automate infrastructure provisioning and deploy applications (GitHub Actions or similar tools)