Tuesday, November 27, 2018

TIL: Ansible engine raw module "needs" gather_facts: no

Okay, the title says it all, but let's unpack that.

Ansible Playbooks are instructions for running ansible modules against a target host. They can be very very simple. Here's one of the simplest:
---
- name: A very simple playbook
  hosts: all
  tasks:
    - name: pingo
      ping:

This merely runs the ansible "ping" module on "all" hosts. (Ie, whatever hosts are passed in on the command line when this playbook is called.)

A note about the ping module. It is not the normal networking definition of "ping". Network folk will be accustomed to using "ping" to send an ICMP packet to a node (at which point the node would typically send an ICMP ack.) Rather, the ansible module "ping" is a check that the node is up and that the basic needs of ansible are supported on the node, i.e., python is installed.

So... the inquiring mind asks what do you do in a situation if python is NOT installed? Can you still take advantage of some of Ansible? But of course.

The ansible "raw" module allows you to basically do something like the following:
# raw: uptime
# ssh targetnode and_execute_this_command
ssh target.example.net uptime

So here we'd get the uptime of the target node (assuming it was running ssh, we had login authority, and that uptime was installed and in the default path of the effective user.)

So, it seems like it would be straightforward to create an ansible playbook that takes advantage of the raw module.

---
- name: raw uptime
  hosts: all
  tasks:
  - name: Run a raw command
    raw: uptime

and here we run into issues. This playbook won't work on a node that doesn't have python installed. (It will work on one that does.) Why is that? Because of the "secret sauce" called fact gathering. Every playbook as it runs, will run the ansible "setup" module to gather facts on the node before running any of the explicit tasks. The setup module is an implicit task and is noted in the module reference, "[t]his module is automatically called by playbooks"

NOTE: I've scattered some handy links within this document so that you can learn more about these. I'd recommend following them and then coming back here after you have familiarized yourself with ansible, modules, ping, raw, setup, and gather_facts.

So, how do we make this work then? If you read the gather_facts link, you probably know that you can bypass it very simply. You set a "gather_facts" to no in your playbook. Consequently you end up with this as the right playbook for a node without python where you want to know the uptime.

---
- name: raw uptime
  hosts: all
  gather_facts: no
  tasks:
  - name: Run a raw command
    raw: uptime

So a simple one line addition.



And how did I get in this situation? One of the most common cloud operating systems (aka cloud images) is one called "cirros". Cirros is a very minimal linux and as such, it does not include python. Moreover, there really isn't an effective way to "add" python to it (though possibly could be done with a staticly built file--I'll leave that as an exercise for the reader.)

CIrros is frequently used in a cloud environment (i.e., OpenStack) to validate that the cloud itself is working well. From within cirros you can login (as it provides transparent credentials) and check on the networking, etc. Basically it's a quick and dirty way to make sure your cloud is operating as intended.

I regularly spin up one or more cirros instances as soon as I build an openstack--whether that be an all-in-one devstack or an entire production cloud. In both cases, cirros is my "go to" tool to validate the cloud. (Thanks Scott.)



... and one more thing, you would normally just run the command uptime using the command module to get the uptime. But doing so requires the python infrastructure ansible relies on. Here's that "normal" or typical way.

---
- name: A very simple command
  hosts: all
  tasks:
    - name: uptime
      command: uptime

and even if you add "gather_facts: no" to it, the cmmand module itself still requires python so you really really need the raw module and the "gather_facts: no" setting.