sweetness.hmmz.org

Operon: Extreme Performance For Ansible

I'm very excited to unveil Operon, a high performance replacement for Ansible® Engine, tailored for large installations and offered by subscription. Operon runs your existing playbooks, modules, plug-ins and third party tools without modification using an upgraded engine, dramatically increasing the practical number of nodes addressable in a single run, and potentially saving hours on every invocation.

Operon can be installed independently or side-by-side with Ansible Engine, enabling it to be gradually introduced to your existing projects or employed on a per-run basis.

Here is the runtime for 416 tasks of common.yml from DebOps 0.7.2 deployed via SSH:

Operon reduces runtime by around 60% compared to Ansible for a single node, but things really heat up for large runs. See how runtime scales using a 24 GiB, 8 core Xeon E5530 deploying to Google Cloud VMs over an 18 ms SSH connection:

Each run executed 416 tasks per node, including loop items. In the 1,024 node run, 490,496 tasks executed in 54 minutes, giving an average throughput of 151 tasks per second. Linear scaling is apparent, with just under 4x time increase moving from 256 to 1,024 nodes.

The 256 node Ansible run was cancelled following a lengthy period with no output, after many re-runs to iteratively reduce forks from 40 to 10, so Ansible would not exceed RAM. A 13 fork run may have succeeded, but further attempts were abandoned having consumed two days worth of compute time.

In the final run, Ansible completed 89% of tasks in 6h 13m prior to cancellation:

256 Nodes, DebOps common.yml

Operon deployed to all nodes in parallel for every run presented. Operon has imperceptible overhead executing 1,024 forks given 8 cores and cleanly scales to at least 6,144 given 24 cores. Had these results been recorded using 16 cores rather than 8, we expect the 1,024 node run would complete in 27 minutes rather than 54 minutes.

Memory usage is highly predictable and significantly decoupled from forks. With 256 forks, Operon uses 4x less RAM than Ansible uses for 10 forks, while consuming at least 15x less controller CPU time to achieve the same outcome.

This graph is crooked as the 64 node Ansible run executed with 40 forks, while the 256 node run executed with 10 forks. Ansible required 1.6 GiB per fork for the 256 node run, placing a severe restraint on achievable parallelism regardless of available RAM.

Operon is the progression of a design approach first debuted in Mitogen for Ansible. It inherits massive low-level efficiency improvements from that work, already depended on by thousands of users:

Beyond software

Performance is a secondary effect of a culture shift towards stronger user friendliness, compatibility and cost internalization. There is a lot to reveal here, but to offer a taste of what's planned, I'm pleased to announce a forwards-compatible playbook syntax guarantee, in addition to restoration of specific Ansible Engine constructs marked deprecated.

include:

- include: "i-will-always-work.yml"

"with" loops

- debug: msg={{item}}
  with_items: ["i", "will", "always", "work"]

"squash actions"

- apt:
    name: "{{item}}"
  with_items: ["i", "will",
               "always", "work"]

hyphens in group names

  $ cat hosts
  [i-will-always-work.us.mycorp.com]
  host1

hash merging

  # I will always work
  [defaults]
  hash_behaviour = merge

The Ansible 2.9-compatible syntax Operon ships will always be supported, and future syntax deprecations in Ansible Engine do not apply in Operon. Changes like these harm working configurations without improving capability, and are a major source of error-prone labour during upgrades.

Over time this guarantee will progressively extend to engine semantics and outwards.

How can I get this?

Operon is initially distributed with support from Network Genomics, backed by experience and dedication to service unavailable elsewhere. If your team are gridlocked by deployments or fatigued by years of breaking upgrades, consider requesting an evaluation, and don't hesitate to drop me an e-mail with any questions and concerns.

Software is always better in the open, so a public release will happen when some level of free support can be provided. Subscribe to the operon-announce mailing list to learn about future releases.

Will Operon help Windows performance?

Yes. If you're struggling with performance deploying to Windows, please get in touch.

Will Operon help network device performance?

Yes. Operon features an architectural redesign that extends far beyond the transport layer, and applying to all connection types equally.

Is Operon a fork of Ansible?

No. Operon is an incremental rewrite of the engine, a small component of around 60k code lines, of which around a quarter are replaced. Every Ansible installation includes around 715k lines, of which the vast majority is independently maintained by the wider Ansible community, just as Operon is.

Will Operon help improve Ansible Engine?

Yes. Operon is already promoting improvement within Ansible Engine, and since it remains an upstream, an incentive exists to contribute code upstream where practical.

Is Operon free software?

Yes. Operon is covered by same GPL license that covers Ansible, and you are free to make use of the code to the full extent of that license.

Does Operon break compatibility?

No. Operon does not break compatibility with the standard module collection, plug-in interfaces, or the surrounding Ansible ecosystem, and never plans to. Compatibility is a primary deliverable, including to keep pace with future improvements, and backwards compatibility such as improved playbook syntax stability.

I target only one node, what can Operon do for me?

Operon will help ensure the continued marketability of skills you have heavily invested in. It offers a powerful new flexibility that previously could not exist: your freedom to choose an engine. Whether you use it directly or not, you already benefit from Operon.

David

October 28, 2019

Mitogen v0.2.8 released

Mitogen for Ansible v0.2.8 has been released. This version (finally) supports Ansible 2.8, comes with a supercharged replacement fetch module, and includes roughly 85% of what is needed to implemement fully asynchronous connect.

As usual a huge slew of fixes are included. This is a bumper release, running to over 20k lines of diff. Get it while it's hot, and as always, bug reports are welcome!
- August 18, 2019