A beta is coming soon! Aside from async tasks, the master branch
is looking great. Since last update there have been many features and fixes,
but with important forks in the road ahead, particularly around efficient
support for many-host. Read on..
transfer previously worked by constructing one RPC representing the
complete file, which for large files resulted in an explosion in memory usage
on each machine as the message was enqueued and transferred, with communication
at each hop blocked until the message was delivered. This has required a
rewrite since the original code was written, but a simple solution proved
Today file transfer is all but solved: files are streamed in 128KiB-sized
messages, using a dedicated service that aggregates pending transfers by their
most directly connected stream, serving one file at a time before progressing
to the next transfer. An initial burst of 128KiB chunks is generated to fill a
link with a 1MiB BDP, with
further chunks sent as acknowledgements begin to arrive from the receiver. As
an optimization, files 32KiB or smaller are still delivered in a single RPC,
avoiding one roundtrip in a common scenario.
Compared to sftp(1) or scp(1), the new service has vastly
lower setup overhead (1 RTT vs. 5) and far better safety properties, ensuring
concurrent use of the API by unrelated ansible-playbook runs
cannot create a situation where an inconsistent file may be observed by users,
or a corrupt
file is deployed with no indication a problem exists.
Since file transfer is implemented in terms of Mitogen's message bus, it is
agnostic to Connection Delegation, allowing streaming file transfers between
proxied targets regardless of how the connection is set up.
Some minor problems remain: the scheduler cannot detect a timed out transfer,
risking a cascading hang when Connection Delegation is in use. This is not a
regression compared to previously, as Ansible does not support this operation
mode. In both cases during normal operation, the timeout will eventually be
noticed when the underlying SSH connection times out.
Delegation enables Ansible to use one or more intermediary machines to
reach a target machine or container, with connections and code uploads
deduplicated at each hop in the path. For an Ansible run against many
containers on one target host, only one SSH connection to the target need
exist, and module code need only be uploaded once on that connection.
While not yet complete, this feature exists today and works well, however some
important functionality is still missing. Presently intermediary connection
setup is single threaded, non-Python (i.e. Ansible) module uploads are
duplicated, and the code to infer intermediary connection configurations using
the APIs available in Ansible is.. hairy at best.
Fixing deduplication and single-threaded connection setup entails starting a
service thread pool within each interpreter that will act as an intermediary.
This requires some reworking of the nascent
service framework, also making it easier to use for non-Ansible programs,
and lays the groundwork for Topology-aware File Synchronization.
From the department of surprises, this one is a true classic. Ansible supports
an undocumented (upstream docs patch) but
nonetheless commonly used mechanism for bundling third party modules and
overriding built-in support modules as part of the ZIP file deployed to the
target. It implements this by virtualizing a core Ansible package namespace:
ansible.module_utils, causing what Python finds there to vary on a
per-task basis, and crucially, to have its implementation diverge entirely from
the equivalent import in the Ansible controller process.
It is suffice to say I nearly lost my mind on discovering this "feature", not
due to the functionality it provides, but the manner in which it opts to
provide it. Rather than loading a core package namespace as a regular Python
package using Mitogen's built-in mechanism, every Ansible module must undergo
additional dependency scanning using its unique search path, and any
dependencies found must correctly override existing loaded modules appearing in
the target interpreter's namespace at runtime.
Given Mitogen's intended single-reusable-interpreter design, there is no way to
support this without tempting strange behaviours appearing across tasks whose
ansible.module_utils search path varies. While it is easy
to arrange for ansible.module_utils.third_party_module to be
installed, it is impossible to uninstall it while ensuring every reference to
the previous implementation, including instances of every type defined by it,
are extricated from the reusable interpreter post-execution, which is necessary
if the next module to use the interpreter imports an entirely distinct
implementation of ansible.module_utils.third_party_module.
Today, instead the interpreter forks when an extended or overridden module is
found, and a custom importer is used to implement the overrides. This
introduces an unavoidable inefficiency when the feature it in use, but it is
still far better than always forking, or running the risk of varying
module_utils search paths causing unfixable crashes.
Presently although the container must have Python installed, matching
Ansible's existing behaviour, it occurred to me that when the host machine has
Python installed, there is no reason why Python
needs to exist within the container. This would make a powerful feature
made easy through Mitogen's design, and in a common
use case, would support the ability to run auditing/compliance playbooks
against app containers that were otherwise never customized for use with
Su Become Method Support
Low-hanging fruit from the original crowdfunding plan. Now su(1) may
be used for privilege escalation as easily as sudo(1).
Sudo/Su Connection Types
To support testing and somewhat uncommon use cases where a large number of user
accounts may be targeted for parallel deployment on a small number of
machines, there now exist explicit mitogen_sudo and
mitogen_su connection types that, in combination with Connection
Delegation, allow a single SSH connection to exist to a remote machine while
exposing user accounts as individual (and therefore parallelizable) targets in
This sits somewhere between "hack" and "gorgeous", I really have no idea which,
however it does make it simple to exploit Ansible's parallelism in certain
setups, such as traditional web hosting where each customer exists as a UNIX
account on a small number of machines.
Routing exists and is always enabled for Ansible. This prohibits what was
previously a new communication style available to targets, that, although
ideally benign and potentially very powerful, fundamentally altered Ansible's
security model and risked solution acceptance. It was possible for targets to
send each other messages, and although permission checks occur on reception and
thus should be harmless, represented the ability for otherwise air-gapped
networks to be temporarily bridged for the duration of a run.
Mitogen supports new Blob() and Secret() string wrappers
whose repr() contains a substitute for the actual value. These are
employed in the Ansible extension, ensuring passwords and bulk file transfer
data are no longer logged when verbose output is enabled. The types are
preserved on deserialization, ensuring log messages generated by targets
receive identical treatment.
Ongoing work on the asynchronous task implementation has caused it to evolve
once again, this time to make use of a new subtree
detachment feature in the core library. The new approach is about 70% of
what is needed for the final design, with one major hitch remaining.
Since an asynchronous task must outlive its parent, it must have a copy of
every dependency needed by the module it will execute prior to disconnecting
from the parent. This is exorbitantly fiddly work, interacting with many
aspects including not least custom module_utils, and represents
the last major obstacle in producing a functionally complete extension release.
Industrial grade multiplexing
Mitogen now supports swapping
depending on the host operating system, blasting through the maximum file
descriptor limit of select(2), and ensuring this is no longer a
hindrance for many-target runs. Children initially use the select(2)
multiplexer (tiny and guaranteed available) until they become parents, when the
implementation is transparently swapped for the real deal.
In future some interface tweaks are desirable to make full use of the new
multiplexers: at least epoll(4) supports options that significantly
reduce the system calls necessary to configure it. Although I have not measured
a performance regression due to these calls, their presence is bothersome.
expected growing pains appeared when real multiplexing was implemented. For
testing I adopted a network of VMs running DebOpscommon.yml, with a
quota for up to 500 targets, but so far, it is not possible to approach that
without drowning in the kinks that start to appear. While some of these almost
certainly lie on the Mitogen side, when profiling with only 40 targets enabled,
inefficiencies in Mitogen are buried in the report by extreme inefficiencies
present in Ansible itself.
And with that we reach a nexus: we have almost exhausted what can be
accomplished working from the bottom-up, profiling on a micro scale is no
longer sufficient to meet project goals, while fixing problems identified
through profiling on a macro scale exceeds the project scope. Therefore,
(lightning bolts, wild cackles), a new plan emerges..
Branching for a beta
With the exception of async tasks I consider the master branch to be in
excellent health - for smaller target counts. For larger runs, wider-reaching
work is necessary, but it does not make sense to disrupt the existing design
due to it. Therefore master will be branched with the new branch
kept open for fixes, not least the final pieces of async, while continuing work
in parallel on a new increment.
Vanilla Ansible forks each time it executes a task, with the corresponding
action plug-in gaining control of the main thread until completion, upon which
all state aside from the task result is lost. When running under the extension,
a connection multiplexer process is forked once at startup, and a separate
broker thread exists in each forked task subprocess that connects back to the
connection multiplexer process over a UNIX socket - necessary in the current
design to have a persistent location to manage connections.
The new design comes in the form of a complete reworking of
linear strategy. Today's extension wraps Ansible's strategies while
preserving their process and execution model. To implement the enhancements
above sensibly, additional persistence is required and it becomes necessary to
tackle a strategy implementation head-on.
The old desire for per-CPU connection multiplexers is incorporated, but moves
those multiplexers back into Ansible, much like the pre-crowdfund extension.
The top-level controller process gains a Mitogen broker thread with per-CPU
forked children acting as connection multiplexers, and hosting service threads
on which action plug-ins can sleep. Unlike vanilla Ansible, these processes
exist for the duration of the run rather than per-task.
From the vantage point of only $ncpus processes, it is easy to fix
template precompilation, plug-in path caching, connection caching,
target<->worker affinity, and ensuring task variable generation is
parallelized. Some sizeable obstacles exist, not least:
Liberal shared data structure mutation in the task executor that
must be fixed to handle threading, mostly contained to
Preserving the existing callback plug-in model. Callbacks must always fire
in the top-level process.
Synchronization or serialization overhead, pick one. Either the strategy
logic runs duplicate in each child (requiring coordination with the
top-level process), or it runs once in the parent, and configuration must
be serialized for every task.
Can't this be done upstream?
It should, but I've
experimented and there simply isn't time. If >1 week is reasonable to add missing
documentation, there is no hope real patches will land before full-time
work must conclude. For upstreaming to happen the onus lies with the 20+ strong
permanent team, it's simply not possible to commit unbounded time to land
even trivial changes, a far cry from occasional patches to a privately
At least 16k words have been spent since conversations started around September
2017, and while they bore some fruit over time, few actionable outcomes have
resulted, and the detectable levels of team-originated engagement regarding the
work has been minimal. There is no expectation of fireworks, however it may be
helpful to realize after 3 months no evidence exists of any member testing the
code and experiencing success or failure, let alone a report of such.
It's sufficient to say after so long I find this increasingly troublesome, and
while I cannot hope to understand internal priorities, as an outside
contributor funded by end users, soliciting engagement on a well-documented
enhancement that in some scenarios nets an order of magnitude performance
improvement to a commercial product, some rather basic questions come to mind.
There is a final uneasy aspect to upstreaming, and it is that of being left
with the task of cleaning up, with no guarantee the mess won't simply return.
Some of this code is in an abject (253 LOC, 37 locals)
state (279 LOC, 24 locals) of
sin (306 LOC, 38 locals), for
2018 and in a product less than 72 months old, that has been funded almost
since inception. While I have begun refactoring the strategy plug-in within the
confines of the Mitogen repository, responsibility for benefitting from that
work in mainline rests with others.
A very rough branch exists
for this, and I’m landing volleys of fixes when I have downtime between bigger
pieces of work. Ideally this should have been ready for the end of April, but
it may take a few weeks more.
I originally hoped to have a clear board before starting this, instead it is
being interwoven as busywork when I need a break from whatever else I’m working
Done: multiplexer throughput
The situation has improved massively.
Hybrid TTY/socketpair mode is a
thing and as promised it significantly helps, but just not quite as much as I
Today on a 2011-era Macbook Pro Mitogen can pump an SSH client/daemon at around
13MB/sec, whereas scp in the same configuration hits closer to 19MB/sec. In
the case of SSH, moving beyond this is not possible without a patched SSH
installation, since SSH hard-wires its buffer sizes around 16KB, with no
ability to override them at runtime.
With multiple SSH connections that 13MB should cleanly multiply up, since every
connection can be served in a single IO loop iteration.
A bunch of related performance fixes were landed, including removal of yet
another special case for handling deferred function calls, only taking locks
when necessary, and reducing the frequency of the stream implementations
modifying the status of their descriptors' readability/writeability.
As we’re in the ballpark of existing tools, I’m no longer considering this
as much of a priority as before. There is definitely more low-hanging fruit,
but out-of-the-box behaviour should no longer raise eyebrows.
Done: task isolation
As before, by default each script is compiled once, however it is now
re-executed in a spotless namespace prior to each invocation, working around
any globals/class variable sharing issues that may be present. The cost of this
is negligible, on the order of 100 usec.
When this is insufficient, a mitogen_task_isolation=fork per-task variable
exists to allow explicitly forcing a particular module to run in a new process.
Enabling this by default causes something on the order of a 33% slowdown, which
is much better than expected, but still not good enough to enable forking by
Aside from building up a blacklist of modules that should always be forked,
task isolation is pretty much all done, with just a few performance
regressions remaining to fix in the forking case.
Done: exotic module support
Every style of Ansible module is supported aside from the prehistorical
“module replacer” type. That means today all of these work and are covered by automated tests:
Built-in new-style Python scripts
User-supplied new-style Python scripts
Ancient key=value style input scripts
Statically linked Go programs
Python module support was updated to remove the monkey-patching in use before.
Instead, sys.stdin, sys.stdout and sys.stderr are redirected to
StringIO objects, allowing a much larger variety of custom user scripts to be
run in-process even when they don’t use the new-style Ansible module APIs.
Done: free strategy support
The "free" strategy can now be used by specifying ANSIBLE_STRATEGY=mitogen_free. The mitogen strategy is now an alias of mitogen_linear.
Done: temporary file handling
This should be identical to Ansible’s handling in all cases.
Done: interpreter recycling
An upper bound exists to prevent a remote machine from being spammed with
thousands of Python interpreters, which was previously possible when e.g. using
a with_items loop that templatized become_user.
Once 20 interpreters exist, the extension shuts down the most recently created
interpreter before starting a new one. This strategy isn’t perfect, but it
should suffice to avoid raised eyebrows in most common cases for the time
Done: precise standard IO emulation
Ansible’s complex semantics for when it does/does not merge stdout and
stderr during module runs are respected in every case, including
emulation of extraneous \r characters. This may seem like a tiny and
pointless nit, however it is almost certainly the difference between a tested
real-world playbook succeeding under the extension or breaking horribly.
Done: async tasks
We’re on the third iteration of asynchronous tasks, and I really don’t want to
waste any more time on it. The new implementation works a lot more like
Ansible’s existing implementaion, for as much as that implementation can be
said to “work” at all.
Done: better error messages
Connection errors no longer crash with an inscrutible stack trace, but trigger
Ansible’s internal error handling by raising the right exception types.
Mitogen’s logging integration with the Ansible display framework is much
improved, and errors and warnings correctly show up on the console in red
without having to specify -vvv.
Still more work to do on this when internal RPCs fail, but that’s less likely
to be triggered than a connection error.
New debugging mode
An “emergency” debugging mode has been added, in the form of
MITOGEN_DUMP_THREAD_STACKS=1. When this is present, every interpreter will
dump the stack of every thread into the logging framework every 5 seconds,
allowing hangs to be more easily diagnosed directly from the controller
While adding this, it struck me that there is a really sweet piece of
functionality missing here that would be easy to add – an interactive
debugger. This might turn up in the form of an in-process web server allowing
viewing the full context hierarchy, and running code snippets against remotely
executing stacks, much like Werkzeug’s interactive debugger.
In addition to simply not being my focus recently, a lot of the new
functionality has introduced import statements that impact code running in
the target, and so performance has likely slipped a little from the original
posted benchmarks, most likely during run startup in the presence of a high
I will be back to investigate these problems (and fix those for which no
investigation is required – the module loader!) once all remaining
functionality is stable.
This seemingly simple function has required the greatest deal of thought out of
every issues I’ve encountered so far. The initial problem relates to flow
control, and the absense of any natural mechanism to block a producer (file
server) while intermediary pipe buffers (i.e. the SSH connection) are filled.
Even when flow control exists, an additional problem arises since with Mitogen
there is no guarantee that one SSH connection = one target machine, especially
once connection delegation is implemented. Some kind of bandwidth sharing
mechanism must also exist, without poorly reimplementing the entirety of TCP/IP
in a Python script.
For the initial release I have settled on basic design that should ensure the
available bandwidth is fully utilized, with each upload target having its file
data served on a first-come-first-served basis.
When any file transfer is active, one of the service threads in the associated
connection multiplexer process (the same ones used for setting up connections)
will be dedicated to a long-running loop that monitors every connected stream’s
transmit queue size, enqueuing additional file chunks as the queue drains.
Files are served one-at-a-time to make it more likely that if a run is
interrupted, rather than having every partial file transfer thrown away, at
least a few targets will have received the full file, allowing that copy to be
skipped when the play is restarted.
The initial implementation will almost certainly be replaced eventually, but
this basic design should be sufficient for what is needed today, and should
continue to suffice when connection delegation is implemented.
Testing / CI
The smattering of unit and integration tests that exist are running and
passing under Travis CI. In preparation
for a release, master is considered always-healthy and my development
has moved to a new dmw branch.
I’m taking a “mostly top down” approach to testing, written in the form of
Ansible playbooks, as this gives the widest degree of coverage, ensuring that
high level Ansible behaviour is matched with/without the extension installed.
For each new test written, the result must pass under regular Ansible in
addition to Ansible with the extension.
“Bottom up” type tests are written as needs arise, usually when Ansible’s user
interface doesn’t sufficiently expose whatever is being tested.
Also visible in Travis is a debops_common target: this is running all 255
tasks from DebOpscommon.yml against
a Docker instance. It’s the first in what should be 4-5 similar DebOps jobs,
deploying real software with the final extension.
I have begun exploring integrating the extension with Ansible’s own integration
tests, but it looks likely this is too large a job for Travis. Work here is
This is the first in what I hope will be at least a bi-weekly series to keep
backers up to date on the current state of delivering the Mitogen extension
for Ansible. I’m trying
to use every second I have wisely until every major time risk is taken care
of, so please forgive the knowledge-dump style of this post :)
Well ahead of time. Some exciting new stuff popped up, none of it intractably
I have some fabulous news on funding: in addition to what was already public
on Kickstarter, significant additional funding has become available, enough
that I should be able to dedicate full time to the project for at least
another 10 weeks!
Naturally this has some fantastic implications, including making it
significantly likely that I’ll be able to implement Topology-aware File
I could not commit to this due to worrying Python 3 would become a huge and
destablizing time sink, ruining any chance of delivering more immediately
The missing piece (exception syntax) to support from Python 2.4 all the way to
3.x has been found - it came via an extraordinarily fruitful IRC chat with the
Ansible guys, and was originally implemented in Ansible itself by Marius
Gedminas. With this last piece of the puzzle, the
only bugs left to worry about are renamed imports and the usual bytes/str
battles. Both are trivial to address with strong tests - something already due
for the coming weeks. It now seems almost guaranteed Python 3 will be
completed as part of this work, although I am still holding off on a 100%
commitment until more pressing concerns are addressed.
New Risk: multiplexer throughput
Some truly insane performance bugs have been found and fixed already,
particularly around the stress caused by delivering huge single
messages, however during that work
a new issue was found: IO multiplexer throughput truly sucks for many small
This doesn’t impact things much except in one area: file transfer. While I
haven’t implemented a final solution for file transfer yet, as part of that I
will need to address what (for now) seems a hard single-thread performance
limit: Mitogen’s current IO loop cannot push more than ~300MiB/sec in
128KiB-sized chunks, or to put it another way, best case 3MiB/sec given 100
Single thread performance: the obvious solution is sharding the
multiplexer across multiple processes, and already that was likely required
for completing the multithreaded connect work. This is a straightforward
change that promises to comfortably saturate a Gigabit Ethernet port using a
2011 era Macbook while leaving plenty of room for components further up
(Ansible) and down (ssh) the stack.
TTY layer: I’ve already implemented some fixes for this (increase buffer
sizes, reduce loop iterations), but found some ugly new problems as a result:
the TTY layer in every major UNIX has, at best, around a 4KiB buffer, forcing
many syscalls and loop iterations, and it seems on no OS is this buffer
tunable. Fear not, there is already a kick-ass solution for this
This problem should disappear entirely by the time real file transfer support
is implemented - today the extension is still delivering files as a single
large message. The blocker to fixing that is a missing flow control mechanism
to prevent saturation of the message queue, which requires a little research.
This hopefully isn’t going to be a huge amount of work, and I’ve already got a
bunch of no-brainer yet hacky ways to fix it.
New risk: task isolation
It was only a matter of time, but the first isolation-related bug was
found, due to a class variable in a
built-in Ansible module that persists some state across invocations of the
module’s main() function. I’d been expecting something of this sort, so
already had ideas for solving it when it came up, and really it was quite a
surprise that only one such bug was reported out of all those reports from
The obvious solution is forking a child for each task by
default, however as always the devil is in the
and in many intractable ways forking actually introduces state sharing
problems far deadlier than those it promises to solve, in addition to
introducing a huge (3ms on Xeon) penalty that is needless in most cases.
Basically forking is absolute hell to get right - even for a tiny 2 kLOC
library written almost entirely by one author who wrote his first fork()
call somewhere in the region of 20 years ago, and I’m certain this is liable
to become a support nightmare.
The most valuable de facto protection afforded by fork - memory safety, is
pretty redundant in an almost perfectly memory safe language like Python,
that’s why the language is so popular at all.
Meanwhile forking is needed anyway for robust implementation of asynchronous
tasks, so while implementing it would never have been wasted work, it is not
obvious to me that forking could or should ever become the default mode. It
amounts to a very ripe field for impossible to spot bugs of much harder
classes than the simple solution of running everything in a single process,
where we only need to care about version conflicts, crap monkey patches,
needlessly global variables and memory/resource leaks.
I’m still exploring the solution space for this one, current thinking is
maybe (maybe! this is totally greenfield) something like:
Built-in list of fixups for ridiculously easy to repair bugs, like the
yum_repository example above.
Whitelist for in-process execution any module known (and manually audited)
to be perfectly safe. Common with_items modules like lineinfile easily
fit in this class.
Whitelist for in-process safe but nonetheless leaky modules, such as the
buggy yum_repository module above that simply needs its bytecode
re-executed (100usec) to paper over the bug. Can’t decide whether to keep
this mode or not - or simply merge it with the above mode.
Default to forking (3ms - max 333 with_items/sec) for all unknown bespoke
(user) modules and built-in modules of dubious quality, with a
mitogen_task_isolation variable permitting the mode to be overridden by
the user on a per-task basis. “Oh that one loop is eating 45 minutes? Try
it with mitogen_task_isolation=none”
All the Mitogen-side forking bits are implemented already, and I’m deferring
the Ansible-side bits to be done simultaneous to supporting exotic module
types, since that whole chunk of code needs a rewrite and no point in
rewriting it twice.
Meanwhile whatever the outcome of this work, be assured you will always have
your cake and eat it - this project is all about fixing performance, not
regressing it. I hope this entire topic becomes a tiny implementation detail
in the coming weeks.
On the testing front I was absolutely overjoyed to discover
DebOps by way of a Mitogen bug report. This deserves a
whole article on its own, meanwhile it represents what is likely to be a huge
piece of the testing puzzle.
A big chunk is already implemented in order to fix an unrelated bug! The
default pool size has 16 threads in one process, so there will only be a minor
performance penalty for the first task to run when the number of targets
exceeds 16. Meanwhile, the queue size is adjustable via an environment
variable. I’ll tidy this up later.
Even though it basically already exists, I’m not yet focused on making
multithreaded connect work - including analysing the various performance
weirdness that appears when running Mitogen against multiple targets. These
definitely exist, I just haven’t made time yet to determine whether it’s an
Ansible-side scaling issue or a Mitogen-side issue. Stay tuned and don’t
worry! Multi-target runs are already zippy, and I’m certain any issues found
can be addressed.
At least a full day will be dedicated to nothing but coming up with new attack
scenarios, meanwhile I’m feeling pretty good about security already. The
fabulous Alex Willmer has been busily inventing
new cPickle attack scenarios, and some of them are absolutely fantastically
scary! He’s sitting on at least one exciting new attack that represents a
no-brainer decider on the viability of keeping cPickle or replacing it.
Serialization aside, I’ve been busy comparing Ansible’s existing security
model to what the extension provides today, and have at least identified
unidirectional routing mode as a
must-have for delivering the extension. Regarding that, it is possible to have
a single playbook safely target 2 otherwise completely partitioned networks.
Today with Mitogen, one network could route messages towards workers in the
other network using the controller as a bridge. While this should be harmless
(given existing security mitigations), it still introduces a scary capability
for an attacker that shouldn’t exist.
Really screwed up on planning here - turns out Ansible on Windows does not use
Python whatsoever, and so implementing the support in Mitogen would mean
increasing the installation requirements for Windows targets. That’s stupid,
it violates Ansible’s zero-install design and was explicitly a non-goal from
the get go.
Meanwhile WinRM has extremely poor options for bidirectional IO, and likely
viable Mitogen support for Windows will include introducing a, say,
SSL-encrypted reversion connection from the target machine in order to get
I will shortly be polling everyone who has pledged towards the project, and if
nobody speaks up to save Windows, it’s being pushed to the back of the queue.
A big, big thanks, once again!
It goes without saying but none of this work has been a lone effort, starting
from planning, article review, funding, testing, and an endless series of
suggestions, questions and recommendations coming from so many people. Thanks
to everyone, whether you contributed a single $1 or a single typo bug report.
Super busy, but also super on target! Until next time..
It’s been an incredibly intense first week crowdfunding the Mitogen
extension for Ansible involving far more effort than anticipated, where I
have worked almost flat out from waking until the early hours just to ensure
any queries are answered thoroughly. I cannot complain, because it has been so
much fun that I’d change almost nothing of the experience, and already the
campaign has reached 46% from the exposure it received.
As a recap Mitogen is a library for writing distributed programs that require
zero deployment, with the prototype extension implementing an architectural
change that vastly improves Ansible’s performance in common scenarios,
laying a framework to extend this advantage far beyond simple overhead
A great deal of work has simply been staying on top of bug reports and ensuring
experiences with the prototype are solid – for each report from one tester, we
can assume 10 more hit the same bug but did not or could not report it.
Of the many reports received, I have addressed almost all of them promptly.
one report via Reddit
of a performance improvement so fantastical that it exceeds even my most
contrived overhead-heavy example:
"With mitogen my playbook runtime went from 45 minutes to just
under 3 minutes. Awesome work!"
This is a common theme – anywhere with_items
appears, Mitogen has the most profound impact. The obvious reason is that
during loops the same module is executed repeatedly, and after one iteration is
guaranteed to be compiled and ready on the target.
So many lessons!
Developing the campaign from a thought exercise one idle Sunday evening into an
actually practical project has taken a lot of work – far more than I
anticipated, and at almost every step I have learned something novel. This is
all reuseable knowledge for anyone attempting a similar project in future, and
I will write it up as time permits.
Regardless of outcomes the campaign has already proven one very exciting
result: real users will stake real money towards something as seemingly
mundane as free infrastructure, and I think that’s beyond amazing. In a world
content to throw millions of dollars at junk ICOs almost weekly, crowdfunding
free software seems to me a practice that should happen far more often.
I wish to thank everyone for the support shown thus far, and I’d encourage you
to consider tapping that Ansible user you know on the shoulder to let them know
about the project. For those working close to infrastructure consulting, please
consider using the final week to corner your boss regarding associating your
company logo with a sexy project that promises to receive many eyeballs over
the coming years.
Allegedly on site as a developer, two summers ago I found myself in a situation
you are no doubt familiar with, where despite preferences unrelated problems
inevitably gravitate towards whoever can deal with them. Following an
exhausting day spent watching a dog-slow Ansible job fail repeatedly, one
evening I dusted off a personal aid to help me relax: an ancient, perpetually
unfinished hobby project whose sole function until then had simply been to
remind me things can always improve.
Something of a miracle had struck by the early hours of next morning, as almost
every outstanding issue had been solved, and to my disbelief the code ran
reliably. 18 months later and for the first time in living memory, I am excited
to report delivery of that project, one of sufficient complexity as to have
warranted extreme persistence - in this case from concept to implementation,
over more than a decade.
The miracle? It comes in the form of
- a tiny Python library you won’t have heard of, but I hope as an Ansible user
you will soon eternally be glad for, on discovering ansible-playbook
now completes in very reasonable time even in the face of deeply unreasonable
Mitogen is a library for writing distributed programs that require zero
deployment, specifically designed to fit the needs of infrastructure software
like Ansible. Without upfront configuration it supports any UNIX machine
featuring an installed Python interpreter, which is to say almost all of them.
While the concept is hard to explain - even to fellow engineers, its value is
easy to grasp:
This trace shows two Ansible runs of a basic
100-step playbook over a 1 ms latency network against a single target host.
The first run employs SSH
pipelining, Ansible’s current most optimal configuration, where it consumes
almost 4.5 Mbytes network bandwidth in a running time of 59 secs.
The second uses the prototype Mitogen extension
for Ansible, with a far more reasonable 90 Kbytes consumed in 8.1 secs.
An unmodified playbook executes over 7 times faster while consuming 50x less
Less than half the CPU time was consumed on the host machine, meaning that
by one metric it should handle at least twice as many targets. Crucially
no changes were required to the target machine, including new software or
nasty on-disk caches to contend with.
While only pure overhead is measured above, the benefits very much extend
to real-world scenarios. See the documentation
(1.75x time) and issue #85 (4.2x time, 3.1x CPU) for examples.
How is this possible?
Mitogen is perhaps most easily described as a kind of network-capable fork() on
steroids. It allows programs to establish lazily-loaded duplicates on remote
hosts, without requiring any upfront remote disk writes, and to communicate
with those copies once they exist. The copies can in turn recursively split to
produce further children - with bidirectional message routing between every
copy handled automatically.
In the context of Ansible, unlike with SSH pipelining where up to one SSH
invocation, sudo invocation and script compilation are required for every
playbook step, and with all scripts re-uploaded for each step, with Mitogen
only one of each exists per target for the duration of the playbook run, with
all code cached in RAM between steps. Absolutely everything is reused,
saving 300-800 ms on every step.
The extension represents around a week’s work, replaces hundreds of lines of
horrid shell-related code in Ansible, and is already at the point where on one
real-world playbook, Ansible is only 2% slower than equivalent SSH
commands. Presently connection establishment is single-threaded, so the
prototype is only good for a few hosts, but rest assured this limitation’s days
Not just a speed up, a paradigm shift you’ll adore
If this seems impressive and couldn’t be improved upon, prepare for some deep
shocks. You can think of the extension not just as a performance
improvement, but something of a surreptitious beachhead from which I intend
to thoroughly assault your sense of reality.
This performance is a side effect of a far more interesting property: Ansible
is no longer running on just the host machine, but temporarily distributed
throughout the target network for the duration of the run, with bidirectional
communication between all pieces, and you won’t believe the crazy functionality
What if I told you it were possible not only to eliminate that final 2%, but
turn it sharply negative, while simultaneously reducing resource consumption?
“Surely Ansible can’t execute faster than equivalent raw SSH commands?” You
bet it can! And if you care about such things, this could be yours by
Autumn. Read on..
Pushing brains into the ether, no evil agents required
As I teased last
year, Ansible takes its name from a faster-than-light communication
device from science fiction, yet despite these improvements it is still
fundamentally bound by the speed with which information physically propagates.
Pull and agent-based tooling is strongly advantageous here: control flow occurs
at the same point as the measurements necessary to inform that flow, and no
penalty is incurred for traversing the network.
Today, reducing latency in Ansible means running it within the target network,
or in pull mode,
where the playbook is stored on the target alongside for example, secrets for
decrypting any vaults, and the hairy mechanics required to keep that in sync
and executing when appropriate. This is a far cry from the simplicity of
tapping ansible-playbook live.yml on your laptop, and so it is an option of
What would be amazing is some hybrid where we could have the performance and
scaleability benefits of pull, combined with the stateless simplicity of push,
without introducing dedicated hosts or permanent caches and agents running on
the target machines, that amount to persistent intermediate state and
introduce huge headaches of their own, all without sacrificing the fabulous
ability to shut everything down with a simple CTRL+C.
The opening volley: connection delegation
As a first step to exploiting previously impossible functionality, I will
enhance the extension to support delegating connection establishment to a
machine on the target network, avoiding the cost of establishing hundreds of
SSH connections over a low throughput, high latency network link.
Unlike with SSH proxying, this has the huge benefit of caching and serving
Ansible code from RAM on the intermediary, avoiding uploading approximatey
50KiB of code for every playbook step, and ensuring those cached responses are
delivered over the low latency LAN fabric on the target network. For 100 target
machines, this replaces the transmission of 5 Mbytes of data for every
playbook step with on the order of kilobytes worth of tiny remote procedure
All the Mitogen-side infrastructure for this exists today, and is already used
to implement become support.
It could be flipped on with a few lines of code in the Ansible extension, but
there are a few more importer
bugs to fix before it’ll work perfectly.
Finally as a reminder, since Mitogen operates recursively delegation also
operates recursively, with code caching and connection establishment
happening at each hop. Not only is this useful for navigating slow links and
complicated firewall setups, as we’ll see, it enables some exciting new
Ansible is intended to manage many machines simultaneously, and while the
extension’s improvements presently work well for single-machine playbooks, that
is all but a niche application for many users.
Having the newfound ability to delegate connection establishment to an
intermediary on the target network, far away from our laptop’s high latency 3G
connection, and with the ability to further sub-delegate from that
intermediary, we can implement a divide and conquer strategy, forming a large
tree comprising the final network of target machines for the playbook run, with
responsibility for caching and connection multiplexing evenly divided across
the tree, neatly avoiding single resource bottlenecks.
I will rewrite Mitogen’s connection establishment to be asynchronous: creation
of many downstream connections can be scheduled in parallel, with the ability
to enqueue commands prior to completion, including recursive commands that
would cause those connections to in turn be used as intermediaries.
The cost of establishing connections should become only the cost of code upload
(~50KiB) and the latency of a single SSH connection per tree layer, as
connections at each layer occur in parallel. For an imaginary 1,700 node
cluster split into quarters of 17 racks and 25 nodes per rack, connection via a
300 ms 3G network should complete in well under 15 seconds.
Topology-aware file synchronization
So you have a playbook on your laptop deploying a Django application via the
module, to 100 Ubuntu machines running in a datacentre 300 ms away. Each run of
the playbook entails a groan followed by a long walk, as a 3.8 second rsync run is
invoked 100 times via your 3G connection, just to synchronize a 3 Mbyte asset
the design team won’t stop tweaking. Not only are there 6 minutes of roundtrips
buried in those invocations, but that puny 3G connection is forced to send a
total of 300 Mbytes toward the target network.
What is the point of continually re-sending that file to the same set of
machines in some far-off network? What if it could be uploaded exactly once,
then automatically cached and redistributed within the target network,
producing exactly one upload per layer in the hierarchy:
Why stop at delegating connection establishment and module caching? Now we have
a partial copy of Ansible within the network, nothing prevents implementing all
kinds of smarts. Here is another feature that is a cinch to build once
bidirectional communication exists between topology-aware code, which the
prototype extension already provides today.
After a brutal 4 hour meeting involving 10 executives our hero Bob, Senior
Disaster Architect III, emerges bloodstained yet victorious against the
tyrannical security team, as his backends can talk with impunity to the entire
Internet just so apt-get can reach packages.debian.org for the 15
seconds Bob’s daily Ansible CI job requires.
That evening, having regaled his giddy betrothed (HR Coordinator II) with
heroic story of war, Bob catches a brief yet chilling glimmer of doubt for all
that transpired. “Was there another way?” he sleepily ponders, before
succumbing to a cosier battle waged by those fatigued and heavy eyelids.
Suddenly aware again, Bob emerges bathed in a mysterious utopian dreamscape
where CI jobs executed infinitely quickly, war and poverty did not exist, and
the impossible had always been possible.
Building on Mitogen’s
message routing, forwarding all kinds of pipes and network sockets becomes
trivial, including schemes that would allow exposing a transient, locked down
HTTP proxy to Bob’s apt-get invocation only for as long as necessary, all
with a few lines of YAML in a playbook.
While this is already possible with SSH forwarding, the hand-configuration
involved is messy, and becomes extremely hairy when the target of the forward
is not the host machine. My initial goal is to support forwarding of UNIX and
TCP sockets, as they cover all use cases I have in mind. Speaking of which..
Topology-aware Git pull
Another common security fail seen in Ansible playbooks is to call Git directly
from target machines, including granting those machines access to a Git
server. This is a horrid violation: even read-only access implies the machine
needs permanent firewall rules that shouldn’t exist, just for the scant
moments a pull is in progress. Granting backends access to a site as complex as
GitHub.com, you may as well abandon all outbound firewalling, as this is enough
for even the puniest script kiddy to exfiltrate a production database.
What if Git could run with the permissions of the local Ansible user, on the
user’s own machine, and be served efficiently to the target machines only for
the duration of the push, faster than 100 machines talking to GitHub.com, and
only to the single read-only repository intended?
Building on generalized forwarding, topology-aware Git repeats all the caching
and single-upload tricks of file synchronization, but this time implementing
the Git protocol between each node.
In the scheme I will implement, a single round-trip is necessary for git-fetch-pack to pull
just the changed objects from the laptop over the high latency 3G link, before
propagating at LAN speeds throughout the target network, with git-ls-remote output
delivered as part of the message that initiates the pull. Not only is the
result more efficient than a normal git-pull, but backends no
longer require network access to Git.
The final word: Inversion of control
Remember we talked about making Ansible run faster than equivalent SSH
commands? Well, today Ansible requires one network round-trip per playbook
step, so just like SSH, it must pay the penalty for every round-trip unless
something gives, and that something is the partial delegation of control to the
target machine itself.
With inversion of control, the role of ansible-playbook simply becomes that
of shipping code and selective chunks of data to target machines, where those
machines can execute and make control decisions without necessitating a
conversation with the master after each step, just to figure out what to
Ansible has all the framework to enable implementing this today, by
significantly extending the prototype extension’s existing strategy plug-in,
and teaching it how to automatically send and wait on batches of tasks, rather
than on single tasks at a time.
Aside from improved performance, the semantics of the existing linear
strategy will be preserved, and playbooks need not be changed to cope: on the
target machine tasks will not suddenly begin running concurrently, or in any
order different to previously.
App-level connection persistence
As a final battle against latency during playbook development and debugging, I
will support detaching the connection tree from ansible-playbook on exit,
and teach the extension to reuse it at startup. This will reduce the overhead
of repeat runs, especially against many targets, to the order of hundreds of
milliseconds, as no new SSH connections, module compilations or code uploads
Connection persistence opens the floodgates for adding sweet new tooling,
although I’m not sure how desirable it is to expose an implementation detail
like this forever, while also extending the interface provided by Ansible
itself. As a simple example, we could provide an ansible-ssh tool that
reuses the connection tree along with Ansible’s tunnelling, delegation, dynamic
inventory and authentication configuration to forward a pipe to a remote shell.
The cost of slow tooling
Ansible has over
28,500 stars on GitHub, representing just those users who have a GitHub
account and ever thought to star it, and appears to grow by 150 stars per week.
Around London the going rate to hire one user is $100/hour, and conservatively,
we could expect that user is trotting out a 15 minute run of ansible-playbook
live.yml at least once per week.
We can expect that if Ansible is running merely twice as slowly as necessary,
7.5 minutes of that run is lost productivity, and across those 28,500 users,
the economic cost is in the region of $356,250 per invocation or
$17,100,000 per year. In reality the average user is running Ansible far
more often, including thousands of times per minute under various CI systems
worldwide, and those runs often last far longer than 15 minutes, but I’d
recommend that mental guesstimation is left as an exercise to readers who are
already blind drunk.
The future is beautiful if you want it to be
My name is David, and nothing jinxes my day quite like slow tooling. I have
poured easily 500 hours in some form into this project over a decade and on my
own time. The project has now reached an inflection point where the fun part is
over, the science is done and the effect is real, and only a small, highly
predictable set of milestones remain to deliver what I hope you agree is a much
Before reading I doubt you would have believed it possible to provide the
features described without a complex infrastructure running in the target
network, now I hope you’ll join me in disproving one final impossibility.
While everything here will exist in time, it cannot exist in 2018 without
your support, and that’s why I’d like to try something crazy, that would
allow me to devote myself to delivering a vastly improved daily routine for
thousands of people just like you and me.
You may have guessed already: I want you to crowdfund awesome tooling.
What value would you place on an extra productive hour every working week? In
the UK that’s an easy question: it’s around $4,800 per year. And what risk is
there to contributing $100 to an already proven component? I hope you’ll agree
this too is a no-brainer, both for you and your employer.
To encourage success I’m offering a unique permanent placement of your brand on
the GitHub repository and documentation. Funds will be returned if the minimum
goal cannot be reached, however just 3 weeks are sufficient to ensure a well
tested extension, with my full attention given to every bug, ready to save many
hours right on time to enjoy the early sunlight of Spring.
Totalling much less than the economic damage caused by a single run of today’s
Ansible, the grand plan is divided into incrementally related stretch goals. I
cannot imagine this will achieve full funding, but if it does, as a finale
I’ll deliver a feature built on Ansible that you never dreamed possible.
As a modern area deployment tooling is exposed to the ebb and flow of the
software industry far more than typical, and unexpected disruption happens
continuously. Without ongoing evolution, exposure to buggy and unfamiliar new
tooling is all but guaranteed, with benefits barely justifying the cost of
their integration. As we know all too well, rational ideas like cost/benefit
rarely win the hearts of buzzword-hungry and youthful infrastructure teams, so
counterarguments must be presented another way.
As a recent example there is growing love for mgmt, which is designed from the
outset as an agent-based reactive distributed system, much as Mitogen nudges
Ansible towards. However unlike mgmt, Ansible preserves its zero-install and
agentless nature, while laying a sound framework for significantly more
exciting features. If that alone does not win loyalty, we’re at least
guaranteed that every migration-triggering new feature implemented in such
systems can be headed off with minimal effort, long into the foreseeable