sweetness.hmmz.org

Crowfunding Mitogen: day 46
This is the second update on the status of developing the Mitogen extension for Ansible, only 2 weeks late!

Too long, didn’t read

Gearing up to remove the scary warning labels and release a beta! Running a little behind, but not terribly. Every major risk is solved except file transfer, which should be addressed this week.

23 days, 257 commits, 186 files changed, 7292 insertions(+), 1503 deletions(-)

Just tuning in?
- 2017-09-15: Mitogen, an infrastructure code baseline that sucks less
- 2018-03-06: Quadrupling Ansible performance with Mitogen
- 2018-03-28: Crowdfunding Mitogen: day 23
Started: Python 3 Support

A very rough branch exists for this, and I’m landing volleys of fixes when I have downtime between bigger pieces of work. Ideally this should have been ready for the end of April, but it may take a few weeks more.

I originally hoped to have a clear board before starting this, instead it is being interwoven as busywork when I need a break from whatever else I’m working on.

Done: multiplexer throughput

The situation has improved massively. Hybrid TTY/socketpair mode is a thing and as promised it significantly helps, but just not quite as much as I hoped.

Today on a 2011-era Macbook Pro Mitogen can pump an SSH client/daemon at around 13MB/sec, whereas scp in the same configuration hits closer to 19MB/sec. In the case of SSH, moving beyond this is not possible without a patched SSH installation, since SSH hard-wires its buffer sizes around 16KB, with no ability to override them at runtime.

With multiple SSH connections that 13MB should cleanly multiply up, since every connection can be served in a single IO loop iteration.

A bunch of related performance fixes were landed, including removal of yet another special case for handling deferred function calls, only taking locks when necessary, and reducing the frequency of the stream implementations modifying the status of their descriptors' readability/writeability.

As we’re in the ballpark of existing tools, I’m no longer considering this as much of a priority as before. There is definitely more low-hanging fruit, but out-of-the-box behaviour should no longer raise eyebrows.

Done: task isolation

As before, by default each script is compiled once, however it is now re-executed in a spotless namespace prior to each invocation, working around any globals/class variable sharing issues that may be present. The cost of this is negligible, on the order of 100 usec.

When this is insufficient, a mitogen_task_isolation=fork per-task variable exists to allow explicitly forcing a particular module to run in a new process. Enabling this by default causes something on the order of a 33% slowdown, which is much better than expected, but still not good enough to enable forking by default.

Aside from building up a blacklist of modules that should always be forked, task isolation is pretty much all done, with just a few performance regressions remaining to fix in the forking case.

Done: exotic module support

Every style of Ansible module is supported aside from the prehistorical “module replacer” type. That means today all of these work and are covered by automated tests:
- Built-in new-style Python scripts
- User-supplied new-style Python scripts
- Ancient key=value style input scripts
- Statically linked Go programs
- Perl scripts
Python module support was updated to remove the monkey-patching in use before. Instead, sys.stdin, sys.stdout and sys.stderr are redirected to StringIO objects, allowing a much larger variety of custom user scripts to be run in-process even when they don’t use the new-style Ansible module APIs.

Done: free strategy support

The "free" strategy can now be used by specifying ANSIBLE_STRATEGY=mitogen_free. The mitogen strategy is now an alias of mitogen_linear.

Done: temporary file handling

This should be identical to Ansible’s handling in all cases.

Done: interpreter recycling

An upper bound exists to prevent a remote machine from being spammed with thousands of Python interpreters, which was previously possible when e.g. using a with_items loop that templatized become_user.

Once 20 interpreters exist, the extension shuts down the most recently created interpreter before starting a new one. This strategy isn’t perfect, but it should suffice to avoid raised eyebrows in most common cases for the time being.

Done: precise standard IO emulation

Ansible’s complex semantics for when it does/does not merge stdout and stderr during module runs are respected in every case, including emulation of extraneous \r characters. This may seem like a tiny and pointless nit, however it is almost certainly the difference between a tested real-world playbook succeeding under the extension or breaking horribly.

Done: async tasks

We’re on the third iteration of asynchronous tasks, and I really don’t want to waste any more time on it. The new implementation works a lot more like Ansible’s existing implementaion, for as much as that implementation can be said to “work” at all.

Done: better error messages

Connection errors no longer crash with an inscrutible stack trace, but trigger Ansible’s internal error handling by raising the right exception types.

Mitogen’s logging integration with the Ansible display framework is much improved, and errors and warnings correctly show up on the console in red without having to specify -vvv.

Still more work to do on this when internal RPCs fail, but that’s less likely to be triggered than a connection error.

New debugging mode

An “emergency” debugging mode has been added, in the form of MITOGEN_DUMP_THREAD_STACKS=1. When this is present, every interpreter will dump the stack of every thread into the logging framework every 5 seconds, allowing hangs to be more easily diagnosed directly from the controller machine’s logs.

While adding this, it struck me that there is a really sweet piece of functionality missing here that would be easy to add – an interactive debugger. This might turn up in the form of an in-process web server allowing viewing the full context hierarchy, and running code snippets against remotely executing stacks, much like Werkzeug’s interactive debugger.

Performance regressions

In addition to simply not being my focus recently, a lot of the new functionality has introduced import statements that impact code running in the target, and so performance has likely slipped a little from the original posted benchmarks, most likely during run startup in the presence of a high latency network.

I will be back to investigate these problems (and fix those for which no investigation is required – the module loader!) once all remaining functionality is stable.

File Transfer

This seemingly simple function has required the greatest deal of thought out of every issues I’ve encountered so far. The initial problem relates to flow control, and the absense of any natural mechanism to block a producer (file server) while intermediary pipe buffers (i.e. the SSH connection) are filled.

Even when flow control exists, an additional problem arises since with Mitogen there is no guarantee that one SSH connection = one target machine, especially once connection delegation is implemented. Some kind of bandwidth sharing mechanism must also exist, without poorly reimplementing the entirety of TCP/IP in a Python script.

For the initial release I have settled on basic design that should ensure the available bandwidth is fully utilized, with each upload target having its file data served on a first-come-first-served basis.

When any file transfer is active, one of the service threads in the associated connection multiplexer process (the same ones used for setting up connections) will be dedicated to a long-running loop that monitors every connected stream’s transmit queue size, enqueuing additional file chunks as the queue drains.

Files are served one-at-a-time to make it more likely that if a run is interrupted, rather than having every partial file transfer thrown away, at least a few targets will have received the full file, allowing that copy to be skipped when the play is restarted.

The initial implementation will almost certainly be replaced eventually, but this basic design should be sufficient for what is needed today, and should continue to suffice when connection delegation is implemented.

Testing / CI

The smattering of unit and integration tests that exist are running and passing under Travis CI. In preparation for a release, master is considered always-healthy and my development has moved to a new dmw branch.

I’m taking a “mostly top down” approach to testing, written in the form of Ansible playbooks, as this gives the widest degree of coverage, ensuring that high level Ansible behaviour is matched with/without the extension installed. For each new test written, the result must pass under regular Ansible in addition to Ansible with the extension.

“Bottom up” type tests are written as needs arise, usually when Ansible’s user interface doesn’t sufficiently expose whatever is being tested.

Also visible in Travis is a debops_common target: this is running all 255 tasks from DebOps common.yml against a Docker instance. It’s the first in what should be 4-5 similar DebOps jobs, deploying real software with the final extension.

I have begun exploring integrating the extension with Ansible’s own integration tests, but it looks likely this is too large a job for Travis. Work here is ongoing.

Security

A few items have been chipped off the list.
- Message source verification was audited everywhere, and is covered by automated tests.
- All internal message handlers specify a policy indicating what kind of participants are allowed to deliver messages to them.
- As above, but for mitogen.service. A service cannot be exposed without attaching an access policy to it.
Notably absent is unidirectional routing mode. I will make time to finish that shortly.

User bug fixes
- Poor refactoring broke select EINTR handling
- SSH password was being supplied as the sudo password
- Acquiring a controlling TTY was fixed on FreeBSD
Summary

Super busy, slightly behind! Until next time..
- April 20, 2018
Crowdfunding Mitogen: day 23
This is the first in what I hope will be at least a bi-weekly series to keep backers up to date on the current state of delivering the Mitogen extension for Ansible. I’m trying to use every second I have wisely until every major time risk is taken care of, so please forgive the knowledge-dump style of this post :)

Haven't been following?
- Introduction to Mitogen (September 2017)
- Introduction to the Mitogen extension for Ansible (March 2018)
Too long, didn’t read

Well ahead of time. Some exciting new stuff popped up, none of it intractably scary.

Funding Update

I have some fabulous news on funding: in addition to what was already public on Kickstarter, significant additional funding has become available, enough that I should be able to dedicate full time to the project for at least another 10 weeks!

Naturally this has some fantastic implications, including making it significantly likely that I’ll be able to implement Topology-aware File Synchronization.

Python 3 Support

I could not commit to this due to worrying Python 3 would become a huge and destablizing time sink, ruining any chance of delivering more immediately useful functionality.

The missing piece (exception syntax) to support from Python 2.4 all the way to 3.x has been found - it came via an extraordinarily fruitful IRC chat with the Ansible guys, and was originally implemented in Ansible itself by Marius Gedminas. With this last piece of the puzzle, the only bugs left to worry about are renamed imports and the usual bytes/str battles. Both are trivial to address with strong tests - something already due for the coming weeks. It now seems almost guaranteed Python 3 will be completed as part of this work, although I am still holding off on a 100% commitment until more pressing concerns are addressed.

New Risk: multiplexer throughput

Some truly insane performance bugs have been found and fixed already, particularly around the stress caused by delivering huge single messages, however during that work a new issue was found: IO multiplexer throughput truly sucks for many small messages.

This doesn’t impact things much except in one area: file transfer. While I haven’t implemented a final solution for file transfer yet, as part of that I will need to address what (for now) seems a hard single-thread performance limit: Mitogen’s current IO loop cannot push more than ~300MiB/sec in 128KiB-sized chunks, or to put it another way, best case 3MiB/sec given 100 targets.

Single thread performance: the obvious solution is sharding the multiplexer across multiple processes, and already that was likely required for completing the multithreaded connect work. This is a straightforward change that promises to comfortably saturate a Gigabit Ethernet port using a 2011 era Macbook while leaving plenty of room for components further up (Ansible) and down (ssh) the stack.

TTY layer: I’ve already implemented some fixes for this (increase buffer sizes, reduce loop iterations), but found some ugly new problems as a result: the TTY layer in every major UNIX has, at best, around a 4KiB buffer, forcing many syscalls and loop iterations, and it seems on no OS is this buffer tunable. Fear not, there is already a kick-ass solution for this too.

This problem should disappear entirely by the time real file transfer support is implemented - today the extension is still delivering files as a single large message. The blocker to fixing that is a missing flow control mechanism to prevent saturation of the message queue, which requires a little research. This hopefully isn’t going to be a huge amount of work, and I’ve already got a bunch of no-brainer yet hacky ways to fix it.

New risk: task isolation

It was only a matter of time, but the first isolation-related bug was found, due to a class variable in a built-in Ansible module that persists some state across invocations of the module’s main() function. I’d been expecting something of this sort, so already had ideas for solving it when it came up, and really it was quite a surprise that only one such bug was reported out of all those reports from initial testers.

The obvious solution is forking a child for each task by default, however as always the devil is in the details, and in many intractable ways forking actually introduces state sharing problems far deadlier than those it promises to solve, in addition to introducing a huge (3ms on Xeon) penalty that is needless in most cases. Basically forking is absolute hell to get right - even for a tiny 2 kLOC library written almost entirely by one author who wrote his first fork() call somewhere in the region of 20 years ago, and I’m certain this is liable to become a support nightmare.

The most valuable de facto protection afforded by fork - memory safety, is pretty redundant in an almost perfectly memory safe language like Python, that’s why the language is so popular at all.

Meanwhile forking is needed anyway for robust implementation of asynchronous tasks, so while implementing it would never have been wasted work, it is not obvious to me that forking could or should ever become the default mode. It amounts to a very ripe field for impossible to spot bugs of much harder classes than the simple solution of running everything in a single process, where we only need to care about version conflicts, crap monkey patches, needlessly global variables and memory/resource leaks.

I’m still exploring the solution space for this one, current thinking is maybe (maybe! this is totally greenfield) something like:
- Built-in list of fixups for ridiculously easy to repair bugs, like the yum_repository example above.
- Whitelist for in-process execution any module known (and manually audited) to be perfectly safe. Common with_items modules like lineinfile easily fit in this class.
- Whitelist for in-process safe but nonetheless leaky modules, such as the buggy yum_repository module above that simply needs its bytecode re-executed (100usec) to paper over the bug. Can’t decide whether to keep this mode or not - or simply merge it with the above mode.
- Default to forking (3ms - max 333 with_items/sec) for all unknown bespoke (user) modules and built-in modules of dubious quality, with a mitogen_task_isolation variable permitting the mode to be overridden by the user on a per-task basis. “Oh that one loop is eating 45 minutes? Try it with mitogen_task_isolation=none”
All the Mitogen-side forking bits are implemented already, and I’m deferring the Ansible-side bits to be done simultaneous to supporting exotic module types, since that whole chunk of code needs a rewrite and no point in rewriting it twice.

Meanwhile whatever the outcome of this work, be assured you will always have your cake and eat it - this project is all about fixing performance, not regressing it. I hope this entire topic becomes a tiny implementation detail in the coming weeks.

CI

On the testing front I was absolutely overjoyed to discover DebOps by way of a Mitogen bug report. This deserves a whole article on its own, meanwhile it represents what is likely to be a huge piece of the testing puzzle.

Multithreaded connect

A big chunk is already implemented in order to fix an unrelated bug! The default pool size has 16 threads in one process, so there will only be a minor performance penalty for the first task to run when the number of targets exceeds 16. Meanwhile, the queue size is adjustable via an environment variable. I’ll tidy this up later.

Even though it basically already exists, I’m not yet focused on making multithreaded connect work - including analysing the various performance weirdness that appears when running Mitogen against multiple targets. These definitely exist, I just haven’t made time yet to determine whether it’s an Ansible-side scaling issue or a Mitogen-side issue. Stay tuned and don’t worry! Multi-target runs are already zippy, and I’m certain any issues found can be addressed.

Security

At least a full day will be dedicated to nothing but coming up with new attack scenarios, meanwhile I’m feeling pretty good about security already. The fabulous Alex Willmer has been busily inventing new cPickle attack scenarios, and some of them are absolutely fantastically scary! He’s sitting on at least one exciting new attack that represents a no-brainer decider on the viability of keeping cPickle or replacing it.

Serialization aside, I’ve been busy comparing Ansible’s existing security model to what the extension provides today, and have at least identified unidirectional routing mode as a must-have for delivering the extension. Regarding that, it is possible to have a single playbook safely target 2 otherwise completely partitioned networks. Today with Mitogen, one network could route messages towards workers in the other network using the controller as a bridge. While this should be harmless (given existing security mitigations), it still introduces a scary capability for an attacker that shouldn’t exist.

Some more security bugs I’m fixing here

Deferring Windows support

Really screwed up on planning here - turns out Ansible on Windows does not use Python whatsoever, and so implementing the support in Mitogen would mean increasing the installation requirements for Windows targets. That’s stupid, it violates Ansible’s zero-install design and was explicitly a non-goal from the get go.

Meanwhile WinRM has extremely poor options for bidirectional IO, and likely viable Mitogen support for Windows will include introducing a, say, SSL-encrypted reversion connection from the target machine in order to get efficient IO.

I will shortly be polling everyone who has pledged towards the project, and if nobody speaks up to save Windows, it’s being pushed to the back of the queue.

A big, big thanks, once again!

It goes without saying but none of this work has been a lone effort, starting from planning, article review, funding, testing, and an endless series of suggestions, questions and recommendations coming from so many people. Thanks to everyone, whether you contributed a single $1 or a single typo bug report.

Summary

Super busy, but also super on target! Until next time..
- March 28, 2018

Crowfunding Mitogen: day 46

Crowdfunding Mitogen: day 23