November 18, 2015

Introducing sd-event

The Event Loop API of libsystemd

When we began working on systemd we built it around a hand-written ad-hoc event loop, wrapping Linux epoll. The more our project grew the more we realized the limitations of using raw epoll:

  • As we used timerfd for our timer events, each event source cost one file descriptor and we had many of them! File descriptors are a scarce resource on UNIX, as RLIMIT_NOFILE is typically set to 1024 or similar, limiting the number of available file descriptors per process to 1021, which isn't particularly a lot.

  • Ordering of event dispatching became a nightmare. In many cases, we wanted to make sure that a certain kind of event would always be dispatched before another kind of event, if both happen at the same time. For example, when the last process of a service dies, we might be notified about that via a SIGCHLD signal, via an sd_notify() "STATUS=" message, and via a control group notification. We wanted to get these events in the right order, to know when it's safe to process and subsequently release the runtime data systemd keeps about the service or process: it shouldn't be done if there are still events about it pending.

  • For each program we added to the systemd project we noticed we were adding similar code, over and over again, to work with epoll's complex interfaces. For example, finding the right file descriptor and callback function to dispatch an epoll event to, without running into invalidated pointer issues is outright difficult and requires non-trivial code.

  • Integrating child process watching into our event loops was much more complex than one could hope, and even more so if child process events should be ordered against each other and unrelated kinds of events.

Eventually, we started working on sd-bus. At the same time we decided to seize the opportunity, put together a proper event loop API in C, and then not only port sd-bus on top of it, but also the rest of systemd. The result of this is sd-event. After almost two years of development we declared sd-event stable in systemd version 221, and published it as official API of libsystemd.


sd-event.h, of course, is not the first event loop API around, and it doesn't implement any really novel concepts. When we started working on it we tried to do our homework, and checked the various existing event loop APIs, maybe looking for candidates to adopt instead of doing our own, and to learn about the strengths and weaknesses of the various implementations existing. Ultimately, we found no implementation that could deliver what we needed, or where it would be easy to add the missing bits: as usual in the systemd project, we wanted something that allows us access to all the Linux-specific bits, instead of limiting itself to the least common denominator of UNIX. We weren't looking for an abstraction API, but simply one that makes epoll usable in system code.

With this blog story I'd like to take the opportunity to introduce you to sd-event, and explain why it might be a good candidate to adopt as event loop implementation in your project, too.

So, here are some features it provides:

  • I/O event sources, based on epoll's file descriptor watching, including edge triggered events (EPOLLET). See sd_event_add_io(3).

  • Timer event sources, based on timerfd_create(), supporting the CLOCK_MONOTONIC, CLOCK_REALTIME, CLOCK_BOOTIME clocks, as well as the CLOCK_REALTIME_ALARM and CLOCK_BOOTTIME_ALARM clocks that can resume the system from suspend. When creating timer events a required accuracy parameter may be specified which allows coalescing of timer events to minimize power consumption. For each clock only a single timer file descriptor is kept, and all timer events are multiplexed with a priority queue. See sd_event_add_time(3).

  • UNIX process signal events, based on signalfd(2), including full support for real-time signals, and queued parameters. See sd_event_add_signal(3).

  • Child process state change events, based on waitid(2). See sd_event_add_child(3).

  • Static event sources, of three types: defer, post and exit, for invoking calls in each event loop, after other event sources or at event loop termination. See sd_event_add_defer(3).

  • Event sources may be assigned a 64bit priority value, that controls the order in which event sources are dispatched if multiple are pending simultanously. See sd_event_source_set_priority(3).

  • The event loop may automatically send watchdog notification messages to the service manager. See sd_event_set_watchdog(3).

  • The event loop may be integrated into foreign event loops, such as the GLib one. The event loop API is hence composable, the same way the underlying epoll logic is. See sd_event_get_fd(3) for an example.

  • The API is fully OOM safe.

  • A complete set of documentation in UNIX man page format is available, with sd-event(3) as the entry page.

  • It's pretty widely available, and requires no extra dependencies. Since systemd is built on it, most major distributions ship the library in their default install set.

  • After two years of development, and after being used in all of systemd's components, it has received a fair share of testing already, even though we only recently decided to declare it stable and turned it into a public API.

Note that sd-event has some potential drawbacks too:

  • If portability is essential to you, sd-event is not your best option. sd-event is a wrapper around Linux-specific APIs, and that's visible in the API. For example: our event callbacks receive structures defined by Linux-specific APIs such as signalfd.

  • It's a low-level C API, and it doesn't isolate you from the OS underpinnings. While I like to think that it is relatively nice and easy to use from C, it doesn't compromise on exposing the low-level functionality. It just fills the gaps in what's missing between epoll, timerfd, signalfd and related concepts, and it does not hide that away.

Either way, I believe that sd-event is a great choice when looking for an event loop API, in particular if you work on system-level software and embedded, where functionality like timer coalescing or watchdog support matter.

Getting Started

Here's a short example how to use sd-event in a simple daemon. In this example, we'll not just use sd-event.h, but also sd-daemon.h to implement a system service.

#include <alloca.h>
#include <endian.h>
#include <errno.h>
#include <netinet/in.h>
#include <signal.h>
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/ioctl.h>
#include <sys/socket.h>
#include <unistd.h>

#include <systemd/sd-daemon.h>
#include <systemd/sd-event.h>

static int io_handler(sd_event_source *es, int fd, uint32_t revents, void *userdata) {
        void *buffer;
        ssize_t n;
        int sz;

        /* UDP enforces a somewhat reasonable maximum datagram size of 64K, we can just allocate the buffer on the stack */
        if (ioctl(fd, FIONREAD, &sz) < 0)
                return -errno;
        buffer = alloca(sz);

        n = recv(fd, buffer, sz, 0);
        if (n < 0) {
                if (errno == EAGAIN)
                        return 0;

                return -errno;

        if (n == 5 && memcmp(buffer, "EXIT\n", 5) == 0) {
                /* Request a clean exit */
                sd_event_exit(sd_event_source_get_event(es), 0);
                return 0;

        fwrite(buffer, 1, n, stdout);
        return 0;

int main(int argc, char *argv[]) {
        union {
                struct sockaddr_in in;
                struct sockaddr sa;
        } sa;
        sd_event_source *event_source = NULL;
        sd_event *event = NULL;
        int fd = -1, r;
        sigset_t ss;

        r = sd_event_default(&event);
        if (r < 0)
                goto finish;

        if (sigemptyset(&ss) < 0 ||
            sigaddset(&ss, SIGTERM) < 0 ||
            sigaddset(&ss, SIGINT) < 0) {
                r = -errno;
                goto finish;

        /* Block SIGTERM first, so that the event loop can handle it */
        if (sigprocmask(SIG_BLOCK, &ss, NULL) < 0) {
                r = -errno;
                goto finish;

        /* Let's make use of the default handler and "floating" reference features of sd_event_add_signal() */
        r = sd_event_add_signal(event, NULL, SIGTERM, NULL, NULL);
        if (r < 0)
                goto finish;
        r = sd_event_add_signal(event, NULL, SIGINT, NULL, NULL);
        if (r < 0)
                goto finish;

        /* Enable automatic service watchdog support */
        r = sd_event_set_watchdog(event, true);
        if (r < 0)
                goto finish;

        if (fd < 0) {
                r = -errno;
                goto finish;
        } = (struct sockaddr_in) {
                .sin_family = AF_INET,
                .sin_port = htobe16(7777),
        if (bind(fd, &, sizeof(sa)) < 0) {
                r = -errno;
                goto finish;

        r = sd_event_add_io(event, &event_source, fd, EPOLLIN, io_handler, NULL);
        if (r < 0)
                goto finish;

        (void) sd_notifyf(false,
                          "STATUS=Daemon startup completed, processing events.");

        r = sd_event_loop(event);

        event_source = sd_event_source_unref(event_source);
        event = sd_event_unref(event);

        if (fd >= 0)
                (void) close(fd);

        if (r < 0)
                fprintf(stderr, "Failure: %s\n", strerror(-r));

        return r < 0 ? EXIT_FAILURE : EXIT_SUCCESS;

The example above shows how to write a minimal UDP/IP server, that listens on port 7777. Whenever a datagram is received it outputs its contents to STDOUT, unless it is precisely the string EXIT\n in which case the service exits. The service will react to SIGTERM and SIGINT and do a clean exit then. It also notifies the service manager about its completed startup, if it runs under a service manager. Finally, it sends watchdog keep-alive messages to the service manager if it asked for that, and if it runs under a service manager.

When run as systemd service this service's STDOUT will be connected to the logging framework of course, which means the service can act as a minimal UDP-based remote logging service.

To compile and link this example, save it as event-example.c, then run:

$ gcc event-example.c -o event-example `pkg-config --cflags --libs libsystemd`

For a first test, simply run the resulting binary from the command line, and test it against the following netcat command line:

$ nc -u localhost 7777

For the sake of brevity error checking is minimal, and in a real-world application should, of course, be more comprehensive. However, it hopefully gets the idea across how to write a daemon that reacts to external events with sd-event.

For further details on the functions used in the example above, please consult the manual pages: sd-event(3), sd_event_exit(3), sd_event_source_get_event(3), sd_event_default(3), sd_event_add_signal(3), sd_event_set_watchdog(3), sd_event_add_io(3), sd_notifyf(3), sd_event_loop(3), sd_event_source_unref(3), sd_event_unref(3).


So, is this the event loop to end all other event loops? Certainly not. I actually believe in "event loop plurality". There are many reasons for that, but most importantly: sd-event is supposed to be an event loop suitable for writing a wide range of applications, but it's definitely not going to solve all event loop problems. For example, while the priority logic is important for many usecase it comes with drawbacks for others: if not used carefully high-priority event sources can easily starve low-priority event sources. Also, in order to implement the priority logic, sd-event needs to linearly iterate through the event structures returned by epoll_wait(2) to sort the events by their priority, resulting in worst case O(n*log(n)) complexity on each event loop wakeup (for n = number of file descriptors). Then, to implement priorities fully, sd-event only dispatches a single event before going back to the kernel and asking for new events. sd-event will hence not provide the theoretically possible best scalability to huge numbers of file descriptors. Of course, this could be optimized, by improving epoll, and making it support how todays's event loops actually work (after, all, this is the problem set all event loops that implement priorities -- including GLib's -- have to deal with), but even then: the design of sd-event is focussed on running one event loop per thread, and it dispatches events strictly ordered. In many other important usecases a very different design is preferable: one where events are distributed to a set of worker threads and are dispatched out-of-order.

Hence, don't mistake sd-event for what it isn't. It's not supposed to unify everybody on a single event loop. It's just supposed to be a very good implementation of an event loop suitable for a large part of the typical usecases.

Note that our APIs, including sd-bus, integrate nicely into sd-event event loops, but do not require it, and may be integrated into other event loops too, as long as they support watching for time and I/O events.

And that's all for now. If you are considering using sd-event for your project and need help or have questions, please direct them to the systemd mailing list.

November 08, 2015

systemd.conf 2015 Summary

systemd.conf 2015 is Over Now!

Last week our first systemd.conf conference took place at betahaus, in Berlin, Germany. With almost 100 attendees, a dense schedule of 23 high-quality talks stuffed into a single track on just two days, a productive hackfest and numerous consumed Club-Mates I believe it was quite a success!

If you couldn't attend the conference, you may watch all talks on our YouTube Channel. The slides are available online, too.

Many photos from the conference are available on the Google Events Page. Enjoy!

I'd specifically like to thank Daniel Mack, Chris Kühl and Nils Magnus for running the conference, and making sure that it worked out as smoothly as it did! Thank you very much, you did a fantastic job!

I'd also specifically like to thank the CCC Video Operation Center folks for the excellent video coverage of the conference. Not only did they implement a live-stream for the entire talks part of the conference, but also cut and uploaded videos of all talks to our YouTube Channel within the same day (in fact, within a few hours after the talks finished). That's quite an impressive feat!

The folks from LinuxTag e.V. put a lot of time and energy in the organization. It was great to see how well this all worked out! Excellent work!

(BTW, LinuxTag e.V. and the CCC Video Operation Center folks are willing to help with the organization of Free Software community events in Germany (and Europe?). Hence, if you need an entity that can do the financial work and other stuff for your Free Software project's conference, consider pinging LinuxTag, they might be willing to help. Similar, if you are organizing such an event and are thinking about providing video coverage, consider pinging the the CCC VOC folks! Both of them get our best recommendations!)

I'd also like to thank our conference sponsors! Specifically, we'd like to thank our Gold Sponsors Red Hat and CoreOS for their support. We'd also like to thank our Silver Sponsor Codethink, and our Bronze Sponsors Pengutronix, Pantheon, Collabora, Endocode, the Linux Foundation, Samsung and Travelping, as well as our Cooperation Partners LinuxTag and, and our Media Partner

Last but not least I'd really like to thank our speakers and attendees for presenting and participating in the conference. Of course, the conference we put together specifically for you, and we really hope you had as much fun at it as we did!

Thank you all for attending, supporting, and organizing systemd.conf 2015! We are looking forward to seeing you and working with you again at systemd.conf 2016!


November 06, 2015

Trinity 1.6

As alluded to in my last post, a few days ago I released a new version of Trinity.
The bulk of the work in this release happened prior to my burn out back in July. The combination of everything described in that post, and general unhappiness in my last job etc led to me just wanting to walk away from everything for an indeterminate amount of time.

Distance is good. I’ve continued to poke at trinity in small amounts since then. At last weeks kernel summit, a number of people expressed just how useful they find Trinity and how much they were bummed to find out I wasn’t working on it any more. With that feedback, I felt motivated to clean the decks and get 1.6 out. There’s a short description of most of the bigger changes below, but there were probably a whole bunch more changes made that I forgot to highlight in the shortlog.

With that release wrapped up, and with the fresh perspective of having been ‘away’ from the project for a while, when I was travelling last week, I started work on some new features, starting with implementing a generic object cache instead of hard coding a “remember this” set of functionality for every single object type a syscall could return. A relatively small amount of code, which should make life easier to support recycling syscall results for syscalls other than mmap (which is all that’s implemented right now).

So,.. while I’m working on this stuff again, it’s not the comeback many would like. I don’t know just how much time I’m going to have to devote to working on Trinity. From time to time, I suspect I’ll find some intersection between my work at Facebook and the sort of targeted testing that Trinity is useful for, but it’s not my primary focus, and probably won’t be again. Additionally, I’ve got a bunch of ideas for new projects I’m itching to work on that spawned from discussions last week, so “spare time” hacking effort might be devoted more to them in future.

tl;dr: Don’t send me feature requests. I’ve got more than enough ideas for stuff *I* want to implement. Diffs speak louder than words.

Summary of some of the bigger changes to Trinity since the last (1.5) tarball release include:

  • Assorted improvements to the tuned random number generation.
    (Including one particularly stupid bug where sometimes all child processes would get the same seed, and end up doing the same syscalls. oops)
  • Various networking related improvements/fixes:
    • tcp: add TCP_TIMESTAMP, TCP_NOTSENT_LOWAT & TCP_CC_INFO socket options.
    • ipv6: Improved generation of random addresses. (No longer just localhost)
    • ipv6: Added 14 missing socket options.
    • ipv6: Now passes correct lengths for socket options. (Note: This change may break older glibcs: See this patch.)
    • Beginnings of some better proto-alg sockaddr generation.
    • Recognise PF_IB and PF_MPLS network protocols
    • Socket generation improvements. (Picks right socket type to go with protocol)
    • Now supports an ARG_SOCKETINFO for syscalls that operate primarily on sockets. (Still occasionally passes random fd’s)
    • accept,accept4,bind,connect,getpeername,getsockname,recv,setsockopt,send converted to use ARG_SOCKETINFO.
    • setsockopt now also matches the protocol of the socket passed to the right setsockopt args.
    • netlink socket generation fix (pid is a portid, not a process id)
    • The -P parameter no longer accepts the incomprehensible numeric form of arguments, just names.
    • The PF_ prefix to the -P parameter is now optional, so you can just say ‘UNIX’ instead of ‘PF_UNIX’.
  • Updates to keep up with new upstream kernel changes.
    • Updated perf_event_open syscall to include 4.1 changes
    • Updated syscall lists
      alpha: execveat, getrandom, memfd_create
      s390[x]: execveat, NUMA related syscalls
      parisc: execveat
    • mips: add new prctls for PR_SET_FP_MODE / PR_GET_FP_MODE
    • Support for new fallocate flags (FALLOC_FL_INSERT_RANGE)
  • Watchdog:
    • Remove some false-positive triggering checks from the watchdog.
    • Watchdog process is now nice’d to -19
    • Monitor how many processes are currently stalled.
    • If all child processes are stalled, send SIGKILLs to 50%
  • Misc:
    • New fd generators for drm dumb buffers & inotify watches.
    • blacklist /dev/sd* from the fd list, so we can be a bit safer when running as root with –dropprivs
    • Fixed the ‘bind process to CPU’ code to only pick online CPUs.
    • Self-corruption checks added to child processes, like the watchdog code already did.
    • Remove guard pages around shm.
    • In debug mode, write protect the shm before making syscalls.
    • Refactoring of logging code.
    • Various code cleanups as usual.
    • No longer tries to mmap 1GB pages if running with less than 8GB free.

The post Trinity 1.6 appeared first on

November 05, 2015

Native file choosers in Gtk+

Recently I have been working on support for a filechooser portal for sandboxed applications (in xdg-app). The way these work is that the application triggers a file chooser, then the actual user interaction happens outside the sandbox, and the application only recieves the file data after the user finished the interaction.

Ideally something like this would be completely hidden by the toolkit, and the application would just use the regular file chooser APIs. However, the Gtk+ filechooser APIs expose too much details about the file chooser dialog, which means it has to be a regular in-process widget. Unfortunately this means we can’t replace it by an out-of-process dialog.

What we need is a Gtk+ API for the file chooser that hides the details of how the dialog works. At this point I realized that this is something that has been requested a lot in a different context. Such an API would allow us to plug in platform-native file chooser dialogs.

So, I got to work, and today I landed support for native Windows file choosers in Gtk:

win32 file chooser in gtkWe will also look at implementing an OSX version of this to ensure that the APIs work for the common cases.

Now that we have this people can start porting their applications (which is quite trivial in the common case, as the APIs is very similar to the old APIs). And once the apps are ported they will automatically get support for the filechooser portal to make them work in a sandboxed environment.

So, anyone maintaining a Gtk 3 application that want to work better on other platforms, or when sandboxed should take a look at the new GtkFileChooserNative API. Right now it only exists in git, but once there is a new release it should show up in the API docs.

Cool hardware in Tokyo

At the Mitaka Summit, we finally got some interesting kit exhibited, after the relatively lean summits in Atlanta and Vancouver. Unfortunately, the lightning in the Marketplace was very weird and pictures came out poorly.

My personal favourite is probably the flash array by SanDisk. It's nothing but JBOF, the host connection is SAS. You'd think any idiot could slap a few flash chips on cards and plug them into backplane... But just look how elegant it is. The capacity of the 2U box is 512 TB, but the whole thing only consumes 700 W maximum. It's brilliant, really.

Unfortunately, I don't have a good picture, but the second best was Ericksson's passive optical backplane. It promises to make your cables last forever: just swap out optronics when new bit rates come along. Even a terabit! Now it may actually be a misguided product. If they cannot get 3rd party vendors to build modules for it, the whole things comes crashing to the ground. Ditto if they build, but overprice. But the audacity of making something that's different is to be acknowledged. And frankly I'm not a fan of re-cabling when new servers come about.

Intel wins a consolation prize for preservance. They quietly presented some kind of next-generation multiblock computer, with pieces connected by serial cables. Finally, the future dreamed by the creators of Infiniband is here - only 15 years late, and still we don't know if it is viable.

There was also a bunch of fairly mundane boxes. Various also-run flash vendors were present, of course. Interestingly, SolidFire had a booth, but without anything eye-catching. Resting on the laurels? IBM brought their newest PowerPC, which was mostly remarkable for still existing. That sort of thing.

November 04, 2015

kernel summit 2015 wrap-up

Exhausting travel aside, kernel summit in Seoul was a good use of time.
Most of the sessions didn’t feel as interactive as prior years, in part I think because there really wasn’t a lot of objection, even to some
of the more controversial things. Kees’ security talk went over pretty well even if it did depress most the people in the room. Hopefully something good will come of it. The restartable sequences feature got talked about but didn’t get much (if any) real pushback.

There were a few hallway discussions surrounding various upcoming
kernel functionality that didn’t get ‘airtime’ in the sessions.
The kernel TLS stuff was probably discussed more in depth at netconf, and assorted VM features were covered more at LSFMM
earlier this year. Quite a few people talking excitedly about eBPF, both from a networking point of view, and soon.. tracing.
Quite a few people still seem concerned (rightly) about the upcoming unpriveledged bpf syscall.

It seems by fracturing the kernel summit into lots of smaller events the deep-dives into new features/problems happens there, leaving the kernel summit more for executive summary type talks, and as has been the general push over the last decade more and more process related discussions.

On process, Sasha’s discussion on stable was probably the most interesting to me personally. GregKH agreed to make 4.4 the next LTS starting a new tradition of “the next LTS is the one after the kernel summit”. We’ll see how that works out.

Chris Mason gave a “what went good/bad when facebook moved to 4.0” talk. Which for the most part, was all good. There are a few small things that are still being shaken out, but it’s by no means awful.

I had a lot of hallway conversations that began “so, trinity..”
The short answer there is that I’m still working on it, though at a much reduced pace than I was a year ago. It was good to hear feedback
from pretty much everyone I talked to that it was something that people value, which was a good motivator. More on that later.

I also had a lot of people asking a lot of questions about my Facebook bootcamp experience. I’ll do a longer write-up of that soon.

The post kernel summit 2015 wrap-up appeared first on

October 28, 2015

Darcy on the future of storage

Quick comment on the following:

Good morning, madam. What kind of storage system would you like me to build for you today?

Scary thought. That means that selling storage products is going to be hard for all of us. We'll be selling components, both hardware and software, or we'll be selling integration and support services. Somebody will always pay to have somebody else assemble the parts, maybe add some light customization, and support the result. There's a nice living to be made there... but no empires.

Why is it a problem that no empires are to be built? It's only a problem for an empire-builder like I dunno... Sam Altman or something. Darcy is an old engineer, not a startup founder. A good one, too. His kids aren't going to go to bed hungry.

We've been at this dance before with Linux. People have been asking if Red Hat was going to be like Microsoft, and I told everyone: nope. We're transfering the wealth that the proprietary lock-in vendors were collecting back to the users. That was the whole idea. In the process, we're collecting less - a more reasonable amount, necessary to put stuff together and make it run. Therefore, we're not going to be as wealthy off users' backs. But the society as a whole benefits.

So cry me a river. Not scary at all. But RTWT, I think he's drawing a truthful outline overall.

P.S. Another thing, what's magical about storage? Why, I can go build spacecraft when storage goes bust. Or whatever. Of course it's a pity for all the storage-specific techniques and skills that I accumulated, but eh. As long as we leave behind the good code (and docs), it's all good.

October 18, 2015

Second Round of systemd.conf 2015 Sponsors

Second Round of systemd.conf 2015 Sponsors

We are happy to announce the second round of systemd.conf 2015 sponsors! In addition to those from the first announcement, we have:

Our second Gold sponsor is Red Hat!

What began as a better way to build software—openness, transparency, collaboration—soon shifted the balance of power in an entire industry. The revolution of choice continues. Today Red Hat® is the world's leading provider of open source solutions, using a community-powered approach to provide reliable and high-performing cloud, virtualization, storage, Linux®, and middleware technologies.

A Bronze sponsor is Samsung:

From the beginning we have established a very fast pace and are currently one of the biggest and fastest growing modern-technology R&D centers in East-Central Europe. We have started with designing subsystems for digital satellite television, however, we have quickly expanded the scope of our interest. Currently, it includes advanced systems of digital television, platform convergence, mobile systems, smart solutions, and enterprise solutions. Also a vital role in our activity plays the quality and certification center, which controls the conformity of Samsung Electronics products with the highest standards of quality and reliability.

A Bronze sponsor is travelping:

Travelping is passionate about networks, communications and devices. We empower our customers to deploy and operate networks using our state of the art products, solutions and services. Our products and solutions are based on our industry proven physical and virtual appliance platforms. These purpose built platforms ensure best in class performance, scalability and reliability combined with consistent end to end management capabilities. To build this products, Travelping has developed a own embedded, cross platform Linux distribution called which incorporates the systemd service manager and tools.

A Bronze sponsor is Collabora:

Collabora has over 10 years of experience working with top tier OEMs & silicon manufacturers worldwide to develop products based on Open Source software. Through the use of Open Source technologies and methodologies, Collabora helps clients in multiple market segments gain faster time to market and save millions of dollars in licensing and maintenance costs. Collabora has already brought to market several products relying on systemd extensively.

A Bronze sponsor is Endocode:

Endocode AG. An employee-owned, software engineering company from Berlin. Open Source is our heart and soul.

A Bronze sponsor is the Linux Foundation:

The Linux Foundation advances the growth of Linux and offers its collaborative principles and practices to any endeavor.

We are Cooperating with LinuxTag e.V. on the organization:

LinuxTag is Europe's leading organizer of Linux and Open Source events. Born of the community and in business for 20 years, we organize LinuxTag, an annual conference and exhibition attracting thousands of visitors. We also participate and cooperate in organizing workshops, tutorials, seminars, and other events together with and for the Open Source community. Selected events include non-profit workshops, the German Kernel Summit at FrOSCon, participation in the Open Tech Summit, and others. We take care of the organizational framework of systemd.conf 2015. LinuxTag e.V. is a non-profit organization and welcomes donations of ideas and workforce.

A Media Partner is Golem: is an up to date online-publication intended for professional computer users. It provides technology insights of the IT and telecommunications industry. offers profound and up to date information on significant and trending topics. Online- and IT-Professionals, marketing managers, purchasers, and readers inspired by technology receive substantial information on product, market and branding potentials through tests, interviews und market analysis.

We'd like to thank our sponsors for their support! Without sponsors our conference would not be possible!

The Conference s SOLD OUT since a few weeks. We no longer accept registrations, nor paper submissions.

For further details about systemd.conf consult the conference website.

See the the first round of sponsor announcements!

See you in Berlin!

October 06, 2015

Nightly development builds using xdg-app

When reporting a bug in some software it is often common that the developer asks you to check if the bug is fixed in the latest development version. This makes a lot of sense from the perspective of a developer that gets a lot of bug reports. However, unless you are very experienced with building software this is prohibitively hard.

This is an area where xdg-app shines, because it allows you to create binary builds of desktop applications that work on any distribution. In order to demonstrate this I set up an automated build system that builds Gimp and Inkscape from the development branch every day and produces a new binary that you can easily install and run:

screenshot of app iconsLaunching the apps

To make it easy to use I also created packages of xdg-app for some common distributions.

For more information about how to use these builds, see the nightly builds page.

October 05, 2015

Pics Up

Чёт я под настроение выложил картинки с этой недели на форумы Авиабазы. Anglophones are welcome to pictures at least.

October 03, 2015

Software that liberates people: feels about FSF@30 and OSFeels@1

tl;dr: I want to liberate people; software is a (critical) tool to that end. There is a conference this weekend that understands that, but I worry it isn’t FSF’s.

Feelings are facts, by wrote, CC BY 2.0

This morning, social network chatter reminded me of FSF‘s 30th birthday celebration. These travel messages were from friends who I have a great deal of love and respect for, and represent a movement to which I essentially owe my adult life.

Despite that, I had lots of mixed feels about the event. I had a hard time capturing why, though.

While I was still processing these feelings, late tonight, Twitter reminded me of a new conference also going on this weekend, appropriately called Open Source and Feelings. (I badly wanted to submit a talk for it, but a prior commitment kept me from both it and FSF@30.)

I saw the OSFeels agenda for the first time tonight. It includes:

  • Design and empathy (learning to build open software that empowers all users, not just the technically sophisticated)
  • Inclusive development (multiple talks about this, including non-English, family, and people of color) (so that the whole planet can access, and participate in developing, open software)
  • Documentation (so that users understand open software)
  • Communications skills (so that people feel welcome and engaged to help develop open software)

This is an agenda focused on liberating human beings by developing software that serves their needs, and engaging them in the creation of that software. That is incredibly exciting. I’ve long thought (following Sen and Nussbaum’s capability approach) that it is not sufficient to free people; they must be empowered to actually enjoy the benefits of that freedom. This is a conference that seems to get that, and I can’t wait to go (and hopefully speak!) next year.

The Free Software Foundation event’s agenda:

  • licenses
  • crypto
  • boot firmware
  • federation

These are important topics. But there is clearly a difference in focus here — technology first, not people. No mention of community, or of design.

This difference in focus is where this morning’s conflicted feels came from. On the one hand, I support FSF, because they’ve done an incredible amount to make the world a better place. (OSFeels can take open development for granted precisely because FSF fought so many battles about source code.) But precisely because I support FSF, I’d challenge it, in the next 15 years, to become more clearly and forcefully dedicated to liberating people. In this world, FSF would talk about design, accessibility, and inclusion as much as licensing, and talk about community-building protocols as much as communication protocols. This is not impossible: LibrePlanet had at least some people-focused talks (e.g.), and inclusion and accessibility are a genuine concern of staff, even if they didn’t rise to today’s agenda. But it would still be a big change, because at the deepest level, it would require FSF to see source code as just one of many requirements for freedom, rather than “the point of free software“.

At the same time, OSFeels is clearly filled with people who see the world through a broad, thoughtful ethical lens. It is a sad sign, both for FSF and how it is perceived, that such a group uses the deliberately apolitical language of openness rather than the language of a (hopefully) aligned ethical movement — free software. I’ll look forward to the day (maybe FSF’s 45th (or 31st!) birthday) that both groups can speak and work together about their real shared concern: software that liberates people. I’d certainly have no conflicted feelings about signing up for a conference on that :)

September 23, 2015

Playing games with runtime extensions

One of the core ideas of xdg-app is that users should be running the same build of everything that the developers tested on. Not only does this mean that you can trust the testing that went into the app, but it also means that an app can run on multiple distributions, and on different version of the same distribution.

However, an application does not have bundle everything. Instead the app specifies a dependency on a runtime, which contains the base system libraries. I like to compare this to dynamic libraries, where xdg-app is “dynamically linked” to its runtime, whereas container systems (like docker) are “statically linked” (by shipping a complete runtime in each app).

This may seem weird and contrary to the first paragraph, but it turns out that this is pretty much a requirement. We want third parties to be able to produce a binary that will keep running “forever”, but the base system may need fixes or support for new hardware. We can’t expect every vendor to rebuild every application (for example some old game) each time something needs fixing on the lower levels. So, therefore we allow updates to the runtime separately from the app (although any update must be compatible).

An app can only depend on one runtime, and everything not in the runtime must be bundled with the application. There is (by design) no way to depend on multiple runtimes, or have runtimes depend on each other. However, there is something called runtime extensions. Extensions are a way to split off and make optional parts of the runtime and recombine them at runtime.

For instance, I’m working on a runtime called org.freedesktop.Platform that has the basic freedesktop libraries (X11, Mesa, DBus, etc). It has this snippet in the configuration:

[Extension org.freedesktop.Platform.Timezones]

This means that whenever another runtime called org.freedesktop.Platform.Timezones is installed its contents will replace the directory share/zoneinfo in the runtime. This is very useful as it allows the timezone info (which changes frequently) to be updated separately from the runtime.

It also has this:

[Extension org.freedesktop.Platform.Locale]

This means that if a runtime like is installed, it will replace the contents of share/runtime/locale/sv in the runtime. During the build all the locale data and translations are separated out into per-language runtimes which can be installed separately.

And finally, it has:

[Extension org.freedesktop.Platform.GL]

There is no (official) runtime with this name, but if one is installed it will appear in lib/GL, and the main runtime has been programmed to look into this directory for libGL.

The idea here is that if your system uses an OpenGL driver that does not ship with the regular runtime (i.e. Mesa), or needs a more recent version of it, then you can create your own runtime with this name and get your drivers into the runtime.

For a long time this has been a theoretical solution, but recently I acquired an NVidia card in order to test this. The result is this script, which takes an upstream nvidia driver release and converts it into a runtime that matches the (soon to be released) 1.2 version of the Freedesktop runtime.

To verify that it works I created an xdg-app bundle for the Unreal Editor. Here are some screenshots of a sandboxed version of unreal to show this working:

Launching the unreal editorLaunching the unreal editor
Editing the sample projectEditing the sample project

While runtimes can’t have dependencies on other runtimes, they can be build from the same base, and thus be compatible. For instance, the official Gnome runtime takes the Freedesktop runtime and adds the Gnome modules to it. Since we have the same ABI  we can reuse the same extensions. The two runtimes have different versions (Gnome is 3.18, Freedesktop is 1.2), so we have to specify the version (which is otherwise infered by the runtime version). Here is how the Gnome runtime config looks:

[Extension org.freedesktop.Platform.GL]

I hope to have stable builds out of the Freedesktop 1.2 and Gnome 3.18 runtimes shortly, so that other people can play with them. Unfortunately I’m not allowed to distribute the unreal editor app.

September 22, 2015

systemd.conf close to being sold out!

Only 14 tickets still available!

systemd.conf 2015 is close to being sold out, there are only 14 tickets left now. If you haven't bought your ticket yet, now is the time to do it, because otherwise it will be too late and all tickets will be gone!

Why attend? At this conference you'll get to meet everybody who is involved with the systemd project and learn what they are working on, and where the project will go next. You'll hear from major users and projects working with systemd. It's the primary forum where you can make yourself heard and get first hand access to everybody who's working on the future of the core Linux userspace!

To get an idea about the schedule, please consult our preliminary schedule.

In order to register for the conference, please visit the registration page.

We are still looking for sponsors. If you'd like to join the ranks of systemd.conf 2015 sponsors, please have a look at our Becoming a Sponsor page!

For further details about systemd.conf consult the conference website.

September 21, 2015

Media unit for geeks with kids?

Phoenix is growing up quickly and pretty soon he’ll be crawling around the house. So it’s time for babyproofing.

For the past year, I’ve been looking all over the internet for decent media units that we could get. IKEA used to have some good ones, but it doesn’t look like they have any decent model anymore.

So I turn to the geeky side of the internet, as I’m sure there’s lots of people out there who’ve gone through the same problem with an infant growing up.

So far, I’m thinking:

  • closed at the front, except for a big slot large enough to fit my central speaker (I admit I went large with a PolkAudio A4)
  • thick glass – the kind that lets IR through, but not babies when they smash into it
  • plenty of holes out the back for ventilation – in fact, mostly open
  • useful leads for cables if possible
  • 50-60 inch wide because the TV needs to go on top
  • high enough – at least 80 cm. So many units are low, why?
  • deep enough – so many media units do not even fit a standard AV receiver, let alone leave enough space for air to circulate so the unit doesn’t burn up
  • cubby holes/shelves high enough so said unit fits as well
  • not butt ugly or escaped from the eighties
  • can hold A/V receiver, standard Digital TV unit, router, a NAS, a PS3, and an Atari VCS 2600. Bonus points for space left over for a future Megadrive or NES.
  • easy to attach to a wall
  • built-in custom rack for Atari VCS 2600 cartridges (though I’d begrudgingly accept a unit that ticks all the other boxes)

Any requirements I’m missing? Anyone want to share which unit made them happy?

Update: if it matters, this is for a smallish appartment in Manhattan – preference for no DIY.

flattr this!

September 15, 2015

Preliminary systemd.conf 2015 Schedule

A Preliminary systemd.conf 2015 Schedule is Now Online!

We are happy to announce that an initial, preliminary version of the systemd.conf 2015 schedule is now online! (Please ignore that some rows in the schedule link the same session twice on that page. That's a bug in the web site CMS we are working on to fix.)

We got an overwhelming number of high-quality submissions during the CfP! Because there were so many good talks we really wanted to accept, we decided to do two full days of talks now, leaving one more day for the hackfest and BoFs. We also shortened many of the slots, to make room for more. All in all we now have a schedule packed with fantastic presentations!

The areas covered range from containers, to system provisioning, stateless systems, distributed init systems, the kdbus IPC, control groups, systemd on the desktop, systemd in embedded devices, configuration management and systemd, and systemd in downstream distributions.

We'd like to thank everybody who submited a presentation proposal!

Also, don't forget to register for the conference! Only a limited number of registrations are available due to space constraints! Register here!.

We are still looking for sponsors. If you'd like to join the ranks of systemd.conf 2015 sponsors, please have a look at our Becoming a Sponsor page!

For further details about systemd.conf consult the conference website.

September 11, 2015

TLS Security In Firefox 40

What do people at Mozilla think is going to happen when I need to access a website and Firefox says that TLS parameters are insecure and thus I cannot? I'm going to use Chrome, that's what. Or maybe even a hacked Midori, where I can adjust build-time parameters of gcr.

That company went way downhill when they kicked Eich out.

August 27, 2015

systemd.conf 2015 CfP REMINDER

LAST REMINDER! systemd.conf 2015 Call for Presentations ends August 31st!

Here's the last reminder that the systemd.conf 2015 CfP ends on August 31st 11:59:59pm Central European Time (that's monday next week)! Make sure to submit your proposals until then!

Please submit your proposals on our website!

And don't forget to register for the conference! Only a limited number of registrations are available due to space constraints! Register here!.

For further details about systemd.conf consult the conference website.

August 24, 2015

First Round of systemd.conf 2015 Sponsors

First Round of systemd.conf 2015 Sponsors

We are happy to announce the first round of systemd.conf 2015 sponsors!

Our first Gold sponsor is CoreOS!

CoreOS develops software for modern infrastructure that delivers a consistent operating environment for distributed applications. CoreOS's commercial offering, Tectonic, is an enterprise-ready platform that combines Kubernetes and the CoreOS stack to run Linux containers. In addition CoreOS is the creator and maintainer of open source projects such as CoreOS Linux, etcd, fleet, flannel and rkt. The strategies and architectures that influence CoreOS allow companies like Google, Facebook and Twitter to run their services at scale with high resilience. Learn more about CoreOS here, Tectonic here, or follow CoreOS on Twitter @coreoslinux.

A Silver sponsor is Codethink:

Codethink is a software services consultancy, focusing on engineering reliable systems for long-term deployment with open source technologies.

A Bronze sponsor is Pantheon:

Pantheon is a platform for professional website development, testing, and deployment. Supporting Drupal and WordPress, Pantheon runs over 100,000 websites for the world's top brands, universities, and media organizations on top of over a million containers.

A Bronze sponsor is Pengutronix:

Pengutronix provides consulting, training and development services for Embedded Linux to customers from the industry. The Kernel Team ports Linux to customer hardware and has more than 3100 patches in the official mainline kernel. In addition to lowlevel ports, the Pengutronix Application Team is responsible for board support packages based on PTXdist or Yocto and deals with system integration (this is where systemd plays an important role). The Graphics Team works on accelerated multimedia tasks, based on the Linux kernel, GStreamer, Qt and web technologies.

We'd like to thank our sponsors for their support! Without sponsors our conference would not be possible!

We'll shortly announce our second round of sponsors, please stay tuned!

If you'd like to join the ranks of systemd.conf 2015 sponsors, please have a look at our Becoming a Sponsor page!

Reminder! The systemd.conf 2015 Call for Presentations ends on monday, August 31st! Please make sure to submit your proposals on the CfP page until then!

Also, don't forget to register for the conference! Only a limited number of registrations are available due to space constraints! Register here!.

For further details about systemd.conf consult the conference website.

August 18, 2015

systemd.conf 2015 Call for Presentations

REMINDER! systemd.conf 2015 Call for Presentations ends August 31st!

We'd like to remind you that the systemd.conf 2015 Call for Presentations ends on August 31st! Please submit your presentation proposals before that data on our website.

We are specifically interested in submissions from projects and vendors building today's and tomorrow's products, services and devices with systemd. We'd like to learn about the problems you encounter and the benefits you see! Hence, if you work for a company using systemd, please submit a presentation!

We are also specifically interested in submissions from downstream distribution maintainers of systemd! If you develop or maintain systemd packages in a distribution, please submit a presentation reporting about the state, future and the problems of systemd packaging so that we can improve downstream collaboration!

And of course, all talks regarding systemd usage in containers, in the cloud, on servers, on the desktop, in mobile and in embedded are highly welcome! Talks about systemd networking and kdbus IPC are very welcome too!

Please submit your presentations until August 31st!

And don't forget to register for the conference! Only a limited number of registrations are available due to space constraints! Register here!.

Also, limited travel and entry fee sponsorship is available for community contributors. Please contact us for details!

For further details about the CfP consult the CfP page.

For further details about systemd.conf consult the conference website.

systemd.conf 2015 Call for Papers

REMINDER! systemd.conf 2015 Call for Papers ends August 31st!

We'd like to remind you that the systemd.conf 2015 Call for Presentations ends on August 31st! Please submit your presentation proposals before that data on our website.

We are specifically interested in submissions from projects and vendors building today's and tomorrow's products, services and devices with systemd. We'd like to learn about the problems you encounter and the benefits you see! Hence, if you work for a company using systemd, please submit a presentation!

We are also specifically interested in submissions from downstream distribution maintainers of systemd! If you develop or maintain systemd packages in a distribution, please submit a presentation reporting about the state, future and the problems of systemd packaging so that we can improve downstream collaboration!

And of course, all talks regarding systemd usage in containers, in the cloud, on servers, on the desktop, in mobile and in embedded are highly welcome! Talks about systemd networking and kdbus IPC are very welcome too!

Please submit your presentations until August 31st!

And don't forget to register for the conference! Only a limited number of registrations are available due to space constraints! Register here!.

Also, limited travel and entry fee sponsorship is available for community contributors. Please contact us for details!

For further details abou the CfP consult the CfP page.

For further details about systemd.conf consult the conference website.

First Round of systemd.conf 2015 Sponsors

First Round of systemd.conf 2015 Sponsors

We are happy to announce the first round of systemd.conf 2015 sponsors!

Our first Silver sponsor is CoreOS!

CoreOS develops software for modern infrastructure that delivers a consistent operating environment for distributed applications. CoreOS's commercial offering, Tectonic, is an enterprise-ready platform that combines Kubernetes and the CoreOS stack to run Linux containers. In addition CoreOS is the creator and maintainer of open source projects such as CoreOS Linux, etcd, fleet, flannel and rkt. The strategies and architectures that influence CoreOS allow companies like Google, Facebook and Twitter to run their services at scale with high resilience. Learn more about CoreOS here, Tectonic here, or follow CoreOS on Twitter @coreoslinux.

A Bronze sponsor is Codethink:

Codethink is a software services consultancy, focusing on engineering reliable systems for long-term deployment with open source technologies.

A Bronze sponsor is Pantheon:

Pantheon is a platform for professional website development, testing, and deployment. Supporting Drupal and WordPress, Pantheon runs over 100,000 websites for the world's top brands, universities, and media organizations on top of over a million containers.

A Bronze sponsor is Pengutronix:

Pengutronix provides consulting, training and development services for Embedded Linux to customers from the industry. The Kernel Team ports Linux to customer hardware and has more than 3100 patches in the official mainline kernel. In addition to lowlevel ports, the Pengutronix Application Team is responsible for board support packages based on PTXdist or Yocto and deals with system integration (this is where systemd plays an important role). The Graphics Team works on accelerated multimedia tasks, based on the Linux kernel, GStreamer, Qt and web technologies.

We'd like to thank our sponsors for their support! Without sponsors our conference would not be possible!

We'll shortly announce our second round of sponsors, please stay tuned!

If you'd like to join the ranks of systemd.conf 2015 sponsors, please have a look at our Becoming a Sponsor page!

Reminder! The systemd.conf 2015 Call for Presentations ends on monday, August 31st! Please make sure to submit your proposals on the CfP page until then!

Also, don't forget to register for the conference! Only a limited number of registrations are available due to space constraints! Register here!.

For further details about systemd.conf consult the conference website.

August 14, 2015

Tablet Uber Alles Or Is It

Given the trouble with modern laptops, I'm seriously thinking if I should make a jump to a gigantic tablet with a keyboard. You run "make" on VM. Not enough RAM? Order in the cloud! The idea was planted in my mind by that jerk Atwood, who penned an article claiming a death of PC. And a month ago I saw someone at Python meetup using Canopy. It kinda worked, actually. I expect Github Atom to be even better.

Unfortunately, there are problems in 3 broad categories still.

First, the hotspot Internet connectivity sucks. It is plain unreliable. VPN, ssh, and IRC are often blocked; it's necessary to remember "Connectivity Through Anything" lessons and tehcniques. When it works, it's often slow. These problems extend to venues such as Intel's Executive Briefing Center. If "executives" eating their awesome snacks cannot obtain a decent WiFi, what hope do I have? I do not have cellphone data, but I hear bitching about it.

Second, the usual questions about privacy and security apply. Non-proprietary tablets suck immensely, from what I heard.

Third, tablets top out at 10..11 inch. Sorry, but that is not enough to kill laptops while laptops continue to be made. Certainly, Atwood made an argument that as tablets absorb users, PC makers will stop. The day the last one quits, we'll have to use the least shitty tablet regardless of size. But today is not that day.

UPDATE: 3 weeks after this post, Apple unveiled a 12.9" (2732 x 2048) iPad Pro, with a keyboard as a factory option.

User-facing hardware

New business trip, new hardware pictures.

It was almost a year, and I'm still looking for a decent laptop, same criteria. I saw a couple of guys using Lenovo X1 Carbon, which looks good. Most importantly, the left Ctrl is now extends to its proper position. Almost a winner, but unfortunately, there are issues. Apparently, the screen on the X1 is not touching the main frame flat when it's closed, so a bundle of clothing pressing in the middle between the hinges is capable to making a nasty crack in plastic. Not acceptable for what is a $1,400 laptop even with Amazon's "discount" of $900. Way to go, Lenovo. Almost had me this time.

Meanwhile, a $500 Dell Vostro continues to soldier on. It's showing its age: building Ceph with "make -j${N}" requires more RAM that it has for any reasonable N, and dialog windows started to outgrow its screen (notably, some of GNOME preferences). I still need a laptop, but can't find a suitable one. The Lenovo X1 tops out at 8GB, which was another strike against it.

I was a little sad when Google stopped making Nexus 7. I have a 2013 version and it is quite good. In the same meeting, I bumped into a guy with a projected update to Nexus 7 that became orphaned when Google pulled the plug. ASUS continued to build them and market them as "MemoPad 7". However, taking the page from Microsoft playbook with their "Surface" and "Surface Pro", ASUS sell "MemoPad 7" versions ranging from worthless piece of junk with 1024x600 to actual Nexus 7 replacements with 1920x1200. Allegedly, the battery life and speed are much improved by using Intel's embedded Atom core. Some of the ARM-optimized apps may not work (example is some kind of music editing thing for podcasters).

August 13, 2015

The case of the mysterious disappearing I211

Day one of unemployed life saw me finally getting around to the first of several hardware related maintenance items that I’ve been putting off until I’ve had the time.

I got a lot of life out of my desktop machine that I had been using since 2007. Earlier this year, I decided it was long overdue an upgrade, and ended up building a ridiculously over-specced machine in the hopes it too would last me a while. After some research, I ended up with a 6-core Haswell-E i7-5820K, and a frankly ridiculously over-featured motherboard.
Once I had delved through the absurd number of BIOS options to convince it that I *really* didn’t want to overclock my CPU or my RAM, or anything else, it was very stable.

It has exceeded all my expectations. In the time it took my old desktop to build one kernel, I can build kernel .deb’s for every machine I own, and still have time spare. It’s an absolute beast.

One of the features that sold me on this board was the two onboard ethernet ports. I had been wanting to do a bunch of networking experiments, and the possibility of using bonding, without having to screw around with add-in cards was appealing.

So I was a little irked one evening after updating its BIOS, to notice that the bond only had one interface active. After some investigation, I noticed that the PCI ID of one of the onboard NICs had changed.

What was once

00:19.0 Ethernet controller: Intel Corporation Ethernet Connection (2) I218-V (rev 05)
08:00.0 Ethernet controller: Intel Corporation I211 Gigabit Network Connection (rev 03)

Was now

00:19.0 Ethernet controller: Intel Corporation Ethernet Connection (2) I218-V (rev 05)
08:00.0 Ethernet controller: Intel Corporation Device 1532 (rev 03)

My I211 had changed its PCI ID, and the e1000 driver wouldn’t bind to this new device.

At first I thought “Cool, some kind of NIC firmware update”, and assumed that e1000 hadn’t been updated yet to support this new feature. Googling for “i211 1532” told a much sadder story however.

If you read the spec update for the i211, you find this interesting table:

I211 Device ID Code Vendor ID Device ID Revision ID1
WGI211AT (not programmed/factory default) 0x8086 0x1532 0x3
WGI211AT (programmed) 0x8086 0x1539 0x3

Uh, not cool. Somehow the BIOS update procedure had wiped the NVRAM on the NIC.

A long protracted conversation with ASUS support followed, including such gems as “I understand you’re seeing blue screens” and “Have you tried removing the DIMMs, rubbing the contacts with an eraser and replacing them”. Eventually I think they got to the end of their script, and agreed to RMA the board. Somewhat annoying, given there’s probably a tool somewhere that can rewrite the flash, but Intel only seems to make that available to integrators, not end-users, and the ASUS representatives denied all knowledge.

It was gone for about two weeks, and finally returned yesterday. Its PCI ID is 0x1539 again, and it has its old MAC address once more. (I’m now hesitant to ever upgrade the BIOS on this machine again). So what happened ? Anyone’s guess, but this isn’t the first time I’ve seen this happen. We had a bunch of these NICs at Akamai too that occasionally had the same thing happen to them.

The whole thing is reminiscent of a painful old bug where ftrace would corrupt the e1000e ROM. Hopefully Linux isn’t to blame this time.

So, long story short: If you see an i211 with a PCI ID of 1532, you’re looking at an RMA.

The post The case of the mysterious disappearing I211 appeared first on

August 11, 2015

Moving on from Akamai.

Today was my last day at Akamai. It’s been brief (Just over seven months), but things weren’t really working out for me there for a number of reasons. I’ve mentioned to a number of people who have known about my decision for a while, that it’s not that it’s a bad place to work, but it never felt like a good fit for me, and I came to realize that I’ve spent most of this last year being in denial of just how unhappy I was, in the hope “things would get better”.

There are a lot of smart people working there, working on really difficult problems, but a lot of those problems just don’t align with my interests, especially when they don’t always involve contributing code back upstream. [clarification: There is some upstream work going on there, just not as much as I’d like].

Add to this my disdain for some of the proprietary tooling that’s prevalent there, and it was becoming clear it was not a matter of “if”, but “when” I was going to leave. As an example; I joked a few months ago to co-workers “next time I’m looking for a job, the first question I ask is ‘do you use perforce’?”. Only it wasn’t really a joke, I was dead serious. User-hostile software has no place in my life.
Even little things like “let’s use git” translating to “let’s license Atlassian stash” rather than “run a git-daemon somewhere” started getting me down.

The final project I worked on there was a continuous rebase strategy for the kernel, moving away from perforce to git. It’s a move in the right direction, but ultimately, not the sort of work that gets me excited, and it’s going to be a multi-year project before it starts really bearing fruit. Given how perforce is ingrained in so many of Akamai’s systems, it would also have been extremely unlikely I’d have been able to purge all knowledge of ever having used it.

The rebase work itself also started to bother me that many of the kernel changes we made had no chance of ever even being submitted, let alone accepted upstream. (In part because many of them are very unique to Akamai’s CDN — you won’t find any of the trickery employed there described in a Richard Stevens book, and they’re unlikely to ever be official RFC’s due to the competitive edge they gain from those changes).
There are exceptions to all of this, and the kernel team is trying to do a better job there with upstreaming most of the newer changes, but many of the older legacy patches are under-documented, and/or understood well by few people, with the original authors no longer around, making it a frustrating exercise to get up to speed; especially when you’re trying to learn what the upstream code is doing at the same time.

Someone with less experience dealing exclusively with open-source for most of their career would probably find many of my reasons for leaving trivial. Those same people would probably find Akamai a great place to work. There are a lot of opportunities there if you have a higher tolerance for such things than I did. It was eye-opening recently, mentoring some of the interns there. Optimism. The unjaded outlook that comes with youth. Not getting bent out of shape at crappy tooling because they don’t know different. It made me realize I wasn’t going to ever be like this here.

On a particularly bad day a few weeks back, a recruiter reached out to me, to find out if I was interested in a second chance at an offer I received last time I was looking for a new job. It worked. Enduring an unhappy situation in the hopes things will get better isn’t a great strategy when there are other options.

So, I start at Facebook in September.

I have no delusions that things are going to be perfect there, but at least from the outside right now, the grass looks greener. I feel bad walking away from problems unfinished, but going home miserable or angry or some other negative emotion every day was really starting to get take its toll. It’s not a healthy way to live.

When I was interviewing last December, I read Being Geek to death, so it’s fitting that I’ve picked it up again recently. One paragraph in particular jumps out at me.

My single worst gig was one where I got everything I wanted out the of the offer letter, but in my exuberance for being highly valued, I totally forgot that my gut read on the gig was "meh". Ninety days later, I couldn't care less that I got a 15% raise and a sign-on bonus. I couldn't stand the mundanity of the daily work, and I happily resigned a few months later, taking both a pay cut and returning my sign-on bonus for the opportunity to work at Netscape.

Anachronisms and minor details aside, that paragraph played through my head this afternoon as I wrote the check to pay back the remainder of my sign-on bonus. I wasn’t quite thinking “meh”, but I knew I was making compromises on what I really valued from day one.

Walking away from unvested RSUs, giving up this months paycheck, and writing that check stings a little, but when I did my exit interview this morning, I knew that I too, was “happily resigning” for a great opportunity.

I’m feeling uncharacteristically optimistic right now. Hopefully it’ll last.

I’ll be in Seattle next week, but due to complications with my registration being transferred to another Akamai employee, I won’t actually be at the Linux plumbers conf. If you’re also going to be there and want to catch up, drop me a mail, or <ahem> hit me up on facebook.

The post Moving on from Akamai. appeared first on

git submodule

It's a familiar sign to anyone dealing with a project that includes submodules: you run "make" and see something like this:

rgw/ In member function ‘virtual int RGWMongooseFrontend::run()’:
rgw/ error: ‘struct mg_callbacks’ has no member named ‘log_access’
cb.log_access = rgw_civetweb_log_access_callback;

Ah, yes. Submodule civetweb is obviously out of date. Type "git submodule init; git submodule update" and... nothing happens. The goddamn submodules are stuck.

At this point, running "git diff origin" produces an output like:

--- a/ceph-object-corpus
+++ b/ceph-object-corpus
@@ -1 +1 @@
-Subproject commit 20351c6bae6dd4802936a5a9fd76e41b8ce2bad0
+Subproject commit bb3cee6b85b93210af5fb2c65a33f3000e341a11

So yeah, obviously you fetched the right thing from the origin, but you cannot merge or rebase no matter what. You may spend a good part of a hackathon reading man pages for git subcommands, all for naught.

Fortunately, the stuck submodules can be worked around, by looking at the "git diff origin" above, then doing this:

git update-index --replace --cacheinfo 160000,20351c6bae6dd4802936a5a9fd76e41b8ce2bad0,ceph-object-corpus

You get the idea: force the right commit from the origin into the local index. This allows "git submodule update" to clone and checkout the right thing and you're off to the races. The fixups in the index will stick out in "git status", so create an empty commit to get rid of them (but only after "git submodule update").

When you're done, you might want to kick in the nuts whoever chose to use submodules in your project.

P.S. "git --version" yields "git version 2.4.3".

P.P.S. You verify what you have in the index by running "git ls-files -s ceph-object-corpus" (or src/civetweb). The mode must be 160000 and the hash should match the upstream. Note that "git diff origin" continues to display a disparity until you've run the "git submodules update".

the future is here


10005 zaitcev   20   0  809920 755384  13220 R  99.7 12.5   0:20.47 cc1plus
 9894 zaitcev   20   0 1946748 1.806g  15800 R  99.3 31.4   1:46.60 cc1plus
 9956 zaitcev   20   0 1652076 1.524g  15832 R  99.0 26.5   1:30.64 cc1plus
   72 root      20   0       0      0      0 S   4.0  0.0   0:04.60 kswapd0
 9957 zaitcev   20   0   56648  43536   1436 S   2.7  0.7   0:00.49 as
 9895 zaitcev   20   0   79480  66368   1480 S   2.0  1.1   0:00.89 as
 2870 zaitcev   20   0 1989524 533104 160868 S   1.3  8.9  60:28.10 firefox
 2035 zaitcev   20   0 2018216 166872  20028 S   0.7  2.8  16:50.66 gnome-sh

That's right, boys and girls, a compiler with a bigger resident size than Firefox. Three times bigger.

August 02, 2015

DNF - Debugging Not Finished

It's 100% like CKS said:

[root@kvm-rei zaitcev]# dnf check-update openstack-swift
Last metadata expiration check performed 0:10:02 ago on Sun Aug  2 18:42:13 2015.

openstack-swift.noarch                   2.3.0-2.fc23                    rawhide
[root@kvm-rei zaitcev]# dnf update openstack-swift
Last metadata expiration check performed 0:10:07 ago on Sun Aug  2 18:42:13 2015.
Dependencies resolved.
Nothing to do.
[root@kvm-rei zaitcev]# rpm -q openstack-swift
[root@kvm-rei zaitcev]# 

Searching for a good way to make it unstuck.

July 29, 2015

Conference submission and voting

Generally I feel that I do not do any work that's important enough to present at conferences. My previous presentation was at OLS back in 2005, concerning usbmon. The usbmon is something a guy learning C would program: it's a circular buffer into which kernel drops tracing events; Wireshark pulls them out. Hardly a conference material, but at the time I thought it was supremely important to proseltize the basic techniques of always-on tracing, because it would improve the quality and the ease of debugging of the kernel overall. I really wanted FireWire guys to adopt a similar tracing scheme, because it was a hell on a stick debugging juju with just printk(). Needless to say, that was a miserable failure, as was FireWire itself. I don't think anyone who came to listen to my presentation in Ottawa received their money's worth.

Or did they? Recently an epiphany occured to me. I really should not even think if anyone is interested. That is conference organizers' job, not mine! As a result, I sent a proposal to OpenStack Tokyo, entitled "The Plot to Destroy OpenStack Swift Using C++: Enhancements of Swift API Compatibility in Ceph RADOS Gateway". It's basically a compendum of practical issues that occur when running Swift apps on top of Ceph RGW and what we do to help people do that.

The things are a little different from 10 years ago, because attendees can vote on the submissions. This sounds democratic. I went through all submissions on the storage track and voted them according to my preference. It took a very long time and I suspect that I was crowdsourced by the organizers in the best traditions of Web 2.0. I wonder if they'll even read the abstracts. :-)

July 28, 2015

Announcing systemd.conf 2015

Announcing systemd.conf 2015

We are happy to announce the inaugural systemd.conf 2015 conference of the systemd project.

The conference takes place November 5th-7th, 2015 in Berlin, Germany.

Only a limited number of tickets are available, hence make sure to sign up quickly.

For further details consult the conference website.

July 23, 2015

Wikimania 2015 – random thoughts and observations

Random thoughts from Wikimania, 2015 edition (2013, 2014):

"Wikimania 2015 Reception at Laboratorio Arte Alameda - 02" by Jarek Tuszynski,  under CC BY 4.0
Wikimania 2015 Reception at Laboratorio Arte Alameda – 02” by Jarek Tuszynski, under CC BY 4.0
  • Dancing: After five Wikimedia events (not counting WMF all-hands) I was finally dragged onto the dance floor on the last night. I’ll never be Garfield, but I had fun anyway. The amazing setting did not hurt.
  • Our hosts: The conference was excellently organized and run. I’ve never had Mexico City high on my list of “places I must see” but it moved up many spots after this trip.
  • First timers: I always enjoy talking to people who have never been to Wikimania before. They almost always seem to have enjoyed it, but of course the ones I talk to are typically the ones who are more outgoing and better equipped to enjoy things. I do hope we’re also being welcome to people who don’t already know folks, or who aren’t as outgoing.
  • Luis von Ahn: Good to chat briefly with my long-ago classmate. I thought the Q&A section of his talk was one of the best I’ve seen in a long time. There were both good questions and interesting answers, which is more rare than it should be.
  • Keynotes: I’d love to have one keynote slot each year for a contributor to talk about their work within the movement. Finding the right person would be a challenge, of course, as could language barriers, but it seems like it should be doable.
  • US English: I was corrected on my Americanisms and the occasional complexity of my sentence structure. It was a good reminder that even for fairly sophisticated speakers of English as a second language, California-English is not terribly clear. This is especially true when spoken. Verbose slides can help, which is a shame, since I usually prefer minimal slides. I will try to work on that in the future, and see how we can help other WMFers do the same.
  • Mobile: Really hope someday we can figure out how to make the schedule legible on a mobile device :) Good reminder we’ve got a long way to go there.
  • Community engagement: I enjoyed my departments “engage with” session, but I think next year we need to make it more interactive—probably with something like an introduction/overview followed by a World Cafe-style discussion. One thing we did right was to take questions on written cards. This helped indicate what the most important topics were (when questions were repeated), avoided the problem of lecture-by-question, and opened the floor to people who might otherwise be intimidated because of language barriers or personality. Our booth was also excellent and I’m excited to see some of the stories that came out of it.
  • Technology and culture: After talking about how we’d used cards to change the atmosphere of a talk, someone deliberately provoked me: shouldn’t we address on-wiki cultural issues the same way, by changing the “technology” used for discussion? I agree that technology can help improve things, and we should think about it more than we do (e.g.) but ultimately it can only be part of the solution – our most difficult problems will definitely require work on culture as well as interfaces. (Surprisingly, my 2009 post on this topic holds up pretty well.)
  • Who is this for? I’ve always felt there was some tension around whether the conference is for “us” or for the public, but never had language for it. An older gentleman who I spoke with for a while finally gave me the right term: is it an annual meeting or is it a public conference? Nothing I saw here changed my position, which is that it is more annual meeting than public conference, at least until we get much better at turning new users into long-term users.
  • Esino Lario looks like it will be a lot of fun. I strongly support the organizing committee’s decision to focus less on brief talks and more on longer, more interactive conversations. That is clearly the best use of our limited time together. I’m also excited that they’re looking into blind submissions (which I suggested in my Wikimania post from last year).
  • Being an exec: I saw exactly one regular talk that was not by my department, though I did have lots and lots of conversations. I’m still not sure how I feel about this tradeoff, but I know it will become even harder if we truly do transition to a model with more workshops/conversations and fewer lectures, since those will be both more valuable and more time-consuming/less flexible.
  • Some day: I wrote most of this post in the Mexico City airport, and saw that there are flights from there to La Habana. I hope someday we can do a Wikimania there.

July 20, 2015

Fedora 22 killed IPv6 and I'm fine

I upgraded Fedora on my home router to F22 and immediately IPv6 disappeared on the internal network. The problem is that radvd started throwing its usual "no linklocal address configured on ethmain.5" (although the message is only visible with "IgnoreIfMissing off;"), which leads to "interface ethmain.5 does not exist or is not set up properly". With the default IgnoreIfMissing, radvd continues running but refuses to work, quietly. Needless to say, the interface has a perfectly valid link-local address, same as it had in F21 before the upgrade.

There used to be a time when I took a problem like this as an affront to the idea of IPv6 superiority and the reputation of Fedora as a platform for roll-your-own home router. Now though, I don't give a rat's tail for IPv6. Let Comcast and Google care and pay someone to care. Okay, I lied. I cared enough for file a bug 1244428, but I'm not rushing to build from SRPMs, reinstall old versions, and such.

(Frankly, if we just engage Lennart's attention for an hour, he'll incoporate a perfectly serviceable radvd function into systemd-networkd. Of course, one would need journalctl to see any messages from it, but since it is certain to work, nobody would actually attempt that. The bug report would go unanswered like radvd bugs today, but again, there would not be any bugs.)

UPDATE: The root cause turned out to be an incorrect link-local address after all. I presumed that RFC meant whole fe80::/10 prefix to be used, so each interface had a different address within a node. Ergo, fe80:0:0:1::1/64 address. As it turns out, I may have confused Link-Local address with Site-Local address. The RFC 4291 specifies a fe80::/64 prefix and in F22 the radvd started enforcing it. Note that apparently the lower part has to be unique across the link.

July 16, 2015

NetworkManager 1.0.4 Released!

Just a quick note that we’ve released the latest stable NetworkManager, version 1.0.4.  This is mainly a bugfix release, though it does have a couple new features and larger changes.  Some of those are:

  • Some configuration options can now be changed without restarting the NM daemon.  Those include the ‘dns’, ‘connectivity’, and ‘ignore-carrier’ settings.
  • Devices that have only an IPv6 link-local address are no longer assumed to be connected; by default whenever an interface is set “up” the kernel will assign an IPv6 link-local address which could theoretically be used for communication.  NetworkManager used to interpret this as the interface being available for network communication, while this is rarely what users want or expect.
  • Correct routing is now maintained when two interfaces of the same priority are connected to the same network.
  • udev rules can now be used to tell NetworkManager to manage or unmanage specific devices.
  • Connections with children (bridge, bond, team, etc) can now optionally bring up their slave interfaces when the master is started.
  • Many, many bugs and crashers have also been fixed in the core daemon, the libraries, and nmcli.

Grab the 1.0.4 release here!

We’re well into the development cycle of NetworkManager 1.2 as well, where among other things, we’ll finally be moving to GDBus instead of dbus-glib.  We’ll also have support for structured logging with journald, indicating that certain connections are metered, advertising LLDP-discovered information, built-in IPv4 link-local addressing to get rid of avahi-autoipd, improvements to Wi-Fi Access Point scanning, less verbose logging by default, improvements to DNS handling, and more.  Look for it later this year!


July 12, 2015

Future development of Trinity.

It’s been an odd few weeks regarding Trinity based things.

First an email from a higher-up at my former employer asking (paraphrased)..

"That thing we asked you to stop working on when you worked here, any chance now you've left you'll implement these features."

I’m still trying to get my head around the thought process that led to that being a reasonable thing to ask. I’ve made the occasional commit over the last six months, but it’s mostly been code motion, clean-up work, and things like syscall table updates. New feature development came to a halt long ago.

It’s no coincidence that the number of bugs reported found with Trinity have dropped off sharply since the beginning of the year, and I don’t think it’s because the Linux kernel suddenly got lots better. Rather, it’s due to the lack of real ongoing development to “try something else” when some approaches dry up. Sadly we now live in a world where it’s easier to get paid to run someone else’s fuzzer these days than it is to develop one.

Then earlier this week, came the revelation that the only people prepared to fund that kind of new feature development are pretty much the worst people.

Apparently Hacking Team modified Trinity to fuzz ioctl() on Android, which yielded some results. I’ve done no analysis on whether those crashes are are exploitable/fixed/only relevant to Android etc. (Frankly, I’m past caring). I’m not convinced their approach is particularly sound even if it was finding results Trinity wasn’t, so it looks unlikely there are even ideas to borrow here. (We all already knew that ioctl was ripe with bugs, and had practically zero coverage testing).

It bothers me that my work was used as a foundation for their hack-job. Then again, maybe if I hadn’t released Trinity, they’d have based on iknowthis, or some other less useful fuzzer. None of this really should surprise me. I’ve known for some time that there are some “security” people that have their own modifications they have no intention of sending my way. Thanks to the way that people that release 0-days are revered in this circus, there’s no incentive for people to share their modifications if it means that someone else might beat them to finding their precious bugs.

It’s unfortunate that this project has attracted so many awful people. When I began it, the motivation had nothing to do with security. Back in 2010 we were inundated in weird oopses that we couldn’t reproduce, many times triggered by jvm’s. I came up with the idea that maybe a fuzzer could create a realistic enough workload to tickle some of those same bugs. Turned out I was right, and so began a series of huge page and other VM related bug fixes.

In the five years that I’ve made Trinity available, I’ve received notable contributions from perhaps a half dozen people. In return I’ve made my changes available before I’d even given them runtime myself.

It’s a project everyone wants to take from, but no-one wants to give back to.

And that’s why for the foreseeable future, I’m unlikely to make public any further feature work I do on it.
I’m done enabling assholes.

The post Future development of Trinity. appeared first on

July 02, 2015

Time for a new GPG key

My GPG key has lasted me well, over 18 years, but it's a v2 key and therefore no longer supported by newer versions of GnuPG. So it's time to move to a new one. I've made a transition statement available. If you signed my old key please consider signing the new one.

June 24, 2015

xdg-app moving to

For anyone following the development of xdg-app, all development have now moved to Here is where things are happening now:

June 18, 2015

The new sd-bus API of systemd

With the new v221 release of systemd we are declaring the sd-bus API shipped with systemd stable. sd-bus is our minimal D-Bus IPC C library, supporting as back-ends both classic socket-based D-Bus and kdbus. The library has been been part of systemd for a while, but has only been used internally, since we wanted to have the liberty to still make API changes without affecting external consumers of the library. However, now we are confident to commit to a stable API for it, starting with v221.

In this blog story I hope to provide you with a quick overview on sd-bus, a short reiteration on D-Bus and its concepts, as well as a few simple examples how to write D-Bus clients and services with it.

What is D-Bus again?

Let's start with a quick reminder what D-Bus actually is: it's a powerful, generic IPC system for Linux and other operating systems. It knows concepts like buses, objects, interfaces, methods, signals, properties. It provides you with fine-grained access control, a rich type system, discoverability, introspection, monitoring, reliable multicasting, service activation, file descriptor passing, and more. There are bindings for numerous programming languages that are used on Linux.

D-Bus has been a core component of Linux systems since more than 10 years. It is certainly the most widely established high-level local IPC system on Linux. Since systemd's inception it has been the IPC system it exposes its interfaces on. And even before systemd, it was the IPC system Upstart used to expose its interfaces. It is used by GNOME, by KDE and by a variety of system components.

D-Bus refers to both a specification, and a reference implementation. The reference implementation provides both a bus server component, as well as a client library. While there are multiple other, popular reimplementations of the client library – for both C and other programming languages –, the only commonly used server side is the one from the reference implementation. (However, the kdbus project is working on providing an alternative to this server implementation as a kernel component.)

D-Bus is mostly used as local IPC, on top of AF_UNIX sockets. However, the protocol may be used on top of TCP/IP as well. It does not natively support encryption, hence using D-Bus directly on TCP is usually not a good idea. It is possible to combine D-Bus with a transport like ssh in order to secure it. systemd uses this to make many of its APIs accessible remotely.

A frequently asked question about D-Bus is why it exists at all, given that AF_UNIX sockets and FIFOs already exist on UNIX and have been used for a long time successfully. To answer this question let's make a comparison with popular web technology of today: what AF_UNIX/FIFOs are to D-Bus, TCP is to HTTP/REST. While AF_UNIX sockets/FIFOs only shovel raw bytes between processes, D-Bus defines actual message encoding and adds concepts like method call transactions, an object system, security mechanisms, multicasting and more.

From our 10year+ experience with D-Bus we know today that while there are some areas where we can improve things (and we are working on that, both with kdbus and sd-bus), it generally appears to be a very well designed system, that stood the test of time, aged well and is widely established. Today, if we'd sit down and design a completely new IPC system incorporating all the experience and knowledge we gained with D-Bus, I am sure the result would be very close to what D-Bus already is.

Or in short: D-Bus is great. If you hack on a Linux project and need a local IPC, it should be your first choice. Not only because D-Bus is well designed, but also because there aren't many alternatives that can cover similar functionality.

Where does sd-bus fit in?

Let's discuss why sd-bus exists, how it compares with the other existing C D-Bus libraries and why it might be a library to consider for your project.

For C, there are two established, popular D-Bus libraries: libdbus, as it is shipped in the reference implementation of D-Bus, as well as GDBus, a component of GLib, the low-level tool library of GNOME.

Of the two libdbus is the much older one, as it was written at the time the specification was put together. The library was written with a focus on being portable and to be useful as back-end for higher-level language bindings. Both of these goals required the API to be very generic, resulting in a relatively baroque, hard-to-use API that lacks the bits that make it easy and fun to use from C. It provides the building blocks, but few tools to actually make it straightforward to build a house from them. On the other hand, the library is suitable for most use-cases (for example, it is OOM-safe making it suitable for writing lowest level system software), and is portable to operating systems like Windows or more exotic UNIXes.

GDBus is a much newer implementation. It has been written after considerable experience with using a GLib/GObject wrapper around libdbus. GDBus is implemented from scratch, shares no code with libdbus. Its design differs substantially from libdbus, it contains code generators to make it specifically easy to expose GObject objects on the bus, or talking to D-Bus objects as GObject objects. It translates D-Bus data types to GVariant, which is GLib's powerful data serialization format. If you are used to GLib-style programming then you'll feel right at home, hacking D-Bus services and clients with it is a lot simpler than using libdbus.

With sd-bus we now provide a third implementation, sharing no code with either libdbus or GDBus. For us, the focus was on providing kind of a middle ground between libdbus and GDBus: a low-level C library that actually is fun to work with, that has enough syntactic sugar to make it easy to write clients and services with, but on the other hand is more low-level than GDBus/GLib/GObject/GVariant. To be able to use it in systemd's various system-level components it needed to be OOM-safe and minimal. Another major point we wanted to focus on was supporting a kdbus back-end right from the beginning, in addition to the socket transport of the original D-Bus specification ("dbus1"). In fact, we wanted to design the library closer to kdbus' semantics than to dbus1's, wherever they are different, but still cover both transports nicely. In contrast to libdbus or GDBus portability is not a priority for sd-bus, instead we try to make the best of the Linux platform and expose specific Linux concepts wherever that is beneficial. Finally, performance was also an issue (though a secondary one): neither libdbus nor GDBus will win any speed records. We wanted to improve on performance (throughput and latency) -- but simplicity and correctness are more important to us. We believe the result of our work delivers our goals quite nicely: the library is fun to use, supports kdbus and sockets as back-end, is relatively minimal, and the performance is substantially better than both libdbus and GDBus.

To decide which of the three APIs to use for you C project, here are short guidelines:

  • If you hack on a GLib/GObject project, GDBus is definitely your first choice.

  • If portability to non-Linux kernels -- including Windows, Mac OS and other UNIXes -- is important to you, use either GDBus (which more or less means buying into GLib/GObject) or libdbus (which requires a lot of manual work).

  • Otherwise, sd-bus would be my recommended choice.

(I am not covering C++ specifically here, this is all about plain C only. But do note: if you use Qt, then QtDBus is the D-Bus API of choice, being a wrapper around libdbus.)

Introduction to D-Bus Concepts

To the uninitiated D-Bus usually appears to be a relatively opaque technology. It uses lots of concepts that appear unnecessarily complex and redundant on first sight. But actually, they make a lot of sense. Let's have a look:

  • A bus is where you look for IPC services. There are usually two kinds of buses: a system bus, of which there's exactly one per system, and which is where you'd look for system services; and a user bus, of which there's one per user, and which is where you'd look for user services, like the address book service or the mail program. (Originally, the user bus was actually a session bus -- so that you get multiple of them if you log in many times as the same user --, and on most setups it still is, but we are working on moving things to a true user bus, of which there is only one per user on a system, regardless how many times that user happens to log in.)

  • A service is a program that offers some IPC API on a bus. A service is identified by a name in reverse domain name notation. Thus, the org.freedesktop.NetworkManager service on the system bus is where NetworkManager's APIs are available and org.freedesktop.login1 on the system bus is where systemd-logind's APIs are exposed.

  • A client is a program that makes use of some IPC API on a bus. It talks to a service, monitors it and generally doesn't provide any services on its own. That said, lines are blurry and many services are also clients to other services. Frequently the term peer is used as a generalization to refer to either a service or a client.

  • An object path is an identifier for an object on a specific service. In a way this is comparable to a C pointer, since that's how you generally reference a C object, if you hack object-oriented programs in C. However, C pointers are just memory addresses, and passing memory addresses around to other processes would make little sense, since they of course refer to the address space of the service, the client couldn't make sense of it. Thus, the D-Bus designers came up with the object path concept, which is just a string that looks like a file system path. Example: /org/freedesktop/login1 is the object path of the 'manager' object of the org.freedesktop.login1 service (which, as we remember from above, is still the service systemd-logind exposes). Because object paths are structured like file system paths they can be neatly arranged in a tree, so that you end up with a venerable tree of objects. For example, you'll find all user sessions systemd-logind manages below the /org/freedesktop/login1/session sub-tree, for example called /org/freedesktop/login1/session/_7, /org/freedesktop/login1/session/_55 and so on. How services precisely label their objects and arrange them in a tree is completely up to the developers of the services.

  • Each object that is identified by an object path has one or more interfaces. An interface is a collection of signals, methods, and properties (collectively called members), that belong together. The concept of a D-Bus interface is actually pretty much identical to what you know from programming languages such as Java, which also know an interface concept. Which interfaces an object implements are up the developers of the service. Interface names are in reverse domain name notation, much like service names. (Yes, that's admittedly confusing, in particular since it's pretty common for simpler services to reuse the service name string also as an interface name.) A couple of interfaces are standardized though and you'll find them available on many of the objects offered by the various services. Specifically, those are org.freedesktop.DBus.Introspectable, org.freedesktop.DBus.Peer and org.freedesktop.DBus.Properties.

  • An interface can contain methods. The word "method" is more or less just a fancy word for "function", and is a term used pretty much the same way in object-oriented languages such as Java. The most common interaction between D-Bus peers is that one peer invokes one of these methods on another peer and gets a reply. A D-Bus method takes a couple of parameters, and returns others. The parameters are transmitted in a type-safe way, and the type information is included in the introspection data you can query from each object. Usually, method names (and the other member types) follow a CamelCase syntax. For example, systemd-logind exposes an ActivateSession method on the org.freedesktop.login1.Manager interface that is available on the /org/freedesktop/login1 object of the org.freedesktop.login1 service.

  • A signature describes a set of parameters a function (or signal, property, see below) takes or returns. It's a series of characters that each encode one parameter by its type. The set of types available is pretty powerful. For example, there are simpler types like s for string, or u for 32bit integer, but also complex types such as as for an array of strings or a(sb) for an array of structures consisting of one string and one boolean each. See the D-Bus specification for the full explanation of the type system. The ActivateSession method mentioned above takes a single string as parameter (the parameter signature is hence s), and returns nothing (the return signature is hence the empty string). Of course, the signature can get a lot more complex, see below for more examples.

  • A signal is another member type that the D-Bus object system knows. Much like a method it has a signature. However, they serve different purposes. While in a method call a single client issues a request on a single service, and that service sends back a response to the client, signals are for general notification of peers. Services send them out when they want to tell one or more peers on the bus that something happened or changed. In contrast to method calls and their replies they are hence usually broadcast over a bus. While method calls/replies are used for duplex one-to-one communication, signals are usually used for simplex one-to-many communication (note however that that's not a requirement, they can also be used one-to-one). Example: systemd-logind broadcasts a SessionNew signal from its manager object each time a user logs in, and a SessionRemoved signal every time a user logs out.

  • A property is the third member type that the D-Bus object system knows. It's similar to the property concept known by languages like C#. Properties also have a signature, and are more or less just variables that an object exposes, that can be read or altered by clients. Example: systemd-logind exposes a property Docked of the signature b (a boolean). It reflects whether systemd-logind thinks the system is currently in a docking station of some form (only applies to laptops …).

So much for the various concepts D-Bus knows. Of course, all these new concepts might be overwhelming. Let's look at them from a different perspective. I assume many of the readers have an understanding of today's web technology, specifically HTTP and REST. Let's try to compare the concept of a HTTP request with the concept of a D-Bus method call:

  • A HTTP request you issue on a specific network. It could be the Internet, or it could be your local LAN, or a company VPN. Depending on which network you issue the request on, you'll be able to talk to a different set of servers. This is not unlike the "bus" concept of D-Bus.

  • On the network you then pick a specific HTTP server to talk to. That's roughly comparable to picking a service on a specific bus.

  • On the HTTP server you then ask for a specific URL. The "path" part of the URL (by which I mean everything after the host name of the server, up to the last "/") is pretty similar to a D-Bus object path.

  • The "file" part of the URL (by which I mean everything after the last slash, following the path, as described above), then defines the actual call to make. In D-Bus this could be mapped to an interface and method name.

  • Finally, the parameters of a HTTP call follow the path after the "?", they map to the signature of the D-Bus call.

Of course, comparing an HTTP request to a D-Bus method call is a bit comparing apples and oranges. However, I think it's still useful to get a bit of a feeling of what maps to what.

From the shell

So much about the concepts and the gray theory behind them. Let's make this exciting, let's actually see how this feels on a real system.

Since a while systemd has included a tool busctl that is useful to explore and interact with the D-Bus object system. When invoked without parameters, it will show you a list of all peers connected to the system bus. (Use --user to see the peers of your user bus instead):

$ busctl
NAME                                       PID PROCESS         USER             CONNECTION    UNIT                      SESSION    DESCRIPTION
:1.1                                         1 systemd         root             :1.1          -                         -          -
:1.11                                      705 NetworkManager  root             :1.11         NetworkManager.service    -          -
:1.14                                      744 gdm             root             :1.14         gdm.service               -          -
:1.4                                       708 systemd-logind  root             :1.4          systemd-logind.service    -          -
:1.7200                                  17563 busctl          lennart          :1.7200       session-1.scope           1          -
org.freedesktop.NetworkManager             705 NetworkManager  root             :1.11         NetworkManager.service    -          -
org.freedesktop.login1                     708 systemd-logind  root             :1.4          systemd-logind.service    -          -
org.freedesktop.systemd1                     1 systemd         root             :1.1          -                         -          -
org.gnome.DisplayManager                   744 gdm             root             :1.14         gdm.service               -          -

(I have shortened the output a bit, to make keep things brief).

The list begins with a list of all peers currently connected to the bus. They are identified by peer names like ":1.11". These are called unique names in D-Bus nomenclature. Basically, every peer has a unique name, and they are assigned automatically when a peer connects to the bus. They are much like an IP address if you so will. You'll notice that a couple of peers are already connected, including our little busctl tool itself as well as a number of system services. The list then shows all actual services on the bus, identified by their service names (as discussed above; to discern them from the unique names these are also called well-known names). In many ways well-known names are similar to DNS host names, i.e. they are a friendlier way to reference a peer, but on the lower level they just map to an IP address, or in this comparison the unique name. Much like you can connect to a host on the Internet by either its host name or its IP address, you can also connect to a bus peer either by its unique or its well-known name. (Note that each peer can have as many well-known names as it likes, much like an IP address can have multiple host names referring to it).

OK, that's already kinda cool. Try it for yourself, on your local machine (all you need is a recent, systemd-based distribution).

Let's now go the next step. Let's see which objects the org.freedesktop.login1 service actually offers:

$ busctl tree org.freedesktop.login1
  │ ├─/org/freedesktop/login1/seat/seat0
  │ └─/org/freedesktop/login1/seat/self
  │ ├─/org/freedesktop/login1/session/_31
  │ └─/org/freedesktop/login1/session/self

Pretty, isn't it? What's actually even nicer, and which the output does not show is that there's full command line completion available: as you press TAB the shell will auto-complete the service names for you. It's a real pleasure to explore your D-Bus objects that way!

The output shows some objects that you might recognize from the explanations above. Now, let's go further. Let's see what interfaces, methods, signals and properties one of these objects actually exposes:

$ busctl introspect org.freedesktop.login1 /org/freedesktop/login1/session/_31
NAME                                TYPE      SIGNATURE RESULT/VALUE                             FLAGS
org.freedesktop.DBus.Introspectable interface -         -                                        -
.Introspect                         method    -         s                                        -
org.freedesktop.DBus.Peer           interface -         -                                        -
.GetMachineId                       method    -         s                                        -
.Ping                               method    -         -                                        -
org.freedesktop.DBus.Properties     interface -         -                                        -
.Get                                method    ss        v                                        -
.GetAll                             method    s         a{sv}                                    -
.Set                                method    ssv       -                                        -
.PropertiesChanged                  signal    sa{sv}as  -                                        -
org.freedesktop.login1.Session      interface -         -                                        -
.Activate                           method    -         -                                        -
.Kill                               method    si        -                                        -
.Lock                               method    -         -                                        -
.PauseDeviceComplete                method    uu        -                                        -
.ReleaseControl                     method    -         -                                        -
.ReleaseDevice                      method    uu        -                                        -
.SetIdleHint                        method    b         -                                        -
.TakeControl                        method    b         -                                        -
.TakeDevice                         method    uu        hb                                       -
.Terminate                          method    -         -                                        -
.Unlock                             method    -         -                                        -
.Active                             property  b         true                                     emits-change
.Audit                              property  u         1                                        const
.Class                              property  s         "user"                                   const
.Desktop                            property  s         ""                                       const
.Display                            property  s         ""                                       const
.Id                                 property  s         "1"                                      const
.IdleHint                           property  b         true                                     emits-change
.IdleSinceHint                      property  t         1434494624206001                         emits-change
.IdleSinceHintMonotonic             property  t         0                                        emits-change
.Leader                             property  u         762                                      const
.Name                               property  s         "lennart"                                const
.Remote                             property  b         false                                    const
.RemoteHost                         property  s         ""                                       const
.RemoteUser                         property  s         ""                                       const
.Scope                              property  s         "session-1.scope"                        const
.Seat                               property  (so)      "seat0" "/org/freedesktop/login1/seat... const
.Service                            property  s         "gdm-autologin"                          const
.State                              property  s         "active"                                 -
.TTY                                property  s         "/dev/tty1"                              const
.Timestamp                          property  t         1434494630344367                         const
.TimestampMonotonic                 property  t         34814579                                 const
.Type                               property  s         "x11"                                    const
.User                               property  (uo)      1000 "/org/freedesktop/login1/user/_1... const
.VTNr                               property  u         1                                        const
.Lock                               signal    -         -                                        -
.PauseDevice                        signal    uus       -                                        -
.ResumeDevice                       signal    uuh       -                                        -
.Unlock                             signal    -         -                                        -

As before, the busctl command supports command line completion, hence both the service name and the object path used are easily put together on the shell simply by pressing TAB. The output shows the methods, properties, signals of one of the session objects that are currently made available by systemd-logind. There's a section for each interface the object knows. The second column tells you what kind of member is shown in the line. The third column shows the signature of the member. In case of method calls that's the input parameters, the fourth column shows what is returned. For properties, the fourth column encodes the current value of them.

So far, we just explored. Let's take the next step now: let's become active - let's call a method:

# busctl call org.freedesktop.login1 /org/freedesktop/login1/session/_31 org.freedesktop.login1.Session Lock

I don't think I need to mention this anymore, but anyway: again there's full command line completion available. The third argument is the interface name, the fourth the method name, both can be easily completed by pressing TAB. In this case we picked the Lock method, which activates the screen lock for the specific session. And yupp, the instant I pressed enter on this line my screen lock turned on (this only works on DEs that correctly hook into systemd-logind for this to work. GNOME works fine, and KDE should work too).

The Lock method call we picked is very simple, as it takes no parameters and returns none. Of course, it can get more complicated for some calls. Here's another example, this time using one of systemd's own bus calls, to start an arbitrary system unit:

# busctl call org.freedesktop.systemd1 /org/freedesktop/systemd1 org.freedesktop.systemd1.Manager StartUnit ss "cups.service" "replace"
o "/org/freedesktop/systemd1/job/42684"

This call takes two strings as input parameters, as we denote in the signature string that follows the method name (as usual, command line completion helps you getting this right). Following the signature the next two parameters are simply the two strings to pass. The specified signature string hence indicates what comes next. systemd's StartUnit method call takes the unit name to start as first parameter, and the mode in which to start it as second. The call returned a single object path value. It is encoded the same way as the input parameter: a signature (just o for the object path) followed by the actual value.

Of course, some method call parameters can get a ton more complex, but with busctl it's relatively easy to encode them all. See the man page for details.

busctl knows a number of other operations. For example, you can use it to monitor D-Bus traffic as it happens (including generating a .cap file for use with Wireshark!) or you can set or get specific properties. However, this blog story was supposed to be about sd-bus, not busctl, hence let's cut this short here, and let me direct you to the man page in case you want to know more about the tool.

busctl (like the rest of system) is implemented using the sd-bus API. Thus it exposes many of the features of sd-bus itself. For example, you can use to connect to remote or container buses. It understands both kdbus and classic D-Bus, and more!


But enough! Let's get back on topic, let's talk about sd-bus itself.

The sd-bus set of APIs is mostly contained in the header file sd-bus.h.

Here's a random selection of features of the library, that make it compare well with the other implementations available.

  • Supports both kdbus and dbus1 as back-end.

  • Has high-level support for connecting to remote buses via ssh, and to buses of local OS containers.

  • Powerful credential model, to implement authentication of clients in services. Currently 34 individual fields are supported, from the PID of the client to the cgroup or capability sets.

  • Support for tracking the life-cycle of peers in order to release local objects automatically when all peers referencing them disconnected.

  • The client builds an efficient decision tree to determine which handlers to deliver an incoming bus message to.

  • Automatically translates D-Bus errors into UNIX style errors and back (this is lossy though), to ensure best integration of D-Bus into low-level Linux programs.

  • Powerful but lightweight object model for exposing local objects on the bus. Automatically generates introspection as necessary.

The API is currently not fully documented, but we are working on completing the set of manual pages. For details see all pages starting with sd_bus_.

Invoking a Method, from C, with sd-bus

So much about the library in general. Here's an example for connecting to the bus and issuing a method call:

#include <stdio.h>
#include <stdlib.h>
#include <systemd/sd-bus.h>

int main(int argc, char *argv[]) {
        sd_bus_error error = SD_BUS_ERROR_NULL;
        sd_bus_message *m = NULL;
        sd_bus *bus = NULL;
        const char *path;
        int r;

        /* Connect to the system bus */
        r = sd_bus_open_system(&bus);
        if (r < 0) {
                fprintf(stderr, "Failed to connect to system bus: %s\n", strerror(-r));
                goto finish;

        /* Issue the method call and store the respons message in m */
        r = sd_bus_call_method(bus,
                               "org.freedesktop.systemd1",           /* service to contact */
                               "/org/freedesktop/systemd1",          /* object path */
                               "org.freedesktop.systemd1.Manager",   /* interface name */
                               "StartUnit",                          /* method name */
                               &error,                               /* object to return error in */
                               &m,                                   /* return message on success */
                               "ss",                                 /* input signature */
                               "cups.service",                       /* first argument */
                               "replace");                           /* second argument */
        if (r < 0) {
                fprintf(stderr, "Failed to issue method call: %s\n", error.message);
                goto finish;

        /* Parse the response message */
        r = sd_bus_message_read(m, "o", &path);
        if (r < 0) {
                fprintf(stderr, "Failed to parse response message: %s\n", strerror(-r));
                goto finish;

        printf("Queued service job as %s.\n", path);


        return r < 0 ? EXIT_FAILURE : EXIT_SUCCESS;

Save this example as bus-client.c, then build it with:

$ gcc bus-client.c -o bus-client `pkg-config --cflags --libs libsystemd`

This will generate a binary bus-client you can now run. Make sure to run it as root though, since access to the StartUnit method is privileged:

# ./bus-client
Queued service job as /org/freedesktop/systemd1/job/3586.

And that's it already, our first example. It showed how we invoked a method call on the bus. The actual function call of the method is very close to the busctl command line we used before. I hope the code excerpt needs little further explanation. It's supposed to give you a taste how to write D-Bus clients with sd-bus. For more more information please have a look at the header file, the man page or even the sd-bus sources.

Implementing a Service, in C, with sd-bus

Of course, just calling a single method is a rather simplistic example. Let's have a look on how to write a bus service. We'll write a small calculator service, that exposes a single object, which implements an interface that exposes two methods: one to multiply two 64bit signed integers, and one to divide one 64bit signed integer by another.

#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <systemd/sd-bus.h>

static int method_multiply(sd_bus_message *m, void *userdata, sd_bus_error *ret_error) {
        int64_t x, y;
        int r;

        /* Read the parameters */
        r = sd_bus_message_read(m, "xx", &x, &y);
        if (r < 0) {
                fprintf(stderr, "Failed to parse parameters: %s\n", strerror(-r));
                return r;

        /* Reply with the response */
        return sd_bus_reply_method_return(m, "x", x * y);

static int method_divide(sd_bus_message *m, void *userdata, sd_bus_error *ret_error) {
        int64_t x, y;
        int r;

        /* Read the parameters */
        r = sd_bus_message_read(m, "xx", &x, &y);
        if (r < 0) {
                fprintf(stderr, "Failed to parse parameters: %s\n", strerror(-r));
                return r;

        /* Return an error on division by zero */
        if (y == 0) {
                sd_bus_error_set_const(ret_error, "net.poettering.DivisionByZero", "Sorry, can't allow division by zero.");
                return -EINVAL;

        return sd_bus_reply_method_return(m, "x", x / y);

/* The vtable of our little object, implements the net.poettering.Calculator interface */
static const sd_bus_vtable calculator_vtable[] = {
        SD_BUS_METHOD("Multiply", "xx", "x", method_multiply, SD_BUS_VTABLE_UNPRIVILEGED),
        SD_BUS_METHOD("Divide",   "xx", "x", method_divide,   SD_BUS_VTABLE_UNPRIVILEGED),

int main(int argc, char *argv[]) {
        sd_bus_slot *slot = NULL;
        sd_bus *bus = NULL;
        int r;

        /* Connect to the user bus this time */
        r = sd_bus_open_user(&bus);
        if (r < 0) {
                fprintf(stderr, "Failed to connect to system bus: %s\n", strerror(-r));
                goto finish;

        /* Install the object */
        r = sd_bus_add_object_vtable(bus,
                                     "/net/poettering/Calculator",  /* object path */
                                     "net.poettering.Calculator",   /* interface name */
        if (r < 0) {
                fprintf(stderr, "Failed to issue method call: %s\n", strerror(-r));
                goto finish;

        /* Take a well-known service name so that clients can find us */
        r = sd_bus_request_name(bus, "net.poettering.Calculator", 0);
        if (r < 0) {
                fprintf(stderr, "Failed to acquire service name: %s\n", strerror(-r));
                goto finish;

        for (;;) {
                /* Process requests */
                r = sd_bus_process(bus, NULL);
                if (r < 0) {
                        fprintf(stderr, "Failed to process bus: %s\n", strerror(-r));
                        goto finish;
                if (r > 0) /* we processed a request, try to process another one, right-away */

                /* Wait for the next request to process */
                r = sd_bus_wait(bus, (uint64_t) -1);
                if (r < 0) {
                        fprintf(stderr, "Failed to wait on bus: %s\n", strerror(-r));
                        goto finish;


        return r < 0 ? EXIT_FAILURE : EXIT_SUCCESS;

Save this example as bus-service.c, then build it with:

$ gcc bus-service.c -o bus-service `pkg-config --cflags --libs libsystemd`

Now, let's run it:

$ ./bus-service

In another terminal, let's try to talk to it. Note that this service is now on the user bus, not on the system bus as before. We do this for simplicity reasons: on the system bus access to services is tightly controlled so unprivileged clients cannot request privileged operations. On the user bus however things are simpler: as only processes of the user owning the bus can connect no further policy enforcement will complicate this example. Because the service is on the user bus, we have to pass the --user switch on the busctl command line. Let's start with looking at the service's object tree.

$ busctl --user tree net.poettering.Calculator

As we can see, there's only a single object on the service, which is not surprising, given that our code above only registered one. Let's see the interfaces and the members this object exposes:

$ busctl --user introspect net.poettering.Calculator /net/poettering/Calculator
NAME                                TYPE      SIGNATURE RESULT/VALUE FLAGS
net.poettering.Calculator           interface -         -            -
.Divide                             method    xx        x            -
.Multiply                           method    xx        x            -
org.freedesktop.DBus.Introspectable interface -         -            -
.Introspect                         method    -         s            -
org.freedesktop.DBus.Peer           interface -         -            -
.GetMachineId                       method    -         s            -
.Ping                               method    -         -            -
org.freedesktop.DBus.Properties     interface -         -            -
.Get                                method    ss        v            -
.GetAll                             method    s         a{sv}        -
.Set                                method    ssv       -            -
.PropertiesChanged                  signal    sa{sv}as  -            -

The sd-bus library automatically added a couple of generic interfaces, as mentioned above. But the first interface we see is actually the one we added! It shows our two methods, and both take "xx" (two 64bit signed integers) as input parameters, and return one "x". Great! But does it work?

$ busctl --user call net.poettering.Calculator /net/poettering/Calculator net.poettering.Calculator Multiply xx 5 7
x 35

Woohoo! We passed the two integers 5 and 7, and the service actually multiplied them for us and returned a single integer 35! Let's try the other method:

$ busctl --user call net.poettering.Calculator /net/poettering/Calculator net.poettering.Calculator Divide xx 99 17
x 5

Oh, wow! It can even do integer division! Fantastic! But let's trick it into dividing by zero:

$ busctl --user call net.poettering.Calculator /net/poettering/Calculator net.poettering.Calculator Divide xx 43 0
Sorry, can't allow division by zero.

Nice! It detected this nicely and returned a clean error about it. If you look in the source code example above you'll see how precisely we generated the error.

And that's really all I have for today. Of course, the examples I showed are short, and I don't get into detail here on what precisely each line does. However, this is supposed to be a short introduction into D-Bus and sd-bus, and it's already way too long for that …

I hope this blog story was useful to you. If you are interested in using sd-bus for your own programs, I hope this gets you started. If you have further questions, check the (incomplete) man pages, and inquire us on IRC or the systemd mailing list. If you need more examples, have a look at the systemd source tree, all of systemd's many bus services use sd-bus extensively.

June 17, 2015

Testing rawhide apps using xdg-app

An important aspect of xdg-app is application sandboxing, which will require application changes to use sandbox-specific APIs. However, xdg-app is also a good way to deploy and run non-sandboxed (or partially sandboxed) regular applications.

A very interesting usecase for this is to have an image-based operating system, for instance a Workstation spin of Fedora Atomic. Such a system would have a basic workstation installation with a read-only /usr, and atomic updates/rollback. However, installing an application is painful, and customizing yor install in that way undoes many of the advantages of an image-based OS.

With xdg-app you can install apps into /var (or $HOME) and have them fully integrate with the system, while still being isolated from changes to the host. This makes for a great combination, just like atomic + docker is a good combination for the server space.

I’ve spent some time recently making a prototype runtime based on the Fedora packages, as reported on the desktop list. This is kind of interesting as it lets you test applications from rawhide on fedora 21 or 22. Just install xdg-app from fedora-updates and then install the runtime:

$ xdg-app add-remote --no-gpg-verify --user fedora
$ xdg-app install-runtime --user fedora org.fedoraproject.Platform 23

And then you can try gedit 3.17.0:

$ xdg-app install-app --user fedora org.gnome.gedit
$ xdg-app run org.gnome.gedit

Or evince 3.17.2:

$ xdg-app install-app --user fedora org.gnome.evince
$ xdg-app run org.gnome.evince

Once installed you can also just start them from the desktop environment as usual.  They should be there like any regular application as the desktop files and icons are exported to the host.

June 11, 2015

Rich and comments

Rich Jones posted an article about being banned by Boing-Boing, supposedly for bringing attention to their use of affiliate links (the practice that Gamergate groups criticized as well — and scored a regulatory win against). Meanwhile, all my comments at Rich's blog are blackholed, which is quite ironic. Generally, I am not into this "blog comment" thing. Ani-nouto never had any comments and is doing great that way. But some people like comments, so I leave them as necessary.

June 06, 2015

What tools are changing our world next?

Quick brain dump after a bike ride home: free software took a huge leap in the late 90s and early 00s in large part because of non-ideological advantages that the rest of the world is now competing with or surpassing:

HDR automatically created from old pictures of Muir Woods by Google Photos.
HDR automatically created by Google Photos from my old pictures of Muir Woods. Not perfect, but better than I ever bothered to do!
  • Collaboration tools: Because we got to the ‘net first, our tools for collaborating with each other were simply better than what proprietary developers were doing: cvs, mailman, wiki, etc., were all better than the silo’d old-school tools. Modern best-of-breed collaboration tools have all learned from what we did and added proprietary sauce on top: github, slack, Google Docs, etc. So our tools that are now (at best) as productive as our proprietary counterparts, and sometimes less productive but ideologically agreeable.
  • Release processes: “Release early/release often” made us better partners for our users. We’re now actively behind here: compare how often a mobile app or web user gets updates, exactly as the author intended, relative to a user of a modern Linux distro.
  • Zero cost: We did things for no (direct) cost by subsidizing our work through college, startups, or consulting gigs; now everyone has a subsidize-by-selling-something-else model (usually advertising, though sometimes freemium). Again, advantage (mostly?) lost.
  • Knowing our users: We knew a lot about our users, because we were our biggest users, and we talked to other users a lot; this was more effective than what passed for software design in the late 90s. This has been eclipsed by extensive a/b testing throughout the industry, and (to a lesser extent) by more extensive usage of direct user testing and design-thinking.

None of these are terribly original observations – all of these have been remarked on before. But after playing some with Google Photos this weekend, I’m ready to add another one to the list:

Worth asking what your project is doing that could be radically changed if your competitors get access to new technology. For example, for Wikipedia:

  • Collaborating: Wiki was best-of-breed (or close); it isn’t anymore. Visual Editor helps get editing back to par, but the social aspect of collaboration is still lacking relative to the expectations of many users.
  • Knowledge creation: big groups of humans, working together wiki-style, is the state of the art for creating useful, non-BS knowledge at scale. With the aforementioned machine learning, I suspect this will no longer the case in a (growing) number of domains.

I’m sure there are others…

May 29, 2015

Cool hardware in Vancouver

There wasn't much, but more than in Atlanta. The most "pro" looking kit was presented by NEC: basically a bladeserver, but the "blades" are SBCs, each of them accompained by a dedicated drive card. I can see downsides of this design, but very cute.

Unfortunately, they only offer CPU cards based on Atom. No ARM or anything.

The only other interesting booth belonged to StackVelocity, a subsidiary of JB Circuits that does custom design.

I'm sorry to say, their wares looked decidedly pedestrian, which is to be expected: their sales point is low cost, and stuff of that nature underpins the modern datacenter. One curious thing, however, is the variety of flash cards they offer. Basically Fusion-IO on budget. One was particularly tricky by having 2 layers. At first I even thought it could have flash chips mounted sideways, but nope, the science of low-cost computing is not there yet.

P.S. NEC also sell the same chassis with CPU cards instead of drive cards under the index "DX1000".

Semi-hard numbers from Rackspace

Previously in hard numbers: China, Wikimedia, Amazon S3. Rackspace previously reported in creiht's preso 18 months ago. This time, scotty went public at the Vancouver (Liberty) summit with the following:

> 50 billion objects
> 100 PB data (sanitized number, but way higher than 85 PB)
= 6 global clusters
3:1 PUT:GET ratio
10k+ requests/second

The number of objects is roughly 40 times less than in Amazon S3.

May 19, 2015

GDB Preattach

In firefox development, it’s normal to do most development tasks via the mach command. Build? Use mach. Update UUIDs? Use mach. Run tests? Use mach. Debug tests? Yes, mach mochitest --debugger gdb.

Now, normally I run gdb inside emacs, of course. But this is hard to do when I’m also using mach to set up the environment and invoke gdb.

This is really an Emacs bug. GUD, the Emacs interface to all kinds of debuggers, is written as its own mode, but there’s no really great reason for this. It would be way cooler to have an adaptive shell mode, where running the debugger in the shell would magically change the shell-ish buffer into a gud-ish buffer. And somebody — probably you! — should work on this.

But anyway this is hard and I am lazy. Well, sort of lazy and when I’m not lazy, also unfocused, since I came up with three other approaches to the basic problem. Trying stuff out and all. And these are even the principled ways, not crazy stuff like screenify.

Oh right, the basic problem.  The basic problem with running gdb from mach is that then you’re just stuck in the terminal. And unless you dig the TUI, which I don’t, terminal gdb is not that great to use.

One of the ideas, in fact the one this post is about, since this post isn’t about the one that I couldn’t get to work, or the one that is also pretty cool but that I’m not ready to talk about, was: hey, can’t I just attach gdb to the test firefox? Well, no, of course not, the test program runs too fast (sometimes) and racing to attach is no fun. What would be great is to be able to pre-attach — tell gdb to attach to the next instance of a given program.

This requires kernel support. Once upon a time there were some gdb and kernel patches (search for “global breakpoints”) to do this, but they were never merged. Though hmm! I can do some fun kernel stuff with SystemTap…

Specifically what I did was write a small SystemTap script to look for a specific exec, then deliver a SIGSTOP to the process. Then the script prints the PID of the process. On the gdb side, there’s a new command written in Python that invokes the SystemTap script, reads the PID, and invokes attach. It’s a bit hacky and a bit weird to use (the SIGSTOP appears in gdb to have been delivered multiple times or something like that). But it works!

It would be better to have this functionality directly in the kernel. Somebody — probably you! — should write this. But meanwhile my hack is available, along with a few other gdb scxripts, in my gdb helpers github repository.

May 06, 2015

How Mitchell Baker made me to divorce

Well, nearly did. Deleting history in Firefox 37 is very slow and the UI locks up while you do that. "Very slow" means an operation that takes 13 minutes (not exaggerating - it's reproducible). The UI lock-up means a non-dismissable context menu floating over everything; Firefox itself being, of course, entirely unresponsive. See the screencap.

The screencap is from Linux where I confirmed the problem, but the story started on Windows, where my wife tried to tidy up a bit. So, when Firefox locked up, she killed it, and repeated the process a few times. And what else would you do? We are not talking about hanging up for seconds - it literally was many minutes. Firefox did not pop a dialog with "Please wait, deleting 108,534 objects with separate SQLite transactions", a progress gauge, and a "Cancel" button. Instead, it pretended to lock up.

Interestingly enough, remember when Firefox had a default to keep the history for a week? This mode is gone now - FF keeps the history potentially forever. Instead, it offers a technical limit: 108,534 entries are saved in the "Places" database at the most, in order to prevent SQLite from eating all your storage. Now I understand why my brown "visited" links never go back to blue anymore.

The problem is, there's no alternative. I tried to use Midori as my main browser for a month or two in early 2014, but it was a horrible crash city. I had no choice but to give up and go back to Firefox and its case of Featuritis Obesum.

Come work with me – developer edition!

It has been a long time since I was able to say to developer friends “come work with me” in anything but the most abstract “come work under the same roof” kind of sense. But today I can say to developers “come work with me” and really mean it. Which is fun :)

By Supercarwaar (Own work) [CC BY-SA 3.0], via Wikimedia Commons
By Supercarwaar, CC BY-SA 3.0
Details: Wikimedia’s new community tech team is hiring for a community tech developer and a team lead. This will be extremely community-intensive work, so if you enjoy and get energy from working with a community and helping them achieve their goals, this could be a great role for you. This team will work intensely with my department to ensure that we’re correctly identifying and prioritizing the needs of our most active editors. If that sounds like fun, get in touch :)

[And I realize that I’ve been bad and not posted here, so here’s my new job announce: “my department” is the Foundation’s new Community Engagement department, where we work to support healthy contributor communities and help WMF-community collaboration. It is a detour from law, but I’ve always said law was just a way to help people do their thing — so in that sense is the same thing I’ve always been doing. It has been an intense roller coaster of a first two months, and I look forward to much more of the same.]

May 05, 2015

Thoughts on a feedback loop for Trinity.

With the success that afl has been having on fuzzing userspace, I’ve been revisiting an idea that Andi Kleen gave me years ago for trinity, which was pretty much the same thing but for kernel space. I.e., a genetic algorithm that rates how successful the last fuzz attempt was, and makes a decision on whether to mutate that last run, or do something completely new.

It’s something I’ve struggled to get my head around for a few years. The mutation part would be fairly easy. We would need to store the parameters from the last run, and extrapolate out a set of ->mutate functions from the existing ->sanitize functions that currently generate arguments.

The difficult part is the “how successful” measurement. Typically, we don’t really get anything useful back from a syscall other than “we didn’t crash”, which isn’t particularly useful in this case. What we really want is “did we execute code that we’ve not previously tested”. I’ve done some experiments with code coverage in the past. Explorations of the GCOV feature in the kernel didn’t really get very far however for a few reasons (primarily that it really slowed things down too much, and also I was looking into this last summer, when the initial cracks were showing that I was going to be leaving Red Hat, so my time investment for starting large new projecs was limited).

After recent discussions at work surrounding code coverage, I got thinking about this stuff again, and trying to come up with workable alternatives. I started wondering if I could use the x86 performance counters for this. Basically counting the number of instructions executed between system call enter/exit. The example code that Vince Weaver wrote for perf_event_open looked like a good starting point. I compiled it and ran it a few times.

$ ./a.out 
Measuring instruction count for this printf
Used 3212 instructions
$ ./a.out 
Measuring instruction count for this printf
Used 3214 instructions

Ok, so there’s some loss of precision there, but we can mask off the bottom few bits. A collision isn’t the end of the world for what we’re using this for. That’s just measuring userspace however. What happens if we tell it to measure the kernel, and measure say.. getpid().

$ ./a.out 
Used 9283 instructions
$ ./a.out 
Used 9367 instructions

Ok, that’s a lot more precision we’ve lost. What the hell.
Given how much time he’s spent on this stuff, I emailed Vince, and asked if he had insight as to why the counters weren’t deterministic across different runs. He had actually written a paper on the subject. Turns out we’re also getting event counts here for page faults, hardware interrupts, timers, etc.
x86 counters lack the ability to say “only generate events if RIP is within this range” or anything similar, so it doesn’t look like this is going to be particularly useful.

That’s kind of where I’ve stopped with this for now. I don’t have a huge amount of time to work on this, but had hoped that I could hack up something basic using the perf counters, but it looks like even if it’s possible, it’s going to be a fair bit more work than I had anticipated.

It occurred to me after posting this that measuring instructions isn’t going to work regardless of the amount of precision the counters offer. Consider a syscall that operates on vma’s for example. Over the lifetime of a process, the number of executed instructions of a call to such a syscall will vary even with the same input parameters, as the lengths of various linked lists that have to be walked will change. Number of instructions, or number of branches taken/untaken etc just isn’t a good match for this idea. Approximating “have we been here before” isn’t really achievable with this approach afaics, so I’m starting to think something like the initial gcov idea is the only way this could be done.

The post Thoughts on a feedback loop for Trinity. appeared first on

Reach the Top With NetworkManager 1.0.2

Summit - Asbjørn Floden (CC BY-NC 2.0)Summit – Asbjørn Floden (CC BY-NC 2.0)

Just this morning Lubomir released NetworkManager 1.0.2, the latest of the 1.0 stable series.  It’s  a great cleanup and bugfix release with contributions from lots of community members in many different areas of the project!

Some highlights of new functionality and fixes:

  • Wi-Fi device band capability indications, requested by the GNOME Shell team
  • Devices set to ignore carrier that use DHCP configurations will now wait a period of time for the carrier to appear, instead of failing immediately
  • Startup optimizations allow networking-dependent services to be started much earlier by systemd
  • Memory usage reductions through many memory leak fixes and optimizations
  • teamd interface management is now more robust and teamd is respawned when it terminates
  • dnsmasq is now respawned when it terminates in the local caching nameserver configuration
  • Fixes for an IPv6 DoS issue CVE-2015-2924, similar to one fixed recently in the kernel
  • IPv6 Dynamic DNS updates sent through DHCP now work more reliably (and require a fully qualified name, per the RFCs)
  • An IPv6 router solicitation loop due to a non-responsive IPv6 router has been fixed

While the list of generally interesting enhancements may be short, it masks 373 git commits and over 50 bugzilla issues fixed.  It’s a great release and we recommend that everyone upgrade.

Next up is NetworkManager 1.2, with DNS improvements, Wi-Fi scanning and AP list fixes for mobile uses, NM-in-containers improvements (no udev required!), even less dependence on the obsolete dbus-glib, less logging noise, device management fixes, continuing removal of external dependencies (like avahi-autoipd), configuration reload-ability, and much more!

May 04, 2015

kernel code coverage brain dump.

Someone at work recently asked me about code coverage tooling for the kernel. I played with this a little last year. At the time I was trying to figure out just how much of certain syscalls trinity was exercising. I ended up being a little disappointed at the level of post-processing tools to deal with the information presented, and added some things to my TODO list to find some time to hack up something, which quickly bubbled its way to the bottom.

As I did a write-up based on past experiences with this stuff, I figured I’d share.

requires kernel built with
Note: Setting GCOV_PROFILE_ALL incurs some performance penalty, so any resulting kernel built with this option should _never_ be used for any kind of performance tests.
I can’t exaggerate this enough, it’s miserably slow. Disk operations that took minutes for me now took hours. As example:


# time dd if=/dev/zero of=output bs=1M count=500
500+0 records in
500+0 records out
524288000 bytes (524 MB) copied, 0.409712 s, 1.3 GB/s
0.00user 0.40system 0:00.41elapsed 99%CPU (0avgtext+0avgdata 2980maxresident)k
136inputs+1024000outputs (1major+340minor)pagefaults 0swaps


# time dd if=/dev/zero of=output bs=1M count=500
500+0 records in
500+0 records out
524288000 bytes (524 MB) copied, 6.17212 s, 84.9 MB/s
0.00user 7.17system 0:07.22elapsed 99%CPU (0avgtext+0avgdata 2940maxresident)k
0inputs+1024000outputs (0major+338minor)pagefaults 0swaps

From 41 seconds, to over 7 minutes. Ugh.

If we *didn’t* set GCOV_PROFILE_ALL, we’d have to recompile just the files we cared about with the relevant gcc profiling switches. It’s kind of a pain.

For all this to work, gcov expects to see a source tree, with:

  • .o objects
  • source files
  • .gcno files (these are generated during the kernel build)
  • .gcda files containing the runtime counters. These come from sysfs on the running kernel.

After booting the kernel, a subtree appears in sysfs at /sys/kernel/debug/gcov/
These directories mirror the kernel source tree, but instead of source files, now contain files that can be fed to the gcov tool. There will be a .gcda file, and a .gcno symlink back to the source tree (with complete path). Ie, /sys/kernel/debug/mm for example contains (among others..)

-rw------- 1 root root 0 Mar 24 11:46 readahead.gcda
lrwxrwxrwx 1 root root 0 Mar 24 11:46 readahead.gcno -> /home/davej/build/linux-dj/mm/readahead.gcno

It is likely the symlink will be broken on the test machine, because the path doesn’t exist, unless you nfs mount the source code from the built kernel for eg.

I hacked up the script below, which may or may not be useful for anyone else (honestly, it’s way easier to just use nfs).
Run it from within a kernel source tree, and it will populate the source tree with the relevant gcda files, and generate the .gcov output file.

obj=$(echo "$1" | sed 's/\.c/\.o/')
if [ ! -f $obj ]; then

dirname=$(dirname $1)
gcovfn=$(echo "$(basename $1)" | sed 's/\.c/\.gcda/')
if [ -f /sys/kernel/debug/gcov$pwd/$dirname/$gcovfn ]; then
  cp /sys/kernel/debug/gcov$pwd/$dirname/$gcovfn $dirname
  gcov -f -r -o $1 $obj
  if [ -f $(basename $1).gcov ]; then
    mv $(basename $1).gcov $dirname
  echo "no gcov data for /sys/kernel/debug/gcov$pwd/$dirname/$gcovfn"

Take that script, and run it like so..

$ cd kernel-source-tree
$ find . -type f -name "*.c" -exec "{}" \;

Running for eg, mm/mmap.c will cause gcov to spit out a mmap.c.gcov file (in the current directory) that has coverage information that looks like..

   135684:  269:static struct vm_area_struct *remove_vma(struct vm_area_struct *vma)
        -:  270:{
   135684:  271:        struct vm_area_struct *next = vma->vm_next;
        -:  272:
   135684:  273:        might_sleep();
   135686:  274:        if (vma->vm_ops && vma->vm_ops->close)
     5080:  275:                vma->vm_ops->close(vma);
   135686:  276:        if (vma->vm_file)
    90302:  277:                fput(vma->vm_file);
        -:  278:        mpol_put(vma_policy(vma));
   135686:  279:        kmem_cache_free(vm_area_cachep, vma);
   135686:  280:        return next;
        -:  281:}

The numbers on the left being the number of times that line of code was executed.
Lines beginning with ‘-‘ have no coverage information for whatever reason.
If a branch is not taken, it gets prefixed with ‘#####’, like so..

  4815374:  391:                if (vma->vm_start < pend) {
    #####:  392:                        pr_emerg("vm_start %lx < pend %lx\n",
        -:  393:                                  vma->vm_start, pend);
        -:  394:                        bug = 1;
        -:  395:                }

There are some cases that need a little more digging to explain. eg:

    88105:  237:static void __remove_shared_vm_struct(struct vm_area_struct *vma,
        -:  238:                struct file *file, struct address_space *mapping)
        -:  239:{
    88105:  240:        if (vma->vm_flags & VM_DENYWRITE)
    15108:  241:                atomic_inc(&file_inode(file)->i_writecount);
    88105:  242:        if (vma->vm_flags & VM_SHARED)
        -:  243:                mapping_unmap_writable(mapping);
        -:  244:
        -:  245:        flush_dcache_mmap_lock(mapping);
    88105:  246:        vma_interval_tree_remove(vma, &mapping->i_mmap);
        -:  247:        flush_dcache_mmap_unlock(mapping);
    88104:  248:}

In this example, lines 245 & 247 have no hitcount, even though there’s no way they could have been skipped.
If we look at the definition of flush_dcache_mmap_(un)lock, we see..
#define flush_dcache_mmap_lock(mapping) do { } while (0)
So the compiler never emitted any code, and hence, it gets treated the same way as the blank lines.

There is a /sys/kernel/debug/gcov/reset file that can be written to to reset the counters before each test if desired.

Additional thoughts

  • Not sure how inlining affects things.
  • There needs to be some element of post-processing, to work out percentages of code coverage etc, which may involve things like stripping out comments/preprocessor defines.
  • debug kernels differ in functionality in various low level features. For example LOCKDEP will fundamentally change the way spinlocks work. For coverage purposes though, we can choose to not care and stop drilling down at certain levels.
  • Whatever does the post-processing of results may need to aggregate results from multiple test machines. Think of the situation where we’re running a client/server test: Both machines will be running different code paths.
  • ggcov has some interesting looking tooling for visually displaying results.

The post kernel code coverage brain dump. appeared first on

May 01, 2015

Trinity socket improvements

I’ve been wanting to get back to working on the networking related code in trinity for a long time. I recently carved out some time in the evenings to make a start on some of the lower hanging fruit.

Something that bugged me a while is that we create a bunch of sockets on startup, and then when we call for eg, setsockopt() on that socket, the socket options we pass have more chance of not being the correct protocol for the protocol the socket was created for. This isn’t always a bad thing; for eg, one of the oldest kernel bugs trinity found was found by setting TCP options on a non-TCP socket. But doing this the majority of the time is wasteful, as we’ll just get -EINVAL most the time.

We actually have the necessary information in trinity to know what kind of socket we were dealing with in a socketinfo struct.

struct socket_triplet {
        unsigned int family;
        unsigned int type;
        unsigned int protocol;

struct socketinfo {
        struct socket_triplet triplet;
        int fd; 

We just had it at the wrong level of abstraction. setsockopt only ever saw a file descriptor. We could have searched through the fd arrays looking for the socketinfo that matched, but that seems like a lame solution. So I changed the various networking syscalls to take a ARG_SOCKETINFO instead of an ARG_FD. As a side-effect, we actually pass sockets to those syscalls more than say, a perf fd, or an epoll fd, or ..

There is still a small chance we pass some crazy fd, just to cover the crazy cases, though those cases don’t tend to trip things up much any more.

After passing down the triplet, it was a simple case of annotating the structures containing the various setsockopt function pointers to indicate which family they belonged to. AF_INET was the only complication, which needed special casing due to the multiple protocols for which we have setsockopt() functions. Creation of a second table, using the protocol instead of the family was enough for the matching code.

There are still a ton of improvements I want to make to this code, but it’s going to take a while, so it’s good when some mostly trivial changes like the above come together quickly.

The post Trinity socket improvements appeared first on

April 14, 2015

the more things change.. 4.0

$ ping gelk
PING ( 56(84) bytes of data.
WARNING: kernel is not very fresh, upgrade is recommended.
$ uname -r

Remember that one time the kernel versioning changed and nothing in userspace broke ? Me either.

Why people insist on trying to think they can get this stuff right is beyond me.


update: this was already fixed, almost exactly a year ago in the ping git tree. The (now removed) commentary kind of explains why they cared. Sigh.

The post the more things change.. 4.0 appeared first on

March 31, 2015

Official GNOME SDK runtime builds are out

As people who have followed the work on sandboxed applications know, we have promised a developer preview for GNOME 3.16. Well, 3.16 has now been released, so the time is now!

I spent last week setting up an build system on the GNOME infrastructure, and the output of this is finally available at:

This repository contains the gnome 3.16 runtimes, org.gnome.Platform, as well as a smaller one that is useful for less integrated apps (like games) called org.freedesktop.Platform. It also has corresponding develoment runtimes (org.gnome.Sdk and org.freedesktop.Sdk) that you can use to create applications for the platforms.

This is a developer preview, so consider these builds weakly supported. This means I will try to keep them somewhat updated if there are major issues and that I will keep them API and ABI stable. I will probably also pick up at least some 3.16.x minor releases as they are released.

I also did the first official release of xdg-app. For easy testing this is available for Fedora 21 and 22 as a copr repo.

Testing the SDK

Using the repo above makes it really easy to test this. Just install the xdg-app package from copr, log out+in (needed update the environment for the session), then follow these instructions (as a regular user):

  1. Install the Gnome SDK public key into  /usr/share/ostree/trusted.gpg.d, (or alternatively, use –no-gpg-verify when you add the remote below).
  2. Install the basic Gnome and freedesktop runtimes:
    $ xdg-app add-remote --user gnome-sdk
    $ xdg-app install-runtime --user gnome-sdk org.gnome.Platform 3.16
    $ xdg-app install-runtime --user gnome-sdk org.freedesktop.Platform 1.0
  3. Optionally install some locale packs:
    $ xdg-app install-runtime --user gnome-sdk 3.16
    $ xdg-app install-runtime --user gnome-sdk 1.0
  4. Install some apps from my repository of test apps:
    $ xdg-app add-remote --user --no-gpg-verify test-apps
    $ xdg-app install-app --user test-apps org.gnome.gedit
    $ xdg-app install-app --user test-apps org.freedesktop.glxgears
  5. Run the apps! You should find gedit listed among the regular applications in the shell as it exports a desktop file. But you can also run them manually like this:
    $ xdg-app run org.gnome.gedit
    $ xdg-app run org.freedesktop.glxgears
  6. I also packaged the latest gnome builder from git. It requires the full sdk which takes a bit longer to download:
    $ xdg-app install-runtime --user gnome-sdk org.gnome.Sdk 3.16
    $ xdg-app install-app --user test-apps org.gnome.Builder

All the above install the apps into your home-directory (in ~/.local/share/xdg-app) . You can also run the commands as root and skip the –user arguments to do system-wide application installs.

Future work

With the basics now laid down to run current applications in a minimally isolated environment the next step is to work on the sandboxing aspects more. This will require lots of work, both in the system side (things like kdbus), the desktop (add sandbox aware APIs, make pulseaudio protect clients from each other, etc)  and in modifying applications.

If you’re interested in this, you can follow the work on the wiki.

Building your own apps

If you download the SDKs you have enough tooling to build your own applications. There are some documentations on how to do this here.

I also created a git repository with the scripts I used to build the test applications above. It uses the gnome-sdk-bundles repostory which has some tooling and specfiles to easily bundle dependencies with the application.

Building the SDK

If you ever want to build the SDK yourself, it is available at:

This repository contains the desktop specific parts of the SDK, which is layered on a core Yocto layer. When you build the SDK this will be automatically checked out and built from:

However, if you don’t want to build all of this you can download the pre-build images from and put them in the freedesktop-sdk-base/images/x86_64 subdirectory of gnome-sdk-images. This can save you a lot of time and space.

March 22, 2015

Fedora at Midwest Rep Rap Fest 2015

I attended Midwest Rep Rap Fest 2015 this weekend, in Goshen, Indiana. Goshen is about 45 minutes outside of South Bend (the nearest regional airport). This part of Indiana is noteworthy for a few reasons, including the fact that Matthew Miller, the Fedora Project Leader, is from there. It also has a very large Amish population, which makes it one of the few places I've attended a conference where most of the local businesses have a place to tie up your horses. The Midwest Rep Rap Fest is an event dedicated to Open Source 3d printers (and their surrounding ecosystem). The primary sponsor of the event is SeeMeCNC, a local vendor that makes open source hardware delta 3d printers. A Delta printer is a 3d printer with a circular stationary bed. Attached to the bed are three vertical rods which serve as tracks for three geared motors. The motors move up and down the rods, and are connected to a central extruder which hangs down the center. The extruder is moved in three dimensions by moving the supports along their tracks. Watching a Delta 3d printer do its thing is pretty amazing, it seems to dance like a trapeze artists as it dips and swoops to print the object.

The Delta type of 3d printer was the most common printer at the event, many people had either bought SeeMeCNC printers or had built their own off their open source design. The SeeMeCNC team brought their super-sized Delta, which they think is the largest Delta printer in the world. It was easily 30 feet tall and barely fit in the building we were using (which is saying something, because we were in an exhibition hall at the local state fairgrounds). The owner of the company decided to see how big of a Delta printer he could build, and this was the result!

The printer used a shop vac to blow plastic pellets up a plastic hose into the giant heated end. Originally, they were trying to print a giant model of Groot (shown in progress in my picture above), but they had to leave it running overnight on Friday and when we came back Saturday morning, the print had failed because it had run out of plastic pellets! Later on, they printed a very large basket/vase with it (after fixing it so that it wouldn't run out of plastic).

Fedora had a table in the main room. I brought two open source 3d printers from Lulzbot and controlled them both from my laptop running Fedora 21. My larger printer, the Taz 4, was configured with a dual extruder addon, and I spent four hours on Friday calibrating it to print properly. On Saturday morning, I printed my first completely successful dual color print, a red and white tree frog!

The eyes didn't come out perfect, but it all came out aligned and in one piece. Several people offered me tips and advice on how to improve the print quality with the dual-extruder setup. One of the nice things about the Rep Rap fest was the extremely friendly nature of the community. Everyone was eager to help everyone else solve problems or improve their printers/prints. I used Pronterface to control the Taz 4, since it was better suited to handle the dual extruder controls.

My smaller printer, the Lulzbot Mini, was controlled with Cura-Lulzbot (a package which got added to Fedora a few days before the show!). Cura has a very fast and high quality slicer, but with less options for tweaking it than slic3r (the traditional open source slicing tool) does. 3d printers depend on a slicing tool to take a 3d model and convert it into the GCode machine instructions that tell the printer where to move and when to extract plastic. Cura also has a more polished UI than Pronterface.

The Lulzbot Mini is able to self level, self clean, and self calibrate, which almost eliminates the prep time before a print! One of the vendors at the show was Taulman, who is constantly innovating new filaments for 3d printing. They announced a new filament the weekend of the Rep Rap Fest, 910, and they gave me a sample to try out on the Mini. The Mini can print filaments with a melting point of 300 degrees Celsius or less, so it was well suited for the 910. 910 was interesting because it was incredibly strong, almost as good as polycarbonate! It was also translucent, which made it ideal for me to finish a project I've been working on for a long time: my 3d printed TARDIS model!

I printed four window panels and a topper piece for the lantern on the roof. A few other people had TARDIS models (including one that had storage drawers inside it), but mine was the biggest (and I think, the nicest).

One of Fedora's neighbors was mUVe, an open source SLA 3d printer. SLA 3d printers use a liquid resin and a DLP projector to make incredibly accurate 3d models that would be difficult or impossible to print on other kinds of 3d printers. It seemed like everyone was printing the same Groot model at the event, and they printed one that came out looking incredible. The inventor of the hardware was working their table, and we talked for a while about the importance of open source in hardware. He felt strongly that it was mandatory for him to release his work into open source so that other people could innovate and improve upon the designs he'd created. The mUVe printer was one of the largest SLA printers I've ever seen and the quality of its prints was amazing. The biggest downside is the complexity, it involves chemicals in the resin and in curing the prints once they have finished, but in my opinion, it was worth it. The cost was in the $1500-2000 price range, but he said he's working on something awesome that will bring that cost down. They used Creation Workshop to slice and control their printer, which was new to me, but it was also open source. It's C# though, but I want to see if I can get it working in Mono on Fedora. (They were also in the greater Detroit area, so I encouraged them to come out and demo it at Penguicon!)

Another neighbor had 3d printed an amazingly intricate "home clock". They had used a famous woodworking pattern, converted each of the pieces to a 3d model, then printed them. Each piece was then smoothed and attached together. The only piece they didn't print was the clock at the center! On the table, the top of the clock was taller than me (and I'm 6'4"). It didn't look 3d printed, it looked too nice! It took them 3 months to print it all. The owner said that if you're able to cut this model from wood and assemble it properly, you're considered to be a master in their community. Everyone was definitely in awe of it in this community.

It seemed like everyone showing off something at this event had a clever hack of their own. Some people were creating amazing models, some people had built new open source printers. One printer had color changing LED strips attached underneath it which changed from red to green to indicate the progress of the printing job. Another printer had a Raspberry Pi with camera wired into it so you had a "printer's eye view" as it printed. There was a custom 3d scanner designed to scan people's heads and torsos to make printable busts. There was even a printer that looked like some sort of industrial robot gone mad! The one thing these all had in common? They were open source. No one here was questioning open source, it was just the way they operated, sharing what they knew and building off each other's successes (and failures). There were a few MakerBot Replicators, but all of them had been hacked in some way.

Attendance at this years event was both up and down. There were more people and companies exhibiting at the event, including Texas Instruments, Hackaday, Lulzbot, Taulman, and Printed Solid. Printed Solid was giving out free samples of some amazing ColorFabb filament. I came home with some BronzeFill (prints into a bronze like material that when polished is heavy and shiny), a new flexible filament, and some carbon-fiber infused filament! They also had some really fantastic glow in the dark filament, but no samples of that were available (and I didn't have the spare cash to buy a full spool). General attendance at the event was about 750 people, which was down from last year (around 1000). The general consensus was that the event wasn't doing all it could to advertise itself, and the location wasn't exactly optimal (45 minutes from the nearest regional airport, almost 2 hours from a major airport). The majority of visitors were local to the Indiana/Michigan area. The event staff said that next year they plan on rebranding the event to a more general FOSS 3d printing event (not limiting themselves to the Midwest region of the US). I think that is the right decision, since they are the only open source 3d printing event that I'm aware of, and I'd really love to see them grow into something bigger and more accessible.

Oh, did I mention we had a celebrity at the event? Ben Heck was there with his Delta printer! He's built a pinball machine. I might want to be him a little bit (but I'm not). He was very friendly and cool, spent a lot of time talking to the other makers and attendees.

Thanks to Ben Williams, Fedora had a very nice booth setup. We had our Fedora tablecloth and lots of stickers to give away. I brought a good sampling of models I'd printed with Fedora and my 3d printers, and I had a lot of good conversations about using Linux and open source to power 3d printing and 3d model creation. My coworker (and celebrity writer) Brian Proffitt stopped by on Saturday and helped out at the table for a while. I was supposed to have Fedora 21 media to hand out, but the promised shipment never arrived. The computers there were a mix of Windows and Linux, very few Macs in this community. Several people were using Fedora, but most of the Linux instances were Debian.

The Fedora event box needs a little love, there wasn't very much in it that was useful anymore. The OLPC in it is very old now, and since the current OLPC hardware runs Android these days, it isn't as "cool" as it used to be. I restocked it with Fedora bubble stickers, but it probably needs a plan to revitalize it.

All in all, it was a very fun weekend event and a great opportunity to connect with the open source 3d printer community. I think it is the responsibility of Fedora (and Red Hat) to reach out to the maker communities and help them be open source in their own ways, and this was an excellent opportunity to do exactly that. Is there a Maker event happening somewhere near you? You can sign up to represent Fedora at that event like I did at MRRF: Fedora Event Calendar

March 16, 2015

virgil3d local rendering test harness

So I've still been working on the virgil3d project along with part time help from Marc-Andre and Gerd at Red Hat, and we've been making steady progress. This post is about a test harness I just finished developing for adding and debugging GL features.

So one of the more annoying issuess with working on virgil has been that while working on adding 3D renderer features or trying to track down a piglit failure, you generally have to run a full VM to do so. This adds a long round trip in your test/development cycle.

I'd always had the idea to do some sort of local system renderer, but there are some issues with calling GL from inside a GL driver. So my plan was to have a renderer process which loads the renderer library that qemu loads, and a mesa driver that hooks into the software rasterizer interfaces. So instead of running llvmpipe or softpipe I have a virpipe gallium wrapper, that wraps my virgl driver and the sw state tracker via a new vtest winsys layer for virgl.

So the virgl pipe driver sits on top of the new winsys layer, and the new winsys instead of using the Linux kernel DRM apis just passes the commands over a UNIX socket to a remote server process.

The remote server process then uses EGL and the renderer library, forks a new copy for each incoming connection and dies off when the rendering is done.

The final rendered result has to be read back over the socket, and then the sw winsys is used to putimage the rendering onto the screen.

So this system is probably going to be slower in raw speed terms, but for developing features or debugging fails it should provide an easier route without the overheads of the qemu process. I was pleasantly surprised it only took two days to pull most of this test harness together which was neat, I'd planned much longer for it!

The code lives in two halves. virgl-mesa-driver

[updated: pushed into the main branches]

Also the virglrenderer repo is standalone now, it also has a bunch of unit tests in it that are run using valgrind also, in an attempt to lock down some more corners of the API and test for possible ways to escape the host.

March 13, 2015

LSF/MM 2015 recap.

It’s been a long week.
Spent Monday/Tuesday at LSFMM. This year it was in Boston, which was convenient in that I didn’t have to travel anywhere, but less convenient in that I had to get up early and do a rush-hour commute to get to the conference location in time. At least the weather got considerably better this week compared to the frankly stupid amount of snow we’ve had over the last month.
LWN did their usual great write-up which covers everything that was talked about in a lot more detail than my feeble mind can remember.

A lot of things from last years event seem to still be getting a lot of discussion. SMR drives & persistent memory being the obvious stand-outs. Lots of discussion surrounding various things related to huge pages (so much so one session overran and replaced a slot I was supposed to share with Sasha, not that I complained. It was interesting stuff, and I learned a few new reasons to dislike the way we handle hugepages & forking), and I lost track how many times the GFP_NOFAIL discussion came up.

In a passing comment in one session, one of the people Intel sent (Dave Hansen iirc) mentioned that Intel are now shipping a 18 core/36 thread CPU. A bargain at just $4642. Especially when compared to this madness.

A few days before the event, I had been asked if I wanted to do a “how Akamai uses Linux” type talk at LSFMM, akin to what Chris Mason did re: facebook at last years event. I declined, given I’m still trying to figure that out myself. Perhaps another time.

Wednesday/Thursday, I attended Vault at the same location.
My take-away’s:

  • There was generally a lot more positive vibes around btrfs this year. Even with Josef playing bad cop to Chris’ good cop talk, things generally seemed to be moving away from a “everything is awful” toward “this actually works…” though with the qualifier “.. for facebook’s workload”. Josef did touch on one area that btrfs does still suck, which apparently is database workloads (iirc, due to the copy-on-write nature of btrfs). The spurious ENOSPC failures of the past should hopefully stay in the past. Things generally on the up and up. (Though, this does include the linecount, which has now passed 100KLOC, more than double that of XFS or ext*. Scary).
  • Equally positive vibes surrounding XFS. We celebrated the 20 year anniversary at one evening event, making us all feel just that little bit more like an old fart club. Interesting talk toward the end by Dave Chinner about the future of XFS, and how the current surge of development in XFS is probably its last for various scaling reasons as disks continue to get bigger and bigger. Predicting the future is always hard, but if what Dave said was true, things will start to get ‘interesting’ in about 5 years time, given every other filesystem we support in Linux has the same issues (or worse).
  • People still care a lot about NFS. Especially pNFS. Surprising amount of activity still happening.
  • Even when I worked there, I never really got Red Hat’s “big picture” wrt the several distributed filesystems they supported. Now that I’m not there, I feel even more out of the loop. “ceph is the way forward” “except when it’s glusterfs” or something. Oh, and GFS2 is still a thing apparently, for some reason.
  • As entertaining as Jeremy Allison might be, don’t go to a talk on Samba internals unless you work on it (in which case it’s too late for you). The horrors will likely keep you up at night.
  • Ted’s ext4 talk drew a decent crowd. As fancy as btrfs/xfs etc might be, a *lot* of people still give a crap about extN. Somehow I missed the addition of the ‘lazytime’ option to ext4. Seems neat. Played with it (and also the super-secret ‘dioread_nolock’ mount option). Saw another talk on orphan list scalability in ext4, which was interesting, but didn’t draw as big a crowd.

I got asked “What are you doing at Akamai ?” a lot. (answer right now: trying to bring some coherence to our multiple test infrastructures).
Second most popular question: “What are going to do after that ?”. (answer: unknown, but likely something more related to digging into networking problems rather than fighting shell scripts, perl and Makefiles).

All that, plus a lot of hallway conversations, long lunches, and evening activities that went on possibly a little later than they should have have led to me almost losing my voice today.
Really good use of time though. I had fun, and it’s always good to catch up with various people.

The post LSF/MM 2015 recap. appeared first on

March 03, 2015

Trinity 1.5 release.

As announced this morning, today I decided that things had slowed down (to an almost standstill of late) enough that it was worth making a tarball release of Trinity, to wrap up everything that’s gone in over the last year.

The email linked above covers most of the major changes, but a lot of the change over the last year has actually been groundwork for those features. Things like..

  • The post-mortem dumper needed the generation of the text and the writing to log files to be decoupled, which wasn’t particularly trivial.
  • Some features involved considerable rewrites. The fd generators are now pretty much isolated from each other, making adding a new one a simple task.
  • Handling of the mapping structs got a lot of cleanup (though there is definitely still a lot of room for improvement there, especially when we do things like splitting a mapping).
  • I should also mention the countless hours spent chasing down quite a few hard-to-reproduce bugs that are fixed in 1.5

As I mentioned in the announcement, I don’t see myself having a huge amount of time for at least this year to work on Trinity. I’ve had a number of people email me asking the status of some feature. Hopefully this demarkation point will answer the question.

So, it’s not abandoned, it just won’t be seeing the volume of change it has over the last few years. I expect my personal involvement will be limited to merging patches, and updating the syscall lists when new syscalls get added.

Trinity used to be on roughly a six month release schedule. We’ll see if by the end of the year there’s enough input from other people to justify doing a 1.6 release.

I’m also hopeful that time working on other projects mean I’ll come back to this at some point with fresh eyes. There are a number of features I wanted to implement that needed a lot more thought. Perhaps working on some other things for a while will give me the perspective necessary to realize those features.

The post Trinity 1.5 release. appeared first on

February 23, 2015

backup solutions.

For the longest time, my backup solution has been a series of rsync scripts that have evolved over time into a crufty mess. Having become spoiled on my mac with time machine, I decided to look into something better that didn’t involve a huge time investment on my part.

The general consensus seemed to be that for ready-to-use home-nas type devices, the way to go was either Synology, or Drobo. You just stick in some disks, and setup NFS/SAMBA etc with a bunch of mouse clicking. Perfect.

I had already decided I was going to roll with a 5 disk RAID6 setup, so bit the bullet and laid down $1000 for a Synology 8-Bay DS1815+. It came *triple* boxed, unlike the handful of 3TB HGST drives.
I chose the HGST’s after reading backblaze’s report on failure rates across several manufacturers, and figured that after the RAID6 overhead, 8TB would be more than enough for a long time, even at the rate I accumulate flac and wav files. Also, worst case, I still had 3 spare bays I could expand into later if needed.

Installation was a breeze. The plastic drive caddies felt a little flimsy, but the drives were secure once in them, even if they did feel like they were going to snap as I flexed them to pop them into place. After putting in all the drives, I connected the four ethernet ports, I powered it up.
After connecting to its web UI, it wanted to do a firmware update, like just about every internet connected device wants to do these days. It rebooted, and finally I could get about setting things up.

On first logging into the device over ssh, I think the first command I typed was uname. Seeing a 3.2 kernel surprised me a little. I got nervous thinking about how many VFS,EXT4,MD bugfixes hadn’t made their way back to long-term stable, and got the creeps a little. I decided to not think too much about it, and put faith in the Synology people doing backports (though I never got as far as looking into their kernel package).

The web ui is pretty slick, though felt a little sluggish at times. I set up my RAID6 volume with a bunch of clicks, and then listened as all those disks started clattering away. After creation, it wanted to do an initial parity scan. I set it going, and went to bed. The next morning before going to work, I checked on it, and noticed it wasn’t even at 20% done. I left it going while I went into the office the next day. I spent the night away from home, and so didn’t get back to it until another day later.

When I returned home, the volume was now ready, but I noticed the device was now noticeably hotter to touch than I remembered. I figured it had been hammering the disks non-stop for 24hrs, so go figure, and that it would probably cool off a little as it idled. As the device was now ready for exporting, I set up an nfs export, and then spent some time fighting uid mappings, as you do. The device does have ability to deal with LDAP and some other stuff that I’ve never had time to setup, so I did things the hard way. Once I had the export mounted, I started my first rsync from my existing backups.

While it was running, I remembered I had intended to set up bonding. A little bit of clicky-clicky later, it was done, and transfers started getting even faster. Very nice. I set up two bonds, with a pair of NICs in each. Given my desktop only has a dual NIC, that was good enough. Having a 2nd 2GigE bond I figured was nice in case I had multiple machines wanting to use it while I was doing a backup.

So the backup was going to take a while, so I left it running.
A few hours later, I got back to it, and again, it was getting really hot. There are two pretty big fans in the back of the units, and they were cranking out heat. Then, things started getting really weird. I noticed that the rsync had hung. I ctrl-c’d it, and tried logging into the device as root. It took _minutes_ to get a command prompt. I typed top and waited. About two minutes later top started. Then it spontaneously rebooted.

When it came back up, I logged in, and poked around the log files, and didn’t see anything out of the ordinary.
I restarted the rsync, and left it go for a while. About 20 minutes later, I came back to check on it again, and found that the box had just hung completely. The rsync was stalled, I couldn’t ssh in. I rebooted the device, cursed a bit, and then decided to think about it for a while, so never restarted the rsync. I clicked around in the interface, to see if there was anything I could turn on/off that would perhaps give me some clues wtf was going on.
Then it rebooted spontaneously again.

It was about this time I was ready to throw the damn thing out the window. I bought this thing because I wanted a turn-key solution that ‘just worked’, and had quickly come to realize that with this device when something went bad, I was pretty screwed. Sometimes “It runs Linux” just isn’t enough. For some people, the Synology might be a great solution, but it wasn’t for me. Reading some of the Amazon reviews, it seems there were a few people complaining about their units overheating, which might explain the random reboots I saw. For a device I wanted to leave switched on 24/7 and never think about, something that overheats (especially when I’m not at home) really doesn’t give me feel good vibes. Some of the other reviews on Amazon rave about the DS1815+. It may be that there was a bad batch, and I got unlucky, but I felt burnt on the whole experience, and even if I had got a replacement, I don’t know if I would have felt like I could have trusted this thing with my data.

I ended up returning it to Amazon for a refund, and used the money to buy a motherboard, cpu, ram etc to build a dedicated backup computer. It might not have the fancy web ui, and it might mean I’ll still be using my crappy rsync scripts, but when things go wrong, I generally have a much better chance of fixing the problems.

Other surprises: At one point, I opened the unit up to install an extra 4GB of RAM (It comes with just 2GB by default), I noticed that it runs off a single 250W power supply, which seemed surprising to me. I thought disks during spin-up used considerably more power, but apparently they’re pretty low power these days.

So, two weeks of wasted time, frustration, and failed experiments. Hopefully by next week I’ll have my replacement solution all set up and can move on to more interesting things instead of fighting appliances.

The post backup solutions. appeared first on

February 17, 2015

First fully sandboxed Linux desktop app

Its not a secret that I’ve been working on sandboxed desktop applications recently. In fact, I recently gave a talk at about it. However, up until now I’ve mainly been focusing on the bundling and deployment aspects of the problem. I’ve been running applications in their own environment, but having pretty open access to the system.

Now that the basics are working it’s time to start looking at how to create a real sandbox. This is going to require a lot of changes to the Linux stack. For instance, we have to use Wayland instead of X11, because X11 is impossible to secure. We also need to use kdbus to allow desktop integration that is properly filtered at the kernel level.

Recently Wayland has made some pretty big strides though, and we now have working Wayland sessions in Fedora 21. This means we can start testing real sandboxing for simple applications. To get something running I chose to focus on a game, because they require very little interaction with the system. Here is a video I made of Neverball, running in a minimal sandbox:

Click here to view the embedded video.

In this example we’re running a regular build of neverball in an environment which:

  • Is independent of the host distribution
  • Has no access to any system or user files other than the ones from the runtime and application itself
  • Has no access to any hardware devices, other than DRI (for GL rendering)
  • Has no network access
  • Can’t see any other processes in the system
  • Can only get input via Wayland
  • Can only show graphics via Wayland
  • Can only output audio via PulseAudio
  • … plus more sandboxing details

Yet the application is still simple to install and integrates nicely with the desktop. If you want to test it yourself, just follow the instructions on the project page and install org.neverball.Neverball.

Of course, there are still a lot to do here. For instance, PulseAudio doesn’t protect clients from each other, and for more complex applications we need to add new APIs to safely grant access to things like user files and devices. The sandbox details page has a more detailed list of what has to be done.

The road is long, but at least we have now started our journey!

February 16, 2015

NetworkManager for Administrators Part 1

4870003098_26ba44a08a_b(via scobleizer, CC BY 2.0)

NetworkManager is a system service that manages network interfaces and connections based on user or automatic configuration. It supports Ethernet, Bridge, Bond, VLAN, team, InfiniBand, Wi-Fi, mobile broadband (WWAN), PPPoE and other devices, and supports a variety of different VPN services.  You can manage it a couple different ways, from config files to a rich command-line client, a curses-like client for non-GUI systems, graphical clients for the major desktop environments, and even web-based management consoles like Cockpit.

There’s an old perception that NetworkManager is only useful on laptops for controlling Wi-Fi, but nothing could be further from the truth.  No laptop I know of has InfiniBand ports.  We recently released NetworkManager 1.0 with a whole load of improvements for workstations, servers, containers, and tiny systems from embedded to RaspberryPi.  In the spirit of making double-plus sure that everyone knows how capable and useful NetworkManager is, let’s take a magical journey into Administrator-land and start at the very bottom…

Daemon Configuration Files

Basic configuration is stored in /etc/NetworkManager/NetworkManager.conf in a standard key/value ini-style format.  The sections and values are well-described by ‘man NetworkManager.conf’.  A standard default configuration looks like this:


You can override default configuration through either command-line switches or by dropping “configuration snippets” into /etc/NetworkManager/conf.d.  These snippets use the same configuration options from ‘man NetworkManager.conf’ but are much easier to distribute among larger numbers of machines though packages or tools like Puppet, or even just to install features through your favorite package manager.  For example, in Fedora, there is a NetworkManager-config-connectivity-fedora RPM package that installs a snippet that enables connectivity checking to Fedora Project servers.  If you don’t care about connectivity checking, you simply ‘rpm -e NetworkManager-config-connectivity-fedora’ instead of tracking down and deleting /etc/NetworkManager/conf.d/20-connectivity-fedora.conf.

Just for kicks, let’s take a walk through the various configuration options, what they do, and why you might care about them in a server, datacenter, or minimal environment…

Configuration Snippets

First, each configuration “snippet” in /etc/NetworkManager/conf.d can override values set in earlier snippets, or even the default configuration (but not command-line options).  So the same option specified in 50-foobar.conf will override that option specified in 10-barfoo.conf.  Many options also support the “+” modifier, which allows their value to be added to earlier ones instead of replacing.  So “plugins+=something-else” will add “something-else” to the list, instead of overwriting any earlier values.  You’ll see why this is quite useful in a minute…

Dive Deep

plugins=ifcfg-rh | ifupdown | ifnet | ifcfg-suse | ibft (default empty)

This option enables or disables certain settings plugins, which are small loadable libraries that read and write distribution-specific network configuration.  For example, Fedora/RHEL would specify ‘plugins=ifcfg-rh’ for reading and writing the ifcfg file format, while Debian/Ubuntu would use ‘plugins=ifupdown’ for reading /etc/network/interfaces, and Gentoo would use ‘plugins=ifnet’.  If you know your distro’s config format like the back of your hand, NetworkManager doesn’t make you change it.

There is one default plugin though, ‘keyfile’, which NetworkManager uses to read and write configurations that the distro-specific plugins can’t handle.  These files go into /etc/NetworkManager/system-connections and are standard .ini-style key/value files.  If you’re interested in the key and value definitions, you can check out ‘man nm-settings’ and ‘man nm-settings-keyfiles’, or even look at some examples.

monitor-connection-files=yes | no (default no)

By popular demand, NetworkManager no longer watches configuration files for changes.  Instead, you make all the changes you want, and then explicitly tell NetworkManager when you’re done with “nmcli con reload” or “nmcli con load <filename>”.  This prevents reading partial configuration and allows you to double-check that everything is correct before making the configuration update.  Note that changes made through the D-Bus interface (instead of the filesystem) always happen immediately.

However, if you want the old behavior back, you can set this option to “yes”.

auth-polkit=yes | no (default yes)

If built with support for it, NetworkManager uses PolicyKit for fine-grained authorization of network actions.  This will be the subject of another article in this series, but the TLDR is that PolicyKit easily allows user A the permission to use WiFi while denying user B WiFi but allowing WWAN.  These things can be done with Unix groups, but that quickly gets unwieldy and isn’t fine-grained enough for some organizations.  In any case, PolicyKit is often unecessary on small, single-user systems or in datacenters with controlled access.  So even if your distribution builds NetworkManager with PolicyKit enabled, you can turn it off for simpler root-only operation.

dhcp=dhclient | dhcpcd | internal (default determined at build time, dhclient preferred if enabled)

With NetworkManager 1.0 we’ve added a new internal DHCP client (based off systemd code which was based off ConnMan code) which is smaller, faster, and lighter than dhclient or dhcpcd.  It doesn’t do DHCPv6 yet, but we’re working on that.  We think you’ll like it, and it’s certainly much less of a resource hog than a dhclient process for every interface. To use it, set this option to “internal” and restart NetworkManager.

If NetworkManager was built with support for dhclient or dhcpcd, you can use either of these clients by setting this option to the client’s name.  Note that if you enable both dhclient and dhcpcd, dhclient will be preferred for maximum compatibility.

no-auto-default= (default empty)

By default, NetworkManager will create an in-memory DHCP connection for every Ethernet interface on your system, which ensures that you have connectivity when bringing a new system up or booting a live DVD.  But that’s not ideal on large systems with many NICs, or on systems where you’d like to control initial network bring-up yourself.  In that case, you should set this option to “*” to disable the auto-Ethernet behavior for all interfaces, indicating that you’d like to create explicit configuration instead.  You can also use MAC addresses or interface names here too!  On Fedora we’ve created a package called NetworkManager-config-server that sets this option to “*” by default.

ignore-carrier= (default empty)

Trip over a cable?  Want to make sure a critical interface stays configured if the switch port goes down?  This option is for you!  Setting it to “*” (all interfaces) or using MAC addresses or interface names here will tell NetworkManager to ignore carrier events after the interface is configured.  For DHCP connections a carrier is obviously required for initial configuration, while static connections can start regardless of carrier status.  After that, feel free to unplug the cable every time Apple sells an iPhone!

configure-and-quit=yes | no (default no)

New with 1.0 is the “configure and quit” mode where NetworkManager configures interfaces (including, if desired, blocking startup until networking is active) and then quits, spawning small helpers to maintain DHCP leases and IPv6 address lifetimes if required.  In a datacenter or cloud where cycles are money, this can save you some cash and deliver a more stable setup with known behavior.

dns=dnsmasq | unbound | none | default (default empty, equivalent to “default”)

Want to control DNS yourself?  NetworkManager makes it easy!  Don’t want to?  NetworkManager makes that easy too! When you set this option to ‘dnsmasq’ NetworkManager will configure dnsmasq as a local caching nameserver, including split DNS for VPN tunnels.  If you set it to ‘none’ then NetworkManager won’t touch /etc/resolv.conf and you can use dispatcher scripts that NetworkManager calls at various points to set up DNS any way you choose.

Leaving the option empty or setting it to “default” asks NetworkManager to own resolv.conf, updating system DNS with any information from your explicit network settings or those received from automatic means like DHCP.

In the upcoming NetworkManager 1.2, DNS information is written to /var/lib/NetworkManager/resolv.conf and, if NM is allowed to manage /etc/resolv.conf, that file will be a symlink to the one in /var similar to systemd-resolvd.  This makes it easier for external tools to incorporate the DNS information that NetworkManager combines from multiple sources like DHCP, PPP, IPv6, VPNs, and more.

unmanaged-devices= (default empty)

Want to keep NetworkManager’s hands off a specific device?  That’s what this option is for, where you can use “interface-name:eth0″ or “mac:00:22:68:1c:59:b1″ to prevent automatic management of a device.  While there are some situations that require this, by default NetworkManager doesn’t touch virtual interfaces that it didn’t create, like bridges, bonds, VLANs, teams, macvlan, tun, tap, etc.  So while it’s unusual to need this option, we realize that NetworkManager can be used in concert with other tools, so it’s here if you do.

uri=  (default empty = disabled)
interval=(default 0 = disabled)
response=  (default “NetworkManager is online”)

Connectivity checking helps users log into captive ports and hotspots, while also providing information about whether or not the Internet is reachable.  When NetworkManager connects a network interface, it sends an HTTP request to the given URI and waits for the specified response.  If you’re connected to the Internet and the connectivity server isn’t down, the response should match and NetworkManager will change state from CONNECTED_SITE to CONNECTED.  It will also check connectivity every ‘interval’ seconds so that clients can report status to the user.

If you’re instead connected to a WiFi hotspot or some kind of captive portal like a hotel network, your DNS will be hijacked and the request will be redirected to an authentication server.  The response will be unexpected and NetworkManager will know that you’re behind a captive portal.  Clients like GNOME Shell will then indicate that you must authenticate before you can access the real Internet, and could provide an embedded web browser for this purpose.

Upstream connectivity checking is disabled by default, but some distribution variants (like Fedora Workstation) are now enabling it for desktops, laptops, and workstations.  On a server or embedded system, or where traffic costs a lot of money, you probably don’t want this feature enabled.  To turn it off you can either remove your distro-provided connectivity package (which just drops a file in /etc/NetworkManager/conf.d) or you can remove the options from NetworkManager.conf.

Special NetworkManager data files

In the normal course of network management sometimes non-configuration data needs to persist.  NetworkManager does this in the /var/lib/NetworkManager directory, which contains a few different files of interest:


This file contains the BSSIDs (MAC addresses) of WiFi access points that NetworkManager has connected to for each configured WiFi network.  NetworkManager doesn’t do this to spy on you (and the file is readable only by root), but instead to automatically connect to WiFi networks that do not broadcast their SSID.  You almost never need to touch this file, but if you are concerned about privacy feel free to delete this file periodically.


Each time you connect to a network, whether wired, WiFi, etc, NetworkManager updates the timestamp in this file.  This allows NetworkManager to determine which network you last used, which can be used to automatically connect you to more preferred networks.  NetworkManager also uses the timestamp as an indicator that you have successfully connected to the network before, which it uses when deciding whether or not to ask for your WiFi password when you get randomly disconnected or the driver fails.


This file stores persistent user-determined state for Airplane mode for each technology like WiFi, WWAN, and WiMAX.  Normally this is controlled by hardware buttons, but some systems don’t have hardware buttons or the drivers don’t work, plus that state is not persistent across boots.  So NetworkManager stores a user-defined state for each radio type and will ensure the radio stays in that state across reboots too.

DHCP lease and configuration files

When you obtain a DHCP lease, that lease may last longer than your connection to that network.  To ensure that you receive a nominally stable IP address the next time you connect, or to ensure that your TCP sessions are not broken if there is a network hiccup, NetworkManager stores the DHCP lease and attempts to acquire the same lease again.  These files are stored per-connection to ensure that a lease acquired on your home WiFi or ethernet network is not used for work or Starbucks.  Temporary DHCP configuration files are also stored here, which are constructed based on your preferences and on generic DHCP configuration files in /etc for each supported DHCP client.  If you want to wipe the DHCP slate clean, feel free to remove any of the lease or configuration files.

And that’s it for this time, stay tuned for the next part in this series!

January 30, 2015

Back from DevX hackfest

I’m now back from a week in Cambridge at the developer experience hackfest. This was a great event, it was a lot of fun to meet people again, and we got a lot of things done. I spent a lot of time talking to people about things related to xdg-app and sandboxed applications, both spreading information and actually implementing features.

I spent some time with Emmanuele, Ryan and Lars working on glib stuff, which resulted in the G_DECLARE_*_TYPE macros finally being merged. Additionally I reviewed the new list model abstraction which I hope we can land soon, and Ryan and I worked out a new fancy __attribute__(cleanup) approach that we hope to merge into glib soon.

We also worked a bit on Gtk+ OpenGL support. Based on feedback from early users we’re doing some changes in how GL contexts are created to allow you to configure them in more detail. We also decided that we want to completely drop support for legacy OpenGL contexts, as these had issues cooperating with Core 3.2 contexts, and because we don’t live in the 90s anymore. Carlos was working on converting GtkPopover to use (override redirect) toplevels on X11, and I gave him moral support and generally hated on ancient crappy X11 behaviour.

Props to Collabora and Philip for arranging a great event!

January 26, 2015

So much to learn..

At lunch a few days ago, I was discussing with a coworker how right now I’m feeling a little overwhelmed. Not completely unsurprising I suppose given the volume of new things I need to learn. But as I’m getting acclimated in my new job, it’s becoming clearer which things I either don’t know well enough or am completely unfamiliar with.

I’m only now realizing that not only the scope of what I’m working on changed, but how I work on things. During the ten years I worked on Fedora, because we were woefully understaffed and buried alive in bugs, there was never really time to spend a lot of time on an individual bug, unless a lot of users were hitting it. Because of this, the things that I ended up fixing were usually fairly small fixes. The occasional NULL pointer deference. Perhaps a use after free. Maybe linked-list corruption. Pretty basic stuff that anyone with familiarity with any part of the kernel could figure out reasonably quickly. Do enough of this, and it becomes less about engineering, and more about pattern recognition.

Even a lot of the bugs that trinity finds aren’t terribly invasive fixes.
(Screwed up error paths being probably the most common thing it picks up, because nothing else really ever tests them).

The more complicated bugs, like a WARN_ON being hit ? That could take a lot more understanding that could take a considerable time to get to.

As Fedora kernels were so close to mainline a lot of the time we could chase up the maintainer, pass the bug along, and move onto something else. The result of ten years of this way of working has meant I have a passing understanding how lots of parts of the kernel interact from a 10,000ft perspective, and perhaps some understanding of some nuances based on past interactions, but no deep architectural understanding of for eg: how an sk_buff traverses through the various parts of the network stack. A warning deep in the tcp guts, caused by a packet that’s been through bonding, netfilter, etc ? I’m not your guy. At least today.

Thankfully I’ve got some time to ramp up my learning on unfamiliar parts of the kernel, and get a better understanding of how things work on a deeper level. I suspect I’ll end up turning at least some of the stuff I learn into future posts here.

In the meantime, I’ve got a lot of reading to do.
I felt a little reassured at least when my coworker responded “yeah, me too”.

The post So much to learn.. appeared first on