February 03, 2012

The license term smorgasbord: copyleft, share-alike, reciprocal, viral, or hereditary?

I microblogged (diaspora, identica, twitter) the following statement a few weeks ago:

First new year’s resolution, 10 days late: I will use ‘hereditary license’ any time I am tempted to say ‘viral license.’

Surprisingly, this generated quite a few responses (on identica and elsewhere)- some people liked it, but many people had their own alternative to propose. So here are some longer-form thoughts.

There are four primary options that I am aware of when trying to find a one-word term for open source licenses that in some way compel distributors to also distribute code- i.e., the licenses called “copyleft” by those of us who have spent too much time with this stuff. The terms:

  • Copyleft: This is the common name when speaking to other people experienced in open source, so it’s obviously the first choice when you know your audience has at least some experience in open source. But to an audience not already involved in open source (the only time I’m ever even vaguely “tempted to say viral”), the phrase is completely non-obvious. It has zero evident meaning. In fact, it can actively confuse: it could mean the reverse of copyright, which to most people probably means “no license” or anti-copyright altogether. So it’s really not a good word to use for audiences who aren’t familiar with open source- which is to say, most audiences.
  • Viral: This is another old standby. Traditionally, the objection to this term has been that it is perjorative: no one likes viruses, so ‘viral’ is often seen as (and sometimes is) a deliberate attempt to frame copyleft licenses as inherently bad. That objection is certainly accurate, but I think there is another problem with this word: it implies that, like a virus, copyleft can spread to someone without their active involvement; you can “catch” it from the digital equivalent of being in the same room with someone, or not washing your hands. This isn’t the case – there must be a strong relationship between the copylefted code and the other code that might be required to be shared. This, to me, is where “viral” really fails to communicate. It makes people think that a copyleft is something that might “happen to them” instead of it being something that they have to be actively involved with.
  • Share-alike (or the related word “reciprocal”): Oddly, neither of these is used much outside of the Creative Commons world. Neither of these are bad terms: they are reasonably value-neutral, and they both imply that there must be an actively chosen relationship between the parties, which I think is important. But to me they don’t capture the why of the relationship; it makes it sound like there might be a choice in the matter, or something you do because you’re a nice guy.
  • Hereditary: Richard Fontana traced this back to at least 2004, so it isn’t new, but without doubt this is the least used of the various terms I’m discussing here. At least for the moment, though, and for general audiences, I’m leaning towards thinking it is the best option. First, it implies that there has to be a real, derivative relationship between the two codebases; it can’t just happen at random (like viral implies). Second, it also implies that once the relationship exists, further licensing isn’t a choice- you must pass it on to the next folks who “inherit” from you. (Share-alike/reciprocal might imply that you do this because you’re a nice guy, which isn’t the case, or that you do it back to the original sharer- which also isn’t necessarily the case.) The analogy obviously isn’t perfect, most notably because a mere redistributor who hasn’t created a derivative work that “inherits” from the parent work is still bound by the license. But I think on balance that it has the fewest tradeoffs.

So there you go, for the dozen people who asked, and the hundreds, nay billions more who don’t care :)

Well hello OpenShift

I haven’t blogged lately. For some reason since the popularization of Facebook posts and tweets, my ability to write more than a few coherent sentences have greatly diminished. Perhaps it is just me getting old but change is what change does and a lot of change has happened recently. The biggest recent change is me getting a promotion to Senior Software Engineer and moving from the Fedora team to the OpenShift team inside of Red Hat. Yes, I have traded Beefy Miracles for Space Pandas and I think the change has done me some good. I have wanted to transition to a more customer driven structured part of Red Hat without sacrificing the excitement of working with a fast moving project. OpenShift fit the bill very nicely with their agile development workflow in the emerging field of PaaS (Platform as a Service) cloud development. It is also nice having a large and growing team to work with.

My involvement with PyGObject

That being said most of my hacking time will be spent on OpenShift related projects and while I had already transitioned out of day to day PyGObject maintainership some time ago, I will no longer have any real time to dedicate to the project (I’m actually learning Ruby right now). To tell the truth, not being able to put any more serious time into the project is one of the major reasons I decided I needed a change. There are a number of other people still contributing to the project but it is sorely in need of a lead maintainer who can do releases, keep people on schedule and ping the right people when bugs languish. I feel PyGObject is in good shape but as it begins to get more uptake bugs fixes need to be committed, edge cases corralled and the last mile needs to be traversed. I will still hang out in #python on GIMPNet and can be persuaded to look at patches or even write a few if you ping me and are nice.

Jobs

With me leaving the Fedora team there is now an opening for someone to join the team. They are looking for an all around FOSS rock star who can work in a number of different areas such as packaging, desktop and web development, and any number of miscellaneous skill your would encounter with any FOSS project. The main responsibilities would be maintaining, improving and integrating our infrastructure tools such as Fedora Community Packages web app, Bodhi update tool and Fas accounts system as well as developing tools to make it easier to contribute to Fedora. Most of the tools are written in Python so being a Python expert is a big plus as well as having worked as part of a team on any major open source project. If that sounds like fun to you send me your resume (I get a bonus if you get hired).

OpenShift is also expanding so if any of these jobs look more like your speed feel free to mail me also.

[read this post in: ar de es fr it ja ko pt ru zh-CN ]

January 28, 2012

Difficulties of elisp

The thesis that underlies my project to translate the Emacs C code to Common Lisp is that Emacs Lisp is close enough to Common Lisp that the parts of the Emacs C code that implement Lisp can be dropped in favor of the generally superior CL implementation.  This is generally true, but there are a few difficult bits.

Symbols

The primary problem is the translation of symbols when used as variable references.  Consider this code:

(defvar global 73)
(defun function (argument)
  (let ((local (something-else))
    (+ local argument global)))

More is going on here than meets the eye.

First, Emacs Lisp uses dynamic binding by default (optional lexical binding is a new feature in Emacs 24).  This applies to function arguments as well as other bindings.  So, you might think you could translate this straightforwardly to:

(defvar global 73)
(declare (special global))
(defun function (argument)
  (declare (special argument))
  (let ((local (something-else))
    (declare (special local))
    (+ local argument global)))

This was the approach taken by elisp.lisp; it defined macros for let and let* (but forgot defun) to do the dirty work:

(defmacro el::let* ((&rest vars) &rest forms)
  "Emacs-Lisp version of `let*' (everything special)."
  `(let* ,vars (declare (special ,@(mapcar #'from-list vars))) ,@forms))

But not so fast!  Emacs also has buffer-local variables.  These are variables where the value is associated with the current buffer; switching buffers makes a different binding visible to Lisp.  These require no special syntax, and a variable can be made buffer-local at any time.  So, we can break the above translation simply by evaluating:

(make-local-variable 'global)
(setq global 0)

Whoops!  Now the function will return the wrong result — the translation will have no way to know that is should refer to the buffer-local value.  (Well, ok, pretend that the setq magically worked somehow…)

My idea for implementing this is pretty convoluted.  Actually I have two ideas, one “user” and one “kernel”:

User

I think it is possible to use define-symbol-macro on all symbols that come from Elisp, so that we can tell the CL compiler about the real implementation.  However, a symbol can either be treated as a variable, or it can be treated as a symbol-macro — not both at the same time.  So, we will need a second location of some kind to store the real value.  Right now I’m thinking a symbol in another package, but maybe a cons or some other object would work better. In either case, we’d need a macro, a setf method for its expansion, and some extra-tricky redefinitions of let and defun to account for this change.

This would look something like:

(define-symbol-macro global (elisp:get-elisp-value 'global))
(defsetf elisp:get-elisp-value elisp:set-elisp-value))
;; Details left as an exercise for the reader.

This solution then has to be applied to buffer-, keyboard-, and frame-local variables.

Kernel

The kernel method is a lot simpler to explain: hack a Common Lisp implementation to directly know about buffer-locals.  SMOP!  But on the whole I think this approach is to be less preferred.

Other Problems

Emacs Lisp also freely extends other typical data types with custom attributes.  I consider this part of the genius of Emacs; a more ordinary program would work within the strictures of some defined, external language, but Emacs is not so cautious or constrained.  (Emacs is sort of a case study in breaking generally accepted rules of programming; which makes one wonder whether those rules are any good at all.)

So, for example, strings in Emacs have properties as a built-in component.  The solution here is simple — we will just translate the Emacs string data type as a whole, something we probably have to do anyway, because Emacs also has its own idiosyncratic approach to different encodings.

In elisp, aref can be used to access elements of other vector-like objects, not just arrays; there are some other odd little cases like this.  This is also easily handled; but it left me wondering why things like aref aren’t generic methods in CL. It often seems to me that a simpler, more orthogonal language lies inside of CL, struggling to get free. I try not to think these thoughts, though, as that way lies Scheme and the ridiculous fragmentation that has left Lisp unpopular.

January 26, 2012

The Case for the /usr Merge

One of the features of Fedora 17 is the /usr merge, put forward by Harald Hoyer and Kay Sievers[1]. In the time since this feature has been proposed repetitive discussions took place all over the various Free Software communities, and usually the same questions were asked: what the reasons behind this feature were, and whether it makes sense to adopt the same scheme for distribution XYZ, too.

Especially in the Non-Fedora world it appears to be socially unacceptable to actually have a look at the Fedora feature page (where many of the questions are already brought up and answered) which is very unfortunate. To improve the situation I spent some time today to summarize the reasons for the /usr merge independently. I'd hence like to direct you to this new page I put up which tries to summarize the reasons for this, with an emphasis on the compatibility point of view:

The Case for the /usr Merge

Note that even though this page is in the systemd wiki, what it covers is mostly orthogonal to systemd. systemd supports both systems with a merged /usr and with a split /usr, and the /usr merge should be interesting for non-systemd distributions as well.

Primarily I put this together to have a nice place to point all those folks who continue to write me annoyed emails, even though I am actually not even working on all of this...

Enjoy the read!

Footnotes:

[1] And not actually by me, I am just a supportive spectator and am not doing any work on it. Unfortunately some tech press folks created the false impression I was behind this. But credit where credit is due, this is all Harald's and Kay's work.

Resources in glib

Last week I landed a new feature in Glib that I’ve been wanting to do for a long time: A resource framework. Resources are things that are naturally part of an application or library, but not really normal code. For instance, our code increasingly uses xml  to describe user interfaces and menus.

Traditionally these either had to be manually inserted into the code, like so:

static const gchar *ui_info =
"<ui>"
"  <menubar name='MenuBar'>"
"    <menu action='FileMenu'>"
"      <menuitem action='Quit'/>"
"    </menu>"
"  </menubar>"
"</ui>";

Or a file was stored in /usr/share/$application/ and you had to write code to manually find and load the file, and cache it if used often. This is not a lot of code, but it can be tricky as all I/O code needs to handle errors and the external file makes it harder to make the library/app relocatable.

Instead, with resources you store your data file as plain files in your source tree, edit them with your favourite editor, with full syntax highlighting, automatic indentation, etc. Then you reference these files from a resource description file (another xml file) and use glib-compile-resources to compile these into a binary chunk which is linked into the application.

The resource framework then automatically registers all such resource bundles in a global namespace, where you can quickly look up resource data by a resource path. There are API calls to get the data as a direct pointer, as well as streaming data, but in most cases there are helper functions that let you just specify the pathname. So, for instance, you can just do:

 gtk_builder_add_from_resource (builder, "/org/gnome/appname/menu.ui", &error);

Which would handle all the work for you. And while this looks like an I/O operation its really just a hashtable lookup on linked-in data, reading from the (shared, readonly) data section in your executable, so its very fast and safe.

Additionally there are some tricks the resource compiler can do for you. For instance, you can specify that a resource should be compressed, which means that the data is stored compressed, and the APIs uncompress for you automatically. You can also specify that xml files should be pre-processed to strip away whitespace, which avoids wasting time and memory on something that is not useful at runtime.

There is also support for resource:// URIs, which means you can easily reference resource data like icons from e.g. CSS and UI files.

On Linux we use some gcc specific extensions to store the resources in separate ELF sections, which means its very easy to extract resource data from the binaries. Glib even ships with a tool that lets you do this:

$ gresource list libgtk-3.so
/org/gtk/libgtk/cursor/dnd-ask.png
/org/gtk/libgtk/cursor/dnd-copy.png
/org/gtk/libgtk/cursor/dnd-link.png
/org/gtk/libgtk/cursor/dnd-move.png
/org/gtk/libgtk/cursor/dnd-none.png
/org/gtk/libgtk/gtk-default.css
/org/gtk/libgtk/gtk-win32.css
$ gresource extract libgtk-3.so /org/gtk/libgtk/gtk-default.css
@define-color fg_color #000;
@define-color bg_color #dcdad5;
@define-color text_color #000;
...

If you’re interested in using resources in your application, check out the documentation, or look at this example commit that converts Nautilus to use resources.

GStreamer 0.11 Application Porting Hackfest

I’m in the quiet town of Malaga these three days to attend the GStreamer hackfest. The goal is to port applications over to the 0.11 API which will eventually be 1.0 There’s about 18 people here, which is a good number for a hackfest.

The goal for me is to figure out everything that needs to be done to have Flumotion working with GStreamer 0.11. It looks like there is more work than expected, since some of the things we rely on haven’t been ported successfully.

Luckily back in the day we spent quite a bit of time to layer parts as best as possible so they don’t depend too much on each other. Essentially, Flumotion adds a layer on top of GStreamer where GStreamer pipelines can be run in different processes and on different machines, and be connected to each other over the network. To that end, the essential communication between elements is abstracted and wrapped inside a data protocol, so that raw bytes can be transferred from one process to another, and the other end ends up receiving those same GStreamer buffers and events.

First up, there is the GStreamer Data protocol. Its job is to serialize buffers and events into a byte stream.

Second, there is the concept of streamheaders (which is related to the DELTA_UNIT flag in GStreamer). These are buffers that always need to be send at the beginning of a new stream to be able to interpret the buffers coming after it. In 0.10, that meant that at least a GDP version of the caps needed to be in the streamheader (because the other side cannot interpret a running stream without its caps), and in more recent versions a new-segment event. These streamheaders are analogous to the new sticky event concept in 0.11 – some events, like CAPS and TAG and SEGMENT are now sticky to the pad, which means that a new element connected to that pad will always see those events to make sense of the new data it’s getting.

Third, the actual network communication is done using the multifdsink element (and an fdsrc element on the other side). This element just receives incoming buffers, keeps them on a global buffer list, and sends all of them to the various clients added to it by file descriptor. It understands about streamheaders, and makes sure clients get the right ones for wherever they end up in the buffer list. It manages the buffers, the speed of clients, the bursting behaviour, … It doesn’t require GDP at all to work – Flumotion uses this element to stream Ogg, mp3, asf, flv, webm, … to the outside world. But to send GStreamer buffers, it’s as simple as adding a gdppay before multifdsink, and a gdpdepay after fdsrc. Also, at the same level, there are tcpserversink/tcpclientsrc and tcpclientsink/tcpserversrc elements that do the same thing over a simple TCP connection.

Fourth, there is an interface between multifdsink/fdsrc and Python. We let Twisted set up the connections, and then steal the file descriptor and hand those off to multifdsink and fdsrc. This makes it very easy to set up all sorts of connections (like, say, in SSL, or just pipes) and do things to them before streaming (like, for example, authentication). But by passing the actual file descriptor, we don’t lose any performance – the low-level streaming is still done completely in C. This is a general design principle of Flumotion: use Python and Twisted for setup, teardown, and changes to the system, and where we need a lot of functionality and can sacrifice performance; but use C and GStreamer for the lower-level processor-intensive stuff, the things that happen in steady state, processing the signal.

So, there is work to do in GStreamer 0.11:

  • The GStreamer data protocol has not really been ported. gdppay/depay are still there, but don’t entirely work.
  • streamheaders in those elements will need adapting to handle sticky events.
  • multifdsink was moved to -bad and left with broken unit tests. There is now multisocketsink. But sadly it looks like GSocket isn’t meant to handle pure file descriptors (which we use in our component that records streams to disk for example)
  • 0.11 doesn’t have the traditional Python bindings. It uses gobject-introspection instead. That will need a lot of work on the Flumotion side, and ideally we would want to keep the codebase working against both 0.10 and 0.11 as we did for the 0.8->0.10 move. Apparently these days you cannot mix gi-style binding with old-style binding anymore, because they create separate class trees. I assume this also means we need to port the glib2/gtk2 reactors in Twisted to using gobject-introspection.

So, there is a lot of work to be done it looks like. Luckily Andoni arrived today too, so we can share some work.

After discussing with Wim, Tim, and Sebastien, my plan is:

  1. create a common base class for multihandlesink, and refactor multisocketsink and multifdsink as subclasses of it
  2. create g_value_transform functions to bytestreams for basic objects like Buffers and Events
  3. use these transform functions as the basis for a new version of GDP, which we’ll make typefindable this time around
  4. support sticky events
  5. ignore metadata for now, as it is not mandatory; although in the future we could let gdppay decide which metadata it wants to serialize, so the application can request to do so
  6. try multisocketsink as a transport for inside Flumotion and/or for the streaming components.
  7. In the latter case, do some stress testing – on our platform, we have pipelines with multifdsink running for months on end without crashing or leaking, sometimes going up to 10000 connections open.
  8. Make twisted reactors
  9. prototype flumotion-launch with 0.11 code by using gir

That’s probably not going to be finished over this week, but it’s a good start. Last night I started by fixing the unit tests for multifdsink, and now I started refactoring multisocketsink and multifdsink with that. I’ll first try and make unit tests for multisocketsink though, to verify that I’m refactoring properly.

January 25, 2012

Emacs and Common Lisp, Part 2

This is a followup to my earlier post on converting the Emacs C code into Common Lisp.  This one is a bit more technical, diving into some specifics of the conversion process.

Basics

One important fact is that we do not need to convert an arbitrary C program to Common Lisp.  This might or might not be efficiently possible — but we do not care.  We only need to convert Emacs.  This is simpler for two reasons.  First, we can just ignore any C construct that Emacs does not use.  If the translator barfs after some new update, we can fix it then.  Second, Emacs itself is already written in a relatively Lispy style, being a Lisp implementation itself.  We further exploit this by allowing the translator to know some details about Emacs.  As a trivial example, all the Smumble globals created by the DEFUN marco need not be translated into Common Lisp as structure constants — they are an artifact of the implementation, and will show up directly in the generated defuns instead.

What to ignore

A good portion of Emacs is simply redundant in the CL world.  There are a few types (cons, vector, integers, functions) that are shareable — in fact, sharing these is part of the goal of this effort.  There are also a number of functions which are effectively identical.  There are also entire redundant modules, like the garbage collector, or the bytecode interpreter.

The question is how to have the translator differentiate between what is useful and what is not, without breaking builds of future versions of Emacs.

I don’t currently think there is a high road to solving this problem.  For modules like the GC, I plan to have ad hoc translator rules for the particular source files.  For functions and data types, I’m adding new GCC attributes that I can use to mark the ignorable definitions.

Types

There are two type-related issues that arise when translating the source.

First, how should Emacs-specific types be represented?  Primarily these types are structures, like struct buffer or struct string (we cannot use the CL string type, because Emacs adds properties directly to the string, and Emacs has its own idiosyncratic character handling).  My answer here is to just straightforwardly translate them to defstruct.

The other question is when translating a C function, what do we do with the types of local variables?  For the most part I am pretending that they don’t exist.  This works fine except for local arrays and structures, but these are easily handled by initializing variables properly. My rationale is that while this is slower, it lets me get something working more quickly, and we can always update the translator to emit CL type declarations later on.

This simple approach doesn’t actually cover all the needed cases.  For example, there is code in Emacs that takes the address of a local variable and passes it somewhere.  This is easy to deal with; much of the remaining work is just digging through the code looking for special cases to clean up.

I’m similarly omitting type declarations from the generated structures.  One possible nice side effect of this approach is that it will make it easier to lift Emacs’ file-size restrictions, because there will no longer be any code assuming that the size is a fixnum.

Macros

Many low-level details of the Emacs implementation are hidden in macros.  For example, Emacs stuffs some type information into the low-order bits of pointers.  It uses macros to add or remove this information.  For this build, I redefine these macros to do nothing.  This makes the GCC Gimple representation much closer to the abstract meaning of the program, and thus simpler to translate.

There are also some macros that are useful to redefine so that we can more easily hook into them from the translator.  For example, Emacs has a C macro INTEGERP that is used to check whether its argument is an integer.  Normally this macro uses bit twiddling to get its answer, but I redefine it like so:

#undef INTEGERP
extern Lisp_Object *INTEGERP (Lisp_Object)
    __attribute__((lisp_form("integerp")));

Example

The translator is not nearly complete, but it can already do a fair job at translating simple functions.  For example, here is “forward-point” from the Emacs C code:

DEFUN ("forward-point", Fforward_point, Sforward_point, 1, 1, 0,
       doc: /* Return buffer position N characters after (before if N negative) point.  */)
  (Lisp_Object n)
{
  CHECK_NUMBER (n);

  return make_number (PT + XINT (n));
}

Here is what the translator comes up with:

(defun Fforward_point (n)
  (let (
    temp-var-0
    Qintegerp.316
    temp-var-1
    current_buffer.317
    temp-var-2
    )
    (block nil (tagbody
      bb-0
        ; no gimple here
      bb-1
        ; no gimple here
      bb-2
        (setf temp-var-0 (integerp n))
        (if (== temp-var-0 nil)
          (go bb-3)
          (go bb-4))
      bb-3
        (setf Qintegerp.316 Qintegerp)
        (wrong_type_argument Qintegerp.316 n)
      bb-4
        (setf current_buffer.317 current_buffer)
        (setf temp-var-2 (buffer-pt current_buffer.317))
        (setf temp-var-1 (+ temp-var-2 n))
        (return temp-var-1)
  ))))

(defun elisp:forward-point (arg0)
  (Fforward_point arg0))

The output looks pretty weird, because the translator works after GCC’s CFG is built, and so the most straightforward translation is to use this mess with tagbody.  I doubt this matters much, but in any case the translator is readily hackable — it is still less than 400 lines of Python, including comments.

One thing to note is the translation of “PT“.  This is actually a macro that refers to the current buffer:

#define PT (current_buffer->pt + 0)

The translator properly turns this into a reference to “buffer-pt“.

Another detail is the handling of packages.  My plan is to put the Emacs implementation into one package, and then any elisp into a second package called “elisp“.  A DEFUN in the C code will actually generate two functions: the internal one, and the elisp-visible one; hence the “elisp:” in the translation.

Next Steps

There’s still a good amount of work to be done.  The converter punts on various constructs; type translation is implemented but not actually wired up to anything; the translator should emit definitions for alien functions; and plenty more.

January 24, 2012

Nominated for OpenSource.com People’s Choice Award

Based on my series of MPL posts for opensource.com, I’ve been nominated for a “people’s choice award” as a top contributor to opensource.com. It’s a nice little honor. That said, there are lots of folks on the list of nominees who have written and thought far more than I have this year- so you should go check out the list and vote for one of them instead :)

January 20, 2012

Plumbers Wishlist, The Third Edition, a.k.a. "The Thank You Edition"

Last October we published a wishlist for plumbing related features we'd like to see added to the Linux kernel. Three months later it's time to publish a short update, and explain what has been implemented in the kernel, what people have started working on, and what's still missing.

The full, updated list is available on Google Docs.

In general, I must say that the list turned out to be a great success. It shows how awesome the Open Source community is: Just ask nicely and there's a good chance they'll fulfill your wishes! Thank you very much, Linux community!

We'd like to thank everybody who worked on any of the features on that list: Lucas De Marchi, Andi Kleen, Dan Ballard, Li Zefan, Kirill A. Shutemov, Davidlohr Bueso, Cong Wang, Lennart Poettering, Kay Sievers.

Of the items on the list 5 have been fully implemented and are already part of a released kernel, or already merged for inclusion for the next kernels being released.

For 4 further items patches have been posted, and I am hoping they'll get merged eventually. Davidlohr, Wang, Zefan, Kirill, it would be great if you'd continue working on your patches, as we think they are following the right approach[1] even if there was some opposition to them on LKML. So, please keep pushing to solve the outstanding issues and thanks for your work so far!

Footnotes

[1] Yes, I still believe that tmpfs quota should be implemented via resource limits, as everything else wouldn't work, as we don't want to implement complex and fragile userspace infrastructure to racily upload complex quota data for all current and future UIDs ever used on the system into each tmpfs mount point at mount time.

systemd for Administrators, Part XII

Here's the twelfth installment of my ongoing series on systemd for Administrators:

Securing Your Services

One of the core features of Unix systems is the idea of privilege separation between the different components of the OS. Many system services run under their own user IDs thus limiting what they can do, and hence the impact they may have on the OS in case they get exploited.

This kind of privilege separation only provides very basic protection however, since in general system services run this way can still do at least as much as a normal local users, though not as much as root. For security purposes it is however very interesting to limit even further what services can do, and shut them off a couple of things that normal users are allowed to do.

A great way to limit the impact of services is by employing MAC technologies such as SELinux. If you are interested to secure down your server, running SELinux is a very good idea. systemd enables developers and administrators to apply additional restrictions to local services independently of a MAC. Thus, regardless whether you are able to make use of SELinux you may still enforce certain security limits on your services.

In this iteration of the series we want to focus on a couple of these security features of systemd and how to make use of them in your services. These features take advantage of a couple of Linux-specific technologies that have been available in the kernel for a long time, but never have been exposed in a widely usable fashion. These systemd features have been designed to be as easy to use as possible, in order to make them attractive to administrators and upstream developers:

  • Isolating services from the network
  • Service-private /tmp
  • Making directories appear read-only or inaccessible to services
  • Taking away capabilities from services
  • Disallowing forking, limiting file creation for services
  • Controlling device node access of services

All options described here are documented in systemd's man pages, notably systemd.exec(5). Please consult these man pages for further details.

All these options are available on all systemd systems, regardless if SELinux or any other MAC is enabled, or not.

All these options are relatively cheap, so if in doubt use them. Even if you might think that your service doesn't write to /tmp and hence enabling PrivateTmp=yes (as described below) might not be necessary, due to today's complex software it's still beneficial to enable this feature, simply because libraries you link to (and plug-ins to those libraries) which you do not control might need temporary files after all. Example: you never know what kind of NSS module your local installation has enabled, and what that NSS module does with /tmp.

These options are hopefully interesting both for administrators to secure their local systems, and for upstream developers to ship their services secure by default. We strongly encourage upstream developers to consider using these options by default in their upstream service units. They are very easy to make use of and have major benefits for security.

Isolating Services from the Network

A very simple but powerful configuration option you may use in systemd service definitions is PrivateNetwork=:

...
[Service]
ExecStart=...
PrivateNetwork=yes
...

With this simple switch a service and all the processes it consists of are entirely disconnected from any kind of networking. Network interfaces became unavailable to the processes, the only one they'll see is the loopback device "lo", but it is isolated from the real host loopback. This is a very powerful protection from network attacks.

Caveat: Some services require the network to be operational. Of course, nobody would consider using PrivateNetwork=yes on a network-facing service such as Apache. However even for non-network-facing services network support might be necessary and not always obvious. Example: if the local system is configured for an LDAP-based user database doing glibc name lookups with calls such as getpwnam() might end up resulting in network access. That said, even in those cases it is more often than not OK to use PrivateNetwork=yes since user IDs of system service users are required to be resolvable even without any network around. That means as long as the only user IDs your service needs to resolve are below the magic 1000 boundary using PrivateNetwork=yes should be OK.

Internally, this feature makes use of network namespaces of the kernel. If enabled a new network namespace is opened and only the loopback device configured in it.

Service-Private /tmp

Another very simple but powerful configuration switch is PrivateTmp=:

...
[Service]
ExecStart=...
PrivateTmp=yes
...

If enabled this option will ensure that the /tmp directory the service will see is private and isolated from the host system's /tmp. /tmp traditionally has been a shared space for all local services and users. Over the years it has been a major source of security problems for a multitude of services. Symlink attacks and DoS vulnerabilities due to guessable /tmp temporary files are common. By isolating the service's /tmp from the rest of the host, such vulnerabilities become moot.

For Fedora 17 a feature has been accepted in order to enable this option across a large number of services.

Caveat: Some services actually misuse /tmp as a location for IPC sockets and other communication primitives, even though this is almost always a vulnerability (simply because if you use it for communication you need guessable names, and guessable names make your code vulnerable to DoS and symlink attacks) and /run is the much safer replacement for this, simply because it is not a location writable to unprivileged processes. For example, X11 places it's communication sockets below /tmp (which is actually secure -- though still not ideal -- in this exception since it does so in a safe subdirectory which is created at early boot.) Services which need to communicate via such communication primitives in /tmp are no candidates for PrivateTmp=. Thankfully these days only very few services misusing /tmp like this remain.

Internally, this feature makes use of file system namespaces of the kernel. If enabled a new file system namespace is opened inheritng most of the host hierarchy with the exception of /tmp.

Making Directories Appear Read-Only or Inaccessible to Services

With the ReadOnlyDirectories= and InaccessibleDirectories= options it is possible to make the specified directories inaccessible for writing resp. both reading and writing to the service:

...
[Service]
ExecStart=...
InaccessibleDirectories=/home
ReadOnlyDirectories=/var
...

With these two configuration lines the whole tree below /home becomes inaccessible to the service (i.e. the directory will appear empty and with 000 access mode), and the tree below /var becomes read-only.

Caveat: Note that ReadOnlyDirectories= currently is not recursively applied to submounts of the specified directories (i.e. mounts below /var in the example above stay writable). This is likely to get fixed soon.

Internally, this is also implemented based on file system namspaces.

Taking Away Capabilities From Services

Another very powerful security option in systemd is CapabilityBoundingSet= which allows to limit in a relatively fine grained fashion which kernel capabilities a service started retains:

...
[Service]
ExecStart=...
CapabilityBoundingSet=CAP_CHOWN CAP_KILL
...

In the example above only the CAP_CHOWN and CAP_KILL capabilities are retained by the service, and the service and any processes it might create have no chance to ever acquire any other capabilities again, not even via setuid binaries. The list of currently defined capabilities is available in capabilities(7). Unfortunately some of the defined capabilities are overly generic (such as CAP_SYS_ADMIN), however they are still a very useful tool, in particular for services that otherwise run with full root privileges.

To identify precisely which capabilities are necessary for a service to run cleanly is not always easy and requires a bit of testing. To simplify this process a bit, it is possible to blacklist certain capabilities that are definitely not needed instead of whitelisting all that might be needed. Example: the CAP_SYS_PTRACE is a particularly powerful and security relevant capability needed for the implementation of debuggers, since it allows introspecting and manipulating any local process on the system. A service like Apache obviously has no business in being a debugger for other processes, hence it is safe to remove the capability from it:

...
[Service]
ExecStart=...
CapabilityBoundingSet=~CAP_SYS_PTRACE
...

The ~ character the value assignment here is prefixed with inverts the meaning of the option: instead of listing all capabalities the service will retain you may list the ones it will not retain.

Caveat: Some services might react confused if certain capabilities are made unavailable to them. Thus when determining the right set of capabilities to keep around you need to do this carefully, and it might be a good idea to talk to the upstream maintainers since they should know best which operations a service might need to run successfully.

Caveat 2: Capabilities are not a magic wand. You probably want to combine them and use them in conjunction with other security options in order to make them truly useful.

To easily check which processes on your system retain which capabilities use the pscap tool from the libcap-ng-utils package.

Making use of systemd's CapabilityBoundingSet= option is often a simple, discoverable and cheap replacement for patching all system daemons individually to control the capability bounding set on their own.

Disallowing Forking, Limiting File Creation for Services

Resource Limits may be used to apply certain security limits on services being run. Primarily, resource limits are useful for resource control (as the name suggests...) not so much access control. However, two of them can be useful to disable certain OS features: RLIMIT_NPROC and RLIMIT_FSIZE may be used to disable forking and disable writing of any files with a size > 0:

...
[Service]
ExecStart=...
LimitNPROC=1
LimitFSIZE=0
...

Note that this will work only if the service in question drops privileges and runs under a (non-root) user ID of its own or drops the CAP_SYS_RESOURCE capability, for example via CapabilityBoundingSet= as discussed above. Without that a process could simply increase the resource limit again thus voiding any effect.

Caveat: LimitFSIZE= is pretty brutal. If the service attempts to write a file with a size > 0, it will immeidately be killed with the SIGXFSZ which unless caught terminates the process. Also, creating files with size 0 is still allowed, even if this option is used.

For more information on these and other resource limits, see setrlimit(2).

Controlling Device Node Access of Services

Devices nodes are an important interface to the kernel and its drivers. Since drivers tend to get much less testing and security checking than the core kernel they often are a major entry point for security hacks. systemd allows you to control access to devices individually for each service:

...
[Service]
ExecStart=...
DeviceAllow=/dev/null rw
...

This will limit access to /dev/null and only this device node, disallowing access to any other device nodes.

The feature is implemented on top of the devices cgroup controller.

Other Options

Besides the easy to use options above there are a number of other security relevant options available. However they usually require a bit of preparation in the service itself and hence are probably primarily useful for upstream developers. These options are RootDirectory= (to set up chroot() environments for a service) as well as User= and Group= to drop privileges to the specified user and group. These options are particularly useful to greatly simplify writing daemons, where all the complexities of securely dropping privileges can be left to systemd, and kept out of the daemons themselves.

If you are wondering why these options are not enabled by default: some of them simply break seamntics of traditional Unix, and to maintain compatibility we cannot enable them by default. e.g. since traditional Unix enforced that /tmp was a shared namespace, and processes could use it for IPC we cannot just go and turn that off globally, just because /tmp's role in IPC is now replaced by /run.

And that's it for now. If you are working on unit files for upstream or in your distribution, please consider using one or more of the options listed above. If you service is secure by default by taking advantage of these options this will help not only your users but also make the Internet a safer place.

January 19, 2012

Moved to Blogspot

http://wtogami.blogspot.com/
All further blog entries are now on Blogspot.

January 16, 2012

FUDCon Thoughts - Eucalyptus

At FUDCon, Greg DeKoenigsberg said something that resonated with me (and I paraphrase), but it was this: After 10 years of being at Red Hat, it is easy to forget that most of the rest of the world, even in the IT industry, doesn't really understand Open Source. To most folks at Red Hat, it is intuitive, something we understand/accept, myself included.

In listening to Greg talk about Eucalyptus, I forced myself to set aside any biases I may have had about that company or their history, and really try to remember that for better or worse, they were a company founded by graduate students, who really didn't understand business or open source, and they initially operated on the assumption that in order to survive as a new company, they needed to make decisions to establish the business. A lot of those decisions weren't Open Source friendly, but it is far more likely that this was out of ignorance than malice.

When I look at Eucalyptus today, I see two very smart decisions, and those decisions have names: Andy Grimm and Greg DeKoenigsberg. I've worked with both for years, and I have a huge amount of respect for both. Even above that: I know that they truly get Open Source, and that neither of them sees it as a loss leader for a proprietary offering, or a "Childrens Edition". By bringing them onboard at Eucalyptus, at least in my eyes, it shows that the decision makers at Eucalyptus:
* Realize that they're not operating as a true Open Source Company today
* Want to change that sooner rather than later (understanding that change doesn't happen overnight)
* Are empowering people with the right expertise to enact that change

I respect that greatly. Greg told me about some interesting changes coming soon for Eucalyptus, and since I'm not sure what is public and what is not, I will keep those under my hat, but I am definitely going to be watching them closely over the next year or so. Actions speak louder than words, but so far, they're moving in a better direction. I hope that it leads them away from an Open Core model.

ADDENDUM: It has been pointed out to me that there are other awesome Eucalyptus folks who understand Open Source, like Garrett Holmstrom, it was in no way my intent to omit them, or imply that all Eucalyptus employees who are not Greg or Andy do not understand Open Source, this is obviously false.

PulseAudio vs. AudioFlinger

Arun put an awesome article up, detailing how PulseAudio compares to Android's AudioFlinger in terms of power consumption and suchlike. Suffice to say, PulseAudio rocks, but go and read the whole thing, it's worth it.

Apparently, AudioFlinger is a great choice if you want to shorten your battery life.

January 09, 2012

Valgrind and GDB

Valgrind 3.7.0 now includes an embedded gdbserver, which is wired to the valgrind innards in the most useful way possible.  What this means is that you can now run valgrind in a special mode (simply pass --vgdb-error=0), then attach to it from gdb, just as if you were attaching to a remote target.  Valgrind will helpfully tell you exactly how to do this.  Then you can debug as usual, and also query valgrind’s internal state as you do so.  Valgrind will also cause the program to stop if it hits some valgrind event, like a use of an uninitialized value.

For example, consider this incorrect program, e.c:

#include 
int main ()
{
  int x;
  x = x > 0 ? x : x + 1;
  return x;
}

After compiling it (calling it /tmp/e), we can start valgrind:

$ valgrind --vgdb-error=0 err
==20836== Memcheck, a memory error detector
==20836== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
==20836== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
==20836== Command: /tmp/e
==20836==
==20836== (action at startup) vgdb me ...
==20836==
==20836== TO DEBUG THIS PROCESS USING GDB: start GDB like this
==20836==   /path/to/gdb /tmp/e
==20836== and then give GDB the following command
==20836==   target remote | vgdb --pid=20836
==20836== --pid is optional if only one valgrind process is running
==20836==

Now, in Emacs (or another console if you insist) we start gdb on /tmp/e and enter the command above. Valgrind has paused our program at the first instruction. Now we can “continue” to let it run:

Reading symbols from /tmp/e...done.
(gdb) target remote | vgdb --pid=20836
Remote debugging using | vgdb --pid=20836
relaying data between gdb and process 20836
Reading symbols from /lib64/ld-linux-x86-64.so.2...Reading symbols from /usr/lib/debug/lib64/ld-2.14.so.debug...done.
done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
[Switching to Thread 20836]
0x0000003a15c016b0 in _start () from /lib64/ld-linux-x86-64.so.2
(gdb) c
Continuing.

Now the inferior stops, because we hit a use of an uninitialized value:

Program received signal SIGTRAP, Trace/breakpoint trap.
0x000000000040047c in main () at e.c:5
5	  x = x > 0 ? x : x + 1;
(gdb)

If we look back at the valgrind window, we see:

==20836== Conditional jump or move depends on uninitialised value(s)
==20836==    at 0x40047C: main (e.c:5)

(It would be nice if this showed up in gdb; I’m not sure why it doesn’t.)

Valgrind also provides ways to examine what is happening, via gdb’s monitor command. This is helpfully documented online:

(gdb) monitor help
general valgrind monitor commands:
  help [debug]             : monitor command help. With debug: + debugging commands
[... lots more here ...]

A few improvements are possible; e.g., right now it is not possible to start a new program using valgrind from inside gdb. This would be a nice addition (I think something like “target valgrind“, but other maintainers have other ideas).

I think this is a major step forward for debugging. Thanks to Philippe Waroquiers and Julian Seward for making it happen.

Blogs

Unwittingly, I sat myself up for a yearly excersize, but let's look for a silver lining, shall we?

Blog Posts Delta
Ani-nouto 149 -220
Meenuvia 93 -28
LiveJournal 27 -5
Tracy Today 1 -3
Mixi 0
Duke City Fix 0
Advogato 0

Looks like I'm taking my life back from the Internet.

January 08, 2012

GCC and Python

When writing an earlier post, I realized I haven’t yet written about the Python plugin for GCC.

This is awesome! It is by far the simplest way to write a GCC plugin. The primary reason is that the author, the amazing David Malcolm, has put a lot of effort into the polish: this plugin is the simplest one to build (“make” works for me, with the Fedora 15 system GCC) and also the one with the best documentation.

Why would you want to write a plugin? Pretty much every program — and especially every C program, as C has such bad metaprogramming support — has rules which cannot be expressed directly in the language. One usually resorts to various tricks, and in extremis patch review by all-knowing maintainers, to preserve these. The plugin offers another way out: write a Python script to automate the checking.

I’ve already written a couple of custom checkers for use on GDB (which is a very idiosyncratic C program, basically written in an oddball dialect of C++), which have found real bugs.  These checkers cover things that no generic static analysis tool would ever correctly check, e.g., for the proper use of GDB’s exception handling system.  The exception checker, which we use to verify that we’re correctly bridging between Python’s exception system and GDB’s, took less than a day to write.

January 07, 2012

13. Breakpoints

Phil Muldoon added support for breakpoints to the Python API in gdb this past year.  While work here is ongoing, you can already use it to do neat things which can’t be done from the gdb CLI.

The interface to breakpoints is straightforward.  There is a new Breakpoint class which you can instantiate.  Objects of this type have various attributes and methods, corresponding roughly to what is available from the CLI — with one nice exception.

The new bit is that you can subclass Breakpoint and provide a stop method.  This method is called when the breakpoint is hit and gets to determine whether the breakpoint should cause the inferior to stop.  This lets you implement special breakpoints that collect data, but that don’t interfere with other gdb operations.

If you are a regular gdb user, you might think that this is possible by something like:

break file.c:73
commands
  silent
  python collect_some_data()
  cont
end

Unfortunately, however, this won’t work — if you try to “next” over this breakpoint, your “next” will be interrupted, and the “cont” will cause your inferior to start running free again, instead of stopping at the next line as you asked it to.  Whoops!

Here’s some example code that adds a new “lprintf” command.  This is a “logging printf” — you give it a location and (gdb-style) printf arguments, and it arranges to invoke the printf at that location, without ever interrupting other debugging.

This code is a little funny in that the new breakpoint will still show up in “info break“.  Eventually (this is part of the ongoing changes) you’ll be able to make new breakpoints show up there however you like; but meanwhile, it is handy not to mark these as internal breakpoints, so that you can easily delete or disable them (or even make them conditional) using the normal commands.

import gdb

class _LPrintfBreakpoint(gdb.Breakpoint):
    def __init__(self, spec, command):
        super(_LPrintfBreakpoint, self).__init__(spec, gdb.BP_BREAKPOINT,
                                                 internal = False)
        self.command = command

    def stop(self):
        gdb.execute(self.command)
        return False

class _LPrintfCommand(gdb.Command):
    """Log some expressions at a location, using 'printf'.
    Usage: lprintf LINESPEC, FORMAT [, ARG]...
    Insert a breakpoint at the location given by LINESPEC.
    When the breakpoint is hit, do not stop, but instead pass
    the remaining arguments to 'printf' and continue.
    This can be used to easily add dynamic logging to a program
    without interfering with normal debugger operation."""

    def __init__(self):
        super(_LPrintfCommand, self).__init__('lprintf',
                                              gdb.COMMAND_DATA,
                                              # Not really the correct
                                              # completer, but ok-ish.
                                              gdb.COMPLETE_SYMBOL)

    def invoke(self, arg, from_tty):
        (remaining, locations) = gdb.decode_line(arg)
        if remaining is None:
            raise gdb.GdbError('printf format missing')
        remaining = remaining.strip(',')
        if locations is None:
            raise gdb.GdbError('no matching locations found')

        spec = arg[0:- len(remaining)]
        _LPrintfBreakpoint(spec, 'printf ' + remaining)

_LPrintfCommand()

January 06, 2012

12. Events

There have been many new Python scripting features added to gdb since my last post on the topic.  The one I want to focus on today is event generation.

I wrote a little about events in gdb-python post #9 — but a lot has changed since then.  A Google SoC student, Oguz Kayral, wrote better support for events in 2009.  Then, Sami Wagiaalla substantially rewrote it and put it into gdb.

In the new approach, gdb provides a number of event registries.  An event registry is just an object with connect and disconnect methods.  Your code can use connect to register a callback with a registry; the callback is just any callable object.  The event is passed to the callable as an argument.

Each registry emits specific events — “emitting” an event just means calling all the callables that were connected to the registry.  For example, the gdb.events.stop registry emits events when an inferior or thread has stopped for some reason.  The event describes the reason for the stop — e.g., a breakpoint was hit, or a signal was delivered.

Here’s a script showing this feature in action.  It arranges for a notification to pop up if your program stops unexpectedly — if your program exits normally, nothing is done.  Something like this could be handy for automating testing under gdb; you could augment it by having gdb automatically exit if gdb.events.exited fires.  You could also augment it by setting a conditional breakpoint to catch a rarely-seen condition; then just wait for the notification to appear.

To try this out, just “source” it into gdb.  Then, run your program in various ways.

import gdb, threading, Queue, gtk, glib, os, pynotify

(read_pipe, write_pipe) = os.pipe()

event_queue = Queue.Queue()

def send_to_gtk(func):
    event_queue.put(func)
    os.write(write_pipe, 'x')

def on_stop_event(event):
    n = pynotify.Notification('Your program stopped in gdb')
    n.show()

class GtkThread(threading.Thread):
    def handle_queue(self, source, condition):
        global event_queue
        os.read(source, 1)
        func = event_queue.get()
        func()

    def run(self):
        global read_pipe
        glib.io_add_watch(read_pipe, glib.IO_IN, self.handle_queue)
        gtk.main()

gdb.events.stop.connect(on_stop_event)

gtk.gdk.threads_init()

pynotify.init('gdb')

t = GtkThread()
t.setDaemon(True)
t.start()

SElinux in Fedora 16

Perhaps even a flatworm can learn to turn left in T-maze when he's hit with an electric shock, but every time I install new Fedora, I try to leave SElinux enabled. In Fedora 15 it lasted for almost a week. This time, things went sour in 10 minutes.

Thought 1: The troubleshooter is quite nice, they really are getting civilized these days. Poor Polkit is doing isatty(), let's see if we can help.

Thought 2: Well, this is a bit ugly, but hey, it's copy-pasteable! Hurray for supporting common UNIX tools...

Thought 3: I am dumber than a flatworm. How could I believe that this time they get it right.

Actually, I lied. I disabled the sorry thing the moment I saw this:

The reason I continue to suffer through this excercise twice a year is that the security issue is important, and the shifing focus of attacks at non-daemon, non-system applications (read: browsers) is a matter of great concern for me. Every time Dan Walsh posts something nice to his LJ it looks like progress is being made, but then I install new Fedora and feel despair. It is as if some fundamental architecture is broken or something.

January 05, 2012

Emacs and Common Lisp

Recently I’ve been thinking about how to rebase Emacs on Common Lisp.

First, why rebase?  Performance is the biggest reason.  Emacs Lisp is a very basic lisp implementation.  It has a primitive garbage collector and basic execution model, and due to how it is written, it is quite hard to improve this in place.

Seccond, why Common Lisp?  Two reasons: first, Emacs Lisp resembles Common Lisp in many ways; elisp is like CL’s baby brother.  Second, all of the hard problems in Lisp execution have already been solved excellently by existing, free-software CL implementations.  In particular, the good CL implementations have much better garbage collectors, native compilation, threads, and FFI; we could expose the latter two to elisp in a straightforward way.

By “rebase” I mean something quite ambitious — rewrite the C source of Emacs into Common Lisp.  I think this can largely be automated via a GCC plugin (e.g., written using David Malcolm’s Python plugin).  Full automation would let the CL port be just another way to build Emacs, following the upstream development directly until all the Emacs maintainers can be convinced to drop C entirely (cough, cough).

Part of the rewrite would be dropping code that can be shared with CL.  For example, we don’t need to translate the Emacs implementation of “cons“, we can just use the CL one.

Some CL glue would be needed to make this all work properly.  These days it can’t be quite as small as elisp.lisp, but it still would not need to be very big.  The trickiest problem is dealing with buffer-local variables; but I think that can be done by judicious use of define-symbol-macro in the elisp reader.

Emacs might be the only program in the world that would see a performance improvement from rewriting in CL :-) .  The reason for this is simple: Emacs’ performance is largely related to how well it executes lisp code, and how well the GC works.

January 04, 2012

sys/sdt.h

Here’s a homework problem for you: design a static probe point API that:

  • Consists of a single header file,
  • Works for C, C++, and assembly,
  • Allows probes to have arguments,
  • Does not require any overhead for computing the arguments if they are already live,
  • Does not require debuginfo for debug tools to extract argument values,
  • Has overhead no greater than a single nop when no debugger is attached, and
  • Needs no dynamic relocations.

I wouldn’t have accepted this task, but Roland McGrath, in a virtuoso display of ELF and GCC asm wizardy, wrote <sys/sdt.h> for SystemTap.  Version 3 has all the properties listed above.  I’m pretty amazed by it.

This past year, Sergio Durigan Junior and I added support for this to gdb.  It is already in Fedora, of course, and it will be showing up in upstream gdb soon.

The way I think about these probes is that they let you name a place in your code in a way that is relatively independent of source changes.  gdb can already address functions nicely ("break function") or lines ("break file.c:73") — but sometimes I’d like a stable breakpoint location that is not on a function boundary; but using line numbers in a .gdbinit or other script is hard, because line numbers change when I edit.

We’ve also added probes to a few libraries in the distro, for gdb to use internally.  For example, we added probes to the unwind functions in libgcc, so that gdb can properly “next” over code that throws exceptions.  And, we did something similar for longjmp in glibc.  You can dump the probes from a library with readelf -n, or with “info probes” in gdb.

The probes were designed to be source-compatible with DTrace static probes.  So, if you are already using those, you can just install the appropriate header from SystemTap.  Otherwise, adding the probes is quite easy… see the instructions, but be warned that they focus a bit too much on DTrace compability; you probably don’t want the .d file and the semaphore, that just slows things down.  Instead I recommend just including the header and using the macros directly.

January 03, 2012

Personal MPL acknowledgments

This morning I hit publish on the announcement of MPL 2.0, finishing a two year process. The official announcement had a number of acknowledgements for the many people who helped out along the way, but I wanted to take to my personal blog to add a few personal notes.

“thank you note for every language,” by woodleywonderworks, used under CC-BY 2.0.
First, Gerv Markham. Gerv has in many ways been central to Mozilla’s open source mission for a while, and it would have been easy for him to feel threatened when I parachuted in and began working on the license. Instead, he’s been helpful, patient, and constructive- everything you’d want from a co-worker and team member.

Second, Brett Smith, of the FSF: Brett has brought a very professional, constructive approach to working with me on the license. Without his dedication and patience the new GPL compatibility approach would not have succeeded. Aaron Williamson and James Vasile at SFLC and Richard Fontana at Red Hat were also instrumental in this, and again gave freely of their time when they didn’t have to. They also kept a straight face when I suggested the new approach, which probably helped a great deal in getting it done. :)

Third, Karen Copenhaver and Heather Meeker were incredible pros in helping push out the critical betas- they helped me go through the important issues and get the language right, in a way only people with decades of experience can do. That they were willing to give their time to Mozilla and to me was incredibly generous- most partners at major law firms aren’t willing to take those steps. And I’m not just saying that because Heather is now my boss. ;)

Finally, and most importantly, Mitchell Baker and Harvey Anderson: Mitchell and Harvey took a gamble when they brought me on board this project- one they didn’t need to do. This has been their baby for the past ten years, and they could have done this work themselves, or just let the license continue to age gracefully (as it was doing). Not only did they give me this terrific opportunity, they then opened up their calendars and their minds to me. As a result, I’ve had a terrific educational experience, learning the nooks and crannies of the license as well as lots about how to draft a document that stands the test of time. (Rumor has it that Mitchell wrote the original in a month, which I still find mind-boggling, and I can confirm that the text is still burned into her brain in high resolution.) It has really been an honor and a privilege for me to be involved with them and in this process, so I’m deeply thankful for their encouragement and invitation to participate.

I’ll probably write more here soon about the license and the process,  but thanking people was really at the top of my priority list.

New year, new bugs.

This is how the year begins for the Fedora kernel.

Open bugs
F15 F16 Rawhide  
394 339 143 (876)

To be continued..

New year, new bugs. is a post from: codemonkey.org.uk

No related posts.

January 02, 2012

How do you manage mailing lists?

Every new year is a time of cleaning. After getting back to Inbox 0, my next target is my mailing list subscriptions.

It must be something psychological, but I cannot bring myself to unsubscribe from some of these mailing lists. I don’t check on them daily, but once in a while it’s darn useful to search through my local copy of mails on, say, selinux, and find solutions for a problem I’m having.

However, all this mailing list mail brings me a lot of headache. My email client is slow, and I would want it to be fast for the real mail I’m getting (from actual people, needing actual work). It’s hard to track the mails that matter – all my list mail gets put into folders automatically with some procmail magic, but it also means that some of the things I should be paying more attention to are just another bold folder in Evolution somewhere down the mail tree. And lastly, the server where I host my mail shared with friends gets too much traffic, and syncing 3 different evolutions over IMAP with it is a big part of the burden.

I vastly prefered the newsreader model of old, and I think the de facto standard of mailing lists really is a mistake. But I’m not sure what to replace it with.

What I want:

  1. have selected mailing list archives be available on my machines, locally
  2. have them synced/updated automatically
  3. have them out of the way of my normal mail usage unless when I need them

I’ve been considering getting a separate email account just for email lists for this purpose, although I don’t look forward much to having to change all my subscriptions, and would first like to hear from other people how this approach works out for them.

There used to be a push towards web-based mailing list subscriptions, but I don’t know if anyone is really seriously using that, and I would like to have the option of reading these mailing list archives offline.

How do you separate your ‘real’ mail from your mailing list mail? How do you handle them?

December 31, 2011

Making of the SONIK Gravitation music video

The inspiration for the Sonik video for Gravitation came from a local friend of ours, a talented and world-renowned photographer, Adrian Brannan. Ade is famous for his analogue photo collages (please give him a 'like' on his Facebook page):

We often wondered how the same effect would look if rendered with video. With video you've got the extra element of time, each segment of the mosaic can be running from a different starting point, with a different speed, and even a different direction. In addition the segments themselves can move over time. Would this end up with an effect that was just too much of a mess? Or would it give an effect that helps visualise the consequence of spacetime?

We started by taking several videos at three different locations over the period of a year with a Kodak Zi8 camera. A motorway bridge over the M74, just outside the Buchanan shopping center in Glasgow, and a bench in Strathclyde park. Lining up the images was done roughly by using lines drawn on acetate stuck over the camera screen.

The software to do the mosaic effect was hand-written. We used a simple scripting language, Perl, and the image library GD. On a relatively modern Linux PC running Fedora 16 we can render near real-time 720p HD even when handling 300 segments of mosaic. A simple language controls which parts of the screen come from which video, and the first half of the music video uses this with simple effects having just a few boxes overlayed:

Later in the video things get more complicated, using randomisation to pick the location and movement of each segment:

We used our scripts to create a number of ~13 second segments, then put them all together using kdenlive. The intro and outro were taken from a different video from a hotel room in London Victoria; the intro using a 'miniature' effect, and outro using the randomised segments applied to a single video.

The Perl script and a 5 frame example is available to download: 2011-sonik-vid-example.tar.bz2 (1.4M)

Watch the full video, or click through to YouTube to see it in HD:

December 30, 2011

using xargs on a list of paths with spaces in a file

Every few weeks I have to spend an hour figuring out exactly the same non-googleable thing I’ve already needed to figure out. So this time it’s going on my blog.

The problem is simple: given an input file listing paths, one per line, which probably contain spaces – how do I run a shell command that converts each line to a single shell argument ?

Today, my particular case was a file /tmp/dirs on my NAS which lists all directories in one of my dirvish vaults that contains files bigger than a GB. For some reason not everything is properly hardlinked, but running hardlink on the vault blows up because there are so many files in there.

Let’s see if wordpress manages to not mangle the following shell line.

perl -p -e 's@\n@\000@g' /tmp/dirs | xargs -0 /root/hardlink.py -f -p -t -c --dry-run

December 29, 2011

A note on 2011

The best thing I did for myself in 2011 was to get back on a bicycle after not being on one for 15+ years, and after never actually being comfortable on one. I’m not going to be racing any time soon, but I now really look forward to a bike ride as part of the average weekend and even the average vacation.

image

Yesterday was a nice punctuation mark to that, featuring a long ride down to the ocean, a great view, and a very satisfying fish and chips.

I am definitely enjoying some time out of the office and looking forward to a great 2012- hope my friends are too. Now I just need to figure out what life improvement can trump “get back on a bike.” Suggestions welcome :)

N900 life support

I don’t want to simply be complaining about Nokia’s sad Linux story ending.

It’s obvious that things aren’t going to get better though for Maemo device owners.

Here are two things that caused me trouble over the last few months, and the fix that did it for me, in case you were suffering from the same problems.

      The Facebook photo sharing functionality just stopped working for me. It would upload the file, then give me an error without specifying any reason. I had simply assumed the Facebook API had changed, and since afaik this plugin is sadly closed-source (what on earth possessed Nokia to make a social media sharing component closed to begin with is beyond me – what kind of IP secrets could you possibly have in there?) I thought I would have to do without for now. And it really is a hassle to manually copy photos off then share them from the desktop.

      Turns out that I simply had to re-authenticate the Sharing Account in Settings. No idea why – maybe Facebook changed some authentication system in the last few months ? Now it works again.

      (As a side note, this plugin’s behaviour is really annoying when it comes to uploading photos. It will always try to upload as soon as you connect to a network, although usually all you get is access to some web page on which you have to authenticate, usually by paying, to get on to the net. The sharing plugin already blasts photos at facebook, then fails, gives you a non-useful error message, and then sits there forever without any option to retry. All you can do is cancel the transfer, in which case you will have to re-upload the photos from your library. After some time I figured out that a reboot caused it to retry all pending uploads on the next network connect, and then after that I figured out a kill of a sharing manager process did the same thing. But really, Nokia engineers – a simple ‘retry’ button was too hard ?)

      For the last two weeks my GPS stopped getting a lock completely. This wreaked havoc on my barriosquare/foursquare checkins as well, which simply don’t register without a GPS lock (yes I still have a half-done port of bsq to the new foursquare API, but last time I tried I was still stuck on the simply terrible browser coming with the device that seems to be unable to properly complete SSL requests in emulator mode).

      I stumbled across this page and simply changed the AGPS server to google’s. Worked like a charm on the next connect. Nokia, I don’t know what you did to your AGPS server – surely other phones you have are using it too, not just the Maemo ones ?

      I swore I wasn’t going to buy an N9 because there’s no point in buying an EOL’d phone if I plan to develop for it. The reviews when it actually came out almost persuaded me to get it, and the five minutes I got to play with Luis de Bethencourt’s phone got me really close. I’ve even seen plans in Belgium offering this phone! But really, there isn’t much point if Nokia isn’t going to support this phone any more and services are just going to get worse, and important parts of the stack remain closed and thus unfixable down the road.

      This month’s Android course at work at least got me familiar developing for the phone and I was actually impressed by Eclipse this time around, and while Java still seems like a bitch to program in, the whole emulator setup is easy to use… Who knows, my next phone may in fact be an Android.

      In the meantime, it’s nice to see that some of my fixes go noticed. That motivates me to possibly fix that other annoyance in erminig-ng – all-day events triggering an alarm at midnight and waking me up :)

December 22, 2011

update on hotplug server

No new videos yet, need to fix some more rendering bugs so it looks nicer :)

So I've been working towards 3 setups:

a) intel rendering + nouveau offload
b) nouveau rendering + DVI output + intel LVDS output
c) hotplug USB with either intel or nvidia rendering.

Categorisation of devices roles:
I've identified 4 devices roles so far:
preferred master: the device is happy to be master
reluctant master: the device can be a master but would rather not be
offload slave: device can be used as an additional DRI2 renderer for a master
output slave: device can be used an additional output for a master

For the 3 setups above:
a) intel would be preferred master, nvidia would be offload slave
b) nvidia would be preferred master, intel would be output slave
c) usb devices would be output slaves, however if no master exists, usb device would be reluctant master.

I've rebased the prime work[1] on top of the dma-buf upstream work, and worked through most of the lifetime problems. Some locking issues still exist, and I'll have to get back to them. But the code works and doesn't oops randomly which is good.

prime is the kernel magic needed for this work, as it allows sharing of a buffer between two drm drivers, so for (a) it shares the dri2 front pixmap between devices, for (b/c) it shares a pixmap that the rendering gpu copies dirty updates to and the output slaves use as their scanout pixmap.

So I've done nearly all the work to share between intel and nouveau and I've done the kernel driver work for udl, but I haven't done the last piece in userspace for (c), which is to use the shared pixmap as usb scanout via the modesetting driver.

Today I hacked in a switch on the first randr command, so I can start the X server with intel as master and nouveau in offload mode. I can run gears on intel or nouveau, then after the randr command and another randr command to set a mode, the X server migrates everything to the nouveau driver, puts it in master mode, and places the intel driver into output slave mode. It seems to render my xterm + metacity content fine.

So the current short-term TODO is:
fix some issues with my nouveau/exa port rendering
fix some issues with xcompmgr
add usb output slave support.

Medium-term TODO:
worked out how to control this stuff, via randr protocol. How much information do we need to expose to clients about GPUs, and how do we control them. Open issues with atomicity of updates to avoid major uglys. Switching from intel master to nvidia master + intel outputs, means we have to reconfigure the Intel output to point at the new pixmap, but the more steps we put in there for clients to do, the more ugly and flashing we'll see on screen, however we probably want a lot of this to be client driven (i.e. gnome-settings-daemon).

Longer term TODO:
Get GLX_ARB_robustness done, now that Ian has done the context creation stuff, this should be a lot more trivial. (so trivial someone else could do it :)

[1] http://cgit.freedesktop.org/~airlied/linux/log/?h=drm-prime-dmabuf

ssh friction

I haven’t been too good this year at removing friction from my workflow. Today I wanted to change that. And the random friction thrown my way today has to do with ssh.

You see, somewhere along the line I read that it is a good idea to create separate keys for separate identities. So I have an identity for all work-related stuff (which I consider ‘ring 1′: it’s unlikely to change but everyone can get fired or change jobs), one for personal stuff on machines I actually control (‘ring 0′: they’d have to pry it out of my dead hands), another for my ‘public online default’ identity (‘ring 2′: I can always pull a whytheluckystiff and pull myself of the net and reinvent myself), and then per-project identities (‘ring 3′: I may lose interest in being a fedora or gstreamer contributor without massive changes in my personality).

I started splitting ring 3 per project when it made sense – for example, Fedora recently enforced a key change even if your account wasn’t compromised and even if you already have a strong passphrase on your key (like I had), and of course a massive flamefest ensued. I shrugged and decided to split off a new key and set that on all my machines.

But the problem is, this whole tower of ssh doesn’t really work well in practice. I chose a long passphrase for the new fedora keys, but obviously I do not want to type that every time I clone a package or commit changes. So I use ssh-agent. In theory, ssh-agent adds your keys and asks you for the passphrase once, and is then able to offer those identities to the other side.

The problem is a lot of ssh servers out there only give you a few tries. So your ssh agent will offer identity by identity until it gets refused. If my fedora identity was added as the fourth identity I lose – I can’t clone a package.

Specifying IdentityFile in the ssh config is useless. It is poorly documented, but IdentityFile files actually come after your ssh-agent identities. So your agent blasts all the wrong keys at the host first, and you get denied.

So you can specify IdentityOnly to make sure that only the identity file you want is being used. Sadly in that case it will not use the ssh-agent at all, so it will ask you for the password to your key file – the whole reason you want agents to be used in the first place.

Now obviously ssh has all the pieces it needs to Do The Right Thing. If my config says to use this identity and this identity only, ssh should be able to request ssh-agent to present that identity, and that identity only, and make the login happen without any password.

Surely I must be missing something obvious. Surely one of you uberhackers out there has set up the same thing as me. Why don’t you comment about it here and help the rest of us?

December 21, 2011

Semantic

I’ve been running an Emacs built from bzr for a while now.  I did this so I could try a newer version of Semantic; the one in Emacs 23 is just too broken to use.

Semantic, in case you haven’t heard of it, is an ambitious project to turn Emacs into an IDE.  Really it is quite a crazy project in some ways — it includes its own implementation of CLOS (which opens a strange Emacs maintenance debate: how can CLOS be ok but the CL package not be?) and a port of Bison to elisp (but again, strange: the port is really a pure port, it does not use sexps as the input — bizarre).

Semantic is now usable in Emacs — I’ve found a few buglets, but nothing serious.  In fact, now I find it indispensible.

I have it configured in the most Emacsy way of all: I didn’t make any changes, I just enabled it with M-x semantic-mode.  Then I visited a bunch of gdb source files.  Semantic started indexing them in the background.

Now when I want to jump to a declaration or definition of a function, I use C-c , J.  The key binding is nuts, of course, but I’ve been too lazy to rebind it yet.  Anyway, this acts basically like M-., except you don’t have to ever run etags.  Wonderful.

Semantic has some other nice features, too, but I haven’t really used them yet.  If you’re using it I’d love to hear what you do with it.

 

December 16, 2011

Modern Linux forming new social mores

Was at a meeting today and at some point a whole side of the table was lined with half-closed laptops - except one that was closed fully. You could play "spot the GNOME 2.x user" (hello, Richard). That is because GNOME 3 on Fedora 16 has a reliable suspend, so everyone has it enabled by default, because that's what one normally wants. At a meeting, people used to close the tops to direct their attention to the presentation, or to indicate that they do. But that was then. Now, doing so triggers suspend, and that bumps you off VPN or weird WiFi providers like BitBuzz. Result is funny-looking and awkward. My neighbour eventually resorted to locking the screen (which blanks) instead of closing the top. Although blocking the distraction effectively, this has the disadvantage of not sending the right signal to the presenter, but tough cookies.

I think ideally I might like some kind of hotkey-suspend or whatnot, and disable suspend-on-close, but I dunno. Seems kind of bothersome to RTFM for GNOME. Worse, getting everyone in a meeting to agree on upon a non-default configuration seems like a non-starter.

December 08, 2011

Enterprise Linux 6.1 to 6.2 risk report

Red Hat Enterprise Linux 6.2 was released this week (Dec 2011), just over six months since the release of 6.1 in May 2011. So let's use this opportunity to take a quick look back over the vulnerabilities and security updates made in that time, specifically for Red Hat Enterprise Linux 6 Server.

Errata count

The chart below illustrates the total number of security updates issued for Red Hat Enterprise Linux 6 Server if you had installed 6.1, up to and including the 6.2 release, broken down by severity. It's split into two columns, one for the packages you'd get if you did a default install, and the other if you installed every single package (which is unlikely as it would involve a bit of manual effort to select every one). For a given installation, the number of package updates and vulnerabilities that affected you will depend on exactly what you have installed or removed.

So, for a default install, from release of 6.1 up to and including 6.2, we shipped 36 advisories to address 121 vulnerabilities. 2 advisories were rated critical, 10 were important, and the remaining 24 were moderate and low.

Or, for all packages, from release of 6.1 up to and including 6.2, we shipped 88 advisories to address 218 vulnerabilities. 10 advisories were rated critical, 16 were important, and the remaining 62 were moderate and low.

Critical vulnerabilities

The 10 critical advisories addressed 31 critical vulnerabilities across 3 components:

  1. Two updates to the OpenJDK 6 Java Runtime (June 2011, October 2011) where a malicious web site presenting a Java applet could potentially run arbitrary code as the user running a web browser.
  2. Four updates to Firefox (June 2011, August 2011, September 2011, November 2011) where a malicious web site could potentially run arbitrary code as the user running Firefox.
  3. Four updates to Thunderbird (June 2011, August 2011, September 2011 November 2011) where a malicious email message could potentially run arbitrary code as the user running Thunderbird.

Updates to correct all of the 31 critical vulnerabilities were available via Red Hat Network either the same day or the next calendar day after the issues were public.

Other significant vulnerabilities

Although not in the definition of critical severity, also of interest during this period were a few flaws that were high risk or easily exploitable:

  • A flaw in Bind, CVE-2011-4313 fixed by RHSA-2011:1458 where a malicious client could cause Bind to stop responding, a denial of service attack. This flaw was discovered by it being accidentally triggered in the wild.

  • A flaw in the Apache HTTP Server, CVE-2011-3192, fixed by RHSA-2011:1245, where a remote attacker could cause a denial of service attack. This was discovered due to a public exploit.

  • A flaw in RPM, CVE-2011-3378 fixed by RHSA-2011:1349 where a specially-crafted RPM package that, when queried or installed, would cause rpm to crash or, potentially, execute arbitrary code prior to any signature checking. We're not aware of any working exploits for this issue.

  • Updates to blacklist the DigiNotar Certificate Authority.

Previous update releases

To compare these statistics with previous update releases we need to take into account that the time between each update release is different. So looking at a default installation and calculating the number of advisories per month gives the following chart:

This data is interesting to get a feel for the risk of running Enterprise Linux 6 Server, but isn't really useful for comparisons with other major versions, distributions, or operating systems -- for example, a default install of either Red Hat Enterprise Linux 4AS and 6 Server does not include Firefox, but a default install of 5 Server does. You can use our public security measurement data and tools, and run your own custom metrics for any given Red Hat product, package set, timescales, and severity range of interest.

November 25, 2011

Gtk+ work on windows

The last few weeks I have been working on the Gtk+ win32 port. Since the client side windows work landed in Gtk+ 2.18 the Windows port has been a bit broken, but now I finally sat down and fixed the remaining issues. So, the newly released Gtk+ 2.24.8 is now officially the best ever Gtk+ 2.x release on windows.

Then I forward ported all the work to the current Gtk 3 tree. It was mostly trivial, but one thing that changed a lot in Gtk3 is theming. The old windows theme was mostly a custom themeing engine, but in Gtk+ 3 we want to use engines less in favour of CSS. So, I had to completely redo the windows theme using CSS.

I added a few CSS extensions that access the win32 theming APIs, so you can get theme backgrounds, theme colors and theme sizes. Then the rest is just traditional CSS to bind the things together.

Here is an image of the current state:

Widget Factory on Windows

There are clearly still some issues that need fixing, but it works impressively well for just being some CSS. Check it out.

November 21, 2011

fakemail is handy

For debugging mail problem, e.g. when debugging some emailmerge stuff in LibreOffice recently, fakemail was really really handy when you have a bug which requires generating a couple of hundred emails in quick succession to trigger.

November 18, 2011

Introducing the Journal

In the past weeks we have been working on a major new addition to systemd that will hopefully positively change the Linux ecosystem in a number of ways. But see for yourself, check out the full explanation on what we have implemented on the design document we put up on Google Docs.

Vulnerability Acknowledgements for Red Hat online services

When we get notified of a security issue affecting a Red Hat product in advance we give an acknowledgement in the security advisory and in our CVE database.

We've now created a page to give acknowledgements to the companies and individuals that report issues in our online services, such as finding a cross-site scripting flaw in a Red Hat web site, or a vulnerability in OpenShift.

November 14, 2011

D-Feet 0.1.14 “Iconic” has been released

d-feet-iconAfter being pinged in Bugzilla I finally set some time out to do a new release of D-Feet – The D-Bus Debugger.  This release was aimed at getting some of the bug fixes that have been sitting in git out into an official tarball.  The only big change is I took some time out this morning to hack up a new hi-res icon for better integration into gnome-shell.  It is based off of the system-search icon in gnome-icon-themes and I now provide a source SVG so those with better art skills than myself can fix it up a bit.

Seriously though, please do feel free to commit fixes and ping me to do releases.  I originally wrote it as a one-off tutorial because many similar projects failed to grasp the correct semantics of dealing with D-Bus messages at such a low level.  I hear D-Feet is used quite a bit but my own time is limited for such a project as I no longer use it every day.  In that respect, if someone wanted to take over upstream maintainership I am sure there is a community of D-Bus developer who would be grateful for faster release cycles and more attentive bug fixes.

Latest release: http://download.gnome.org/sources/d-feet/0.1/d-feet-0.1.14.tar.xz (sha256)

Project page: https://live.gnome.org/DFeet

[read this post in: ar de es fr it ja ko pt ru zh-CN ]

November 13, 2011

libexttextcat 3.2.0

Released libexttextcat 3.2.0 (Extended Text Categorization used to guess the language that input text is written in). It can be found in this download dir. No code changes from 3.1.1, but adds a large collection of extra language signatures to nearly add the same language support to libexttextcat as LibreOffice supports, modulo languages that LibreOffice supports which don’t have a convenient UDHR translation to use as a basis to generate a language fingerprint.

Fedora 16 upgrade

A new Fedora, a new decision on which machines to upgrade. Usually I try to stagger the three machines I use most – my work desktop, my home desktop, and my laptop. I had updated work machine and laptop to F-15 when it came out, and kept my home desktop at F-14.

I actually have two or three root partitions on each of those machines, and I typically do a fresh install on a separate root, so I can try things, poke around, and make sure everything I will need works. When I do the install, I don’t mount my /home partition, because I don’t want to have the new version upgrade things for me on my user config.

I have a pretty long checklist by now that I go through on each install/upgrade, installing the packages I use a lot, setting up specific configuration, copy over ssh keys, …

I actually liked F-15 a lot, and though GNOME 3 has its issues (which I still want to document in a separate post), I overall enjoyed the experience. At home, I noticed myself using the windows key or moving my mouse to the top left corner expecting something to happen.

That is how you know you really are ready for GNOME3.

So I thought, what the heck, let’s get to upgrading all of them. I started with my laptop, as usual. That mostly went fine, except for hurdle number one. My laptop actually has /home encrypted. And I did not add it to my custom layout in anaconda. So, the system dropped me in a rescue shell after booting. It took me quite a while to figure out that I had to copy over /etc/crypttab from the old system. After that, things worked again.

Arguably, hurdle #1 may not be Fedora’s fault. Maybe normal users don’t encrypt home drives, or use custom partitioning like I do (although on a few fedora upgrades this saved my bacon when it turned out certain things I needed didn’t work in the new Fedora, like VMWare)

And yes, GNOME 3.2 is a slight improvement. Enough to make a difference at least. All the usual applications seem to work, so I can now mount my old /home directory.

That’s when I ran into hurdle number 2: the default uid/gid numbering change. My thomas user now was 1000:1000 as opposed to 500:500 on all my machines before Fedora 16.

In this day and age, I still have to shell it up to fix things like that:
find / -uid 500 -exec chown 1000 {} \;
find / -gid 500 -exec chown :1000 {} \;

If I had less shame I’d tell you how embarassing it is if you do this for a few users on your system, and start thinking “let’s put this in a for loop”, and because it’s already 1 AM you start doing things like
for a in 0 1 3; do find / -uid 50$a -exec chown 10$a {} \; ; find / -gid 50$a -exec chown 100$a {} \; done
Note how I got the number of 0′s wrong in the first find, and how I actually forgot the : in the second. You can imagine how amusing it is to fix the effect of those commands.

But I’m a shameful person so I won’t tell you about this bit. Instead, suffice it to say that this took a long time.

Ok, so now /home is mounted on the laptop, and for the most part things worked fine.

On comes the weekend, so I turn to the home machine. I tend to keep the work machine for last, because I don’t want to spend work time on fixing distro problems. And I usually take a whole weekend to upgrade at home. The home machine turned out to be more of a problem. I ran headlong into hurdle number three. You see, there is this new thing called GPT for your partition table, and it is now the default, and it means that fdisk will no longer work, and now you should use gdisk (which sadly is not installed on the rescue bit of the install DVD, boo!), and this is all so we can have grub2, which is supposed to be better or something.

I’m sure one day I will be thankful. But on my home machine, I didn’t know any of this, and just had anaconda tell me something about the boot image being too large and there was no space for it and my system may not boot. (I am not sure why I did not run into this problem on my laptop – presumably, looking at the disk layout now, because I kept the original install, which includes Windows, and just shrunk that and added linux – so it’s probably the windows thing doing the booting). And sure enough, the Fedora 16 install did not boot. It dropped me into my friend, the shell.

So here’s the thing. This new way of doing things needs more space than your average MBR, so you actually need to create a primary partition for this, and it needs to be in the first 2 TiB. So you know what time it is now. It’s resize-o-clock time – I get to learn the joys and mysteries of shrinking ext4-on-software-raid so I can make space for this new partition, which doesn’t need to be big, apparently 5 MB is more than enough. Aren’t I happy now that I stubbornly stuck to having a /boot partition as the primary one on my machines, so I can just shrink that a little?

So shrinking an ext partition I already had down pat. I learnt about shrinking software raid partitions, and again I got into the land of not understanding which of the many types of numbers (sectors ? blocks ? bytes ? cilinders ? Mebi vs Mega ?) are understood the same way by the tools, or not understanding how much of those numbers you need to count extra because of the layer of indirection being added (encryption on logical volume on LVM on software RAID anyone ?). So to be safe I end up shrinking 10% on each layer of the onion as I go deeper – then let the tools handle growing to the maximum space again, since that’s the one thing they’re usually decent at.

But you know, if I’ve done all this, I want to get it right. I don’t want a stinking BIOS boot partition sitting after my /boot partitions. That’s not how F16 sets it up by default. But I have never actually moved a partition. So, download gparted, look at it, figure out how it can let me do that, make sure I ask it to count by cylinders so it doesn’t leave gaps, be puzzled at why it doesn’t let me fractions for MiB sizes of partitions, and work around it in some other way. And so I finally have those two software raid /boot-wearing partitions where I want them – sitting right behind this new BIOS partition.

I create a new partition in fdisk (which is what I’m used to), but I can’t actually set the partition type to EF02, which has four characters where I expect two. But really that is what BIOS BOOT should be.

And now the internet tells me I need to set some flag on it using a tool called parted – some flag called bios_grub. Except when I type that magical command that sets the flag, it tells me it can’t exist:
[root@otto ~]# parted /dev/sda set 6 bios_grub
parted: invalid token: bios_grub
Flag to Invert?

Isn’t this tool nicely written for only the writer of the tool instead of for human beings? Of course I don’t know this when it barfs this at me, but at the end of this story I figured a bunch of things out that this tool could have told me.

You see, invalid token just means that it doesn’t accept the flag named bios_grub. I know this because I’m a programmer so I know the programmer used a token parser – a thing normal people shouldn’t have to know about. What’s that you’re asking? Flag to Invert? How about the Belgian flag, I would quite like to see the colors go in the opposite direction. No, that’s a prompt to choose a different flag to invert than bios_grub. Apparently bios_grub is a flag, not a setting, and I’m trying to invert it, instead of setting it. Can you tell me what flags you do know about, dear parted ?

(parted) help set
set NUMBER FLAG STATE change the FLAG on partition NUMBER

NUMBER is the partition number used by Linux. On MS-DOS disk labels,
the primary partitions number from 1 to 4, logical partitions from 5
onwards.
FLAG is one of: boot, root, swap, hidden, raid, lvm, lba, hp-service,
palo, prep, msftres, bios_grub, atvrecv, diag, legacy_boot
STATE is one of: on, off

Wait, what ? You do know about bios_grub ? But you don’t let me set it ?

I seriously spent 30 minutes on trying to figure that one out.

In the end, it’s because a) I should run gdisk b) parted won’t let you set that flag on a normal MBR drive c) gdisk should convert to using GPT and d) the messages gdisk prints by default are SUPER scary and the docs say that this is intentional to keep away stupid Windows users (I am not making this shit up). Well, that’s why I use software RAID, isn’t it ? How about we take our chances, dive in deep, and let this gdisk thing do the conversion to GPT on the first disk. Gulp.

OK, I got lucky. That actually worked. I can now create this partition, with the proper flag set. While I’m at it, why don’t we try this ‘sort partitions’ option in gdisk so that this new partition, which is now at the start, but listed as number 4 out of 4, shows up as number 1. Sure, it will renumber all other partitions, but let’s just hope that most things use UUID’s and labels and what not by now, and if not I should be able to figure things out.

In what feels like Day 5 in a two-day weekend, the system now boots! I actually see a new grub (wait, why is that text-mode only again ? Fedora guys, you spent years to make everything look graphical, because that was some huge important feature that mostly got in my way when it took longer than it was supposed to and I had no way to see why except reboot and remove quiet and rhgb from the options) and now you suddenly let grub2 take that back from you? Show us some spine, please), and the system shows me plymouth again. Until it doesn’t anymore, and drops me into a terminal screen.

Hurdle number four. Can you guess what it is ? Go on, take a stab. If you’ve updated your system, I’m sure you know the answer. I’ll give you some whitespace to think about it…


















SELinux. Riding in to relabel my file system to save it from the evil people out there. And sure, it warns me. This may take a long time. And then it proceeds to throw asterisks in my face. Lots of asterisks. It’s not the first time this happens. But every time it does, I cannot help but wonder one thing.

Who thought it was a great idea to throw asterisks at the user? How many asterisks am I supposed to expect? Never mind that you can’t actually count them unless you glue your eyeball at the screen, because there are so many they actually scroll off at the top. You know, if you squint hard enough, you can see the maniacally laughing face of the programmer who thought this was a nice way of showing progress. Never mind that tools like fsck can show a progress bar that actually means something (if you trick it into sending data to file descriptor 0) in a sensible way – one line on the console, and visible progress towards an end goal of 100%.

If only I could guess what a long time is going to end up being. Is it a ‘get a drink’ amount of time? Or ‘watch some dexter’. Or nookie time? Or, get the hell out of the house and do all the shopping for the next three hours because there’s no way you’ll be doing anything useful with this system for that long?

So I do all of those things, twice, and one even four times times (I won’t tell you which but I ended up having to pee a lot), and I come back, and the system has rebooted, and there’s actually a GUI asking me to log in.

You know, this Fedora 16 better be frigging spectacular after this six day weekend.

I log in, follow my standard upgrade checklist, try out some of my tools. Media keys don’t seem to work as before for my prototype music player (it flashes a nasty forbidden sign at me), and even though I set up to have nothing happen on inserting audio CD’s (because my LEGO robot is inserting CD’s into an external drive about fifty times a day), Rhythmbox craps on and FORCES me to select which of the many CD’s with exactly the same name that audio CD might be. So, par for the course so far.

Maybe a reboot will fix that, it may not know about those settings until I have everything installed and upgraded. And if I reboot, I’d better convert my second drive to GPT and fix my /boot and set that flag and all that. So I do. And for some reason I can’t figure out how to tell software raid that sda2 and sdb2 (which are both still perfectly mountable as ext file systems and were part of the previous RAID-1 /boot array before I resized them) really are a software raid. So there’s this point where I’ve wasted more time on trying that then it would have taken me to actually manually type every byte on that /boot partition, and I just give up and recreate a software raid on those two partitions and copy stuff over.

And then I reboot. And won’t you know it. Effing goddamn selinux relabel all over again. In fact, this way too long entry was typed completely in less than half the time selinux took to complete some work it had already done an hour ago.

I better have a working system after this last relabel finishes. Now excuse me while I go make some comfort food, potatoes and beans and runny eggs with butter sauce. I’m going to eat it while my good friend Dexter comes back from a long holiday. It’s the only thing that is going to get me out of this weekend funk. And you know who I will be thinking about every time my friend Dexter tells me of a problem he solved…

November 11, 2011

When do we push most advisories?

We pushed an update to Flash Player for Red Hat Enterprise Linux Supplementary today, on a Friday, because it fixed Critical vulnerabilities. But we try not to push updates on a Friday unless they are critical and already public.

So let's take a look at the most common times and days we push advisories for Red Hat Enterprise Linux 4, 5, and 6 (including Supplementary) using a heatmap:

heatmap

The more advisories pushed for a given date and hour, the darker that section of the graph is. So the most popular times for pushing advisories are Tuesdays at 10am and 2pm Eastern US time, Fridays are pretty light for pushes, and there was nothing during the weekends. The spread of the graph shows that we push advisories when they are ready, rather than waiting to a fixed day and time, in order to reduce the risk to users.

All the data used to create this graph is available as part of our public metrics. Thanks to Sami Kerola for the R code from which I based my graph generation.

November 10, 2011

Blue sky, white sand, and NetworkManager 0.9.2

The view as I type 'git tag 0.9.2'... (shazwan cc-by-2.0)

There’s no better way to celebrate the release of NetworkManager 0.9.2 than a sip of ice-cold cocktail.  It’s something pink-colored — I don’t know what — and it’s phenomenal.  And if I ever run out, I just ring a bell and somebody fills it up!  It’s basically like Paradise, except Paradise doesn’t have the latest version of NetworkManager.  Here’s a hot tip: make your first half billion and buy yourself a private island.  Then move there and write open-source software for fun.  It’s a pretty great life.  After a hard day on the beach bending networks to my will, I wind down by building buried hatches solely to confuse the island’s next owner (I’m trading up to a private archipelago in a few years).

But I digress.  So many people contributed to this release.  Unfortunately they couldn’t all fly out to my private island for the release party so I’ll just have to call them out by name instead.  I’m sure they’ll take Internet fame as a consolation prize, right?  So a huge thanks to Alfredo Matos, Colin Walters, Dan Winship, David Rothlisberger, Evan Broder, Florian Echtler, Gary Ching-Pang Lin, Jirí Klimeš, Larry Reaves, Ludwig Nussel, Mathieu Trudel-Lapierre, Michael Stapelberg, Thomas Bechtold, Thomas Graf, Thomas Jarosch, Tore Anderson, Michael Biebl, Vincent Untz, Anders Feder, Giovanni Campagna, Murilo Opsfelder, David Woodhouse, all our translators, and all our testers.  It wouldn’t have happened without you.

This release packs in some great stuff aside from the usual bug fixes and pixie dust: translated country names in the mobile broadband provider wizard, VPN details in the applet’s Connection Information dialog, auto-unlocking of GSM modems, support for libnl2 and libnl3, better IPv6 handling, enhancements for nmcli, rfkill fixes for EeePCs, GObject Introspection updates, better cooperation with unmanaged devices, timestamps for VPN connections, increased dnsmasq cache size, and more.  Get your tarballs on:

http://ftp.gnome.org/pub/gnome/sources/NetworkManager/0.9/
http://ftp.gnome.org/pub/gnome/sources/network-manager-applet/0.9/
http://ftp.gnome.org/pub/gnome/sources/NetworkManager-openconnect/0.9/
http://ftp.gnome.org/pub/gnome/sources/NetworkManager-openswan/0.9/
http://ftp.gnome.org/pub/gnome/sources/NetworkManager-openvpn/0.9/
http://ftp.gnome.org/pub/gnome/sources/NetworkManager-pptp/0.9/
http://ftp.gnome.org/pub/gnome/sources/NetworkManager-vpnc/0.9/

0.9.4: a Smörgåsbord of Freaking Awesome

What’s even more exciting is what’s all piled up for 0.9.4.  We’ve killed WEXT and now use the more robust nl80211 for talking to well-behaved kernel drivers.  We’ve uncoupled IPv4 and IPv6 addressing so that when one completes the connection is usable while we wait for the other one to complete or time out.  We’ve added bonding support, and VLANs and bridges are next.  We’ll have better firewall interaction.  We’ll probably have connectivity detection as well.  Many of these features are finished and merged to git master already.

Hey, 0.8.6 is out too!

If you’re into anachronisms, then we’ve got another release for you too.  0.8.6 got tagged earlier this week, and it’s got IPv6 fixes, auto-unlocking of GSM modems, improved usability of IP address and routing entry in the editor, notifications of mobile broadband changes, VPN information in the Connection Information dialog, better handling of gadget devices, retry of Ethernet connections on carrier bounces, allowing certificate paths in keyfile plugins, MAC address blacklists, on-the-fly recognition of newly installed VPN plugins, subject verification of 802.1x certificates, builds without PolicyKit, and much more.

So really it’s just raining NetworkManager goodness.  Except here on my island, where it’s always sunny, breezy, and absolutely perfect.  Time for another drink.  Cheers.

Playing the GNOME 3 Desktop Quest

Chris Siebenmann mentioned in e-mail to me that a new terminal window opens if one Ctrl-clicks on it in Activities in GNOME 3 (there are more tips in comments of his blog post). Previously, I found that right-click there brings up a menu that can open new windows, but this takes it to the new level. Owen Taylor, you are so devious!

Back in 1985 or so, a company where I worked obtained a very expensive and unique piece of equipment: an IBM PC with a hard drive. Someone installed a game on it, called "Leisure Suit Larry" one or the other, and the office took turns playing it. The game was a quest with "quasi-natural" command input. At some point in the game, the player has a dull knife that does not cut. Someone heard from someone that it needs to get sharpened, but nothing we tried worked. Nobody in the firm knew English, so I remember trying "make knife sharp", "knife sharp", "sharp knife", and any other permutation including "knife sharp do". The right answer was "sharpen knife".

GNOME 3 reminds me of that quest. Random inputs that make complete sense to Owen, same as the conjugation into imperative made complete sense to an English speaker, but entirely opaque and mysterious to people who are not in on the secret.

Package repository updated for Fedora 16

Following up on last night’s post, I updated my package repositories with all packages I currently work on before sending them off to Fedora.

The repository is updated with chromaprint, gstreamer-chromaprint, longomatch, and some of my own projects: moap, mach, morituri, and savon.

November 09, 2011

Mach 1.0.0 “Madera” released

Another November, another Fedora. 16 came out, so it was time to update mach again.

And today I thought, is there any reason mach isn’t 1.0 yet ? Am I going to do anything more to this piece of code before I want to call it that ?

And the answer is, no. It’s the first Python application I’ve written, and I’m not particularly proud of the code, but I’m happy I’ve made good use of it for so long, and that it helped push packaging approaches forward and sparked ideas for the Fedora build system.

Since I didn’t like the original code for mach2 (there was a version 1 which was Makefile-based), I started a rewrite with unit tests, better code layout, decent classes for abstracting distro-specific stuff, and so on.

The experience of how mock was created based off mach2 was a slightly sour one however, so I wasn’t really motivated to finish the mach3 rewrite. Sometimes that’s the drawback of open source – sure, forking is specifically allowed, so don’t whine about it when it happens. But when it’s done gratuitously, with no serious attempt at collaborating, it doesn’t feel like it’s in the spirit of open source.

Anyway, that was a long time ago. mach2 as it is today, is done. It really only needs updating for newer versions. As long as it works for me, it’s unlikely I will continue mach3, but who knows?

Enjoy the release!

November 07, 2011

Kernel Hackers Panel

At LinuxCon Europe/ELCE I had the chance to moderate the kernel hackers panel with Linus Torvalds, Alan Cox, Paul McKenney and Thomas Gleixner on stage. I like to believe it went quite well, but check it out for yourself, as a video recording is now available online:

For me personally I think the most notable topic covered was Control Groups, and the clarification that they are something that is needed even though their implementation right now is in many ways less than perfect. But in the end there is no reasonable way around it, and much like SMP, technology that complicates things substantially but is ultimately unavoidable.

Other videos from ELCE are online now, too.

November 04, 2011

Snow storm 2011

Sunday morning I woke up to this sight. An entire tree top, about 20-25ft long, had come down and blocked both garage doors. I borrowed a saw from the neighbors across the street (thank you) and ended up spending most of Sunday sawing this treetop into pieces and dragging those pieces off the driveway. Luckily the damage to the roof was absolutely minimal.

read more

November 01, 2011

libabc

At the Kernel Summit in Prague last week Kay Sievers and I lead a session on developing shared userspace libraries, for kernel hackers. More and more userspace interfaces of the kernel (for example many which deal with storage, audio, resource management, security, file systems or a number of other subsystems) nowadays rely on a dedicated userspace component. As people who work primarily in the plumbing layer of the Linux OS we noticed over and over again that these libraries written by people who usually are at home on the kernel side of things make the same mistakes repeatedly, thus making life for the users of the libraries unnecessarily difficult. In our session we tried to point out a number of these things, and in particular places where the usual kernel hacking style translates badly into userspace shared library hacking. Our hope is that maybe a few kernel developers have a look at our list of recommendations and consider the points we are raising.

To make things easy we have put together an example skeleton library we dubbed libabc, whose README file includes all our points in terse form. It's available on kernel.org:

The git repository and the README.

This list of recommendations draws inspiration from David Zeuthen's and Ulrich Drepper's well known papers on the topic of writing shared libraries. In the README linked above we try to distill this wealth of information into a terse list of recommendations, with a couple of additions and with a strict focus on a kernel hacker background.

Please have a look, and even if you are not a kernel hacker there might be something useful to know in it, especially if you work on the lower layers of our stack.

If you have any questions or additions, just ping us, or comment below!

October 27, 2011

OSPF in the home network

Having VLAN at home is quite reasonable, having DNS may be excusable, but to run OSPF one must be a real nerd. Normally, I am not, but at a certain point I made a decision to use routed VM networking, rather than bridged, and things pretty much snowballed from there. The last straw was a weird regression in Fedora 15, which does not accept IPv6 statics which Fedora 14 did.

KVM host, zebra.conf:

! zebra.conf
hostname simbelmyne.zaitcev.lan
password grumble

interface eth0

interface virbr0
! we have to run radvd because of virsh, so don't interfere into virbr0
  ipv6 nd suppress-ra

log file /var/log/quagga/zebra.log

KVM host, ospfd.conf:

password grumble
!
router ospf
  router-id 192.168.128.10
  network 192.168.0.0/16 area 0
  redistribute connected
!
log file /var/log/quagga/ospfd.log

KVM host, ospf6d.conf:

password grumble

router ospf6
  redistribute connected route-map TBD
  interface eth0 area 0.0.0.0
  interface virbr0 area 0.0.0.0
!!  network fd2d:acfb:74cc::/48 area 0

!ipv6 prefix-list test-prefix seq 1000 deny any
ipv6 prefix-list TBD-prefix permit fd2d:acfb:74cc:3::/64
ipv6 prefix-list TBD-prefix permit 2001:470:86b3:3::/64

route-map TBD permit 10
  match ipv6 address prefix-list TBD-prefix

log file /var/log/quagga/ospf6d.log

Router, zebra.conf:

! zebra.conf
hostname elanor.zaitcev.lan
password grumble

!ipv6 route ::/0 sit1

interface ethmain.5
  ipv6 nd suppress-ra

interface wlanhome
  ipv6 nd suppress-ra

interface ethmain.3
  ipv6 nd suppress-ra

log file /var/log/quagga/zebra.log

Router, ospfd.conf:

password grumble
router ospf
  router-id 192.168.128.1
  network 192.168.0.0/16 area 0
  redistribute connected
  default-information originate
log file /var/log/quagga/ospfd.log

Router, ospf6d.conf:

! ospf6d.conf
password grumble

router ospf6
  router-id 192.168.128.1
!!  redistribute kernel because it's the only way to fetch the default
!!  neither static nor connected work on it for some reason
  redistribute kernel route-map TBD
!  redistribute static
  redistribute connected
  interface ethmain.5 area 0.0.0.0
  interface wlanhome area 0.0.0.0
  interface ethmain.3 area 0.0.0.0

ipv6 prefix-list TBD-prefix permit ::/0
!ipv6 prefix-list TBD-prefix permit fd2d:acfb:74cc::/48
!ipv6 prefix-list TBD-prefix permit 2001:470:1f05:51::/64
!ipv6 prefix-list TBD-prefix permit 2001:470:86b3::/48
!!
route-map TBD permit 10
  match ipv6 address prefix-list TBD-prefix

log file /var/log/quagga/ospf6d.log

The biggest sticking point was the v6 default route. Defining it in zebra.conf and "redistribute static" plainly had no effect. The v4 side has "default-information originate" for that, but it's not documented for v6. Therefore, the weird workarounds with route map.

October 26, 2011

aggressive, repulsive, spam

I've been using the same email address for more than 10 years now, so it would be putting it mildly to say that I get a lot of spam. Lately, I've noticed a trend, a set of emails that don't trigger the spam detection routines at all, but they're clearly unwanted and unsolicited. They look like this:

To: Tom Callaway <tcallawa@redhat.com>
From: "Arnie Boigner" <a.boigner@technology-infrastructure-services.com>
Subject: RE: Meeting Invitation
Tom,

I hope my last message found you well. I wanted to follow up and see when we could speak further about my company's virtualization storage solutions.

Thank you,

Arnie Boigner
----- Original Message -----

From: Arnie Boigner [mailto:a.boigner@technology-infrastructure-services.com]
Sent: Thursday, September 29, 2011
To: Tom Callaway
Subject: Meeting Invitation

Hi Tom,

I trust you're well. I'll be very brief. If you're like most of the companies I work with, you're saving money through virtualization, but you're spending more and more on data management. That's tough in a good economy, much less the one we're in. Executives tell us they struggle with:



Best Regards,

Arnie Boigner
Account Executive

*****
Now, noteworthy is the following:

1. Arnie Boigner is a real human being. He played football at Ohio Northern University from 1989-1992 and has 3,068 yards rushing.
2. He works for Nimble Storage, a recent startup, who neither I nor Red Hat have any relationship with, nor will we likely _EVER_.
3. His email isn't coming through a Nimble Storage owned domain, it's going through a dummy domain (technology-infrastructure-services.com). This domain is owned/managed by domainsbyproxy.com, a company owned by Go Daddy CEO Bob Parsons that exists solely to hide true site ownership and circumvent IANA rules. As if you expected better from Go Daddy, the scumlord of domain registration.
4. The "Original Message" quoted never was sent to me. It's entirely made up.
5. I'm getting a LOT of these, all with different unique sounding dummy domain names that all refer to different companies, and similar subjects, such as "Re: Schedule Discussion", "Re: Brief Discussion", "Re: Schedule Availability", "Re: Request a Call"
, "Re: Schedule a Call", etc. See the pattern? Looks like it started at the beginning of October.
6. I'm getting them repeated in a loop. Arnie? He's "emailed" me the same spam on October 5th and again today. Note that this isn't him sending me one message, then replying to it, its the same email with bogus "Original Message" quoting at the bottom. (The earlier mail had Subject "Re: Meeting information", as opposed to "RE: Meeting information".

From this evidence, I can only conclude that there is a new "marketing" company using spammer techniques to try to get leads. The companies who have contracted with this scheme should be ashamed of themselves, and I strongly encourage you to not do business with them.

Companies I know (thanks to their toxic spam) to have partnered with this spammer:
Acorn Systems
ANX
Barry-Wehmiller International Resources
BTM
Caliber Point
CollabNet (Double shame on you.)
Datrose
Nimble Storage
Prevari
Results Positive
ScanOne
Tidemark

List of known bogus domain names in this racket:
technology-infrastructure-services.com
solutionmethodologies.com
tech-software-solutions.com
talentoutsourceresources.com
globalengineering-resources.com
apworkflow-automation.com
soft-dev-resources.com
information-risk-mgmt.com
businessprocessesoutsourced.com
complexity-reduction.com
businessapplication-solutions.com
technicalresourcesdelivered.com

All of these domains seem to be using the same DNS servers:
NS2.MAXIMUMASP.COM
NS1.MAXIMUMASP.COM
I don't know if those are the DNS servers that DomainsByProxy uses by default or not. MaximumASP.com is owned by Network Solutions, which appears to be a somewhat legitimate hosting company, so I'm guessing that they're not the spammers, just hosting their DNS.

Now, I'm not interested in filling out the DomainsByProxy form to report this spammer 20 something times, if for no other reason than I suspect strongly it will result in GoDaddy spamming me (they want a full header copy of each spam email). I'm mostly relying on the internet to shame these companies.

I'd love to know who this spammer is, so I can shame them equally.

October 25, 2011

VLAN in a home network

It took me 2.5 years since the arrival of home router with 1 Ethernet onboard to figure out that VLAN is the way to go [1]. Until today, I got by with USB Ethernet, which actually worked extremely well. However, VLAN offers some advantages, chief of which is the capability to have several networks, including a separate less trusty network for TiVo and Wii. The switch was $114.

The idea of VLAN would probably never occur to me, but I read about it in a book about WRT54, which employs this architecture in order to save on Ethernet ports. And unlike WRT54, I have enough capacity in the single 1GE chip.

The biggest issue thus far was that while it would be very attractive to have the main VLAN untagged in the trunk, it appears impossible. For some weird reason, if you have, say, VLAN 5 untagged (eth0), and VLAN 3 tagged (eth0.3), dhcpd running on eth0 sees traffic arriving to eth0.3 too. Moreover, it manages to reply to it and NAKs requests. For that reason, I had to make all VLANs in the trunk tagged. This precludes emergency replacement of the switch.

UPDATE: The post became a little bit out of date. After sitting on all-tagged setup for a few days, it occured to me that I lose access to the router if switch goes kaputt, and vice versa (due to letting switch acquire its management address with DHCP on a now-tagged LAN). Therefore, now we are back to tagged+untagged, but it's different from it was before.

Before, the untagged was the main Ethernet, legacy. Basically I introduced the switch as a normal switch and then added VLANs on top of it. Perfect if you need constant network access throughout the process.

Now, the untagged VLAN is a special segment with 2 fixed addresses. There is no traffic on it and no DHCP, except the management. It is to be used in case of emergencies. Either router or switch can be easily replaced with a working piece of equipment. Everything else is on tagged VLANs in the trunk. The dhcpd listens on tagged interfaces and avoids the bug.

Finally, for the record, the VLANs are configured by the good old: "cp ifcfg-eth0 ifcfg-eth0.5 && echo VLAN=yes >> ifcfg-eth0.5". It seems strange that the VLAN ID is only configured in the filename, but that appears to be the right way to do it in Fedora 15. There is no VLAN_ID=5 anywhere.

[1] Actually, the box has an internal Mini-PCIe connector, but it is occupied by the WiFi, a b43 scavenged from an old laptop.

October 24, 2011

GStreamer Conference number 2

I’m in Prague right now for the second GStreamer conference. Prague is as pretty as I remember it from eighteen years ago when I was still in high school and we had our yearly school trip.

It’s great to see a mix of familiar and new faces again. 11 years ago GStreamer was made public, and I joined a year later around the 0.1.1 release if I recall. And now it’s this huge living breathing thing.

Tomorrow I will be giving a talk about Flumotion here, at 12.00 in the main room. If you’re interested in GStreamer beyond mere playback, this talk is for you. The only sad part is that my good friend Jan Schmidt will be talking about Bluray at the same time, but I’m relying on Ubicast to record it properly so I can see it later!

October 22, 2011

Prague

If you make it to Prague the coming week for the LinuxCon/ELCE/GStreamer/Kernel Summit/... superconference, make sure not to miss:

All of that at the Clarion Hotel. See you in Prague!

October 21, 2011

hotplug demo video (teaching X.org new tricks).

So today I managed to see something on screen doing X.org hotplug work. So I present to you live X.org plugging.

http://www.youtube.com/watch?v=g54y80blzRU

Pretty much is a laptop running xf86-video-modesetting driver, with my server, an xterm + metacity. Plug in a USB displaylink device, with a kernel drm driver I wrote for it. (Sneaky xrefresh in the background). and the USB device displays the xterm and metacity.

So what actually happens?

The X server at the start had loaded drivers using udev, and the a new driver ABI. It exports one X protocol screen and plugs an internal DrvScreen into it.

When the hotplug happens, the server inits another DrvScreen for the new device, and plugs it into the single protocol screen. It also copies all the driver level pixmaps/pictures/GCs into the new driver. The single protocol screen at the bottom layer multiplexes across the plugged in drvscreens.

This is like Xinerama pushed down a lot further in the stack, so instead of doing iterations at the protocol level, we do it down at the acceleration level. Also I have randr code hooked up so all the outputs appear no matter what GPU they are from.

This isn't exactly what I want for USB hotplug, ideally we'd use the main GPU to render stuff and only scanout using the USB device, but this is step one. I also need the ability to add/remove drvscreens and all the associated data in order to support dynamic GPU switching.

The real solution is a still a long ways off, but this is just a small light in a long tunnel, I've been hacking on this on/off for over a year now, so its nice to see something visible for the first time.

CTL/CJK format character previews

As Lior Kaplan demonstrated at LibreOffice 2011 Paris, our format character preview really sucks for CTL and CJK users. If no CTL/CJK text is selected then no CTL sample text is shown, and the CJK sample text is from the fontname itself. Many font names are just Latin text, so give no indication what they look like in the actual script/language that is being written in.

e.g. Old dialog for CTL, will only preview some Western text if no text is selected, no attempt to show any sample CTL text, or even the CTL fontname. For CJK it will additional show the fontname of the CJK font in the preview, which isn’t helpful if the CJK fontname contains no CJK glyphs.

Simply adding the CTL fontname wouldn’t help much, seeing as the fontname is David CLM. So, currently reusing the preview text used in the font-dropdown first stab at “doing the right thing” gives me…

Code for all this is mostly in svtools/source/misc/sampletext.cxx where there is now some hugely over-engineered set of heuristics to guess the best script a font is tuned for and various functions to generate suitable text when all we have is the font, versus the font+language vs just the language and if we want a short identifier to classify what script a font might be good to render vs a longer sequence of sample text for a font preview.

Probably best to drop rendering the fontname in the Western case for the text preview and use some sample text there too, at least for the mixed Western+CTL+CJK case as its confusing to have a font name rendered and some sample text in another font.

After initial posting, there was some comments about the hideous rendering of the Hebrew text, which appears to be an artefact or using David CLM. Here’s what it looks like with David, i.e. its the rendering using that font that misplaces the Nikud, not me. Whether this is an interesting bug in our renderer, or maybe glyph fallback, or the font itself it probably worth of investigation.

October 20, 2011

Plumbers Wishlist, The Second Edition

Two weeks ago we published a Plumber's Wishlist for Linux. So far, this has already created lively discussions in the community (as reported on LWN among others), and patches for a few of the items listed have already been posted (thanks a lot to those who worked on this, your contributions are much appreciated!).

We have now prepared a second version of the wish list. It includes a number of additions (tmpfs quota! hostname change notifications! and more!) and updates to the previous items, including links to patches, and references to other interesting material.

We hope to update this wishlist from time, so stay tuned!

And now, go and read the new wishlist!

October 19, 2011

PhagsPa and Tai Le, sample text ?

Looking through my fonts that are clearly tuned for a single specific script, there remain two scripts that niggle me as I don’t have suitable sample text for them. i.e. PhagsPa and Tai Le. I’m looking for a short snippet of sample text in those scripts which is suitable to stick into the font drop down preview. Ideally something fairly equivalent to “Alphabet”, “Script”, “PhagsPa/Tai Le” or “Tibetan/Tai Lü”.

October 17, 2011

Google doesn't like my name

Nice one, Google suspended my Google+ account because I created it under, well, my name, which is "Lennart Poettering", and Google+ thinks that wasn't my name, even though it says so in my passport, and almost every document I own and I was never aware I had any other name. This is ricidulous. Google, give me my name back! This is a really uncool move.

Your Questions for the Kernel Developer Panel at LinuxCon in Prague

I am currently collecting questions for the kernel developer panel at LinuxCon in Prague. If there's something you'd like the panelists to respond to, please post it on the thread, and I'll see what I can do. Thank you!

October 14, 2011

A Big Loss

Google announced today that they'll be shutting down Google Code Search in January. I am quite sure that this would be a massive loss for the Free Software community. The ability to learn from other people's code is a key idea of Free Software. There's simply no better way to do that than with a source code search engine. The day Google Code Search will be shut down will be a sad day for the Free Software community.

Of course, there are a couple of alternatives around, but they all have one thing in common: they, uh, don't even remotely compare to the completeness, performance and simplicity of the Google Code Search interface, and have serious usability issues. (For example: koders.com is really really slow, and splits up identifiers you search for at underscores, which kinda makes it useless for looking for almost any kind of code.)

I think it must be of genuine interest to the Free Software community to have a capable replacement for Google Code Search, for the day it is turned off. In fact, it probably should be something the various foundations which promote Free Software should be looking into, like the FSF or the Linux Foundation. There are very few better ways to get Free Software into the heads and minds of engineers than by examples -- examples consisting of real life code they can find with a source code search engine. I believe a source code search engine is probably among the best vehicles to promote Free Software towards engineers. In particular if it itself was Free Software (in contrast to Google Code Search).

Ideally, all software available on web sites like SourceForge, Freshmeat, or github should be indexed. But there's also a chance for distributions here: indexing the sources of all packages a distribution like Debian or Fedora include would be a great tool for developers. In fact, a distribution offering this functionality might benefit from such functionality, as it attracts developer interest in the distribution.

It's sad that Google Code Search will be gone soon. But maybe there's something positive in the bad news here, and a chance to create something better, more comprehensive, that is free, and promotes our ideals better than Google ever could. Maybe there's a chance here for the Open Source foundations, for the distributions and for the communities to create a better replacement!