August 27, 2015

systemd.conf 2015 CfP REMINDER

LAST REMINDER! systemd.conf 2015 Call for Presentations ends August 31st!

Here's the last reminder that the systemd.conf 2015 CfP ends on August 31st 11:59:59pm Central European Time (that's monday next week)! Make sure to submit your proposals until then!

Please submit your proposals on our website!

And don't forget to register for the conference! Only a limited number of registrations are available due to space constraints! Register here!.

For further details about systemd.conf consult the conference website.

August 24, 2015

First Round of systemd.conf 2015 Sponsors

First Round of systemd.conf 2015 Sponsors

We are happy to announce the first round of systemd.conf 2015 sponsors!

Our first Silver sponsor is CoreOS!

CoreOS develops software for modern infrastructure that delivers a consistent operating environment for distributed applications. CoreOS's commercial offering, Tectonic, is an enterprise-ready platform that combines Kubernetes and the CoreOS stack to run Linux containers. In addition CoreOS is the creator and maintainer of open source projects such as CoreOS Linux, etcd, fleet, flannel and rkt. The strategies and architectures that influence CoreOS allow companies like Google, Facebook and Twitter to run their services at scale with high resilience. Learn more about CoreOS here https://coreos.com/, Tectonic here, https://tectonic.com/ or follow CoreOS on Twitter @coreoslinux.

A Bronze sponsor is Codethink:

Codethink is a software services consultancy, focusing on engineering reliable systems for long-term deployment with open source technologies.

A Bronze sponsor is Pantheon:

Pantheon is a platform for professional website development, testing, and deployment. Supporting Drupal and WordPress, Pantheon runs over 100,000 websites for the world's top brands, universities, and media organizations on top of over a million containers.

A Bronze sponsor is Pengutronix:

Pengutronix provides consulting, training and development services for Embedded Linux to customers from the industry. The Kernel Team ports Linux to customer hardware and has more than 3100 patches in the official mainline kernel. In addition to lowlevel ports, the Pengutronix Application Team is responsible for board support packages based on PTXdist or Yocto and deals with system integration (this is where systemd plays an important role). The Graphics Team works on accelerated multimedia tasks, based on the Linux kernel, GStreamer, Qt and web technologies.

We'd like to thank our sponsors for their support! Without sponsors our conference would not be possible!

We'll shortly announce our second round of sponsors, please stay tuned!

If you'd like to join the ranks of systemd.conf 2015 sponsors, please have a look at our Becoming a Sponsor page!

Reminder! The systemd.conf 2015 Call for Presentations ends on monday, August 31st! Please make sure to submit your proposals on the CfP page until then!

Also, don't forget to register for the conference! Only a limited number of registrations are available due to space constraints! Register here!.

For further details about systemd.conf consult the conference website.

August 18, 2015

systemd.conf 2015 Call for Presentations

REMINDER! systemd.conf 2015 Call for Presentations ends August 31st!

We'd like to remind you that the systemd.conf 2015 Call for Presentations ends on August 31st! Please submit your presentation proposals before that data on our website.

We are specifically interested in submissions from projects and vendors building today's and tomorrow's products, services and devices with systemd. We'd like to learn about the problems you encounter and the benefits you see! Hence, if you work for a company using systemd, please submit a presentation!

We are also specifically interested in submissions from downstream distribution maintainers of systemd! If you develop or maintain systemd packages in a distribution, please submit a presentation reporting about the state, future and the problems of systemd packaging so that we can improve downstream collaboration!

And of course, all talks regarding systemd usage in containers, in the cloud, on servers, on the desktop, in mobile and in embedded are highly welcome! Talks about systemd networking and kdbus IPC are very welcome too!

Please submit your presentations until August 31st!

And don't forget to register for the conference! Only a limited number of registrations are available due to space constraints! Register here!.

Also, limited travel and entry fee sponsorship is available for community contributors. Please contact us for details!

For further details about the CfP consult the CfP page.

For further details about systemd.conf consult the conference website.

systemd.conf 2015 Call for Papers

REMINDER! systemd.conf 2015 Call for Papers ends August 31st!

We'd like to remind you that the systemd.conf 2015 Call for Presentations ends on August 31st! Please submit your presentation proposals before that data on our website.

We are specifically interested in submissions from projects and vendors building today's and tomorrow's products, services and devices with systemd. We'd like to learn about the problems you encounter and the benefits you see! Hence, if you work for a company using systemd, please submit a presentation!

We are also specifically interested in submissions from downstream distribution maintainers of systemd! If you develop or maintain systemd packages in a distribution, please submit a presentation reporting about the state, future and the problems of systemd packaging so that we can improve downstream collaboration!

And of course, all talks regarding systemd usage in containers, in the cloud, on servers, on the desktop, in mobile and in embedded are highly welcome! Talks about systemd networking and kdbus IPC are very welcome too!

Please submit your presentations until August 31st!

And don't forget to register for the conference! Only a limited number of registrations are available due to space constraints! Register here!.

Also, limited travel and entry fee sponsorship is available for community contributors. Please contact us for details!

For further details abou the CfP consult the CfP page.

For further details about systemd.conf consult the conference website.

First Round of systemd.conf 2015 Sponsors

First Round of systemd.conf 2015 Sponsors

We are happy to announce the first round of systemd.conf 2015 sponsors!

Our first Silver sponsor is CoreOS!

CoreOS develops software for modern infrastructure that delivers a consistent operating environment for distributed applications. CoreOS's commercial offering, Tectonic, is an enterprise-ready platform that combines Kubernetes and the CoreOS stack to run Linux containers. In addition CoreOS is the creator and maintainer of open source projects such as CoreOS Linux, etcd, fleet, flannel and rkt. The strategies and architectures that influence CoreOS allow companies like Google, Facebook and Twitter to run their services at scale with high resilience. Learn more about CoreOS here https://coreos.com/, Tectonic here, https://tectonic.com/ or follow CoreOS on Twitter @coreoslinux.

A Bronze sponsor is Codethink:

Codethink is a software services consultancy, focusing on engineering reliable systems for long-term deployment with open source technologies.

A Bronze sponsor is Pantheon:

Pantheon is a platform for professional website development, testing, and deployment. Supporting Drupal and WordPress, Pantheon runs over 100,000 websites for the world's top brands, universities, and media organizations on top of over a million containers.

A Bronze sponsor is Pengutronix:

Pengutronix provides consulting, training and development services for Embedded Linux to customers from the industry. The Kernel Team ports Linux to customer hardware and has more than 3100 patches in the official mainline kernel. In addition to lowlevel ports, the Pengutronix Application Team is responsible for board support packages based on PTXdist or Yocto and deals with system integration (this is where systemd plays an important role). The Graphics Team works on accelerated multimedia tasks, based on the Linux kernel, GStreamer, Qt and web technologies.

We'd like to thank our sponsors for their support! Without sponsors our conference would not be possible!

We'll shortly announce our second round of sponsors, please stay tuned!

If you'd like to join the ranks of systemd.conf 2015 sponsors, please have a look at our Becoming a Sponsor page!

Reminder! The systemd.conf 2015 Call for Presentations ends on monday, August 31st! Please make sure to submit your proposals on the CfP page until then!

Also, don't forget to register for the conference! Only a limited number of registrations are available due to space constraints! Register here!.

For further details about systemd.conf consult the conference website.

August 14, 2015

Tablet Uber Alles Or Is It

Given the trouble with modern laptops, I'm seriously thinking if I should make a jump to a gigantic tablet with a keyboard. You run "make" on VM. Not enough RAM? Order in the cloud! The idea was planted in my mind by that jerk Atwood, who penned an article claiming a death of PC. And a month ago I saw someone at Python meetup using Canopy. It kinda worked, actually. I expect Github Atom to be even better.

Unfortunately, there are problems in 3 broad categories still.

First, the hotspot Internet connectivity sucks. It is plain unreliable. VPN, ssh, and IRC are often blocked; it's necessary to remember "Connectivity Through Anything" lessons and tehcniques. When it works, it's often slow. These problems extend to venues such as Intel's Executive Briefing Center. If "executives" eating their awesome snacks cannot obtain a decent WiFi, what hope do I have? I do not have cellphone data, but I hear bitching about it.

Second, the usual questions about privacy and security apply. Non-proprietary tablets suck immensely, from what I heard.

Third, tablets top out at 10..11 inch. Sorry, but that is not enough to kill laptops while laptops continue to be made. Certainly, Atwood made an argument that as tablets absorb users, PC makers will stop. The day the last one quits, we'll have to use the least shitty tablet regardless of size. But today is not that day.

User-facing hardware

New business trip, new hardware pictures.

It was almost a year, and I'm still looking for a decent laptop, same criteria. I saw a couple of guys using Lenovo X1 Carbon, which looks good. Most importantly, the left Ctrl is now extends to its proper position. Almost a winner, but unfortunately, there are issues. Apparently, the screen on the X1 is not touching the main frame flat when it's closed, so a bundle of clothing pressing in the middle between the hinges is capable to making a nasty crack in plastic. Not acceptable for what is a $1,400 laptop even with Amazon's "discount" of $900. Way to go, Lenovo. Almost had me this time.

Meanwhile, a $500 Dell Vostro continues to soldier on. It's showing its age: building Ceph with "make -j${N}" requires more RAM that it has for any reasonable N, and dialog windows started to outgrow its screen (notably, some of GNOME preferences). I still need a laptop, but can't find a suitable one. The Lenovo X1 tops out at 8GB, which was another strike against it.

I was a little sad when Google stopped making Nexus 7. I have a 2013 version and it is quite good. In the same meeting, I bumped into a guy with a projected update to Nexus 7 that became orphaned when Google pulled the plug. ASUS continued to build them and market them as "MemoPad 7". However, taking the page from Microsoft playbook with their "Surface" and "Surface Pro", ASUS sell "MemoPad 7" versions ranging from worthless piece of junk with 1024x600 to actual Nexus 7 replacements with 1920x1200. Allegedly, the battery life and speed are much improved by using Intel's embedded Atom core. Some of the ARM-optimized apps may not work (example is some kind of music editing thing for podcasters).

August 13, 2015

The case of the mysterious disappearing I211

Day one of unemployed life saw me finally getting around to the first of several hardware related maintenance items that I’ve been putting off until I’ve had the time.

I got a lot of life out of my desktop machine that I had been using since 2007. Earlier this year, I decided it was long overdue an upgrade, and ended up building a ridiculously over-specced machine in the hopes it too would last me a while. After some research, I ended up with a 6-core Haswell-E i7-5820K, and a frankly ridiculously over-featured motherboard.
Once I had delved through the absurd number of BIOS options to convince it that I *really* didn’t want to overclock my CPU or my RAM, or anything else, it was very stable.

It has exceeded all my expectations. In the time it took my old desktop to build one kernel, I can build kernel .deb’s for every machine I own, and still have time spare. It’s an absolute beast.

One of the features that sold me on this board was the two onboard ethernet ports. I had been wanting to do a bunch of networking experiments, and the possibility of using bonding, without having to screw around with add-in cards was appealing.

So I was a little irked one evening after updating its BIOS, to notice that the bond only had one interface active. After some investigation, I noticed that the PCI ID of one of the onboard NICs had changed.

What was once

00:19.0 Ethernet controller: Intel Corporation Ethernet Connection (2) I218-V (rev 05)
08:00.0 Ethernet controller: Intel Corporation I211 Gigabit Network Connection (rev 03)

Was now

00:19.0 Ethernet controller: Intel Corporation Ethernet Connection (2) I218-V (rev 05)
08:00.0 Ethernet controller: Intel Corporation Device 1532 (rev 03)

My I211 had changed its PCI ID, and the e1000 driver wouldn’t bind to this new device.

At first I thought “Cool, some kind of NIC firmware update”, and assumed that e1000 hadn’t been updated yet to support this new feature. Googling for “i211 1532” told a much sadder story however.

If you read the spec update for the i211, you find this interesting table:

I211 Device ID Code Vendor ID Device ID Revision ID1
WGI211AT (not programmed/factory default) 0x8086 0x1532 0x3
WGI211AT (programmed) 0x8086 0x1539 0x3

Uh, not cool. Somehow the BIOS update procedure had wiped the NVRAM on the NIC.

A long protracted conversation with ASUS support followed, including such gems as “I understand you’re seeing blue screens” and “Have you tried removing the DIMMs, rubbing the contacts with an eraser and replacing them”. Eventually I think they got to the end of their script, and agreed to RMA the board. Somewhat annoying, given there’s probably a tool somewhere that can rewrite the flash, but Intel only seems to make that available to integrators, not end-users, and the ASUS representatives denied all knowledge.

It was gone for about two weeks, and finally returned yesterday. Its PCI ID is 0x1539 again, and it has its old MAC address once more. (I’m now hesitant to ever upgrade the BIOS on this machine again). So what happened ? Anyone’s guess, but this isn’t the first time I’ve seen this happen. We had a bunch of these NICs at Akamai too that occasionally had the same thing happen to them.

The whole thing is reminiscent of a painful old bug where ftrace would corrupt the e1000e ROM. Hopefully Linux isn’t to blame this time.

So, long story short: If you see an i211 with a PCI ID of 1532, you’re looking at an RMA.

The post The case of the mysterious disappearing I211 appeared first on codemonkey.org.uk.

August 11, 2015

Moving on from Akamai.

Today was my last day at Akamai. It’s been brief (Just over seven months), but things weren’t really working out for me there for a number of reasons. I’ve mentioned to a number of people who have known about my decision for a while, that it’s not that it’s a bad place to work, but it never felt like a good fit for me, and I came to realize that I’ve spent most of this last year being in denial of just how unhappy I was, in the hope “things would get better”.

There are a lot of smart people working there, working on really difficult problems, but a lot of those problems just don’t align with my interests, especially when they don’t always involve contributing code back upstream. [clarification: There is some upstream work going on there, just not as much as I’d like].

Add to this my disdain for some of the proprietary tooling that’s prevalent there, and it was becoming clear it was not a matter of “if”, but “when” I was going to leave. As an example; I joked a few months ago to co-workers “next time I’m looking for a job, the first question I ask is ‘do you use perforce’?”. Only it wasn’t really a joke, I was dead serious. User-hostile software has no place in my life.
Even little things like “let’s use git” translating to “let’s license Atlassian stash” rather than “run a git-daemon somewhere” started getting me down.

The final project I worked on there was a continuous rebase strategy for the kernel, moving away from perforce to git. It’s a move in the right direction, but ultimately, not the sort of work that gets me excited, and it’s going to be a multi-year project before it starts really bearing fruit. Given how perforce is ingrained in so many of Akamai’s systems, it would also have been extremely unlikely I’d have been able to purge all knowledge of ever having used it.

The rebase work itself also started to bother me that many of the kernel changes we made had no chance of ever even being submitted, let alone accepted upstream. (In part because many of them are very unique to Akamai’s CDN — you won’t find any of the trickery employed there described in a Richard Stevens book, and they’re unlikely to ever be official RFC’s due to the competitive edge they gain from those changes).
There are exceptions to all of this, and the kernel team is trying to do a better job there with upstreaming most of the newer changes, but many of the older legacy patches are under-documented, and/or understood well by few people, with the original authors no longer around, making it a frustrating exercise to get up to speed; especially when you’re trying to learn what the upstream code is doing at the same time.

Someone with less experience dealing exclusively with open-source for most of their career would probably find many of my reasons for leaving trivial. Those same people would probably find Akamai a great place to work. There are a lot of opportunities there if you have a higher tolerance for such things than I did. It was eye-opening recently, mentoring some of the interns there. Optimism. The unjaded outlook that comes with youth. Not getting bent out of shape at crappy tooling because they don’t know different. It made me realize I wasn’t going to ever be like this here.

On a particularly bad day a few weeks back, a recruiter reached out to me, to find out if I was interested in a second chance at an offer I received last time I was looking for a new job. It worked. Enduring an unhappy situation in the hopes things will get better isn’t a great strategy when there are other options.

So, I start at Facebook in September.

I have no delusions that things are going to be perfect there, but at least from the outside right now, the grass looks greener. I feel bad walking away from problems unfinished, but going home miserable or angry or some other negative emotion every day was really starting to get take its toll. It’s not a healthy way to live.

When I was interviewing last December, I read Being Geek to death, so it’s fitting that I’ve picked it up again recently. One paragraph in particular jumps out at me.

My single worst gig was one where I got everything I wanted out the of the offer letter, but in my exuberance for being highly valued, I totally forgot that my gut read on the gig was "meh". Ninety days later, I couldn't care less that I got a 15% raise and a sign-on bonus. I couldn't stand the mundanity of the daily work, and I happily resigned a few months later, taking both a pay cut and returning my sign-on bonus for the opportunity to work at Netscape.

Anachronisms and minor details aside, that paragraph played through my head this afternoon as I wrote the check to pay back the remainder of my sign-on bonus. I wasn’t quite thinking “meh”, but I knew I was making compromises on what I really valued from day one.

Walking away from unvested RSUs, giving up this months paycheck, and writing that check stings a little, but when I did my exit interview this morning, I knew that I too, was “happily resigning” for a great opportunity.

I’m feeling uncharacteristically optimistic right now. Hopefully it’ll last.

I’ll be in Seattle next week, but due to complications with my registration being transferred to another Akamai employee, I won’t actually be at the Linux plumbers conf. If you’re also going to be there and want to catch up, drop me a mail, or <ahem> hit me up on facebook.

The post Moving on from Akamai. appeared first on codemonkey.org.uk.

git submodule

It's a familiar sign to anyone dealing with a project that includes submodules: you run "make" and see something like this:

rgw/rgw_main.cc: In member function ‘virtual int RGWMongooseFrontend::run()’:
rgw/rgw_main.cc:993:8: error: ‘struct mg_callbacks’ has no member named ‘log_access’
cb.log_access = rgw_civetweb_log_access_callback;
^

Ah, yes. Submodule civetweb is obviously out of date. Type "git submodule init; git submodule update" and... nothing happens. The goddamn submodules are stuck.

At this point, running "git diff origin" produces an output like:

--- a/ceph-object-corpus
+++ b/ceph-object-corpus
@@ -1 +1 @@
-Subproject commit 20351c6bae6dd4802936a5a9fd76e41b8ce2bad0
+Subproject commit bb3cee6b85b93210af5fb2c65a33f3000e341a11

So yeah, obviously you fetched the right thing from the origin, but you cannot merge or rebase no matter what. You may spend a good part of a hackathon reading man pages for git subcommands, all for naught.

Fortunately, the stuck submodules can be worked around, by looking at the "git diff origin" above, then doing this:

git update-index --replace --cacheinfo 160000,20351c6bae6dd4802936a5a9fd76e41b8ce2bad0,ceph-object-corpus

You get the idea: force the right commit from the origin into the local index. This allows "git submodule update" to clone and checkout the right thing and you're off to the races. The fixups in the index will stick out in "git status", so create an empty commit to get rid of them (but only after "git submodule update").

When you're done, you might want to kick in the nuts whoever chose to use submodules in your project.

P.S. "git --version" yields "git version 2.4.3".

P.P.S. You verify what you have in the index by running "git ls-files -s ceph-object-corpus" (or src/civetweb). The mode must be 160000 and the hash should match the upstream. Note that "git diff origin" continues to display a disparity until you've run the "git submodules update".

the future is here

Behold:


  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
10005 zaitcev   20   0  809920 755384  13220 R  99.7 12.5   0:20.47 cc1plus
 9894 zaitcev   20   0 1946748 1.806g  15800 R  99.3 31.4   1:46.60 cc1plus
 9956 zaitcev   20   0 1652076 1.524g  15832 R  99.0 26.5   1:30.64 cc1plus
   72 root      20   0       0      0      0 S   4.0  0.0   0:04.60 kswapd0
 9957 zaitcev   20   0   56648  43536   1436 S   2.7  0.7   0:00.49 as
 9895 zaitcev   20   0   79480  66368   1480 S   2.0  1.1   0:00.89 as
 2870 zaitcev   20   0 1989524 533104 160868 S   1.3  8.9  60:28.10 firefox
 2035 zaitcev   20   0 2018216 166872  20028 S   0.7  2.8  16:50.66 gnome-sh

That's right, boys and girls, a compiler with a bigger resident size than Firefox. Three times bigger.

August 02, 2015

DNF - Debugging Not Finished

It's 100% like CKS said:

[root@kvm-rei zaitcev]# dnf check-update openstack-swift
Last metadata expiration check performed 0:10:02 ago on Sun Aug  2 18:42:13 2015.

openstack-swift.noarch                   2.3.0-2.fc23                    rawhide
[root@kvm-rei zaitcev]# dnf update openstack-swift
Last metadata expiration check performed 0:10:07 ago on Sun Aug  2 18:42:13 2015.
Dependencies resolved.
Nothing to do.
Complete!
[root@kvm-rei zaitcev]# rpm -q openstack-swift
openstack-swift-2.2.0-3.fc22.noarch
[root@kvm-rei zaitcev]# 

Searching for a good way to make it unstuck.

July 29, 2015

Conference submission and voting

Generally I feel that I do not do any work that's important enough to present at conferences. My previous presentation was at OLS back in 2005, concerning usbmon. The usbmon is something a guy learning C would program: it's a circular buffer into which kernel drops tracing events; Wireshark pulls them out. Hardly a conference material, but at the time I thought it was supremely important to proseltize the basic techniques of always-on tracing, because it would improve the quality and the ease of debugging of the kernel overall. I really wanted FireWire guys to adopt a similar tracing scheme, because it was a hell on a stick debugging juju with just printk(). Needless to say, that was a miserable failure, as was FireWire itself. I don't think anyone who came to listen to my presentation in Ottawa received their money's worth.

Or did they? Recently an epiphany occured to me. I really should not even think if anyone is interested. That is conference organizers' job, not mine! As a result, I sent a proposal to OpenStack Tokyo, entitled "The Plot to Destroy OpenStack Swift Using C++: Enhancements of Swift API Compatibility in Ceph RADOS Gateway". It's basically a compendum of practical issues that occur when running Swift apps on top of Ceph RGW and what we do to help people do that.

The things are a little different from 10 years ago, because attendees can vote on the submissions. This sounds democratic. I went through all submissions on the storage track and voted them according to my preference. It took a very long time and I suspect that I was crowdsourced by the organizers in the best traditions of Web 2.0. I wonder if they'll even read the abstracts. :-)

July 28, 2015

Announcing systemd.conf 2015

Announcing systemd.conf 2015

We are happy to announce the inaugural systemd.conf 2015 conference of the systemd project.

The conference takes place November 5th-7th, 2015 in Berlin, Germany.

Only a limited number of tickets are available, hence make sure to sign up quickly.

For further details consult the conference website.

July 23, 2015

Wikimania 2015 – random thoughts and observations

Random thoughts from Wikimania, 2015 edition (2013, 2014):

"Wikimania 2015 Reception at Laboratorio Arte Alameda - 02" by Jarek Tuszynski,  under CC BY 4.0
Wikimania 2015 Reception at Laboratorio Arte Alameda – 02” by Jarek Tuszynski, under CC BY 4.0
  • Dancing: After five Wikimedia events (not counting WMF all-hands) I was finally dragged onto the dance floor on the last night. I’ll never be Garfield, but I had fun anyway. The amazing setting did not hurt.
  • Our hosts: The conference was excellently organized and run. I’ve never had Mexico City high on my list of “places I must see” but it moved up many spots after this trip.
  • First timers: I always enjoy talking to people who have never been to Wikimania before. They almost always seem to have enjoyed it, but of course the ones I talk to are typically the ones who are more outgoing and better equipped to enjoy things. I do hope we’re also being welcome to people who don’t already know folks, or who aren’t as outgoing.
  • Luis von Ahn: Good to chat briefly with my long-ago classmate. I thought the Q&A section of his talk was one of the best I’ve seen in a long time. There were both good questions and interesting answers, which is more rare than it should be.
  • Keynotes: I’d love to have one keynote slot each year for a contributor to talk about their work within the movement. Finding the right person would be a challenge, of course, as could language barriers, but it seems like it should be doable.
  • US English: I was corrected on my Americanisms and the occasional complexity of my sentence structure. It was a good reminder that even for fairly sophisticated speakers of English as a second language, California-English is not terribly clear. This is especially true when spoken. Verbose slides can help, which is a shame, since I usually prefer minimal slides. I will try to work on that in the future, and see how we can help other WMFers do the same.
  • Mobile: Really hope someday we can figure out how to make the schedule legible on a mobile device :) Good reminder we’ve got a long way to go there.
  • Community engagement: I enjoyed my departments “engage with” session, but I think next year we need to make it more interactive—probably with something like an introduction/overview followed by a World Cafe-style discussion. One thing we did right was to take questions on written cards. This helped indicate what the most important topics were (when questions were repeated), avoided the problem of lecture-by-question, and opened the floor to people who might otherwise be intimidated because of language barriers or personality. Our booth was also excellent and I’m excited to see some of the stories that came out of it.
  • Technology and culture: After talking about how we’d used cards to change the atmosphere of a talk, someone deliberately provoked me: shouldn’t we address on-wiki cultural issues the same way, by changing the “technology” used for discussion? I agree that technology can help improve things, and we should think about it more than we do (e.g.) but ultimately it can only be part of the solution – our most difficult problems will definitely require work on culture as well as interfaces. (Surprisingly, my 2009 post on this topic holds up pretty well.)
  • Who is this for? I’ve always felt there was some tension around whether the conference is for “us” or for the public, but never had language for it. An older gentleman who I spoke with for a while finally gave me the right term: is it an annual meeting or is it a public conference? Nothing I saw here changed my position, which is that it is more annual meeting than public conference, at least until we get much better at turning new users into long-term users.
  • Esino Lario looks like it will be a lot of fun. I strongly support the organizing committee’s decision to focus less on brief talks and more on longer, more interactive conversations. That is clearly the best use of our limited time together. I’m also excited that they’re looking into blind submissions (which I suggested in my Wikimania post from last year).
  • Being an exec: I saw exactly one regular talk that was not by my department, though I did have lots and lots of conversations. I’m still not sure how I feel about this tradeoff, but I know it will become even harder if we truly do transition to a model with more workshops/conversations and fewer lectures, since those will be both more valuable and more time-consuming/less flexible.
  • Some day: I wrote most of this post in the Mexico City airport, and saw that there are flights from there to La Habana. I hope someday we can do a Wikimania there.

July 20, 2015

Fedora 22 killed IPv6 and I'm fine

I upgraded Fedora on my home router to F22 and immediately IPv6 disappeared on the internal network. The problem is that radvd started throwing its usual "no linklocal address configured on ethmain.5" (although the message is only visible with "IgnoreIfMissing off;"), which leads to "interface ethmain.5 does not exist or is not set up properly". With the default IgnoreIfMissing, radvd continues running but refuses to work, quietly. Needless to say, the interface has a perfectly valid link-local address, same as it had in F21 before the upgrade.

There used to be a time when I took a problem like this as an affront to the idea of IPv6 superiority and the reputation of Fedora as a platform for roll-your-own home router. Now though, I don't give a rat's tail for IPv6. Let Comcast and Google care and pay someone to care. Okay, I lied. I cared enough for file a bug 1244428, but I'm not rushing to build from SRPMs, reinstall old versions, and such.

(Frankly, if we just engage Lennart's attention for an hour, he'll incoporate a perfectly serviceable radvd function into systemd-networkd. Of course, one would need journalctl to see any messages from it, but since it is certain to work, nobody would actually attempt that. The bug report would go unanswered like radvd bugs today, but again, there would not be any bugs.)

UPDATE: The root cause turned out to be an incorrect link-local address after all. I presumed that RFC meant whole fe80::/10 prefix to be used, so each interface had a different address within a node. Ergo, fe80:0:0:1::1/64 address. As it turns out, I may have confused Link-Local address with Site-Local address. The RFC 4291 specifies a fe80::/64 prefix and in F22 the radvd started enforcing it. Note that apparently the lower part has to be unique across the link.

July 16, 2015

NetworkManager 1.0.4 Released!

Just a quick note that we’ve released the latest stable NetworkManager, version 1.0.4.  This is mainly a bugfix release, though it does have a couple new features and larger changes.  Some of those are:

  • Some configuration options can now be changed without restarting the NM daemon.  Those include the ‘dns’, ‘connectivity’, and ‘ignore-carrier’ settings.
  • Devices that have only an IPv6 link-local address are no longer assumed to be connected; by default whenever an interface is set “up” the kernel will assign an IPv6 link-local address which could theoretically be used for communication.  NetworkManager used to interpret this as the interface being available for network communication, while this is rarely what users want or expect.
  • Correct routing is now maintained when two interfaces of the same priority are connected to the same network.
  • udev rules can now be used to tell NetworkManager to manage or unmanage specific devices.
  • Connections with children (bridge, bond, team, etc) can now optionally bring up their slave interfaces when the master is started.
  • Many, many bugs and crashers have also been fixed in the core daemon, the libraries, and nmcli.

Grab the 1.0.4 release here!

We’re well into the development cycle of NetworkManager 1.2 as well, where among other things, we’ll finally be moving to GDBus instead of dbus-glib.  We’ll also have support for structured logging with journald, indicating that certain connections are metered, advertising LLDP-discovered information, built-in IPv4 link-local addressing to get rid of avahi-autoipd, improvements to Wi-Fi Access Point scanning, less verbose logging by default, improvements to DNS handling, and more.  Look for it later this year!

 

July 12, 2015

Future development of Trinity.

It’s been an odd few weeks regarding Trinity based things.

First an email from a higher-up at my former employer asking (paraphrased)..

"That thing we asked you to stop working on when you worked here, any chance now you've left you'll implement these features."

I’m still trying to get my head around the thought process that led to that being a reasonable thing to ask. I’ve made the occasional commit over the last six months, but it’s mostly been code motion, clean-up work, and things like syscall table updates. New feature development came to a halt long ago.

It’s no coincidence that the number of bugs reported found with Trinity have dropped off sharply since the beginning of the year, and I don’t think it’s because the Linux kernel suddenly got lots better. Rather, it’s due to the lack of real ongoing development to “try something else” when some approaches dry up. Sadly we now live in a world where it’s easier to get paid to run someone else’s fuzzer these days than it is to develop one.

Then earlier this week, came the revelation that the only people prepared to fund that kind of new feature development are pretty much the worst people.

Apparently Hacking Team modified Trinity to fuzz ioctl() on Android, which yielded some results. I’ve done no analysis on whether those crashes are are exploitable/fixed/only relevant to Android etc. (Frankly, I’m past caring). I’m not convinced their approach is particularly sound even if it was finding results Trinity wasn’t, so it looks unlikely there are even ideas to borrow here. (We all already knew that ioctl was ripe with bugs, and had practically zero coverage testing).

It bothers me that my work was used as a foundation for their hack-job. Then again, maybe if I hadn’t released Trinity, they’d have based on iknowthis, or some other less useful fuzzer. None of this really should surprise me. I’ve known for some time that there are some “security” people that have their own modifications they have no intention of sending my way. Thanks to the way that people that release 0-days are revered in this circus, there’s no incentive for people to share their modifications if it means that someone else might beat them to finding their precious bugs.

It’s unfortunate that this project has attracted so many awful people. When I began it, the motivation had nothing to do with security. Back in 2010 we were inundated in weird oopses that we couldn’t reproduce, many times triggered by jvm’s. I came up with the idea that maybe a fuzzer could create a realistic enough workload to tickle some of those same bugs. Turned out I was right, and so began a series of huge page and other VM related bug fixes.

In the five years that I’ve made Trinity available, I’ve received notable contributions from perhaps a half dozen people. In return I’ve made my changes available before I’d even given them runtime myself.

It’s a project everyone wants to take from, but no-one wants to give back to.

And that’s why for the foreseeable future, I’m unlikely to make public any further feature work I do on it.
I’m done enabling assholes.

The post Future development of Trinity. appeared first on codemonkey.org.uk.

July 02, 2015

Time for a new GPG key

My GPG key has lasted me well, over 18 years, but it's a v2 key and therefore no longer supported by newer versions of GnuPG. So it's time to move to a new one. I've made a transition statement available. If you signed my old key please consider signing the new one.

June 24, 2015

xdg-app moving to freedesktop.org

For anyone following the development of xdg-app, all development have now moved to freedesktop.org. Here is where things are happening now:

June 18, 2015

The new sd-bus API of systemd

With the new v221 release of systemd we are declaring the sd-bus API shipped with systemd stable. sd-bus is our minimal D-Bus IPC C library, supporting as back-ends both classic socket-based D-Bus and kdbus. The library has been been part of systemd for a while, but has only been used internally, since we wanted to have the liberty to still make API changes without affecting external consumers of the library. However, now we are confident to commit to a stable API for it, starting with v221.

In this blog story I hope to provide you with a quick overview on sd-bus, a short reiteration on D-Bus and its concepts, as well as a few simple examples how to write D-Bus clients and services with it.

What is D-Bus again?

Let's start with a quick reminder what D-Bus actually is: it's a powerful, generic IPC system for Linux and other operating systems. It knows concepts like buses, objects, interfaces, methods, signals, properties. It provides you with fine-grained access control, a rich type system, discoverability, introspection, monitoring, reliable multicasting, service activation, file descriptor passing, and more. There are bindings for numerous programming languages that are used on Linux.

D-Bus has been a core component of Linux systems since more than 10 years. It is certainly the most widely established high-level local IPC system on Linux. Since systemd's inception it has been the IPC system it exposes its interfaces on. And even before systemd, it was the IPC system Upstart used to expose its interfaces. It is used by GNOME, by KDE and by a variety of system components.

D-Bus refers to both a specification, and a reference implementation. The reference implementation provides both a bus server component, as well as a client library. While there are multiple other, popular reimplementations of the client library – for both C and other programming languages –, the only commonly used server side is the one from the reference implementation. (However, the kdbus project is working on providing an alternative to this server implementation as a kernel component.)

D-Bus is mostly used as local IPC, on top of AF_UNIX sockets. However, the protocol may be used on top of TCP/IP as well. It does not natively support encryption, hence using D-Bus directly on TCP is usually not a good idea. It is possible to combine D-Bus with a transport like ssh in order to secure it. systemd uses this to make many of its APIs accessible remotely.

A frequently asked question about D-Bus is why it exists at all, given that AF_UNIX sockets and FIFOs already exist on UNIX and have been used for a long time successfully. To answer this question let's make a comparison with popular web technology of today: what AF_UNIX/FIFOs are to D-Bus, TCP is to HTTP/REST. While AF_UNIX sockets/FIFOs only shovel raw bytes between processes, D-Bus defines actual message encoding and adds concepts like method call transactions, an object system, security mechanisms, multicasting and more.

From our 10year+ experience with D-Bus we know today that while there are some areas where we can improve things (and we are working on that, both with kdbus and sd-bus), it generally appears to be a very well designed system, that stood the test of time, aged well and is widely established. Today, if we'd sit down and design a completely new IPC system incorporating all the experience and knowledge we gained with D-Bus, I am sure the result would be very close to what D-Bus already is.

Or in short: D-Bus is great. If you hack on a Linux project and need a local IPC, it should be your first choice. Not only because D-Bus is well designed, but also because there aren't many alternatives that can cover similar functionality.

Where does sd-bus fit in?

Let's discuss why sd-bus exists, how it compares with the other existing C D-Bus libraries and why it might be a library to consider for your project.

For C, there are two established, popular D-Bus libraries: libdbus, as it is shipped in the reference implementation of D-Bus, as well as GDBus, a component of GLib, the low-level tool library of GNOME.

Of the two libdbus is the much older one, as it was written at the time the specification was put together. The library was written with a focus on being portable and to be useful as back-end for higher-level language bindings. Both of these goals required the API to be very generic, resulting in a relatively baroque, hard-to-use API that lacks the bits that make it easy and fun to use from C. It provides the building blocks, but few tools to actually make it straightforward to build a house from them. On the other hand, the library is suitable for most use-cases (for example, it is OOM-safe making it suitable for writing lowest level system software), and is portable to operating systems like Windows or more exotic UNIXes.

GDBus is a much newer implementation. It has been written after considerable experience with using a GLib/GObject wrapper around libdbus. GDBus is implemented from scratch, shares no code with libdbus. Its design differs substantially from libdbus, it contains code generators to make it specifically easy to expose GObject objects on the bus, or talking to D-Bus objects as GObject objects. It translates D-Bus data types to GVariant, which is GLib's powerful data serialization format. If you are used to GLib-style programming then you'll feel right at home, hacking D-Bus services and clients with it is a lot simpler than using libdbus.

With sd-bus we now provide a third implementation, sharing no code with either libdbus or GDBus. For us, the focus was on providing kind of a middle ground between libdbus and GDBus: a low-level C library that actually is fun to work with, that has enough syntactic sugar to make it easy to write clients and services with, but on the other hand is more low-level than GDBus/GLib/GObject/GVariant. To be able to use it in systemd's various system-level components it needed to be OOM-safe and minimal. Another major point we wanted to focus on was supporting a kdbus back-end right from the beginning, in addition to the socket transport of the original D-Bus specification ("dbus1"). In fact, we wanted to design the library closer to kdbus' semantics than to dbus1's, wherever they are different, but still cover both transports nicely. In contrast to libdbus or GDBus portability is not a priority for sd-bus, instead we try to make the best of the Linux platform and expose specific Linux concepts wherever that is beneficial. Finally, performance was also an issue (though a secondary one): neither libdbus nor GDBus will win any speed records. We wanted to improve on performance (throughput and latency) -- but simplicity and correctness are more important to us. We believe the result of our work delivers our goals quite nicely: the library is fun to use, supports kdbus and sockets as back-end, is relatively minimal, and the performance is substantially better than both libdbus and GDBus.

To decide which of the three APIs to use for you C project, here are short guidelines:

  • If you hack on a GLib/GObject project, GDBus is definitely your first choice.

  • If portability to non-Linux kernels -- including Windows, Mac OS and other UNIXes -- is important to you, use either GDBus (which more or less means buying into GLib/GObject) or libdbus (which requires a lot of manual work).

  • Otherwise, sd-bus would be my recommended choice.

(I am not covering C++ specifically here, this is all about plain C only. But do note: if you use Qt, then QtDBus is the D-Bus API of choice, being a wrapper around libdbus.)

Introduction to D-Bus Concepts

To the uninitiated D-Bus usually appears to be a relatively opaque technology. It uses lots of concepts that appear unnecessarily complex and redundant on first sight. But actually, they make a lot of sense. Let's have a look:

  • A bus is where you look for IPC services. There are usually two kinds of buses: a system bus, of which there's exactly one per system, and which is where you'd look for system services; and a user bus, of which there's one per user, and which is where you'd look for user services, like the address book service or the mail program. (Originally, the user bus was actually a session bus -- so that you get multiple of them if you log in many times as the same user --, and on most setups it still is, but we are working on moving things to a true user bus, of which there is only one per user on a system, regardless how many times that user happens to log in.)

  • A service is a program that offers some IPC API on a bus. A service is identified by a name in reverse domain name notation. Thus, the org.freedesktop.NetworkManager service on the system bus is where NetworkManager's APIs are available and org.freedesktop.login1 on the system bus is where systemd-logind's APIs are exposed.

  • A client is a program that makes use of some IPC API on a bus. It talks to a service, monitors it and generally doesn't provide any services on its own. That said, lines are blurry and many services are also clients to other services. Frequently the term peer is used as a generalization to refer to either a service or a client.

  • An object path is an identifier for an object on a specific service. In a way this is comparable to a C pointer, since that's how you generally reference a C object, if you hack object-oriented programs in C. However, C pointers are just memory addresses, and passing memory addresses around to other processes would make little sense, since they of course refer to the address space of the service, the client couldn't make sense of it. Thus, the D-Bus designers came up with the object path concept, which is just a string that looks like a file system path. Example: /org/freedesktop/login1 is the object path of the 'manager' object of the org.freedesktop.login1 service (which, as we remember from above, is still the service systemd-logind exposes). Because object paths are structured like file system paths they can be neatly arranged in a tree, so that you end up with a venerable tree of objects. For example, you'll find all user sessions systemd-logind manages below the /org/freedesktop/login1/session sub-tree, for example called /org/freedesktop/login1/session/_7, /org/freedesktop/login1/session/_55 and so on. How services precisely label their objects and arrange them in a tree is completely up to the developers of the services.

  • Each object that is identified by an object path has one or more interfaces. An interface is a collection of signals, methods, and properties (collectively called members), that belong together. The concept of a D-Bus interface is actually pretty much identical to what you know from programming languages such as Java, which also know an interface concept. Which interfaces an object implements are up the developers of the service. Interface names are in reverse domain name notation, much like service names. (Yes, that's admittedly confusing, in particular since it's pretty common for simpler services to reuse the service name string also as an interface name.) A couple of interfaces are standardized though and you'll find them available on many of the objects offered by the various services. Specifically, those are org.freedesktop.DBus.Introspectable, org.freedesktop.DBus.Peer and org.freedesktop.DBus.Properties.

  • An interface can contain methods. The word "method" is more or less just a fancy word for "function", and is a term used pretty much the same way in object-oriented languages such as Java. The most common interaction between D-Bus peers is that one peer invokes one of these methods on another peer and gets a reply. A D-Bus method takes a couple of parameters, and returns others. The parameters are transmitted in a type-safe way, and the type information is included in the introspection data you can query from each object. Usually, method names (and the other member types) follow a CamelCase syntax. For example, systemd-logind exposes an ActivateSession method on the org.freedesktop.login1.Manager interface that is available on the /org/freedesktop/login1 object of the org.freedesktop.login1 service.

  • A signature describes a set of parameters a function (or signal, property, see below) takes or returns. It's a series of characters that each encode one parameter by its type. The set of types available is pretty powerful. For example, there are simpler types like s for string, or u for 32bit integer, but also complex types such as as for an array of strings or a(sb) for an array of structures consisting of one string and one boolean each. See the D-Bus specification for the full explanation of the type system. The ActivateSession method mentioned above takes a single string as parameter (the parameter signature is hence s), and returns nothing (the return signature is hence the empty string). Of course, the signature can get a lot more complex, see below for more examples.

  • A signal is another member type that the D-Bus object system knows. Much like a method it has a signature. However, they serve different purposes. While in a method call a single client issues a request on a single service, and that service sends back a response to the client, signals are for general notification of peers. Services send them out when they want to tell one or more peers on the bus that something happened or changed. In contrast to method calls and their replies they are hence usually broadcast over a bus. While method calls/replies are used for duplex one-to-one communication, signals are usually used for simplex one-to-many communication (note however that that's not a requirement, they can also be used one-to-one). Example: systemd-logind broadcasts a SessionNew signal from its manager object each time a user logs in, and a SessionRemoved signal every time a user logs out.

  • A property is the third member type that the D-Bus object system knows. It's similar to the property concept known by languages like C#. Properties also have a signature, and are more or less just variables that an object exposes, that can be read or altered by clients. Example: systemd-logind exposes a property Docked of the signature b (a boolean). It reflects whether systemd-logind thinks the system is currently in a docking station of some form (only applies to laptops …).

So much for the various concepts D-Bus knows. Of course, all these new concepts might be overwhelming. Let's look at them from a different perspective. I assume many of the readers have an understanding of today's web technology, specifically HTTP and REST. Let's try to compare the concept of a HTTP request with the concept of a D-Bus method call:

  • A HTTP request you issue on a specific network. It could be the Internet, or it could be your local LAN, or a company VPN. Depending on which network you issue the request on, you'll be able to talk to a different set of servers. This is not unlike the "bus" concept of D-Bus.

  • On the network you then pick a specific HTTP server to talk to. That's roughly comparable to picking a service on a specific bus.

  • On the HTTP server you then ask for a specific URL. The "path" part of the URL (by which I mean everything after the host name of the server, up to the last "/") is pretty similar to a D-Bus object path.

  • The "file" part of the URL (by which I mean everything after the last slash, following the path, as described above), then defines the actual call to make. In D-Bus this could be mapped to an interface and method name.

  • Finally, the parameters of a HTTP call follow the path after the "?", they map to the signature of the D-Bus call.

Of course, comparing an HTTP request to a D-Bus method call is a bit comparing apples and oranges. However, I think it's still useful to get a bit of a feeling of what maps to what.

From the shell

So much about the concepts and the gray theory behind them. Let's make this exciting, let's actually see how this feels on a real system.

Since a while systemd has included a tool busctl that is useful to explore and interact with the D-Bus object system. When invoked without parameters, it will show you a list of all peers connected to the system bus. (Use --user to see the peers of your user bus instead):

$ busctl
NAME                                       PID PROCESS         USER             CONNECTION    UNIT                      SESSION    DESCRIPTION
:1.1                                         1 systemd         root             :1.1          -                         -          -
:1.11                                      705 NetworkManager  root             :1.11         NetworkManager.service    -          -
:1.14                                      744 gdm             root             :1.14         gdm.service               -          -
:1.4                                       708 systemd-logind  root             :1.4          systemd-logind.service    -          -
:1.7200                                  17563 busctl          lennart          :1.7200       session-1.scope           1          -
[…]
org.freedesktop.NetworkManager             705 NetworkManager  root             :1.11         NetworkManager.service    -          -
org.freedesktop.login1                     708 systemd-logind  root             :1.4          systemd-logind.service    -          -
org.freedesktop.systemd1                     1 systemd         root             :1.1          -                         -          -
org.gnome.DisplayManager                   744 gdm             root             :1.14         gdm.service               -          -
[…]

(I have shortened the output a bit, to make keep things brief).

The list begins with a list of all peers currently connected to the bus. They are identified by peer names like ":1.11". These are called unique names in D-Bus nomenclature. Basically, every peer has a unique name, and they are assigned automatically when a peer connects to the bus. They are much like an IP address if you so will. You'll notice that a couple of peers are already connected, including our little busctl tool itself as well as a number of system services. The list then shows all actual services on the bus, identified by their service names (as discussed above; to discern them from the unique names these are also called well-known names). In many ways well-known names are similar to DNS host names, i.e. they are a friendlier way to reference a peer, but on the lower level they just map to an IP address, or in this comparison the unique name. Much like you can connect to a host on the Internet by either its host name or its IP address, you can also connect to a bus peer either by its unique or its well-known name. (Note that each peer can have as many well-known names as it likes, much like an IP address can have multiple host names referring to it).

OK, that's already kinda cool. Try it for yourself, on your local machine (all you need is a recent, systemd-based distribution).

Let's now go the next step. Let's see which objects the org.freedesktop.login1 service actually offers:

$ busctl tree org.freedesktop.login1
└─/org/freedesktop/login1
  ├─/org/freedesktop/login1/seat
  │ ├─/org/freedesktop/login1/seat/seat0
  │ └─/org/freedesktop/login1/seat/self
  ├─/org/freedesktop/login1/session
  │ ├─/org/freedesktop/login1/session/_31
  │ └─/org/freedesktop/login1/session/self
  └─/org/freedesktop/login1/user
    ├─/org/freedesktop/login1/user/_1000
    └─/org/freedesktop/login1/user/self

Pretty, isn't it? What's actually even nicer, and which the output does not show is that there's full command line completion available: as you press TAB the shell will auto-complete the service names for you. It's a real pleasure to explore your D-Bus objects that way!

The output shows some objects that you might recognize from the explanations above. Now, let's go further. Let's see what interfaces, methods, signals and properties one of these objects actually exposes:

$ busctl introspect org.freedesktop.login1 /org/freedesktop/login1/session/_31
NAME                                TYPE      SIGNATURE RESULT/VALUE                             FLAGS
org.freedesktop.DBus.Introspectable interface -         -                                        -
.Introspect                         method    -         s                                        -
org.freedesktop.DBus.Peer           interface -         -                                        -
.GetMachineId                       method    -         s                                        -
.Ping                               method    -         -                                        -
org.freedesktop.DBus.Properties     interface -         -                                        -
.Get                                method    ss        v                                        -
.GetAll                             method    s         a{sv}                                    -
.Set                                method    ssv       -                                        -
.PropertiesChanged                  signal    sa{sv}as  -                                        -
org.freedesktop.login1.Session      interface -         -                                        -
.Activate                           method    -         -                                        -
.Kill                               method    si        -                                        -
.Lock                               method    -         -                                        -
.PauseDeviceComplete                method    uu        -                                        -
.ReleaseControl                     method    -         -                                        -
.ReleaseDevice                      method    uu        -                                        -
.SetIdleHint                        method    b         -                                        -
.TakeControl                        method    b         -                                        -
.TakeDevice                         method    uu        hb                                       -
.Terminate                          method    -         -                                        -
.Unlock                             method    -         -                                        -
.Active                             property  b         true                                     emits-change
.Audit                              property  u         1                                        const
.Class                              property  s         "user"                                   const
.Desktop                            property  s         ""                                       const
.Display                            property  s         ""                                       const
.Id                                 property  s         "1"                                      const
.IdleHint                           property  b         true                                     emits-change
.IdleSinceHint                      property  t         1434494624206001                         emits-change
.IdleSinceHintMonotonic             property  t         0                                        emits-change
.Leader                             property  u         762                                      const
.Name                               property  s         "lennart"                                const
.Remote                             property  b         false                                    const
.RemoteHost                         property  s         ""                                       const
.RemoteUser                         property  s         ""                                       const
.Scope                              property  s         "session-1.scope"                        const
.Seat                               property  (so)      "seat0" "/org/freedesktop/login1/seat... const
.Service                            property  s         "gdm-autologin"                          const
.State                              property  s         "active"                                 -
.TTY                                property  s         "/dev/tty1"                              const
.Timestamp                          property  t         1434494630344367                         const
.TimestampMonotonic                 property  t         34814579                                 const
.Type                               property  s         "x11"                                    const
.User                               property  (uo)      1000 "/org/freedesktop/login1/user/_1... const
.VTNr                               property  u         1                                        const
.Lock                               signal    -         -                                        -
.PauseDevice                        signal    uus       -                                        -
.ResumeDevice                       signal    uuh       -                                        -
.Unlock                             signal    -         -                                        -

As before, the busctl command supports command line completion, hence both the service name and the object path used are easily put together on the shell simply by pressing TAB. The output shows the methods, properties, signals of one of the session objects that are currently made available by systemd-logind. There's a section for each interface the object knows. The second column tells you what kind of member is shown in the line. The third column shows the signature of the member. In case of method calls that's the input parameters, the fourth column shows what is returned. For properties, the fourth column encodes the current value of them.

So far, we just explored. Let's take the next step now: let's become active - let's call a method:

# busctl call org.freedesktop.login1 /org/freedesktop/login1/session/_31 org.freedesktop.login1.Session Lock

I don't think I need to mention this anymore, but anyway: again there's full command line completion available. The third argument is the interface name, the fourth the method name, both can be easily completed by pressing TAB. In this case we picked the Lock method, which activates the screen lock for the specific session. And yupp, the instant I pressed enter on this line my screen lock turned on (this only works on DEs that correctly hook into systemd-logind for this to work. GNOME works fine, and KDE should work too).

The Lock method call we picked is very simple, as it takes no parameters and returns none. Of course, it can get more complicated for some calls. Here's another example, this time using one of systemd's own bus calls, to start an arbitrary system unit:

# busctl call org.freedesktop.systemd1 /org/freedesktop/systemd1 org.freedesktop.systemd1.Manager StartUnit ss "cups.service" "replace"
o "/org/freedesktop/systemd1/job/42684"

This call takes two strings as input parameters, as we denote in the signature string that follows the method name (as usual, command line completion helps you getting this right). Following the signature the next two parameters are simply the two strings to pass. The specified signature string hence indicates what comes next. systemd's StartUnit method call takes the unit name to start as first parameter, and the mode in which to start it as second. The call returned a single object path value. It is encoded the same way as the input parameter: a signature (just o for the object path) followed by the actual value.

Of course, some method call parameters can get a ton more complex, but with busctl it's relatively easy to encode them all. See the man page for details.

busctl knows a number of other operations. For example, you can use it to monitor D-Bus traffic as it happens (including generating a .cap file for use with Wireshark!) or you can set or get specific properties. However, this blog story was supposed to be about sd-bus, not busctl, hence let's cut this short here, and let me direct you to the man page in case you want to know more about the tool.

busctl (like the rest of system) is implemented using the sd-bus API. Thus it exposes many of the features of sd-bus itself. For example, you can use to connect to remote or container buses. It understands both kdbus and classic D-Bus, and more!

sd-bus

But enough! Let's get back on topic, let's talk about sd-bus itself.

The sd-bus set of APIs is mostly contained in the header file sd-bus.h.

Here's a random selection of features of the library, that make it compare well with the other implementations available.

  • Supports both kdbus and dbus1 as back-end.

  • Has high-level support for connecting to remote buses via ssh, and to buses of local OS containers.

  • Powerful credential model, to implement authentication of clients in services. Currently 34 individual fields are supported, from the PID of the client to the cgroup or capability sets.

  • Support for tracking the life-cycle of peers in order to release local objects automatically when all peers referencing them disconnected.

  • The client builds an efficient decision tree to determine which handlers to deliver an incoming bus message to.

  • Automatically translates D-Bus errors into UNIX style errors and back (this is lossy though), to ensure best integration of D-Bus into low-level Linux programs.

  • Powerful but lightweight object model for exposing local objects on the bus. Automatically generates introspection as necessary.

The API is currently not fully documented, but we are working on completing the set of manual pages. For details see all pages starting with sd_bus_.

Invoking a Method, from C, with sd-bus

So much about the library in general. Here's an example for connecting to the bus and issuing a method call:

#include <stdio.h>
#include <stdlib.h>
#include <systemd/sd-bus.h>

int main(int argc, char *argv[]) {
        sd_bus_error error = SD_BUS_ERROR_NULL;
        sd_bus_message *m = NULL;
        sd_bus *bus = NULL;
        const char *path;
        int r;

        /* Connect to the system bus */
        r = sd_bus_open_system(&bus);
        if (r < 0) {
                fprintf(stderr, "Failed to connect to system bus: %s\n", strerror(-r));
                goto finish;
        }

        /* Issue the method call and store the respons message in m */
        r = sd_bus_call_method(bus,
                               "org.freedesktop.systemd1",           /* service to contact */
                               "/org/freedesktop/systemd1",          /* object path */
                               "org.freedesktop.systemd1.Manager",   /* interface name */
                               "StartUnit",                          /* method name */
                               &error,                               /* object to return error in */
                               &m,                                   /* return message on success */
                               "ss",                                 /* input signature */
                               "cups.service",                       /* first argument */
                               "replace");                           /* second argument */
        if (r < 0) {
                fprintf(stderr, "Failed to issue method call: %s\n", error.message);
                goto finish;
        }

        /* Parse the response message */
        r = sd_bus_message_read(m, "o", &path);
        if (r < 0) {
                fprintf(stderr, "Failed to parse response message: %s\n", strerror(-r));
                goto finish;
        }

        printf("Queued service job as %s.\n", path);

finish:
        sd_bus_error_free(&error);
        sd_bus_message_unref(m);
        sd_bus_unref(bus);

        return r < 0 ? EXIT_FAILURE : EXIT_SUCCESS;
}

Save this example as bus-client.c, then build it with:

$ gcc bus-client.c -o bus-client `pkg-config --cflags --libs libsystemd`

This will generate a binary bus-client you can now run. Make sure to run it as root though, since access to the StartUnit method is privileged:

# ./bus-client
Queued service job as /org/freedesktop/systemd1/job/3586.

And that's it already, our first example. It showed how we invoked a method call on the bus. The actual function call of the method is very close to the busctl command line we used before. I hope the code excerpt needs little further explanation. It's supposed to give you a taste how to write D-Bus clients with sd-bus. For more more information please have a look at the header file, the man page or even the sd-bus sources.

Implementing a Service, in C, with sd-bus

Of course, just calling a single method is a rather simplistic example. Let's have a look on how to write a bus service. We'll write a small calculator service, that exposes a single object, which implements an interface that exposes two methods: one to multiply two 64bit signed integers, and one to divide one 64bit signed integer by another.

#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <systemd/sd-bus.h>

static int method_multiply(sd_bus_message *m, void *userdata, sd_bus_error *ret_error) {
        int64_t x, y;
        int r;

        /* Read the parameters */
        r = sd_bus_message_read(m, "xx", &x, &y);
        if (r < 0) {
                fprintf(stderr, "Failed to parse parameters: %s\n", strerror(-r));
                return r;
        }

        /* Reply with the response */
        return sd_bus_reply_method_return(m, "x", x * y);
}

static int method_divide(sd_bus_message *m, void *userdata, sd_bus_error *ret_error) {
        int64_t x, y;
        int r;

        /* Read the parameters */
        r = sd_bus_message_read(m, "xx", &x, &y);
        if (r < 0) {
                fprintf(stderr, "Failed to parse parameters: %s\n", strerror(-r));
                return r;
        }

        /* Return an error on division by zero */
        if (y == 0) {
                sd_bus_error_set_const(ret_error, "net.poettering.DivisionByZero", "Sorry, can't allow division by zero.");
                return -EINVAL;
        }

        return sd_bus_reply_method_return(m, "x", x / y);
}

/* The vtable of our little object, implements the net.poettering.Calculator interface */
static const sd_bus_vtable calculator_vtable[] = {
        SD_BUS_VTABLE_START(0),
        SD_BUS_METHOD("Multiply", "xx", "x", method_multiply, SD_BUS_VTABLE_UNPRIVILEGED),
        SD_BUS_METHOD("Divide",   "xx", "x", method_divide,   SD_BUS_VTABLE_UNPRIVILEGED),
        SD_BUS_VTABLE_END
};

int main(int argc, char *argv[]) {
        sd_bus_slot *slot = NULL;
        sd_bus *bus = NULL;
        int r;

        /* Connect to the user bus this time */
        r = sd_bus_open_user(&bus);
        if (r < 0) {
                fprintf(stderr, "Failed to connect to system bus: %s\n", strerror(-r));
                goto finish;
        }

        /* Install the object */
        r = sd_bus_add_object_vtable(bus,
                                     &slot,
                                     "/net/poettering/Calculator",  /* object path */
                                     "net.poettering.Calculator",   /* interface name */
                                     calculator_vtable,
                                     NULL);
        if (r < 0) {
                fprintf(stderr, "Failed to issue method call: %s\n", strerror(-r));
                goto finish;
        }

        /* Take a well-known service name so that clients can find us */
        r = sd_bus_request_name(bus, "net.poettering.Calculator", 0);
        if (r < 0) {
                fprintf(stderr, "Failed to acquire service name: %s\n", strerror(-r));
                goto finish;
        }

        for (;;) {
                /* Process requests */
                r = sd_bus_process(bus, NULL);
                if (r < 0) {
                        fprintf(stderr, "Failed to process bus: %s\n", strerror(-r));
                        goto finish;
                }
                if (r > 0) /* we processed a request, try to process another one, right-away */
                        continue;

                /* Wait for the next request to process */
                r = sd_bus_wait(bus, (uint64_t) -1);
                if (r < 0) {
                        fprintf(stderr, "Failed to wait on bus: %s\n", strerror(-r));
                        goto finish;
                }
        }

finish:
        sd_bus_slot_unref(slot);
        sd_bus_unref(bus);

        return r < 0 ? EXIT_FAILURE : EXIT_SUCCESS;
}

Save this example as bus-service.c, then build it with:

$ gcc bus-service.c -o bus-service `pkg-config --cflags --libs libsystemd`

Now, let's run it:

$ ./bus-service

In another terminal, let's try to talk to it. Note that this service is now on the user bus, not on the system bus as before. We do this for simplicity reasons: on the system bus access to services is tightly controlled so unprivileged clients cannot request privileged operations. On the user bus however things are simpler: as only processes of the user owning the bus can connect no further policy enforcement will complicate this example. Because the service is on the user bus, we have to pass the --user switch on the busctl command line. Let's start with looking at the service's object tree.

$ busctl --user tree net.poettering.Calculator
└─/net/poettering/Calculator

As we can see, there's only a single object on the service, which is not surprising, given that our code above only registered one. Let's see the interfaces and the members this object exposes:

$ busctl --user introspect net.poettering.Calculator /net/poettering/Calculator
NAME                                TYPE      SIGNATURE RESULT/VALUE FLAGS
net.poettering.Calculator           interface -         -            -
.Divide                             method    xx        x            -
.Multiply                           method    xx        x            -
org.freedesktop.DBus.Introspectable interface -         -            -
.Introspect                         method    -         s            -
org.freedesktop.DBus.Peer           interface -         -            -
.GetMachineId                       method    -         s            -
.Ping                               method    -         -            -
org.freedesktop.DBus.Properties     interface -         -            -
.Get                                method    ss        v            -
.GetAll                             method    s         a{sv}        -
.Set                                method    ssv       -            -
.PropertiesChanged                  signal    sa{sv}as  -            -

The sd-bus library automatically added a couple of generic interfaces, as mentioned above. But the first interface we see is actually the one we added! It shows our two methods, and both take "xx" (two 64bit signed integers) as input parameters, and return one "x". Great! But does it work?

$ busctl --user call net.poettering.Calculator /net/poettering/Calculator net.poettering.Calculator Multiply xx 5 7
x 35

Woohoo! We passed the two integers 5 and 7, and the service actually multiplied them for us and returned a single integer 35! Let's try the other method:

$ busctl --user call net.poettering.Calculator /net/poettering/Calculator net.poettering.Calculator Divide xx 99 17
x 5

Oh, wow! It can even do integer division! Fantastic! But let's trick it into dividing by zero:

$ busctl --user call net.poettering.Calculator /net/poettering/Calculator net.poettering.Calculator Divide xx 43 0
Sorry, can't allow division by zero.

Nice! It detected this nicely and returned a clean error about it. If you look in the source code example above you'll see how precisely we generated the error.

And that's really all I have for today. Of course, the examples I showed are short, and I don't get into detail here on what precisely each line does. However, this is supposed to be a short introduction into D-Bus and sd-bus, and it's already way too long for that …

I hope this blog story was useful to you. If you are interested in using sd-bus for your own programs, I hope this gets you started. If you have further questions, check the (incomplete) man pages, and inquire us on IRC or the systemd mailing list. If you need more examples, have a look at the systemd source tree, all of systemd's many bus services use sd-bus extensively.

June 17, 2015

Testing rawhide apps using xdg-app

An important aspect of xdg-app is application sandboxing, which will require application changes to use sandbox-specific APIs. However, xdg-app is also a good way to deploy and run non-sandboxed (or partially sandboxed) regular applications.

A very interesting usecase for this is to have an image-based operating system, for instance a Workstation spin of Fedora Atomic. Such a system would have a basic workstation installation with a read-only /usr, and atomic updates/rollback. However, installing an application is painful, and customizing yor install in that way undoes many of the advantages of an image-based OS.

With xdg-app you can install apps into /var (or $HOME) and have them fully integrate with the system, while still being isolated from changes to the host. This makes for a great combination, just like atomic + docker is a good combination for the server space.

I’ve spent some time recently making a prototype runtime based on the Fedora packages, as reported on the desktop list. This is kind of interesting as it lets you test applications from rawhide on fedora 21 or 22. Just install xdg-app from fedora-updates and then install the runtime:

$ xdg-app add-remote --no-gpg-verify --user fedora http://fedorapeople.org/~alexl/repo/
$ xdg-app install-runtime --user fedora org.fedoraproject.Platform 23

And then you can try gedit 3.17.0:

$ xdg-app install-app --user fedora org.gnome.gedit
$ xdg-app run org.gnome.gedit

Or evince 3.17.2:

$ xdg-app install-app --user fedora org.gnome.evince
$ xdg-app run org.gnome.evince

Once installed you can also just start them from the desktop environment as usual.  They should be there like any regular application as the desktop files and icons are exported to the host.

June 11, 2015

Rich and comments

Rich Jones posted an article about being banned by Boing-Boing, supposedly for bringing attention to their use of affiliate links (the practice that Gamergate groups criticized as well — and scored a regulatory win against). Meanwhile, all my comments at Rich's blog are blackholed, which is quite ironic. Generally, I am not into this "blog comment" thing. Ani-nouto never had any comments and is doing great that way. But some people like comments, so I leave them as necessary.

June 06, 2015

What tools are changing our world next?

Quick brain dump after a bike ride home: free software took a huge leap in the late 90s and early 00s in large part because of non-ideological advantages that the rest of the world is now competing with or surpassing:

HDR automatically created from old pictures of Muir Woods by Google Photos.
HDR automatically created by Google Photos from my old pictures of Muir Woods. Not perfect, but better than I ever bothered to do!
  • Collaboration tools: Because we got to the ‘net first, our tools for collaborating with each other were simply better than what proprietary developers were doing: cvs, mailman, wiki, etc., were all better than the silo’d old-school tools. Modern best-of-breed collaboration tools have all learned from what we did and added proprietary sauce on top: github, slack, Google Docs, etc. So our tools that are now (at best) as productive as our proprietary counterparts, and sometimes less productive but ideologically agreeable.
  • Release processes: “Release early/release often” made us better partners for our users. We’re now actively behind here: compare how often a mobile app or web user gets updates, exactly as the author intended, relative to a user of a modern Linux distro.
  • Zero cost: We did things for no (direct) cost by subsidizing our work through college, startups, or consulting gigs; now everyone has a subsidize-by-selling-something-else model (usually advertising, though sometimes freemium). Again, advantage (mostly?) lost.
  • Knowing our users: We knew a lot about our users, because we were our biggest users, and we talked to other users a lot; this was more effective than what passed for software design in the late 90s. This has been eclipsed by extensive a/b testing throughout the industry, and (to a lesser extent) by more extensive usage of direct user testing and design-thinking.

None of these are terribly original observations – all of these have been remarked on before. But after playing some with Google Photos this weekend, I’m ready to add another one to the list:

Worth asking what your project is doing that could be radically changed if your competitors get access to new technology. For example, for Wikipedia:

  • Collaborating: Wiki was best-of-breed (or close); it isn’t anymore. Visual Editor helps get editing back to par, but the social aspect of collaboration is still lacking relative to the expectations of many users.
  • Knowledge creation: big groups of humans, working together wiki-style, is the state of the art for creating useful, non-BS knowledge at scale. With the aforementioned machine learning, I suspect this will no longer the case in a (growing) number of domains.

I’m sure there are others…

May 29, 2015

Cool hardware in Vancouver

There wasn't much, but more than in Atlanta. The most "pro" looking kit was presented by NEC: basically a bladeserver, but the "blades" are SBCs, each of them accompained by a dedicated drive card. I can see downsides of this design, but very cute.

Unfortunately, they only offer CPU cards based on Atom. No ARM or anything.

The only other interesting booth belonged to StackVelocity, a subsidiary of JB Circuits that does custom design.

I'm sorry to say, their wares looked decidedly pedestrian, which is to be expected: their sales point is low cost, and stuff of that nature underpins the modern datacenter. One curious thing, however, is the variety of flash cards they offer. Basically Fusion-IO on budget. One was particularly tricky by having 2 layers. At first I even thought it could have flash chips mounted sideways, but nope, the science of low-cost computing is not there yet.

P.S. NEC also sell the same chassis with CPU cards instead of drive cards under the index "DX1000".

Semi-hard numbers from Rackspace

Previously in hard numbers: China, Wikimedia, Amazon S3. Rackspace previously reported in creiht's preso 18 months ago. This time, scotty went public at the Vancouver (Liberty) summit with the following:

> 50 billion objects
> 100 PB data (sanitized number, but way higher than 85 PB)
= 6 global clusters
3:1 PUT:GET ratio
10k+ requests/second

The number of objects is roughly 40 times less than in Amazon S3.

May 19, 2015

GDB Preattach

In firefox development, it’s normal to do most development tasks via the mach command. Build? Use mach. Update UUIDs? Use mach. Run tests? Use mach. Debug tests? Yes, mach mochitest --debugger gdb.

Now, normally I run gdb inside emacs, of course. But this is hard to do when I’m also using mach to set up the environment and invoke gdb.

This is really an Emacs bug. GUD, the Emacs interface to all kinds of debuggers, is written as its own mode, but there’s no really great reason for this. It would be way cooler to have an adaptive shell mode, where running the debugger in the shell would magically change the shell-ish buffer into a gud-ish buffer. And somebody — probably you! — should work on this.

But anyway this is hard and I am lazy. Well, sort of lazy and when I’m not lazy, also unfocused, since I came up with three other approaches to the basic problem. Trying stuff out and all. And these are even the principled ways, not crazy stuff like screenify.

Oh right, the basic problem.  The basic problem with running gdb from mach is that then you’re just stuck in the terminal. And unless you dig the TUI, which I don’t, terminal gdb is not that great to use.

One of the ideas, in fact the one this post is about, since this post isn’t about the one that I couldn’t get to work, or the one that is also pretty cool but that I’m not ready to talk about, was: hey, can’t I just attach gdb to the test firefox? Well, no, of course not, the test program runs too fast (sometimes) and racing to attach is no fun. What would be great is to be able to pre-attach — tell gdb to attach to the next instance of a given program.

This requires kernel support. Once upon a time there were some gdb and kernel patches (search for “global breakpoints”) to do this, but they were never merged. Though hmm! I can do some fun kernel stuff with SystemTap…

Specifically what I did was write a small SystemTap script to look for a specific exec, then deliver a SIGSTOP to the process. Then the script prints the PID of the process. On the gdb side, there’s a new command written in Python that invokes the SystemTap script, reads the PID, and invokes attach. It’s a bit hacky and a bit weird to use (the SIGSTOP appears in gdb to have been delivered multiple times or something like that). But it works!

It would be better to have this functionality directly in the kernel. Somebody — probably you! — should write this. But meanwhile my hack is available, along with a few other gdb scxripts, in my gdb helpers github repository.

May 06, 2015

How Mitchell Baker made me to divorce

Well, nearly did. Deleting history in Firefox 37 is very slow and the UI locks up while you do that. "Very slow" means an operation that takes 13 minutes (not exaggerating - it's reproducible). The UI lock-up means a non-dismissable context menu floating over everything; Firefox itself being, of course, entirely unresponsive. See the screencap.

The screencap is from Linux where I confirmed the problem, but the story started on Windows, where my wife tried to tidy up a bit. So, when Firefox locked up, she killed it, and repeated the process a few times. And what else would you do? We are not talking about hanging up for seconds - it literally was many minutes. Firefox did not pop a dialog with "Please wait, deleting 108,534 objects with separate SQLite transactions", a progress gauge, and a "Cancel" button. Instead, it pretended to lock up.

Interestingly enough, remember when Firefox had a default to keep the history for a week? This mode is gone now - FF keeps the history potentially forever. Instead, it offers a technical limit: 108,534 entries are saved in the "Places" database at the most, in order to prevent SQLite from eating all your storage. Now I understand why my brown "visited" links never go back to blue anymore.

The problem is, there's no alternative. I tried to use Midori as my main browser for a month or two in early 2014, but it was a horrible crash city. I had no choice but to give up and go back to Firefox and its case of Featuritis Obesum.

Come work with me – developer edition!

It has been a long time since I was able to say to developer friends “come work with me” in anything but the most abstract “come work under the same roof” kind of sense. But today I can say to developers “come work with me” and really mean it. Which is fun :)

By Supercarwaar (Own work) [CC BY-SA 3.0], via Wikimedia Commons
By Supercarwaar, CC BY-SA 3.0
Details: Wikimedia’s new community tech team is hiring for a community tech developer and a team lead. This will be extremely community-intensive work, so if you enjoy and get energy from working with a community and helping them achieve their goals, this could be a great role for you. This team will work intensely with my department to ensure that we’re correctly identifying and prioritizing the needs of our most active editors. If that sounds like fun, get in touch :)

[And I realize that I’ve been bad and not posted here, so here’s my new job announce: “my department” is the Foundation’s new Community Engagement department, where we work to support healthy contributor communities and help WMF-community collaboration. It is a detour from law, but I’ve always said law was just a way to help people do their thing — so in that sense is the same thing I’ve always been doing. It has been an intense roller coaster of a first two months, and I look forward to much more of the same.]

May 05, 2015

Thoughts on a feedback loop for Trinity.

With the success that afl has been having on fuzzing userspace, I’ve been revisiting an idea that Andi Kleen gave me years ago for trinity, which was pretty much the same thing but for kernel space. I.e., a genetic algorithm that rates how successful the last fuzz attempt was, and makes a decision on whether to mutate that last run, or do something completely new.

It’s something I’ve struggled to get my head around for a few years. The mutation part would be fairly easy. We would need to store the parameters from the last run, and extrapolate out a set of ->mutate functions from the existing ->sanitize functions that currently generate arguments.

The difficult part is the “how successful” measurement. Typically, we don’t really get anything useful back from a syscall other than “we didn’t crash”, which isn’t particularly useful in this case. What we really want is “did we execute code that we’ve not previously tested”. I’ve done some experiments with code coverage in the past. Explorations of the GCOV feature in the kernel didn’t really get very far however for a few reasons (primarily that it really slowed things down too much, and also I was looking into this last summer, when the initial cracks were showing that I was going to be leaving Red Hat, so my time investment for starting large new projecs was limited).

After recent discussions at work surrounding code coverage, I got thinking about this stuff again, and trying to come up with workable alternatives. I started wondering if I could use the x86 performance counters for this. Basically counting the number of instructions executed between system call enter/exit. The example code that Vince Weaver wrote for perf_event_open looked like a good starting point. I compiled it and ran it a few times.

$ ./a.out 
Measuring instruction count for this printf
Used 3212 instructions
$ ./a.out 
Measuring instruction count for this printf
Used 3214 instructions

Ok, so there’s some loss of precision there, but we can mask off the bottom few bits. A collision isn’t the end of the world for what we’re using this for. That’s just measuring userspace however. What happens if we tell it to measure the kernel, and measure say.. getpid().

$ ./a.out 
Used 9283 instructions
$ ./a.out 
Used 9367 instructions

Ok, that’s a lot more precision we’ve lost. What the hell.
Given how much time he’s spent on this stuff, I emailed Vince, and asked if he had insight as to why the counters weren’t deterministic across different runs. He had actually written a paper on the subject. Turns out we’re also getting event counts here for page faults, hardware interrupts, timers, etc.
x86 counters lack the ability to say “only generate events if RIP is within this range” or anything similar, so it doesn’t look like this is going to be particularly useful.

That’s kind of where I’ve stopped with this for now. I don’t have a huge amount of time to work on this, but had hoped that I could hack up something basic using the perf counters, but it looks like even if it’s possible, it’s going to be a fair bit more work than I had anticipated.

update:
It occurred to me after posting this that measuring instructions isn’t going to work regardless of the amount of precision the counters offer. Consider a syscall that operates on vma’s for example. Over the lifetime of a process, the number of executed instructions of a call to such a syscall will vary even with the same input parameters, as the lengths of various linked lists that have to be walked will change. Number of instructions, or number of branches taken/untaken etc just isn’t a good match for this idea. Approximating “have we been here before” isn’t really achievable with this approach afaics, so I’m starting to think something like the initial gcov idea is the only way this could be done.

The post Thoughts on a feedback loop for Trinity. appeared first on codemonkey.org.uk.

Reach the Top With NetworkManager 1.0.2

Summit - Asbjørn Floden (CC BY-NC 2.0)Summit – Asbjørn Floden (CC BY-NC 2.0)

Just this morning Lubomir released NetworkManager 1.0.2, the latest of the 1.0 stable series.  It’s  a great cleanup and bugfix release with contributions from lots of community members in many different areas of the project!

Some highlights of new functionality and fixes:

  • Wi-Fi device band capability indications, requested by the GNOME Shell team
  • Devices set to ignore carrier that use DHCP configurations will now wait a period of time for the carrier to appear, instead of failing immediately
  • Startup optimizations allow networking-dependent services to be started much earlier by systemd
  • Memory usage reductions through many memory leak fixes and optimizations
  • teamd interface management is now more robust and teamd is respawned when it terminates
  • dnsmasq is now respawned when it terminates in the local caching nameserver configuration
  • Fixes for an IPv6 DoS issue CVE-2015-2924, similar to one fixed recently in the kernel
  • IPv6 Dynamic DNS updates sent through DHCP now work more reliably (and require a fully qualified name, per the RFCs)
  • An IPv6 router solicitation loop due to a non-responsive IPv6 router has been fixed

While the list of generally interesting enhancements may be short, it masks 373 git commits and over 50 bugzilla issues fixed.  It’s a great release and we recommend that everyone upgrade.

Next up is NetworkManager 1.2, with DNS improvements, Wi-Fi scanning and AP list fixes for mobile uses, NM-in-containers improvements (no udev required!), even less dependence on the obsolete dbus-glib, less logging noise, device management fixes, continuing removal of external dependencies (like avahi-autoipd), configuration reload-ability, and much more!

May 04, 2015

kernel code coverage brain dump.

Someone at work recently asked me about code coverage tooling for the kernel. I played with this a little last year. At the time I was trying to figure out just how much of certain syscalls trinity was exercising. I ended up being a little disappointed at the level of post-processing tools to deal with the information presented, and added some things to my TODO list to find some time to hack up something, which quickly bubbled its way to the bottom.

As I did a write-up based on past experiences with this stuff, I figured I’d share.

gcov/gprof
requires kernel built with
CONFIG_GCOV_KERNEL=y
GCOV_PROFILE_ALL=y
GCOV_FORMAT_AUTODETECT=y
Note: Setting GCOV_PROFILE_ALL incurs some performance penalty, so any resulting kernel built with this option should _never_ be used for any kind of performance tests.
I can’t exaggerate this enough, it’s miserably slow. Disk operations that took minutes for me now took hours. As example:

Before:

# time dd if=/dev/zero of=output bs=1M count=500
500+0 records in
500+0 records out
524288000 bytes (524 MB) copied, 0.409712 s, 1.3 GB/s
0.00user 0.40system 0:00.41elapsed 99%CPU (0avgtext+0avgdata 2980maxresident)k
136inputs+1024000outputs (1major+340minor)pagefaults 0swaps

After:

# time dd if=/dev/zero of=output bs=1M count=500
500+0 records in
500+0 records out
524288000 bytes (524 MB) copied, 6.17212 s, 84.9 MB/s
0.00user 7.17system 0:07.22elapsed 99%CPU (0avgtext+0avgdata 2940maxresident)k
0inputs+1024000outputs (0major+338minor)pagefaults 0swaps

From 41 seconds, to over 7 minutes. Ugh.

If we *didn’t* set GCOV_PROFILE_ALL, we’d have to recompile just the files we cared about with the relevant gcc profiling switches. It’s kind of a pain.

For all this to work, gcov expects to see a source tree, with:

  • .o objects
  • source files
  • .gcno files (these are generated during the kernel build)
  • .gcda files containing the runtime counters. These come from sysfs on the running kernel.

After booting the kernel, a subtree appears in sysfs at /sys/kernel/debug/gcov/
These directories mirror the kernel source tree, but instead of source files, now contain files that can be fed to the gcov tool. There will be a .gcda file, and a .gcno symlink back to the source tree (with complete path). Ie, /sys/kernel/debug/mm for example contains (among others..)

-rw------- 1 root root 0 Mar 24 11:46 readahead.gcda
lrwxrwxrwx 1 root root 0 Mar 24 11:46 readahead.gcno -> /home/davej/build/linux-dj/mm/readahead.gcno

It is likely the symlink will be broken on the test machine, because the path doesn’t exist, unless you nfs mount the source code from the built kernel for eg.

I hacked up the script below, which may or may not be useful for anyone else (honestly, it’s way easier to just use nfs).
Run it from within a kernel source tree, and it will populate the source tree with the relevant gcda files, and generate the .gcov output file.

  
#!/bin/sh
# gen-gcov-data.sh
obj=$(echo "$1" | sed 's/\.c/\.o/')
if [ ! -f $obj ]; then
  exit
fi

pwd=$(pwd)
dirname=$(dirname $1)
gcovfn=$(echo "$(basename $1)" | sed 's/\.c/\.gcda/')
if [ -f /sys/kernel/debug/gcov$pwd/$dirname/$gcovfn ]; then
  cp /sys/kernel/debug/gcov$pwd/$dirname/$gcovfn $dirname
  gcov -f -r -o $1 $obj
 
  if [ -f $(basename $1).gcov ]; then
    mv $(basename $1).gcov $dirname
  fi
else
  echo "no gcov data for /sys/kernel/debug/gcov$pwd/$dirname/$gcovfn"
fi

Take that script, and run it like so..

$ cd kernel-source-tree
$ find . -type f -name "*.c" -exec gen-gcov-data.sh "{}" \;

Running for eg, gen-gcov-data.sh mm/mmap.c will cause gcov to spit out a mmap.c.gcov file (in the current directory) that has coverage information that looks like..

 
   135684:  269:static struct vm_area_struct *remove_vma(struct vm_area_struct *vma)
        -:  270:{
   135684:  271:        struct vm_area_struct *next = vma->vm_next;
        -:  272:
   135684:  273:        might_sleep();
   135686:  274:        if (vma->vm_ops && vma->vm_ops->close)
     5080:  275:                vma->vm_ops->close(vma);
   135686:  276:        if (vma->vm_file)
    90302:  277:                fput(vma->vm_file);
        -:  278:        mpol_put(vma_policy(vma));
   135686:  279:        kmem_cache_free(vm_area_cachep, vma);
   135686:  280:        return next;
        -:  281:}

The numbers on the left being the number of times that line of code was executed.
Lines beginning with ‘-‘ have no coverage information for whatever reason.
If a branch is not taken, it gets prefixed with ‘#####’, like so..

 
  4815374:  391:                if (vma->vm_start < pend) {
    #####:  392:                        pr_emerg("vm_start %lx < pend %lx\n",
        -:  393:                                  vma->vm_start, pend);
        -:  394:                        bug = 1;
        -:  395:                }

There are some cases that need a little more digging to explain. eg:

    88105:  237:static void __remove_shared_vm_struct(struct vm_area_struct *vma,
        -:  238:                struct file *file, struct address_space *mapping)
        -:  239:{
    88105:  240:        if (vma->vm_flags & VM_DENYWRITE)
    15108:  241:                atomic_inc(&file_inode(file)->i_writecount);
    88105:  242:        if (vma->vm_flags & VM_SHARED)
        -:  243:                mapping_unmap_writable(mapping);
        -:  244:
        -:  245:        flush_dcache_mmap_lock(mapping);
    88105:  246:        vma_interval_tree_remove(vma, &mapping->i_mmap);
        -:  247:        flush_dcache_mmap_unlock(mapping);
    88104:  248:}

In this example, lines 245 & 247 have no hitcount, even though there’s no way they could have been skipped.
If we look at the definition of flush_dcache_mmap_(un)lock, we see..
#define flush_dcache_mmap_lock(mapping) do { } while (0)
So the compiler never emitted any code, and hence, it gets treated the same way as the blank lines.

There is a /sys/kernel/debug/gcov/reset file that can be written to to reset the counters before each test if desired.

Additional thoughts

  • Not sure how inlining affects things.
  • There needs to be some element of post-processing, to work out percentages of code coverage etc, which may involve things like stripping out comments/preprocessor defines.
  • debug kernels differ in functionality in various low level features. For example LOCKDEP will fundamentally change the way spinlocks work. For coverage purposes though, we can choose to not care and stop drilling down at certain levels.
  • Whatever does the post-processing of results may need to aggregate results from multiple test machines. Think of the situation where we’re running a client/server test: Both machines will be running different code paths.
  • ggcov has some interesting looking tooling for visually displaying results.

The post kernel code coverage brain dump. appeared first on codemonkey.org.uk.

May 01, 2015

Trinity socket improvements

I’ve been wanting to get back to working on the networking related code in trinity for a long time. I recently carved out some time in the evenings to make a start on some of the lower hanging fruit.

Something that bugged me a while is that we create a bunch of sockets on startup, and then when we call for eg, setsockopt() on that socket, the socket options we pass have more chance of not being the correct protocol for the protocol the socket was created for. This isn’t always a bad thing; for eg, one of the oldest kernel bugs trinity found was found by setting TCP options on a non-TCP socket. But doing this the majority of the time is wasteful, as we’ll just get -EINVAL most the time.

We actually have the necessary information in trinity to know what kind of socket we were dealing with in a socketinfo struct.

struct socket_triplet {
        unsigned int family;
        unsigned int type;
        unsigned int protocol;
};

struct socketinfo {
        struct socket_triplet triplet;
        int fd; 
};

We just had it at the wrong level of abstraction. setsockopt only ever saw a file descriptor. We could have searched through the fd arrays looking for the socketinfo that matched, but that seems like a lame solution. So I changed the various networking syscalls to take a ARG_SOCKETINFO instead of an ARG_FD. As a side-effect, we actually pass sockets to those syscalls more than say, a perf fd, or an epoll fd, or ..

There is still a small chance we pass some crazy fd, just to cover the crazy cases, though those cases don’t tend to trip things up much any more.

After passing down the triplet, it was a simple case of annotating the structures containing the various setsockopt function pointers to indicate which family they belonged to. AF_INET was the only complication, which needed special casing due to the multiple protocols for which we have setsockopt() functions. Creation of a second table, using the protocol instead of the family was enough for the matching code.

There are still a ton of improvements I want to make to this code, but it’s going to take a while, so it’s good when some mostly trivial changes like the above come together quickly.

The post Trinity socket improvements appeared first on codemonkey.org.uk.

April 14, 2015

the more things change.. 4.0


$ ping gelk
PING gelk.kernelslacker.org (192.168.42.30) 56(84) bytes of data.
WARNING: kernel is not very fresh, upgrade is recommended.
...
$ uname -r
4.0.0

Remember that one time the kernel versioning changed and nothing in userspace broke ? Me either.

Why people insist on trying to think they can get this stuff right is beyond me.

YOU’RE PING. WHY DO YOU EVEN CARE WHAT KERNEL VERSION IS RUNNING.

update: this was already fixed, almost exactly a year ago in the ping git tree. The (now removed) commentary kind of explains why they cared. Sigh.

The post the more things change.. 4.0 appeared first on codemonkey.org.uk.

March 31, 2015

Official GNOME SDK runtime builds are out

As people who have followed the work on sandboxed applications know, we have promised a developer preview for GNOME 3.16. Well, 3.16 has now been released, so the time is now!

I spent last week setting up an build system on the GNOME infrastructure, and the output of this is finally available at:

http://sdk.gnome.org/repo/

This repository contains the gnome 3.16 runtimes, org.gnome.Platform, as well as a smaller one that is useful for less integrated apps (like games) called org.freedesktop.Platform. It also has corresponding develoment runtimes (org.gnome.Sdk and org.freedesktop.Sdk) that you can use to create applications for the platforms.

This is a developer preview, so consider these builds weakly supported. This means I will try to keep them somewhat updated if there are major issues and that I will keep them API and ABI stable. I will probably also pick up at least some 3.16.x minor releases as they are released.

I also did the first official release of xdg-app. For easy testing this is available for Fedora 21 and 22 as a copr repo.

Testing the SDK

Using the repo above makes it really easy to test this. Just install the xdg-app package from copr, log out+in (needed update the environment for the session), then follow these instructions (as a regular user):

  1. Install the Gnome SDK public key into  /usr/share/ostree/trusted.gpg.d, (or alternatively, use –no-gpg-verify when you add the remote below).
  2. Install the basic Gnome and freedesktop runtimes:
    $ xdg-app add-remote --user gnome-sdk http://sdk.gnome.org/repo/
    $ xdg-app install-runtime --user gnome-sdk org.gnome.Platform 3.16
    $ xdg-app install-runtime --user gnome-sdk org.freedesktop.Platform 1.0
  3. Optionally install some locale packs:
    $ xdg-app install-runtime --user gnome-sdk org.gnome.Platform.Locale.se 3.16
    $ xdg-app install-runtime --user gnome-sdk org.freedesktop.Platform.Locale.se 1.0
  4. Install some apps from my repository of test apps:
    $ xdg-app add-remote --user --no-gpg-verify test-apps https://people.gnome.org/~alexl/test-apps/repo/
    $ xdg-app install-app --user test-apps org.gnome.gedit
    $ xdg-app install-app --user test-apps org.freedesktop.glxgears
  5. Run the apps! You should find gedit listed among the regular applications in the shell as it exports a desktop file. But you can also run them manually like this:
    $ xdg-app run org.gnome.gedit
    $ xdg-app run org.freedesktop.glxgears
  6. I also packaged the latest gnome builder from git. It requires the full sdk which takes a bit longer to download:
    $ xdg-app install-runtime --user gnome-sdk org.gnome.Sdk 3.16
    $ xdg-app install-app --user test-apps org.gnome.Builder

All the above install the apps into your home-directory (in ~/.local/share/xdg-app) . You can also run the commands as root and skip the –user arguments to do system-wide application installs.

Future work

With the basics now laid down to run current applications in a minimally isolated environment the next step is to work on the sandboxing aspects more. This will require lots of work, both in the system side (things like kdbus), the desktop (add sandbox aware APIs, make pulseaudio protect clients from each other, etc)  and in modifying applications.

If you’re interested in this, you can follow the work on the wiki.

Building your own apps

If you download the SDKs you have enough tooling to build your own applications. There are some documentations on how to do this here.

I also created a git repository with the scripts I used to build the test applications above. It uses the gnome-sdk-bundles repostory which has some tooling and specfiles to easily bundle dependencies with the application.

Building the SDK

If you ever want to build the SDK yourself, it is available at:

https://git.gnome.org/browse/gnome-sdk-images

This repository contains the desktop specific parts of the SDK, which is layered on a core Yocto layer. When you build the SDK this will be automatically checked out and built from:

https://git.gnome.org/browse/freedesktop-sdk-base

However, if you don’t want to build all of this you can download the pre-build images from http://sdk.gnome.org/images/x86_64/ and put them in the freedesktop-sdk-base/images/x86_64 subdirectory of gnome-sdk-images. This can save you a lot of time and space.

March 22, 2015

Fedora at Midwest Rep Rap Fest 2015

I attended Midwest Rep Rap Fest 2015 this weekend, in Goshen, Indiana. Goshen is about 45 minutes outside of South Bend (the nearest regional airport). This part of Indiana is noteworthy for a few reasons, including the fact that Matthew Miller, the Fedora Project Leader, is from there. It also has a very large Amish population, which makes it one of the few places I've attended a conference where most of the local businesses have a place to tie up your horses. The Midwest Rep Rap Fest is an event dedicated to Open Source 3d printers (and their surrounding ecosystem). The primary sponsor of the event is SeeMeCNC, a local vendor that makes open source hardware delta 3d printers. A Delta printer is a 3d printer with a circular stationary bed. Attached to the bed are three vertical rods which serve as tracks for three geared motors. The motors move up and down the rods, and are connected to a central extruder which hangs down the center. The extruder is moved in three dimensions by moving the supports along their tracks. Watching a Delta 3d printer do its thing is pretty amazing, it seems to dance like a trapeze artists as it dips and swoops to print the object.

The Delta type of 3d printer was the most common printer at the event, many people had either bought SeeMeCNC printers or had built their own off their open source design. The SeeMeCNC team brought their super-sized Delta, which they think is the largest Delta printer in the world. It was easily 30 feet tall and barely fit in the building we were using (which is saying something, because we were in an exhibition hall at the local state fairgrounds). The owner of the company decided to see how big of a Delta printer he could build, and this was the result!



The printer used a shop vac to blow plastic pellets up a plastic hose into the giant heated end. Originally, they were trying to print a giant model of Groot (shown in progress in my picture above), but they had to leave it running overnight on Friday and when we came back Saturday morning, the print had failed because it had run out of plastic pellets! Later on, they printed a very large basket/vase with it (after fixing it so that it wouldn't run out of plastic).

Fedora had a table in the main room. I brought two open source 3d printers from Lulzbot and controlled them both from my laptop running Fedora 21. My larger printer, the Taz 4, was configured with a dual extruder addon, and I spent four hours on Friday calibrating it to print properly. On Saturday morning, I printed my first completely successful dual color print, a red and white tree frog!



The eyes didn't come out perfect, but it all came out aligned and in one piece. Several people offered me tips and advice on how to improve the print quality with the dual-extruder setup. One of the nice things about the Rep Rap fest was the extremely friendly nature of the community. Everyone was eager to help everyone else solve problems or improve their printers/prints. I used Pronterface to control the Taz 4, since it was better suited to handle the dual extruder controls.

My smaller printer, the Lulzbot Mini, was controlled with Cura-Lulzbot (a package which got added to Fedora a few days before the show!). Cura has a very fast and high quality slicer, but with less options for tweaking it than slic3r (the traditional open source slicing tool) does. 3d printers depend on a slicing tool to take a 3d model and convert it into the GCode machine instructions that tell the printer where to move and when to extract plastic. Cura also has a more polished UI than Pronterface.

The Lulzbot Mini is able to self level, self clean, and self calibrate, which almost eliminates the prep time before a print! One of the vendors at the show was Taulman, who is constantly innovating new filaments for 3d printing. They announced a new filament the weekend of the Rep Rap Fest, 910, and they gave me a sample to try out on the Mini. The Mini can print filaments with a melting point of 300 degrees Celsius or less, so it was well suited for the 910. 910 was interesting because it was incredibly strong, almost as good as polycarbonate! It was also translucent, which made it ideal for me to finish a project I've been working on for a long time: my 3d printed TARDIS model!



I printed four window panels and a topper piece for the lantern on the roof. A few other people had TARDIS models (including one that had storage drawers inside it), but mine was the biggest (and I think, the nicest).

One of Fedora's neighbors was mUVe, an open source SLA 3d printer. SLA 3d printers use a liquid resin and a DLP projector to make incredibly accurate 3d models that would be difficult or impossible to print on other kinds of 3d printers. It seemed like everyone was printing the same Groot model at the event, and they printed one that came out looking incredible. The inventor of the hardware was working their table, and we talked for a while about the importance of open source in hardware. He felt strongly that it was mandatory for him to release his work into open source so that other people could innovate and improve upon the designs he'd created. The mUVe printer was one of the largest SLA printers I've ever seen and the quality of its prints was amazing. The biggest downside is the complexity, it involves chemicals in the resin and in curing the prints once they have finished, but in my opinion, it was worth it. The cost was in the $1500-2000 price range, but he said he's working on something awesome that will bring that cost down. They used Creation Workshop to slice and control their printer, which was new to me, but it was also open source. It's C# though, but I want to see if I can get it working in Mono on Fedora. (They were also in the greater Detroit area, so I encouraged them to come out and demo it at Penguicon!)



Another neighbor had 3d printed an amazingly intricate "home clock". They had used a famous woodworking pattern, converted each of the pieces to a 3d model, then printed them. Each piece was then smoothed and attached together. The only piece they didn't print was the clock at the center! On the table, the top of the clock was taller than me (and I'm 6'4"). It didn't look 3d printed, it looked too nice! It took them 3 months to print it all. The owner said that if you're able to cut this model from wood and assemble it properly, you're considered to be a master in their community. Everyone was definitely in awe of it in this community.



It seemed like everyone showing off something at this event had a clever hack of their own. Some people were creating amazing models, some people had built new open source printers. One printer had color changing LED strips attached underneath it which changed from red to green to indicate the progress of the printing job. Another printer had a Raspberry Pi with camera wired into it so you had a "printer's eye view" as it printed. There was a custom 3d scanner designed to scan people's heads and torsos to make printable busts. There was even a printer that looked like some sort of industrial robot gone mad! The one thing these all had in common? They were open source. No one here was questioning open source, it was just the way they operated, sharing what they knew and building off each other's successes (and failures). There were a few MakerBot Replicators, but all of them had been hacked in some way.

Attendance at this years event was both up and down. There were more people and companies exhibiting at the event, including Texas Instruments, Hackaday, Lulzbot, Taulman, and Printed Solid. Printed Solid was giving out free samples of some amazing ColorFabb filament. I came home with some BronzeFill (prints into a bronze like material that when polished is heavy and shiny), a new flexible filament, and some carbon-fiber infused filament! They also had some really fantastic glow in the dark filament, but no samples of that were available (and I didn't have the spare cash to buy a full spool). General attendance at the event was about 750 people, which was down from last year (around 1000). The general consensus was that the event wasn't doing all it could to advertise itself, and the location wasn't exactly optimal (45 minutes from the nearest regional airport, almost 2 hours from a major airport). The majority of visitors were local to the Indiana/Michigan area. The event staff said that next year they plan on rebranding the event to a more general FOSS 3d printing event (not limiting themselves to the Midwest region of the US). I think that is the right decision, since they are the only open source 3d printing event that I'm aware of, and I'd really love to see them grow into something bigger and more accessible.

Oh, did I mention we had a celebrity at the event? Ben Heck was there with his Delta printer! He's built a pinball machine. I might want to be him a little bit (but I'm not). He was very friendly and cool, spent a lot of time talking to the other makers and attendees.

Thanks to Ben Williams, Fedora had a very nice booth setup. We had our Fedora tablecloth and lots of stickers to give away. I brought a good sampling of models I'd printed with Fedora and my 3d printers, and I had a lot of good conversations about using Linux and open source to power 3d printing and 3d model creation. My coworker (and celebrity writer) Brian Proffitt stopped by on Saturday and helped out at the table for a while. I was supposed to have Fedora 21 media to hand out, but the promised shipment never arrived. The computers there were a mix of Windows and Linux, very few Macs in this community. Several people were using Fedora, but most of the Linux instances were Debian.



The Fedora event box needs a little love, there wasn't very much in it that was useful anymore. The OLPC in it is very old now, and since the current OLPC hardware runs Android these days, it isn't as "cool" as it used to be. I restocked it with Fedora bubble stickers, but it probably needs a plan to revitalize it.

All in all, it was a very fun weekend event and a great opportunity to connect with the open source 3d printer community. I think it is the responsibility of Fedora (and Red Hat) to reach out to the maker communities and help them be open source in their own ways, and this was an excellent opportunity to do exactly that. Is there a Maker event happening somewhere near you? You can sign up to represent Fedora at that event like I did at MRRF: Fedora Event Calendar

March 16, 2015

virgil3d local rendering test harness

So I've still been working on the virgil3d project along with part time help from Marc-Andre and Gerd at Red Hat, and we've been making steady progress. This post is about a test harness I just finished developing for adding and debugging GL features.

So one of the more annoying issuess with working on virgil has been that while working on adding 3D renderer features or trying to track down a piglit failure, you generally have to run a full VM to do so. This adds a long round trip in your test/development cycle.

I'd always had the idea to do some sort of local system renderer, but there are some issues with calling GL from inside a GL driver. So my plan was to have a renderer process which loads the renderer library that qemu loads, and a mesa driver that hooks into the software rasterizer interfaces. So instead of running llvmpipe or softpipe I have a virpipe gallium wrapper, that wraps my virgl driver and the sw state tracker via a new vtest winsys layer for virgl.

So the virgl pipe driver sits on top of the new winsys layer, and the new winsys instead of using the Linux kernel DRM apis just passes the commands over a UNIX socket to a remote server process.

The remote server process then uses EGL and the renderer library, forks a new copy for each incoming connection and dies off when the rendering is done.

The final rendered result has to be read back over the socket, and then the sw winsys is used to putimage the rendering onto the screen.

So this system is probably going to be slower in raw speed terms, but for developing features or debugging fails it should provide an easier route without the overheads of the qemu process. I was pleasantly surprised it only took two days to pull most of this test harness together which was neat, I'd planned much longer for it!

The code lives in two halves.
http://cgit.freedesktop.org/~airlied/virglrenderer
http://cgit.freedesktop.org/~airlied/mesa virgl-mesa-driver

[updated: pushed into the main branches]

Also the virglrenderer repo is standalone now, it also has a bunch of unit tests in it that are run using valgrind also, in an attempt to lock down some more corners of the API and test for possible ways to escape the host.

March 13, 2015

LSF/MM 2015 recap.

It’s been a long week.
Spent Monday/Tuesday at LSFMM. This year it was in Boston, which was convenient in that I didn’t have to travel anywhere, but less convenient in that I had to get up early and do a rush-hour commute to get to the conference location in time. At least the weather got considerably better this week compared to the frankly stupid amount of snow we’ve had over the last month.
LWN did their usual great write-up which covers everything that was talked about in a lot more detail than my feeble mind can remember.

A lot of things from last years event seem to still be getting a lot of discussion. SMR drives & persistent memory being the obvious stand-outs. Lots of discussion surrounding various things related to huge pages (so much so one session overran and replaced a slot I was supposed to share with Sasha, not that I complained. It was interesting stuff, and I learned a few new reasons to dislike the way we handle hugepages & forking), and I lost track how many times the GFP_NOFAIL discussion came up.

In a passing comment in one session, one of the people Intel sent (Dave Hansen iirc) mentioned that Intel are now shipping a 18 core/36 thread CPU. A bargain at just $4642. Especially when compared to this madness.

A few days before the event, I had been asked if I wanted to do a “how Akamai uses Linux” type talk at LSFMM, akin to what Chris Mason did re: facebook at last years event. I declined, given I’m still trying to figure that out myself. Perhaps another time.

Wednesday/Thursday, I attended Vault at the same location.
My take-away’s:

  • There was generally a lot more positive vibes around btrfs this year. Even with Josef playing bad cop to Chris’ good cop talk, things generally seemed to be moving away from a “everything is awful” toward “this actually works…” though with the qualifier “.. for facebook’s workload”. Josef did touch on one area that btrfs does still suck, which apparently is database workloads (iirc, due to the copy-on-write nature of btrfs). The spurious ENOSPC failures of the past should hopefully stay in the past. Things generally on the up and up. (Though, this does include the linecount, which has now passed 100KLOC, more than double that of XFS or ext*. Scary).
  • Equally positive vibes surrounding XFS. We celebrated the 20 year anniversary at one evening event, making us all feel just that little bit more like an old fart club. Interesting talk toward the end by Dave Chinner about the future of XFS, and how the current surge of development in XFS is probably its last for various scaling reasons as disks continue to get bigger and bigger. Predicting the future is always hard, but if what Dave said was true, things will start to get ‘interesting’ in about 5 years time, given every other filesystem we support in Linux has the same issues (or worse).
  • People still care a lot about NFS. Especially pNFS. Surprising amount of activity still happening.
  • Even when I worked there, I never really got Red Hat’s “big picture” wrt the several distributed filesystems they supported. Now that I’m not there, I feel even more out of the loop. “ceph is the way forward” “except when it’s glusterfs” or something. Oh, and GFS2 is still a thing apparently, for some reason.
  • As entertaining as Jeremy Allison might be, don’t go to a talk on Samba internals unless you work on it (in which case it’s too late for you). The horrors will likely keep you up at night.
  • Ted’s ext4 talk drew a decent crowd. As fancy as btrfs/xfs etc might be, a *lot* of people still give a crap about extN. Somehow I missed the addition of the ‘lazytime’ option to ext4. Seems neat. Played with it (and also the super-secret ‘dioread_nolock’ mount option). Saw another talk on orphan list scalability in ext4, which was interesting, but didn’t draw as big a crowd.

I got asked “What are you doing at Akamai ?” a lot. (answer right now: trying to bring some coherence to our multiple test infrastructures).
Second most popular question: “What are going to do after that ?”. (answer: unknown, but likely something more related to digging into networking problems rather than fighting shell scripts, perl and Makefiles).

All that, plus a lot of hallway conversations, long lunches, and evening activities that went on possibly a little later than they should have have led to me almost losing my voice today.
Really good use of time though. I had fun, and it’s always good to catch up with various people.

The post LSF/MM 2015 recap. appeared first on codemonkey.org.uk.

March 03, 2015

Trinity 1.5 release.

As announced this morning, today I decided that things had slowed down (to an almost standstill of late) enough that it was worth making a tarball release of Trinity, to wrap up everything that’s gone in over the last year.

The email linked above covers most of the major changes, but a lot of the change over the last year has actually been groundwork for those features. Things like..

  • The post-mortem dumper needed the generation of the text and the writing to log files to be decoupled, which wasn’t particularly trivial.
  • Some features involved considerable rewrites. The fd generators are now pretty much isolated from each other, making adding a new one a simple task.
  • Handling of the mapping structs got a lot of cleanup (though there is definitely still a lot of room for improvement there, especially when we do things like splitting a mapping).
  • I should also mention the countless hours spent chasing down quite a few hard-to-reproduce bugs that are fixed in 1.5

As I mentioned in the announcement, I don’t see myself having a huge amount of time for at least this year to work on Trinity. I’ve had a number of people email me asking the status of some feature. Hopefully this demarkation point will answer the question.

So, it’s not abandoned, it just won’t be seeing the volume of change it has over the last few years. I expect my personal involvement will be limited to merging patches, and updating the syscall lists when new syscalls get added.

Trinity used to be on roughly a six month release schedule. We’ll see if by the end of the year there’s enough input from other people to justify doing a 1.6 release.

I’m also hopeful that time working on other projects mean I’ll come back to this at some point with fresh eyes. There are a number of features I wanted to implement that needed a lot more thought. Perhaps working on some other things for a while will give me the perspective necessary to realize those features.

The post Trinity 1.5 release. appeared first on codemonkey.org.uk.

February 23, 2015

backup solutions.

For the longest time, my backup solution has been a series of rsync scripts that have evolved over time into a crufty mess. Having become spoiled on my mac with time machine, I decided to look into something better that didn’t involve a huge time investment on my part.

The general consensus seemed to be that for ready-to-use home-nas type devices, the way to go was either Synology, or Drobo. You just stick in some disks, and setup NFS/SAMBA etc with a bunch of mouse clicking. Perfect.

I had already decided I was going to roll with a 5 disk RAID6 setup, so bit the bullet and laid down $1000 for a Synology 8-Bay DS1815+. It came *triple* boxed, unlike the handful of 3TB HGST drives.
I chose the HGST’s after reading backblaze’s report on failure rates across several manufacturers, and figured that after the RAID6 overhead, 8TB would be more than enough for a long time, even at the rate I accumulate flac and wav files. Also, worst case, I still had 3 spare bays I could expand into later if needed.

Installation was a breeze. The plastic drive caddies felt a little flimsy, but the drives were secure once in them, even if they did feel like they were going to snap as I flexed them to pop them into place. After putting in all the drives, I connected the four ethernet ports, I powered it up.
After connecting to its web UI, it wanted to do a firmware update, like just about every internet connected device wants to do these days. It rebooted, and finally I could get about setting things up.

On first logging into the device over ssh, I think the first command I typed was uname. Seeing a 3.2 kernel surprised me a little. I got nervous thinking about how many VFS,EXT4,MD bugfixes hadn’t made their way back to long-term stable, and got the creeps a little. I decided to not think too much about it, and put faith in the Synology people doing backports (though I never got as far as looking into their kernel package).

The web ui is pretty slick, though felt a little sluggish at times. I set up my RAID6 volume with a bunch of clicks, and then listened as all those disks started clattering away. After creation, it wanted to do an initial parity scan. I set it going, and went to bed. The next morning before going to work, I checked on it, and noticed it wasn’t even at 20% done. I left it going while I went into the office the next day. I spent the night away from home, and so didn’t get back to it until another day later.

When I returned home, the volume was now ready, but I noticed the device was now noticeably hotter to touch than I remembered. I figured it had been hammering the disks non-stop for 24hrs, so go figure, and that it would probably cool off a little as it idled. As the device was now ready for exporting, I set up an nfs export, and then spent some time fighting uid mappings, as you do. The device does have ability to deal with LDAP and some other stuff that I’ve never had time to setup, so I did things the hard way. Once I had the export mounted, I started my first rsync from my existing backups.

While it was running, I remembered I had intended to set up bonding. A little bit of clicky-clicky later, it was done, and transfers started getting even faster. Very nice. I set up two bonds, with a pair of NICs in each. Given my desktop only has a dual NIC, that was good enough. Having a 2nd 2GigE bond I figured was nice in case I had multiple machines wanting to use it while I was doing a backup.

So the backup was going to take a while, so I left it running.
A few hours later, I got back to it, and again, it was getting really hot. There are two pretty big fans in the back of the units, and they were cranking out heat. Then, things started getting really weird. I noticed that the rsync had hung. I ctrl-c’d it, and tried logging into the device as root. It took _minutes_ to get a command prompt. I typed top and waited. About two minutes later top started. Then it spontaneously rebooted.

When it came back up, I logged in, and poked around the log files, and didn’t see anything out of the ordinary.
I restarted the rsync, and left it go for a while. About 20 minutes later, I came back to check on it again, and found that the box had just hung completely. The rsync was stalled, I couldn’t ssh in. I rebooted the device, cursed a bit, and then decided to think about it for a while, so never restarted the rsync. I clicked around in the interface, to see if there was anything I could turn on/off that would perhaps give me some clues wtf was going on.
Then it rebooted spontaneously again.

It was about this time I was ready to throw the damn thing out the window. I bought this thing because I wanted a turn-key solution that ‘just worked’, and had quickly come to realize that with this device when something went bad, I was pretty screwed. Sometimes “It runs Linux” just isn’t enough. For some people, the Synology might be a great solution, but it wasn’t for me. Reading some of the Amazon reviews, it seems there were a few people complaining about their units overheating, which might explain the random reboots I saw. For a device I wanted to leave switched on 24/7 and never think about, something that overheats (especially when I’m not at home) really doesn’t give me feel good vibes. Some of the other reviews on Amazon rave about the DS1815+. It may be that there was a bad batch, and I got unlucky, but I felt burnt on the whole experience, and even if I had got a replacement, I don’t know if I would have felt like I could have trusted this thing with my data.

I ended up returning it to Amazon for a refund, and used the money to buy a motherboard, cpu, ram etc to build a dedicated backup computer. It might not have the fancy web ui, and it might mean I’ll still be using my crappy rsync scripts, but when things go wrong, I generally have a much better chance of fixing the problems.

Other surprises: At one point, I opened the unit up to install an extra 4GB of RAM (It comes with just 2GB by default), I noticed that it runs off a single 250W power supply, which seemed surprising to me. I thought disks during spin-up used considerably more power, but apparently they’re pretty low power these days.

So, two weeks of wasted time, frustration, and failed experiments. Hopefully by next week I’ll have my replacement solution all set up and can move on to more interesting things instead of fighting appliances.

The post backup solutions. appeared first on codemonkey.org.uk.

February 17, 2015

First fully sandboxed Linux desktop app

Its not a secret that I’ve been working on sandboxed desktop applications recently. In fact, I recently gave a talk at devconf.cz about it. However, up until now I’ve mainly been focusing on the bundling and deployment aspects of the problem. I’ve been running applications in their own environment, but having pretty open access to the system.

Now that the basics are working it’s time to start looking at how to create a real sandbox. This is going to require a lot of changes to the Linux stack. For instance, we have to use Wayland instead of X11, because X11 is impossible to secure. We also need to use kdbus to allow desktop integration that is properly filtered at the kernel level.

Recently Wayland has made some pretty big strides though, and we now have working Wayland sessions in Fedora 21. This means we can start testing real sandboxing for simple applications. To get something running I chose to focus on a game, because they require very little interaction with the system. Here is a video I made of Neverball, running in a minimal sandbox:

Click here to view the embedded video.

In this example we’re running a regular build of neverball in an environment which:

  • Is independent of the host distribution
  • Has no access to any system or user files other than the ones from the runtime and application itself
  • Has no access to any hardware devices, other than DRI (for GL rendering)
  • Has no network access
  • Can’t see any other processes in the system
  • Can only get input via Wayland
  • Can only show graphics via Wayland
  • Can only output audio via PulseAudio
  • … plus more sandboxing details

Yet the application is still simple to install and integrates nicely with the desktop. If you want to test it yourself, just follow the instructions on the project page and install org.neverball.Neverball.

Of course, there are still a lot to do here. For instance, PulseAudio doesn’t protect clients from each other, and for more complex applications we need to add new APIs to safely grant access to things like user files and devices. The sandbox details page has a more detailed list of what has to be done.

The road is long, but at least we have now started our journey!

February 16, 2015

NetworkManager for Administrators Part 1

4870003098_26ba44a08a_b(via scobleizer, CC BY 2.0)

NetworkManager is a system service that manages network interfaces and connections based on user or automatic configuration. It supports Ethernet, Bridge, Bond, VLAN, team, InfiniBand, Wi-Fi, mobile broadband (WWAN), PPPoE and other devices, and supports a variety of different VPN services.  You can manage it a couple different ways, from config files to a rich command-line client, a curses-like client for non-GUI systems, graphical clients for the major desktop environments, and even web-based management consoles like Cockpit.

There’s an old perception that NetworkManager is only useful on laptops for controlling Wi-Fi, but nothing could be further from the truth.  No laptop I know of has InfiniBand ports.  We recently released NetworkManager 1.0 with a whole load of improvements for workstations, servers, containers, and tiny systems from embedded to RaspberryPi.  In the spirit of making double-plus sure that everyone knows how capable and useful NetworkManager is, let’s take a magical journey into Administrator-land and start at the very bottom…

Daemon Configuration Files

Basic configuration is stored in /etc/NetworkManager/NetworkManager.conf in a standard key/value ini-style format.  The sections and values are well-described by ‘man NetworkManager.conf’.  A standard default configuration looks like this:

[main]
plugins=ifcfg-rh

You can override default configuration through either command-line switches or by dropping “configuration snippets” into /etc/NetworkManager/conf.d.  These snippets use the same configuration options from ‘man NetworkManager.conf’ but are much easier to distribute among larger numbers of machines though packages or tools like Puppet, or even just to install features through your favorite package manager.  For example, in Fedora, there is a NetworkManager-config-connectivity-fedora RPM package that installs a snippet that enables connectivity checking to Fedora Project servers.  If you don’t care about connectivity checking, you simply ‘rpm -e NetworkManager-config-connectivity-fedora’ instead of tracking down and deleting /etc/NetworkManager/conf.d/20-connectivity-fedora.conf.

Just for kicks, let’s take a walk through the various configuration options, what they do, and why you might care about them in a server, datacenter, or minimal environment…

Configuration Snippets

First, each configuration “snippet” in /etc/NetworkManager/conf.d can override values set in earlier snippets, or even the default configuration (but not command-line options).  So the same option specified in 50-foobar.conf will override that option specified in 10-barfoo.conf.  Many options also support the “+” modifier, which allows their value to be added to earlier ones instead of replacing.  So “plugins+=something-else” will add “something-else” to the list, instead of overwriting any earlier values.  You’ll see why this is quite useful in a minute…

Dive Deep

[main]
plugins=ifcfg-rh | ifupdown | ifnet | ifcfg-suse | ibft (default empty)

This option enables or disables certain settings plugins, which are small loadable libraries that read and write distribution-specific network configuration.  For example, Fedora/RHEL would specify ‘plugins=ifcfg-rh’ for reading and writing the ifcfg file format, while Debian/Ubuntu would use ‘plugins=ifupdown’ for reading /etc/network/interfaces, and Gentoo would use ‘plugins=ifnet’.  If you know your distro’s config format like the back of your hand, NetworkManager doesn’t make you change it.

There is one default plugin though, ‘keyfile’, which NetworkManager uses to read and write configurations that the distro-specific plugins can’t handle.  These files go into /etc/NetworkManager/system-connections and are standard .ini-style key/value files.  If you’re interested in the key and value definitions, you can check out ‘man nm-settings’ and ‘man nm-settings-keyfiles’, or even look at some examples.

[main]
monitor-connection-files=yes | no (default no)

By popular demand, NetworkManager no longer watches configuration files for changes.  Instead, you make all the changes you want, and then explicitly tell NetworkManager when you’re done with “nmcli con reload” or “nmcli con load <filename>”.  This prevents reading partial configuration and allows you to double-check that everything is correct before making the configuration update.  Note that changes made through the D-Bus interface (instead of the filesystem) always happen immediately.

However, if you want the old behavior back, you can set this option to “yes”.

[main]
auth-polkit=yes | no (default yes)

If built with support for it, NetworkManager uses PolicyKit for fine-grained authorization of network actions.  This will be the subject of another article in this series, but the TLDR is that PolicyKit easily allows user A the permission to use WiFi while denying user B WiFi but allowing WWAN.  These things can be done with Unix groups, but that quickly gets unwieldy and isn’t fine-grained enough for some organizations.  In any case, PolicyKit is often unecessary on small, single-user systems or in datacenters with controlled access.  So even if your distribution builds NetworkManager with PolicyKit enabled, you can turn it off for simpler root-only operation.

[main]
dhcp=dhclient | dhcpcd | internal (default determined at build time, dhclient preferred if enabled)

With NetworkManager 1.0 we’ve added a new internal DHCP client (based off systemd code which was based off ConnMan code) which is smaller, faster, and lighter than dhclient or dhcpcd.  It doesn’t do DHCPv6 yet, but we’re working on that.  We think you’ll like it, and it’s certainly much less of a resource hog than a dhclient process for every interface. To use it, set this option to “internal” and restart NetworkManager.

If NetworkManager was built with support for dhclient or dhcpcd, you can use either of these clients by setting this option to the client’s name.  Note that if you enable both dhclient and dhcpcd, dhclient will be preferred for maximum compatibility.

[main]
no-auto-default= (default empty)

By default, NetworkManager will create an in-memory DHCP connection for every Ethernet interface on your system, which ensures that you have connectivity when bringing a new system up or booting a live DVD.  But that’s not ideal on large systems with many NICs, or on systems where you’d like to control initial network bring-up yourself.  In that case, you should set this option to “*” to disable the auto-Ethernet behavior for all interfaces, indicating that you’d like to create explicit configuration instead.  You can also use MAC addresses or interface names here too!  On Fedora we’ve created a package called NetworkManager-config-server that sets this option to “*” by default.

[main]
ignore-carrier= (default empty)

Trip over a cable?  Want to make sure a critical interface stays configured if the switch port goes down?  This option is for you!  Setting it to “*” (all interfaces) or using MAC addresses or interface names here will tell NetworkManager to ignore carrier events after the interface is configured.  For DHCP connections a carrier is obviously required for initial configuration, while static connections can start regardless of carrier status.  After that, feel free to unplug the cable every time Apple sells an iPhone!

[main]
configure-and-quit=yes | no (default no)

New with 1.0 is the “configure and quit” mode where NetworkManager configures interfaces (including, if desired, blocking startup until networking is active) and then quits, spawning small helpers to maintain DHCP leases and IPv6 address lifetimes if required.  In a datacenter or cloud where cycles are money, this can save you some cash and deliver a more stable setup with known behavior.

[main]
dns=dnsmasq | unbound | none | default (default empty, equivalent to “default”)

Want to control DNS yourself?  NetworkManager makes it easy!  Don’t want to?  NetworkManager makes that easy too! When you set this option to ‘dnsmasq’ NetworkManager will configure dnsmasq as a local caching nameserver, including split DNS for VPN tunnels.  If you set it to ‘none’ then NetworkManager won’t touch /etc/resolv.conf and you can use dispatcher scripts that NetworkManager calls at various points to set up DNS any way you choose.

Leaving the option empty or setting it to “default” asks NetworkManager to own resolv.conf, updating system DNS with any information from your explicit network settings or those received from automatic means like DHCP.

In the upcoming NetworkManager 1.2, DNS information is written to /var/lib/NetworkManager/resolv.conf and, if NM is allowed to manage /etc/resolv.conf, that file will be a symlink to the one in /var similar to systemd-resolvd.  This makes it easier for external tools to incorporate the DNS information that NetworkManager combines from multiple sources like DHCP, PPP, IPv6, VPNs, and more.

[keyfile]
unmanaged-devices= (default empty)

Want to keep NetworkManager’s hands off a specific device?  That’s what this option is for, where you can use “interface-name:eth0″ or “mac:00:22:68:1c:59:b1″ to prevent automatic management of a device.  While there are some situations that require this, by default NetworkManager doesn’t touch virtual interfaces that it didn’t create, like bridges, bonds, VLANs, teams, macvlan, tun, tap, etc.  So while it’s unusual to need this option, we realize that NetworkManager can be used in concert with other tools, so it’s here if you do.

[connectivity]
uri=  (default empty = disabled)
interval=(default 0 = disabled)
response=  (default “NetworkManager is online”)

Connectivity checking helps users log into captive ports and hotspots, while also providing information about whether or not the Internet is reachable.  When NetworkManager connects a network interface, it sends an HTTP request to the given URI and waits for the specified response.  If you’re connected to the Internet and the connectivity server isn’t down, the response should match and NetworkManager will change state from CONNECTED_SITE to CONNECTED.  It will also check connectivity every ‘interval’ seconds so that clients can report status to the user.

If you’re instead connected to a WiFi hotspot or some kind of captive portal like a hotel network, your DNS will be hijacked and the request will be redirected to an authentication server.  The response will be unexpected and NetworkManager will know that you’re behind a captive portal.  Clients like GNOME Shell will then indicate that you must authenticate before you can access the real Internet, and could provide an embedded web browser for this purpose.

Upstream connectivity checking is disabled by default, but some distribution variants (like Fedora Workstation) are now enabling it for desktops, laptops, and workstations.  On a server or embedded system, or where traffic costs a lot of money, you probably don’t want this feature enabled.  To turn it off you can either remove your distro-provided connectivity package (which just drops a file in /etc/NetworkManager/conf.d) or you can remove the options from NetworkManager.conf.

Special NetworkManager data files

In the normal course of network management sometimes non-configuration data needs to persist.  NetworkManager does this in the /var/lib/NetworkManager directory, which contains a few different files of interest:

seen-bssids

This file contains the BSSIDs (MAC addresses) of WiFi access points that NetworkManager has connected to for each configured WiFi network.  NetworkManager doesn’t do this to spy on you (and the file is readable only by root), but instead to automatically connect to WiFi networks that do not broadcast their SSID.  You almost never need to touch this file, but if you are concerned about privacy feel free to delete this file periodically.

timestamps

Each time you connect to a network, whether wired, WiFi, etc, NetworkManager updates the timestamp in this file.  This allows NetworkManager to determine which network you last used, which can be used to automatically connect you to more preferred networks.  NetworkManager also uses the timestamp as an indicator that you have successfully connected to the network before, which it uses when deciding whether or not to ask for your WiFi password when you get randomly disconnected or the driver fails.

NetworkManager.state

This file stores persistent user-determined state for Airplane mode for each technology like WiFi, WWAN, and WiMAX.  Normally this is controlled by hardware buttons, but some systems don’t have hardware buttons or the drivers don’t work, plus that state is not persistent across boots.  So NetworkManager stores a user-defined state for each radio type and will ensure the radio stays in that state across reboots too.

DHCP lease and configuration files

When you obtain a DHCP lease, that lease may last longer than your connection to that network.  To ensure that you receive a nominally stable IP address the next time you connect, or to ensure that your TCP sessions are not broken if there is a network hiccup, NetworkManager stores the DHCP lease and attempts to acquire the same lease again.  These files are stored per-connection to ensure that a lease acquired on your home WiFi or ethernet network is not used for work or Starbucks.  Temporary DHCP configuration files are also stored here, which are constructed based on your preferences and on generic DHCP configuration files in /etc for each supported DHCP client.  If you want to wipe the DHCP slate clean, feel free to remove any of the lease or configuration files.

And that’s it for this time, stay tuned for the next part in this series!

January 30, 2015

Back from DevX hackfest

I’m now back from a week in Cambridge at the developer experience hackfest. This was a great event, it was a lot of fun to meet people again, and we got a lot of things done. I spent a lot of time talking to people about things related to xdg-app and sandboxed applications, both spreading information and actually implementing features.

I spent some time with Emmanuele, Ryan and Lars working on glib stuff, which resulted in the G_DECLARE_*_TYPE macros finally being merged. Additionally I reviewed the new list model abstraction which I hope we can land soon, and Ryan and I worked out a new fancy __attribute__(cleanup) approach that we hope to merge into glib soon.

We also worked a bit on Gtk+ OpenGL support. Based on feedback from early users we’re doing some changes in how GL contexts are created to allow you to configure them in more detail. We also decided that we want to completely drop support for legacy OpenGL contexts, as these had issues cooperating with Core 3.2 contexts, and because we don’t live in the 90s anymore. Carlos was working on converting GtkPopover to use (override redirect) toplevels on X11, and I gave him moral support and generally hated on ancient crappy X11 behaviour.

Props to Collabora and Philip for arranging a great event!

January 26, 2015

So much to learn..

At lunch a few days ago, I was discussing with a coworker how right now I’m feeling a little overwhelmed. Not completely unsurprising I suppose given the volume of new things I need to learn. But as I’m getting acclimated in my new job, it’s becoming clearer which things I either don’t know well enough or am completely unfamiliar with.

I’m only now realizing that not only the scope of what I’m working on changed, but how I work on things. During the ten years I worked on Fedora, because we were woefully understaffed and buried alive in bugs, there was never really time to spend a lot of time on an individual bug, unless a lot of users were hitting it. Because of this, the things that I ended up fixing were usually fairly small fixes. The occasional NULL pointer deference. Perhaps a use after free. Maybe linked-list corruption. Pretty basic stuff that anyone with familiarity with any part of the kernel could figure out reasonably quickly. Do enough of this, and it becomes less about engineering, and more about pattern recognition.

Even a lot of the bugs that trinity finds aren’t terribly invasive fixes.
(Screwed up error paths being probably the most common thing it picks up, because nothing else really ever tests them).

The more complicated bugs, like a WARN_ON being hit ? That could take a lot more understanding that could take a considerable time to get to.

As Fedora kernels were so close to mainline a lot of the time we could chase up the maintainer, pass the bug along, and move onto something else. The result of ten years of this way of working has meant I have a passing understanding how lots of parts of the kernel interact from a 10,000ft perspective, and perhaps some understanding of some nuances based on past interactions, but no deep architectural understanding of for eg: how an sk_buff traverses through the various parts of the network stack. A warning deep in the tcp guts, caused by a packet that’s been through bonding, netfilter, etc ? I’m not your guy. At least today.

Thankfully I’ve got some time to ramp up my learning on unfamiliar parts of the kernel, and get a better understanding of how things work on a deeper level. I suspect I’ll end up turning at least some of the stuff I learn into future posts here.

In the meantime, I’ve got a lot of reading to do.
I felt a little reassured at least when my coworker responded “yeah, me too”.

The post So much to learn.. appeared first on codemonkey.org.uk.

January 21, 2015

Thoughts on long-term stable kernels.

I remembered something that I found eye-opening while interviewing/phone-screening over the last few months.

The number of companies that base their systems (especially those that don’t actually distribute their kernels outside the company) on the long-term stable releases of Linux caught me by surprise. We dismissed the idea of basing on long-term stable releases in Fedora after giving it a try circa Fedora 14, and it generally being a disaster because the bugs being fixed didn’t match up much to the bugs our users were seeing. We found that we got more bugs we cared about being fixed by sticking to the rolling model that Fedora is known for.

After discussing this with several potential employers, I now have a different perspective on this, and can see the appeal if you have a limited use-case, and only care about a small subset of hardware.

The general purpose “one size fits all” model that distribution kernels have to fit is a much bigger problem to solve, and with the feedback loop of “stable release -> bug -> report upstream -> fixed in mainline -> backport to next stable release” being so long, it’s not really a surprise that just having users be closer to bleeding edge gets a higher volume of bugs fixed (though at the cost of all the shiny new bugs that get introduced along the way) unless you have a RHEL-like small army of developers backporting fixes.

Finally, nearly everyone I talked to who uses long-term stable was also carrying some ‘extras’, such as updated drivers for hardware they cared about (in some cases, out of tree variants that aren’t even synced with linux-next yet).

It’s a complicated mess out there. “We run linux stable” doesn’t necessarily tell you the whole picture.

The post Thoughts on long-term stable kernels. appeared first on codemonkey.org.uk.

January 19, 2015

The Whole Damn World Takes Effect to NetworkManager 1.0

nyold

2004

Facebook launched.

The first Ubuntu release appeared.

It was the Year of the Linux Desktop.

Novell had just bought Ximian and Mono happened.

Google IPOed.

Firefox 1.0 showed up.

This was your cellphone and PDAs were still a thing.

This love took you over and made you think you got it.

And NetworkManager was first released.

Fast forward to 2014…

nynew

NetworkManager 1.0!

Right before the 2014 holidays, and more than 10 years after the first line of NetworkManager was typed, we released version 1.0.  A huge milestone on the way to making NetworkManager more cooperative, more flexible, more configurable, and more useful than ever before.

How you ask?

1: libnm: the new GLib client library

For all the GLib/GObject users out there, we’ve rebuilt libnm-util and libnm-glib from the ground up into a new single library called libnm.  It uses GDBus instead of dbus-glib.  It provides GIO-style asynchronous methods. It also exposes IP addresses, MAC addresses, and other properties as strings instead of byte arrays, and combines the old NMClient and NMRemoteSettings objects into a single NMClient object, among other things.

from gi.repository import GLib, NM

for dev in NM.Client.new(None).get_devices():
    ipcfg = dev.get_ip4_config()
    if ipcfg:
        for addr in ipcfg.get_addresses():
            print "(%s) %s/%d" % (dev.get_iface(), addr.get_address(), addr.get_prefix())

2: a smaller, faster DHCP client

While it doesn’t do DHCPv6 (yet!) this internal client (based off systemd/connman code) is much faster than dhclient and dhcpcd, and doesn’t consume huge amounts of memory like dhclient.  Use the ‘dhcp=internal’ option in NetworkManager.conf to enable it and let us know how it works.  We’ll be adding DHCPv6 support and enhancing the recognized options in the near future.

3: configure and quit

Have a more static configuration and still want to use NetworkManager configuration and API to manage it?  The ‘configure-and-quit=yes’ option in NetworkManager.conf will configure your interfaces and quit the NM process, spawning small helpers to preserve DHCP and IPv6 addresses.  This saves cycles (and therefore money) and is simpler to manage.

4: more cooperative

Continuing the trend, NetworkManager 1.0 does a much better job of leaving externally configured interfaces alone until you tell it to do something.  In addition to improvements for IPv6 sysctl recognition and user-added route preservation, externally created virtual interfaces are no longer automatically set IFF_UP, and NetworkManager handles external master/slave relationship changes more smoothly.

5: more powerful nmcli

We’ve added PolicyKit and interactive password support to nmcli, allowing full command-line-only operation for most network connections, even for less privileged users.  There’s a new ‘nmcli dev connect’ command that brings up an interface using the best available connection.  You can also delete virtual interfaces directly through nmcli.

6: improved IPv6

We’ve ensured that if network interfaces are supposed to be down and unconfigured, that the kernel doesn’t assign a link-local address to them, to prevent potential security issues when you think networking is down.  We’ve also added support for IPv6 WWAN connections and fixes to respect router-delivered MTUs.

7: Bluetooth DUN support

Bluez5 changed API for Dial-Up-Networking functionality, which broke the NetworkManager support.  At long last we’ve added that support back, no thanks to Bluez.  Happy mobile networking!

8: more flexible and cooperative routing

Every interface that can have a default route now gets one, and NetworkManager manages the priorities to ensure they don’t conflict.  Plus, if you need to, you can manually manage priorities on a per-connection basis to prefer WiFi over WWAN or WWAN over ethernet, or whatever you need.

9: fewer dependencies

We’ve also removed some direct dependencies (PolicyKit), slimmed down code, and split functionality into selectable plugins, leading to easier installs on limited systems and better configurability.

That’s just the tip of the iceberg; we’ve improved almost every part of NetworkManager and we’re not stopping there.  We’re planning improvements to container use-cases, WiFi, VPNs, power savings, client APIs, and much  more.  2015 is gonna be a great year, and not just because the version number is greater than 1!

January 15, 2015

UAT on RTL-SDR update

About a year ago, when I started playing with ADS-B over 1090ES, I noticed that small airplanes heavily favour UAT in 978 MHz, because it's cheaper. For the purposes of independent on-board traffic, thus, it would be important to tap directly into UAT. If I mingle with airliners and their 1090ES, it's in controlled airspace anyway (yes, I know that most collisions happen in controlled airspace, but I'm grasping at excuses to play with UAT here, okay).

UAT poses a large challenge for RTL-SDR because of its relatively high data rate: 1.041667 mbit/s. RTL 2832U can only sample at 3.2 MS/s, and is only stable at 2.8 MS/s. Theoretically it should be enough, but everything I saw out there only works with 8 samples per bit, for weak signals. So, for initial experiments, I thought to try a trick: self-clocking. I set the sample rate to 2083334, and then do no clock recovery whatsoever. It hits where it hits, and if sample points lay well onto bits, the packet is recovered, otherwise it's lost.

I threw together some code, ran it, and it didn't work. Only received white noise. Oh, well. I moved the repo into a dusty corner of Github and forgot about it.

Fast forward a year, a gentleman by the name Oliver Jowett noticed a big problem: the phase was computed incorrectly. I fixed that up and suddenly saw some bits recovered (as it happened, zeroes and ones were swapped, which Oliver had to correct again).

After that, things started to move forward. Having bits recovered allowed to measure reception, and I found that the antenna that I built for the 978 MHz band was much worse than the stock antenna for TV. Imagine my surprise and disappoinment: all that soldering for nothing. I don't know where I screwed up, but some suggest that the computer and dongle produce RF noise that screws with antenna, and a length of coax helps with that despite the losses incurred by the coax (/u/christ0ph liked that in parti-cular).

This is bad.

This is good. Or better at least.

From now on, it's the error recovery. Unfortunately, I have no clue what this means:

The FEC parity generation shall be based on a systematic RS 256-ary code with 8-bit code word symbols. FEC parity generation for each of the six blocks shall be a RS (92,72) code.

Quoted from Annex 10, Volume III, 12.4.4.2.1.

P.S. Observing Oliver's key involvement, the cynical may conclude that the whole premise of Open Source is that you write drek that does not work, upload it to Github, and someone fixes it up for you, free of charge. ESR wrote a whole book about it ("with enough eyes all bugs are shallow")!

January 08, 2015

the new job reveal.

I let the cat out of the bag earlier this afternoon on twitter. Next Monday is day one of my new job at Akamai. The scope of my new role is pretty wide, ranging from the usual kernel debugging type work, to helping stabilizing production releases, proactively finding new bugs, misc QA work, development of some new tooling and a whole bunch of other stuff I can’t talk about just yet. (And probably a whole slew of things I don’t even know about yet).

It seemed almost serendipitous that I’ve ended up here. Earlier this year, I read Fatal System Error, a book detailing how Prolexic got founded. It’s a fascinating story beginning with DDoS’s of offshore gambling sites and ending with Russian organised crime syndicates. I don’t know how much of it got embellished for the book, but it’s a good read all the same. Also, Amazon usually has used copies for 1 cent. Anyway, a month after I read that book, Akamai acquired Prolexic.

Shortly afterwards, I found myself reading another Akamai related book, No Better Time, the autobiography of the late founder of Akamai, Danny Lewin. After reading it, I decided it was at least going to be worth interviewing there.

The job search led me a few possibilities, and the final decision to go with Akamai wasn’t an easy one to make. The combination of interesting work, an “easy to commute to” office (I’ll be at the Kendall square office in Cambridge,MA) and a small team that seemed easy to get along with (famous last words) is what decided it. (That, and all the dubstep).

It’s going to be an interesting challenge ahead to switch from the mindset of “bug that affects a single computer, or a handful of users” to “something that could bring down a 200,000 node cluster”, but I think I’m up for it. One thing I am definitely looking forward to is only caring about contemporary hardware, and a limited number of platforms.

I apologize in advance for any unexpected internet outages. I swear it was like that when I found it.

the new job reveal. is a post from: codemonkey.org.uk

Swift and balance

Swift is on the cusp of getting yet another intricate mechanism that regulates how partitions are placed: so-called "overload". But new users of Swift even keep asking what weights are, and now this? I am not entirely sure it's necessary, but here's a simple explanation why we're ending with attempts at complexity (thanks to John Dickinson on IRC).

Suppose you have a system that spreads your replicas (of partitions) across (failure) zones. This works great as long as your zones are about the same size, usually a rack. But then one day you buy a new rack with 8TB drives and suddenly the new zone is several times larger than others. If you do not adjust anything, it ends only filled by quarter at best.

So, fine, we add "weights". Now a zone that has weight 100.0 gets 2 times more replicas than zone with weight 50.0. This allows you to fill zones better, but this must compromize your dispersion and thus durability. Suppose you only have 4 racks: three with 2TB drives and one with 8TB drives. Not an unreasonable size for a small cloud. So, you set weights to 25, 25, 25, 100. With replication factor of 3, there's still a good probability (which I'm unable to calculate, although I feel it ought to be easy for someone better educated) that the bigger node will end with 2 replicas for some partitions. Once that node goes down, you lose redundancy completely for those partitions.

In the small-cloud example above, if you care about your customers' data, you have to eat the imbalance and underutilization until your retire the 2TB drives [1].

<clayg> torgomatic: well if you have 6 failure domains in a tier but their sized 10000 10 10 10 10 10 - you're still sorta screwed

My suggestion would be just ignore all the complexity we thoughtfuly provided for the people with "screwed" clusters. Deploy and maintain your cluster to make it easy for the placement and replication: have a good number of more or less uniform zones that are well aligned to natural failure domains. Everything else is a workaround -- even weights.

P.S. Kinda wondering how Ceph deals with these issues. It is more automagic when it decides what to store where, but surely there ought to be a good and bad way to add OSDs.

[1] Strictly speaking, other options exist. You can delegate to another tier by tying 2 small racks into a zone: yet another layer of Swift's complexity. Or, you could put new 8TB drives on trays and stuff them into existing nodes. But considering that muddies the waters.

UPDATE: See the changelog for better placement in Swift 2.2.2.

Some closure on a particularly nasty bug.

For the final three months of my tenure at Red Hat, I was chasing what was possibly the most frustrating bug I’d encountered since I had started work there. I had been fixing up various bugs in Trinity over the tail end of summer that meant on a good kernel, it would run and run for quite some time. I still didn’t figure out exactly what the cause of a self-corruption was, but narrowed it down enough that I could come up with a workaround. With this workaround in place, every so often, the kernels NMI watchdog would decide that a process had wedged, and eventually the box would grind to a halt. Just to make it doubly annoying, the hang would happen at seeming random intervals. Sometimes I could trigger it within an hour, sometimes it would take 24 hours.

I spent a lot of time trying to narrow down exactly the circumstances that would trigger this, without much luck. A lot of time was wasted trying to bisect where this was introduced, based upon bad assumptions that earlier kernels were ‘good’. During all this, a lot of theories were being thrown around, and people started looking at multiple areas of the kernel. The watchdog code, the scheduler, FPU context saving, the page fault handling, and time management. Along the way several suspect areas were highlighted, some things got fixed/cleaned up, but ultimately, they didn’t solve the problem. (A number of other suspect areas of code were highlighted that don’t have commits yet).

Then, right down to the final week before I gave all my hardware back to Red Hat, Linus managed to reproduce similar symptoms, by scribbling directly to the HPET. He came up with a hack that at least made the kernel survive for him. When I tried the same patch, the machine ran for three days before I interrupted it. The longest it had ever run.

The question remains, what was scribbling over the HPET in my case ? The /dev/hpet node doesn’t allow writing, even as root. You can mmap /dev/mem if you know the address of the HPET, and directly write to it, but..
1. That would be a root-only possibility, and..
2. Trinity blacklists /dev/mem, and never touches it.

The only two plausible scenarios I could think of were

  • Trinity generated 0xfed000f0 as a random address, and passed that to a syscall which wrote to it. This seems pretty unlikely, and hopefully the kernel has sufficient access_ok() checks on addresses passed in from userspace. Just to be sure, I had hardwired trinity to pass in that address, and couldn’t reproduce the bug.
  • A hardware bug.
    I’m actually starting to believe this may be the case. When trinity drives the CPU load up past a certain threshold, for whatever reason, the HPET stops ticking and corrupts itself. It still seems a bit “out there”, but is more believable than the other theory at least. An interesting data point showed up when googling for the DMI string of the affected machine. Someone else had seen ‘random lockups’ that looked very similar a year earlier. The associated bugzilla had a few more traces.

So that’s where the story (mostly) ends. When I left Red Hat, I gave that (possibly flawed) machine back. Linus’ hacky workaround didn’t get committed, but him & John Stultz continue to back & forth on hardening the clock management code in the face of screwed up hardware, so maybe soon we’ll see something real get committed there.

It was an interesting (though downright annoying) bug that took a lot longer to get any kind of closure on than expected. Some things I learned from this experience:

  • Keep better notes.
    Every week that passed, I had wished I wrote down what I had done the week before. With everything else going on in my life over the last few months, I neglected to document things as well as I could have, and only had old emails to fall back on. Not every bug drags on for months like this, but when you over-optimistically think a bug is going to be solved in a few days, you tend to not bother taking as extensive notes on what has been tried so far.
  • Google for the DMI string of the affected hardware pretty early on.
    That might have given us some clues a lot sooner as to what was going on. Or maybe not, but still – more data.
  • The more people looked into this bug, the more “this doesn’t look right” code was found. There’s never just one bug.

Some closure on a particularly nasty bug. is a post from: codemonkey.org.uk

January 07, 2015

continuity of various projects

I’ve had a bunch of people emailing me asking how my new job will affect various things I’ve worked on over the last few years. For the most part, not massively, but there are some changes ahead.

  • Trinity.
    This will proceed pretty much as it has over the last year, perhaps with a little more focus on various areas of the kernel than it has in the past. (Not being intentionally elusive, I’m not entirely sure myself yet).
    From discussions I’ve had so far, there may even be a spin-off into a separate tool, we’ll see.
    • I’ve been wanting to get back to the network related code for some time, and that’s probably going to happen soon-ish.
    • There is still work that could be done with the VM/FS code I’ve worked on over the last year and a half, but it’s already finding bugs, so is “good enough” for a while.
    • There are also still a few lingering bugs that I need to one day sit down and figure out, but they happen so infrequently that I’ve not found the time so far.
    • Finally, there is also a bunch of other feature work that needs fleshing out that I don’t see myself getting to any time soon, that I’ll dump in the TODO over the next week.
  • Upstream testing.
    Things like my daily running of various stress tools against Linus’ master branch with debug builds won’t happen to the level they were previously. The good news is that instead I’ll be doing a lot more testing on various stable branches, which I never did a whole lot of in the past. I expect over time as I get a better feel for my workload I might be able to ramp back up master testing somewhat, but it will be a secondary thing that I do in addition to everything else. Right now, I’m not really doing any of this, so other people running things like trinity against 3.19/3.20 would probably be a good idea if you like collecting crashes, at least until I find my feet again.
  • Coverity scans.
    • I’ll keep doing the scan.coverity runs hopefully once per -rc. I’ve lapsed a little right now, but will pick this back up again soon to get things back up to current. It takes a lot longer to run a whole scan from home now that I don’t have access to my 24-thread Nehalem (was: < 1 hours for compile, pack & upload to coverity, now: ~4 hours just for the compile), so I'll automate these to run overnight when Linus makes a new snapshot, at least until I put together something a little faster than my ~7 year old core duo.
    • I’m not sure I’m going to have a huge amount of time for triage work yet, so we’ll see how that works out. I might have to focus just on certain areas.
    • There will be some element of related work in my new job, but I’m not sure to what degree yet, more info on that as I figure it out.
  • Fedora.
    The biggest change of all.
    • I’m just not going to have to maintain packages, read mail etc for Fedora, so those all got orphaned yesterday.
    • Josh & Justin pretty much handled all of the Fedora kernel work for the last year or so, so me walking away is not going to make a huge difference there.
    • I might still occasionally take a peek at Fedora bugzilla to see if there’s anything similar to a particular bug, but don’t expect to be doing triage work.
    • I’ll still keep a Fedora box or two at home for a while, but work-wise, I’m expecting a lot more Debian in my life. It’s been over a decade since I last used it seriously. That should prove to be fun.
  • Conferences etc.
    I’ll hopefully be seeing some familiar faces again later this year, and possibly meeting some new ones. The only real change here will be the lack of fudcon/flock/whatever it’s called this week.

So that’s about it. For at least the first few months of 2015, I expect to be absorbed in getting acclimated my new job, so I won’t be as visible as usual, but I’m not going to disappear from the Linux community forever, which was something I made clear wasn’t something I wanted to happen at everywhere I interviewed.

continuity of various projects is a post from: codemonkey.org.uk

January 05, 2015

Going beyond ZFS by accident

Yesterday, CKS wrote an article that tries to redress the balance a little in the coverage of ZFS. Apparently, detractors of ZFS were pulling quotes from his operational gripes, so he went forceful with the observation that ZFS remains the only viable advanced filesystem (on Linux or not). CKS has no time for your btrfs bullshit.

The situation where weselovskys of the world hold the only viable advanced filesystem hostage and call everyone else "jackass" is very sad for Linux, but it may not be quite so dire, because it's possible that events are overtaking btrfs and ZFS. I am talking about the march of super-advanced, distributed filesystems downstream.

It all started with the move beyond POSIX, which, admittedly, seemed very silly at the time. The early DHT was laughable and I remember how they struggled for years to even enable writes. However, useful software was developed since then.

The poster child of it is Sage's Ceph, which relies on plain old XFS for back-end storage, composes an object storage out of nodes (called RADOS), and layers a POSIX layer on top for those who want it. It is in field use at Dreamhost. I can easily see someone using it where otherwise a ZFS-backed NFS/CIFS cluster would be deployed.

Another piece of software that I place in the same category is OpenStack Swift. I know, Swift is not competing with ZFS directly. The consistency of its meta layer is not sufficient to emulate POSIX anyway. However, you get all those built-in checksums and all that durability jazz that CKS wants. And Swift aims even further up in scale than Ceph, by being in field use at Rackspace. So, what seems to be happening is that folks who really need to go large are willing at times to forsake even compatibility with POSIX, in part of get the benefits that ZFS provides to CKS. Mercado Libre is one well-hyped case of migration from a pile of NFS filers to a Swift cluster.

Now that these systems are established and have themselves proven, I see constant efforts to take them downscale. Original Swift 1.0 did not even work right if you had less than 3 nodes (strictly speaking, if you had fewer zones than replication factor). This was fixed since by so-called "as good as possible placement" around 1.13 Havana, so you can have 1-node Swift easily. Ceph, similarly, would not consider PGs on the same node healthy and it's a bit of a PITA even in Firefly. So yea, there are issues, but we're working on it. And in the process, we're definitely coming for chunks of ZFS space.

blitz2 for GNOME catastrophy

After putting blitz2 on my Nexus and poking into GUI buttons, I reckoned that it might be time to stop typing in a terminal like a caveman on Linux too (I'm joking, but only just). And the project rolled smoothly for a while. What it took me 2 months to accomplish in Android, only took 2 days in GNOME. However, 2 lines of code from the end, it all came to an abrubt halt when I found out that it is impossible to access clipboard from a JavaScript application. See GNOME bugs 579312, 712752.

January 03, 2015

2 months in

Today is my two month monthiversary at my new job. Haven’t had time so far to sit back and reflect and let people know, but now during packing boxes for our upcoming move downtown, I welcome the distraction.

I dove into the black hole. I joined the borg collective. I’m now working for the little search engine that could.

I sure had my reservations while contemplating this choice. This is the first job I’ve had that I had to interview for – and quite a bit, I might add (though I have to admit that curiosity about the interviewing process is what made me go for the interviews in the first place – I wasn’t even considering a different job at that time). My first job, a four month high school math teaching stint right after I graduated, was suggested to me by an ex-girlfriend, and I was immediately accepted after talking to the headmaster (that job is still a fond memory for many reasons). For my first real job, I informally chatted over dinner with one of the four founders, and then I started working for them without knowing if they were going to pay me. They ended up doing so by the end of the month, and that was that. The next job was offered to me over IRC, and from that Fluendo and Flumotion were born. None of these were through a standard job interview, and when I interviewed at Google I had much more experience on the other side of the interviewing table.

From a bunch of small startups to a company the scale of Google is a big step up, so that was my main reservation. Am I going to be able to adapt to a big company’s way of working? On the other hand, I reasoned, I don’t really know what it’s like to work for a big company, and clearly Google is one of the best of those to work for. I’d rather try out working for a big company while I’m still considered relatively young job-market-wise, so I rack up some experience with both sides of this coin during my professionally mobile years.

But I’m not going to lie either – seeing that giant curious machine from the inside, learn how they do things, being allowed to pierce the veil and peak behind the curtain – there is a curiosity here that was waiting to be satisfied. Does a company like this have all big problems solved already? How do they handle things I’ve had to learn on the fly without anyone else to learn from? I was hiring and leading a small group of engineers – how does a company that big handle that on an industrial scale? How does a search query really work? How many machines are involved?

And Google is delivering in spades on that front. From the very first day, there’s an openness and a sharing of information that I did not expect. (This also explains why I’ve always felt that people who joined Google basically disappeared into a black hole – in return for this openness, you are encouraged to swear yourself to secrecy towards the outside world. I’m surprised that that can work as an approach, but it seems to). By day two we did our first commit (obviously nothing that goes to production, but still.) In my first week I found the way to the elusive (to me at least) roof top terrace by searching through internal documentation.IMG_20141229_144054The view was totally worth it.

So far, in my first two months, I’ve only had good surprises. I think that’s normal – even the noogler training itself tells you about the happiness curve, and how positive and excited you feel the first few months. It was easy to make fun of some of the perks from an outside perspective, but what you couldn’t tell from that outside perspective is how these perks are just manifestations of common engineering sense on a company level. You get excellent free lunches so that you go eat with your team mates or run into colleagues and discuss things, without losing brain power on deciding where to go eat (I remember the spreadsheet we had in Barcelona for a while for bike lunch once a week) or losing too much time doing so (in Barcelona, all of the options in the office building were totally shit. If you cared about food it was not uncommon to be out of the office area for ninety minutes or more). You get snacks and drinks so that you know that’s taken care of for you and you don’t have to worry about getting any and leave your workplace for them. There are hammocks and nap pods so you can take a nap and be refreshed in the afternoon. You get massage points for massages because a healthy body makes for a healthy mind. You get a health plan where the good options get subsidized because Google takes that same data-driven approach to their HR approach and figured out how much they save by not having sick employees. None of these perks are altruistic as such, but there is also no pretense of them being so. They are just good business sense – keep your employees healthy, productive, focused on their work, and provide the best possible environment to do their best work in. I don’t think I will ever make fun of free food perks again given that the food is this good, and possibly the favorite part of my day is the smoothie I pick up from the cafe on the way in every morning. It’s silly, it’s small, and they probably only do it so that I get enough vitamins to not get the flu in winter and miss work, but it works wonders on me and my morning mood.

I think the bottom line here is that you get treated as a responsible adult by default in this company. I remember silly discussions we had at Flumotion about developer productivity. Of course, that was just a breakdown of a conversation that inevitably stooped to the level of measuring hours worked as a measurement of developer productivity, simply because that’s the end point of any conversation on that spirals out of control. Counting hours worked was the only thing that both sides of that conversation understood as a concept, and paying for hours worked was the only thing that both sides agreed on as a basic rule. But I still considered it a major personal fault to have let the conversation back then get to that point; it was simply too late by then to steer it back in the right direction. At Google? There is no discussion about hours worked, work schedule, expected productivity in terms of hours, or any of that. People get treated like responsible adults, are involved in their short-, mid- and long-term planning, feel responsible for their objectives, and allocate their time accordingly. I’ve come in really early and I’ve come in late (by some personal definition of “on time” that, ever since my second job 15 years ago, I was lucky enough to define as ’10 AM’). I’ve left early on some days and stayed late on more days. I’ve seen people go home early, and I’ve seen people stay late on a Friday night so they could launch a benchmark that was going to run all weekend so there’d be useful data on Monday. I asked my manager one time if I should let him know if I get in later because of a doctor’s visit, and he told me he didn’t need to know, but it helps if I put it on the calendar in case people wanted to have a meeting with me at that hour.

And you know what? It works. Getting this amount of respect by default, and seeing a standard to live up to set all around you – it just makes me want to work even harder to be worthy of that respect. I never had any trouble motivating myself to do work, but now I feel an additional external motivation, one this company has managed to create and maintain over the fifteen+ years they’ve been in business. I think that’s an amazing achievement.

So far, so good, fingers crossed, touch wood and all that. It’s quite a change from what came before, and it’s going to be quite the ride. But I’m ready for it.

(On a side note – the only time my habit of wearing two different shoes was ever considered a no-no for a job was for my previous job – the dysfunctional one where they still owe me money, among other stunts they pulled. I think I can now empirically elevate my shoe habit to a litmus test for a decent job, and I should have listened to my gut on the last one. Live and learn!)

flattr this!

December 31, 2014

blitz2 for Android

An additional upside for blitz2 is that HTTP client is available on about any platform. So if I want to share clipboard with my Nexus tablet, I can, without running an sshd on it.

This is my first Android app, and I didn't touch Java in many years. So, first impressions.

I forgot how insanely wordy Java is. And doing anything takes effort, with all the factories, accessors, and whatnot.

I like checked exceptions. Too bad Python doesn't have them (probably impossible by the very nature of a dynamic language, but I've been bitten by an unexpected exception floating up from the depth of the stack before).

Android docs are excellent and one almost never needs to search for answers. Unfortunately, I managed to step into one such case: the so-called "Up" navigation. My chosen API level is 11. The contemporary docs explain how to emulate "Up" using compatibility libraries for APIs before mine, and explain how to use onNavigateUp(), that comes in API level 16. But there's absolutely nothing, nowhere, that tells how do it in API 11. I was walking in circles for days. The answer is actually a secret ID namespace, particularly android.R.id.home. I would never figure it out if not for random pieces of code on the Internet. Good grief, Google. So close to perfect marks.

Oh, and one more thing: Googlers score good sanity points for reimplementing a stock Java API for HTTP (HttpURLConnection and friends). They could've easily rolled their own, but they didn't. They wrote their own runtime, but it's fully compatible with Oracle, including dark corners of SSL. It permits to mostly debug difficult parts on a Linux box. Very nice. Just to see what it could be otherwise, look at their gratiously incompatible Base64.

UPDATE: I forgot to mention that I started with Eclipse, but it was entirely unusable due to crashing all the time (about once an hour, for no discernable reason). I was at Fedora 20 at the time. So, I used command-line tools, and that worked like a charm. There's a Makefile in blitz2 repo linked above.

December 22, 2014

Cheating around taskotron in Fedora

The yesterday ntp vulnerability uncovered a trick for Fedora maintainers. You know how it's super annoying that you cannot push an update to F20 without F21? You must herd updates and can never do them in parallel, or else taskotron ruins innocent updates. But at the time of this writing the fixes are live in F20, but not in F21. How does Miroslav do it?

The answer is easy: he keeps ntp intentionally a few releases back in older Fedora (4.2.6p5-19 in F20), so he can bump it with impunity without regard to the newer Fedora (4.2.6p5-25 in F21). Of course, if someone were to upgrade to F21 today, he'd go from a fixed ntp to a broken ntp, but hey... at least the automated checks are defeated.

This challenge is similar to writing super ugly OpenStack code that passes PEP8 checks, only outcome is actually dangerous today.

December 19, 2014

Moving on from Red Hat.

After eleven and a half years, today is my final day at Red Hat.
I’ll write more about what comes next in the new year.

In the meantime, here’s a slightly edited version of a mail I sent internally yesterday.


In 2003, I got an email from Michael Johnson, about a secretive new thing Red Hat was working on called "Fedora". No-one was quite sure what it was going to be (some may argue we're still figuring it out), but he was pretty sure I'd want to be a part of it. "How'd you feel about taking care of _any_ kernel problems that come in for this thing?" he asked. I was terrified, but excited at the opportunities to learn a lot of stuff outside my usual areas of expertise.

With barely any real detail as to what I was signing up for, I jumped at the opportunity. Within my first few months, I had some concerns over whether or not I had made a good decision. Then Michael left for rPath, and I seriously started to have my doubts.

While everyone was figuring out what Fedora was going to be, I was thrown in at the deep end. "Here's Red Hat Linux 7, 8 and 9, you maintain the kernel for those now. Go". I remember looking at bugzilla scrolling through page after page of bugs thinking "This is going to be a nightmare" At the same time, RHEL 3 was really starting to take shape. I looked at what the guys working on RHEL were doing and thought "Well, this sucks, but those guys.. they _really_ have work to do". As much as I was buried alive in work, I relished every moment of it, learning as much as I could in what little spare time I had.

Then Fedora finally happened. For those not around back then, Fedora Core 1 was pretty much what Red Hat Linux 10 would have been from a kernel pov. A nasty hairball of patches that weren't going upstream (execshield! 4g4g! Tux! CIPE!) that even their authors had stopped maintaining, and a bunch of features backported from 2.5 to 2.4. I get the shakes when I think back to the horrors of maintaining that mess, but like the horrors of RHL before it, it was an amazing learning experience (mostly "what not to do").

But for all its warts, Fedora gained traction, and after Fedora 2 moved to a 2.6 kernel, things really started to take shape. As Fedora's community started to grow, things got even busier in bugzilla than RHL had ever been.

Then somehow I got talked into also being RHEL4 kernel maintainer for a while.
It turned out that juggling Fedora 3, Fedora 4, Rawhide, RHEL4 GA, and RHEL4 U1 means you don't get a lot of time to sleep. So after finding another sucker to deal with the RHEL work, I moved back to just doing Fedora work, and in another big turning point, we started to slowly grow out the Fedora kernel team.

Over the years that followed, the only thing that remained constant was the inflow of bugs. At any given time we had a thousand or so bugs open, with at best 3 people, at worst 1 person working on them. I'm incredibly proud of what we've managed to achieve with the Fedora kernel. More than just the base for RHEL, it changed the whole landscape of upstream kernel development.

  • Our insistence on shipping the latest code, with as few 'special sauce' patches won over a lot of upstream developers that wouldn't have given us the time of day for similar bugs back in the RHL days. Sometimes painful for our users, but Linux as a whole got better because of our stance here.
  • Decisions like Fedora enabling debug options by default in betas shook out an unbelievable number of bugs almost as soon as they get introduced. Again, painful for users, but from a quality standpoint, we found a ton of bugs in code others were racing to ship first and call "enterprise ready".
  • Fedora enabling features sometimes before they were fully baked got us a lot of love from their respective upstream maintainers.

Despite this progress though, I always felt we were on a treadmill making no real forward progress. That constant 1000 or so bugs kept nagging at me. As fast as we closed them out, a new batch would arrive.

In more recent years, we tried to split the workload within the team so we could do more proactive bug-finding before users even find them. My own 'trinity' project has found so many serious bugs (filesystem corruptors, root holes, vm corner cases, the list goes on) that it got to be almost a full time job just tracking everything.

I used to feel that leaving Red Hat wasn't something I could do. On a few occasions I actually turned down offers from potential employers, because "What about the Fedora kernel?". For the first time since the project has begun I feel like I've left things in more than capable hands, and I'm sure things will continue to move in the right direction.

3 RHL's. 5 and a half RHEL's. 21 Fedoras. You don't even want to know how much hardware I've destroyed in the line of duty in this time. It's been uh, an experience.

So, after all this time, one thing I have learned, is that all this was definitely one of my better decisions. I hope that my next decision turns out to be an equally good one.

Moving on from Red Hat. is a post from: codemonkey.org.uk

December 12, 2014

blitz2

You know how some people attach several montiors to one PC? I don't. I just have several PCs. But then I want copy-paste to work transparently (as transparently as possible). For several years I used blitz to copy clipboard. It works well enough, but once you have 3 computers, it gets somewhat cumbersome to type the hostname. Also, it always bothered me how it rides ssh authentication. I wanted something independent from ssh.

Behold blitz2. Instead of passing the clipboard to the host where it's needed directly, the clipboard is uploaded to an HTTP server. Seems more complex at first, but it's actually much better, because previously the PC where you copy had to authenticate to the PC where you paste. Now the authentication is symmetric. So, all clients are configured exactly the same, and all can upload and download the clipboard no matter who trusts what ssh keys.

December 02, 2014

Free-riding and copyleft in cultural commons like Flickr

Flickr recently started selling prints of Creative Commons Attribution-Share Alike photos without sharing any of the revenue with the original photographers. When people were surprised, Flickr said “if you don’t want commercial use, switch the photo to CC non-commercial”.

This seems to have mostly caused two reactions:

  1. This is horrible! Creative Commons is horrible!”
  2. “Commercial reuse is explicitly part of the license; I don’t understand the anger.”

I think it makes sense to examine some of the assumptions those users (and many license authors) may have had, and what that tells us about license choice and design going forward.

Free ride!!, by https://www.flickr.com/photos/dhinakaran/
Free ride!!, by Dhinakaran Gajavarathan, under CC BY 2.0

Free riding is why we share-alike…

As I’ve explained before here, a major reason why people choose copyleft/share-alike licenses is to prevent free rider problems: they are OK with you using their thing, but they want the license to nudge (or push) you in the direction of sharing back/collaborating with them in the future. To quote Elinor Ostrom, who won a Nobel for her research on how commons are managed in the wild, “[i]n all recorded, long surviving, self-organized resource governance regimes, participants invest resources in monitoring the actions of each other so as to reduce the probability of free riding.” (emphasis added)

… but share-alike is not always enough

Copyleft is one of our mechanisms for this in our commons, but it isn’t enough. I think experience in free/open/libre software shows that free rider problems are best prevented when three conditions are present:

  • The work being created is genuinely collaborative — i.e., many authors who contribute similarly to the work. This reduces the cost of free riding to any one author. It also makes it more understandable/tolerable when a re-user fails to compensate specific authors, since there is so much practical difficulty for even a good-faith reuser to evaluate who should get paid and contact them.
  • There is a long-term cost to not contributing back to the parent project. In the case of Linux and many large software projects, this long-term cost is about maintenance and security: if you’re not working with upstream, you’re not going to get the benefit of new fixes, and will pay a cost in backporting security fixes.
  • The license triggers share-alike obligations for common use cases. The copyleft doesn’t need to perfectly capture all use cases. But if at least some high-profile use cases require sharing back, that helps discipline other users by making them think more carefully about their obligations (both legal and social/organizational).

Alternately, you may be able to avoid damage from free rider problems by taking the Apache/BSD approach: genuinely, deeply educating contributors, before they contribute, that they should only contribute if they are OK with a high level of free riding. It is hard to see how this can work in a situation like Flickr’s, because contributors don’t have extensive community contact.1

The most important takeaway from this list is that if you want to prevent free riding in a community-production project, the license can’t do all the work itself — other frictions that somewhat slow reuse should be present. (In fact, my first draft of this list didn’t mention the license at all — just the first two points.)

Flickr is practically designed for free riding

Flickr fails on all the points I’ve listed above — it has no frictions that might discourage free riding.

  • The community doesn’t collaborate on the works. This makes the selling a deeply personal, “expensive” thing for any author who sees their photo for sale. It is very easy for each of them to find their specific materials being reused, and see a specific price being charged by Yahoo that they’d like to see a slice of.
  • There is no cost to re-users who don’t contribute back to the author—the photo will never develop security problems, or get less useful with time.
  • The share-alike doesn’t kick in for virtually any reuses, encouraging Yahoo to look at the relationship as a purely legal one, and encouraging them to forget about the other relationships they have with Flickr users.
  • There is no community education about the expectations for commercial use, so many people don’t fully understand the licenses they’re using.

So what does this mean?

This has already gone on too long, but a quick thought: what this suggests is that if you have a community dedicated to creating a cultural commons, it needs some features that discourage free riding — and critically, mere copyleft licensing might not be good enough, because of the nature of most production of commons of cultural works. In Flickr’s case, maybe this should simply have included not doing this, or making some sort of financial arrangement despite what was legally permissible; for other communities and other circumstances other solutions to the free-rider problem may make sense too.

And I think this argues for consideration of non-commercial licenses in some circumstances as well. This doesn’t make non-commercial licenses more palatable, but since commercial free riding is typically people’s biggest concern, and other tools may not be available, it is entirely possible it should be considered more seriously than free and open source software dogma might have you believe.

  1. It is open to discussion, I think, whether this works in Wikimedia Commons, and how it can be scaled as Commons grows.

November 30, 2014

23 years

(This post is only about music – for people not from Belgium, Luc de Vos, singer of Gorki, passed away yesterday at 52)

I am 15. I hear a song on the radio, and I don’t understand the lyrics. Why would you ask a piranha to devour you? Still, I’m intrigued. I’d only really gotten into music little by little. My earliest musical memory is hearing my parents’ record player playing ‘I want you’ by Bob Dylan. After that, it was my inexplicable arousal at seeing the Hey You the Rock Steady Crewvideo in 1983 when I was 7, getting the Top Gun soundtrack on cassette (my first ever music purchase) in 1986, and watching the video for ‘I want your sex’ by George Michael in 1987 over and over on my recording of Veronica’s “Countdown”. At my confirmation (12 years old), when kids typically get some kind of bigger gift they’ve been dreaming of for a long time, I still chose a computer instead of a stereo.

I am 16, I just had my birthday. I am doing a summer job at my family’s company (which processes animal fat) and I am staying with my grandparents in Bavegem. With the money from my birthday I bought a portable stereo CD/cassette player for the incredible amount of 6000 BEF (or 150 euro as the kids would call it these days). . I listen to nothing else for weeks on end. I can still hum the amazingly beautiful piano part that closes Mia from memory. It’s been my favorite song ever since.

I am 17, and learning the guitar. It turns out that Mia is quite complicated to get right, because of that perfect 3/4-5/4 tempo, or whatever you’d call it if you knew anything about music. It doesn’t help that I’m left-handed playing on a right-handed guitar, but I make the song my own. To this day though, I can still not play and sing it at the same time. There is something about the timing of how that third line starts before the music starts, where he signs ‘Mensen als ik’, that I just can’t figure out. It’s magic – it makes this song all the better.

I am 17, and Gorky is now Gorki, with completely new band members. I see them live for the first time, at ‘De Kring’ in Merelbeke, with my best friend Jeremy. I wish I had bought all the t-shirts that night – they had a different one for each of the new songs. The album sounds so different – parts of it recorded in Africa. I don’t listen to that album enough, but I still love playing Berejager on guitar, such a beautiful intro.

I am 17, and it’s my last year of boy scout before becoming a leader. I have a mini-JIN camp called JINTRO during the year, that ends with a party. I dance with a girl to Mia, and one minute into the dance she says, ‘no no, we’re not going to do a one-tile-dance for the rest of the night. Here’s how you do it’ and she teaches me two basic moves to make a slow dance more interesting. Thank you, Karlien, for changing my life.

I am 18, and we travel through Catalunya with the boy and girl scouts group I’m in, and a local Catalan group. This is one of the CD’s we brought with us as a sample of our own culture. The Catalans love it – they say it sounds like Bruce Springsteen. I can see where they’re coming from. At the end of the two weeks, he guitar player of their group nails down a really good version of Mia (without the words of course)

I am 18, and have my first serious girlfriend. Mia is a song that runs through our history together – we must have danced to it at every party that played it (she messaged me yesterday that she immediately thought of me when she heard the news… just like I did of her). Back then, parties still had blocks of 3 slow songs every one or two hours. I miss that tradition… The moves that Karlien taught me put me well ahead of the pack of my fellow young adult males, and that paid off generously in the young adult females agreeing to dance with me at every party. (The theory of compounded interest clearly put in practice, now that I think of it)

I am 19, and one of my fellow boy scout leaders gives me an old demo cassette of Gorky. Among other things, it contains a cover of the Pixies’ “Monkey Gone to Heaven”, some of their songs that didn’t make their debut (but appeared on Boterhammen, like ‘Ik word oud’, or were turned into a b-side). It also contains the original version of Mia, as a fast-paced slurred-sung rocker. They made the right call slowing it down.

I am 21, and I have a radio show at a student radio I helped start up. I am too young to know how the world really works and just send out interview requests to managers and record labels for bands that I like. In those days, I got to interview my favorite band, The Afghan Whigs, as well as other bands like Everclear and The Sheila Divine. But we also managed to get Luc De Vos as a guest on our radio show, and Jeremy and I interviewed him inbetween songs for an hour. (That tape is at my parents’ place. I have an Excel sheet that tells me exactly which box it’s in, and I hope I can recover it next time I go to Belgium.) I tell him about that demo tape that I have, and he asks for a copy. A little after that, I bring him a copy of that practice tape, I put ‘Congregation’ by The Afghan Whigs on the other side (because I want one of my favorite bands to know another one of my favorite bands), and I go past his house to drop it off. (From the news report this weekend I hear he still lived in the same street, so I can only assume he was still living in the same house he’s lived for the last 17 years).

I am 22, and Luc De Vos plays solo at the university somewhere, in an auditorium. I think it was one of the first times he ever did that. He probably already read out a column he wrote. But I remember how amazing he was by himself, what beautiful versions of these songs that I knew so well he played, songs that usually they didn’t play live because they were the slower ones. ‘Arme Jongen’, I remember him playing it there like it was yesterday.

I am 26, and I see him at various festivals, always there to either play or enjoy the music. I see him backstage with his son, recently born. He is walking around with some kind of elastic band tied around his waist that keeps his kid from running away more than ten meters from him, and it is hilarious to see in the backstage area.

Time starts moving quicker as I grow up, become an adult, and graduate from college. More and more albums. Every album still contained at least one killer song. ‘Leve de Lente’ still gives me goosebumps when those guitars crash in. ‘Vaarwel Lieveling’ is possibly his most underrated song – I don’t think I’ve ever heard that one played live. ‘Ode an die freude’, ‘We zijn zo jong’, ‘Duitsland wint altijd’ – I love the sound of resignment he has in his voice, like a deep sigh put too music. That album came with a floppy disk (!) with the lyrics. ‘Het voorspel was moordend’, ‘Tijdbom’ – while the music came back to being a bit more convential, the lyrics got more hermetically sealed. I must admit that I slowly lost track – having moved to Barcelona at some point, it was much harder to catch them live of course. I know their first five albums the best, and while I still bought the others (having missed only one), none of them had the luxury of not having any other album in my collection to compete with like their debut album had. But there is no denying that when they were great, they were still amazing. A song like ‘Veronica komt naar je toe’ managed to pull together so many different things. The title was a recurring slogan of a Dutch channel that was popular among young people in Belgium, for lack of a Belgian alternative. Here’s a great song, with a great chorus, and his ability to sample just this one sentence to evoke a memory of youth every one of my generation remembers (while it evoked at the same time my personal memory of seeing ‘I want your sex’ on Veronica). And then he manages to evoke such a common feeling everyone has, where you are trying to grab that fleeting thing you were thinking just a second ago, straddling typically complicated-to-phrase words in Dutch with effortless ease – ‘Wat was het nu ook alweer/dat ik wou doen/het was iets belangrijks’ (or ‘What was it again/I wanted to do/it was something important). In the beginning, his lyrics were quirky in ideas, but fairly straightforward in their phrasing. Further on in their career, they experimented quite a bit musically, but especially the lyrics could get complicated, and with exceptional and inventive phrasing.

I’m 31, and I live in Barcelona, but I travel back to Belgium because Gorki is playing their debut album, Gorky. I wrote about that concert back then, but that memory is still strong. I can’t believe that was 7 years ago…

I always enjoyed reading his columns in Zone 09 whenever I was in my hometown, I thought he had a great gift for writing. I noticed just now he left behind quite a few more books than I had, so I started tracking those down. So many of my memories have his music attached to it. His was the first band that opened me up to a wider range of music, away from the mainstream (not everybody would agree I guess, but I never considered them mainstream. Their debut album certainly was different enough from whatever was considered mainstream at the time, and as often happens this debut was only widely recognized several albums into their career later, while at the same time those later albums never really got the same kind of traction.)

I loved his way of looking at the world, the way he described it in music, lyrics, writing, and interviews. Always with that cheeky look. Like, surprisingly it now turns out, so many of my generation, his music was intertwined with my growing up. Here’s a man I was hoping to live long and make much more music, and grow old playing hundreds of songs in bars and clubs, but it wasn’t to be. He set out to be a successful rock singer, whether that was tongue-in-cheek or not, and by all accounts he achieved what he set out to do. And everything he did, he did it for the best of reasons. He did it for ‘a fistful of bonnekes’

flattr this!