Kernel Planet
May 24, 2013
In addition to turning your Fedora 18 box into an iSCSI target, LIO also supports other SCSI transport layers (‘fabrics’), such as Fibre Channel, with the qla2xxx fabric.
The most crucial bit is to verify that the qla2xxx driver has initiator mode disabled — it should be operating in target mode only. You can check this with:
cat /sys/module/qla2xxx/parameters/qlini_mode
It should say ‘disabled’. If it doesn’t, create a file called /usr/lib/modprobe.d/qla2xxx.conf and put:
options qla2xxx qlini_mode=disabled
in it. Then, run ‘dracut -f’ to rebuild your initrd, and reboot.
Some of you may be wondering: why /usr/lib/modprobe.d instead of /etc/modprobe.d ? This is because qla2xxx is likely loaded from the kernel’s initial ramdisk (initrd), and dracut, the initrd building tool, omits “host-specific” settings in /etc/modprobe.d. While you’re mucking around, also make sure the firmware package for your qla device, such as ql2200-firmware or similar, is also installed.
targetcli won’t let you create a qla2xxx fabric if qlini_mode is wrong. Once it lets you create the qla fabric, you can add luns to it and grant access permissions to acls exactly in the same manner as the other LIO fabrics.
May 24, 2013 04:29 PM
| |
17 |
18 |
19 |
rawhide |
|
| Open: |
276 |
380 |
104 |
66 |
(826) |
| Opened since 2013-05-17 |
4 |
32 |
6 |
3 |
(45) |
| Closed since 2013-05-17 |
3 |
11 |
4 |
3 |
(21) |
| Changed since 2013-05-17 |
16 |
48 |
14 |
5 |
(83) |
Weekly Fedora kernel bug statistics – May 24th 2013 is a post from: codemonkey.org.uk
May 24, 2013 04:10 PM
- Continued chasing the xfs bug. Confirmed that with the pending fix in the XFS tree, I can no longer reproduce the bug I saw. Then 3-4 hours later, it popped up again.
Annoying. Finally got some debug info to Dave Chinner at the end of the day.
Timezone differences making it even more annoying to debug.
- Fixed up a compile problem on older distros in trinity since the recent perf changes.
- Vince Weaver’s perf specific trinity fork is finding more bugs in the perf syscall already [1], [2].
- Merged trinity patch from Daniel Borkmann to add randomized seccomp filters generated by markov chain. Interesting stuff.
- Planning for some interesting testing work next week.
- Started some prep work for trinity 1.2
My 3.10-rc2 outstanding issues:
- RCU/NOHZ_FULL bugs [1], [2]
- RCU bootmem allocator trace.http://codemonkey.org.uk/?p=477&preview=true
- tickbroadcast bootmem allocator trace.
- XFS Slab corruption (pending fix)
- XFS xfs_setattr_size assertion
- T430s Lid events no longer put machine to sleep (actually an old bug since 3.9-rc1, but laptop is out of action until next week)
Puzzling website of the day: pain registers.
daily log May 23rd 2013. is a post from: codemonkey.org.uk
May 24, 2013 04:25 AM
May 23, 2013
Going to try and continue yesterdays daily log format for a while.
- Grumbled at openvpn changing pathname for ‘plugin’ to ‘plugins’ breaking my vpn script.
- Bugzilla seemed unhappy. Gave up trying to look at it after it kept timing out.
- Continued poking at the XFS assertion from yesterday. Downgraded the compiler from f19′s 4.8.0 to 4.7.3. No luck. Couldn’t reproduce on 3.9, so started bisecting. Seemed to be caused by a patch I added recently to work around another XFS bug (slab corruption). I can’t win. Dave Chinner confused by my diagnosis. Bisect take 2 on that tomorrow.
- Vince Weaver posted a perf_event fuzzer based on trinity. Spent a while reading it over. Neat. Glad to see people taking an idea and running with it in new directions. The more test programs the better.
- Diagnosed yesterdays “microcode loader got slow” bug. Turned out that I had somehow inadvertently set CONFIG_FW_LOADER_USER_HELPER, which incurs a 60 second timeout.
- While waiting for bisections, looked over some bugs in coverity’s database. Around 1500 untriaged. Would like to find time to work on that some at some point.
Spent so much of the day bisecting/building/rebooting that I didn’t write much new code today. Ho-hum.
daily log May 22nd 2013. is a post from: codemonkey.org.uk
May 23, 2013 03:51 AM
May 22, 2013
As detailed in this blog post, I've expanded the set of man pages rendered in HTML at http://man7.org/linux/man-pages/ to include pages in addition to those provided by the man-pages project. This change has several purposes. One main purpose is to provide a up-to-date and regularly updated HTML renderings of these man pages. (Most online man page renderings are out-of-date to some extent--in some cases, extremely out of date.) The other main purpose is to provide information on where to report bugs in each man page. To this end, each HTML rendering includes a COLOPHON that describes the origin of the page, notes the date when it was extracted, and provides information on where to report bugs in the page. (The man-pages project has already done this since December 2007, with the result that many more man page bugs are nowadays reported.)
Currently, man pages from nearly 40 projects are rendered, raising the number of pages rendered at man7.org from around 950 to around 1750. The projects that I have so far included have a bias that matches my interests: man-pages, projects related to low-level C and system programming (e.g., the ACL and extended attribute libraries), toolchain projects (e.g., gcc, gdb, Git, coreutils, binutils, util-linux), and other relevant tools (kmod, strace, ltrace, procps, expect) and tools relevant to manual pages (e.g., groff, man-db). The full list of projects and the corresponding man pages that are rendered can be found in the man pages by project index. I'm open to adding further projects to the rendered set, if they seem relevant. If you think there is a project that should be added, take a look at this blog post.
May 22, 2013 01:18 PM
Got back from vacation today (since last Thursday). Here’s how I spent the day.
- Caught up (skimmed) the 1500 postings to Linux-kernel and related mailing lists that had accumulated.
- Reviewed, applied and cleaned up my patch backlog for trinity.
- Caught up with direct mail that needed a response.
- Brought my test machines up to 3.10rc2, and restarted tests.
- Caught another pair of RCU/nohz bugs pretty quickly. [1][2].
- Checked on the RMA for my failed SSD. Still awaiting shipment of replacement.
- Received my ultrabay adaptor for my thinkpad. Surprised to find out that a full height SSD would fit into it.
- Pushed out a 3.9.3 update for F18
- Looked at bugzilla backlog. Swore a lot. 3.9.x rebase bugs started to trickle in.
- Rewrote a bunch of code surrounding trinity’s rand() usage.
- Finally got F19 installed via NFS on new test machine.
- Hit an XFS assertion.
- Then hit an i915 pineview kms console blanking bug.
- Noticed that x86 microcode loading had gotten really slow. It seems to be waiting a whole 60 seconds for each core.
a day in the life.. is a post from: codemonkey.org.uk
May 22, 2013 03:36 AM
May 17, 2013
A few years ago, I gave a history of the 2.6.32 stable kernel, and
mentioned the previous stable kernels as well. I'd like to apologize for not
acknowledging the work of Adrian Bunk in maintaining the 2.6.16 stable kernel
for 2 years after I gave up on it, allowing it to be used by many people for a
very long time.
I've updated the previous post with this information in it at the bottom, for
the archives. Again, many apologies, I never meant to ignore the work of this
developer.
May 17, 2013 04:34 PM
May 16, 2013
At Havana summit they were giving away a paper version of Joe Arnold's "Software Defined Storage with OpenStack Swift". Very useful book for anyone dealing with Swift, I would be glad to pay the cover price of $25. But even more interestingly than tips on care and feeding of Swift, Joe opens the whole book thus:
[...] a de-coupled management system so customers could achieve (1) amazing flexibility in terms of how (and where) they deployed their storage, (2) control of their data without being locked-in to a vendor and (3) private storage at public cloud prices.
These features are the essence of Software Defined Storage (SDS), a new term the meaning of which is being defined. [...] Key aspects of SDS are scalability, adaptability, and the ability to use most any hardware. Through this de-coupling, operators can now make choices on how their storage is scaled and managed and how users can store and access data — all driven programmatically for the entire storage tier, regardless of where the storage resources are deployed.
Parts of the above prompt questions. Firstly, what good is de-coupling in respect to lock-in? SwiftStack effectively locks in by owning the de-coupled management. Sure, you own your data and could, in theory, manage your Swift with another management plane... I do not expect anyone crazy enough to try switching by anything less than standing up a new cluster. In any case, that part is not important, IMHO. The important part is programmatic control.
The phrase "SDS" jumps off "Software-Defined Networking". When SDN came into OpenStack, I was quite skeptical about it. It seemed too much like vendor-driven marketing bullshit. However, as users deployed the Project Formerly Known as OpenStack Quantum, it became clear that SDN answers their needs. The chief need was the ability to shape networks programmatically, overlaid on top of the physical networking plant, in service of the VMs.
Before SDN, when all this cloud thing came about, practitioners also struggled with the definition of it, and in particular the difference from the plain old datacenter virtualization. The difference is the programmatic control throughout. RHEV (now oVirt) eventually grew an API, which blurred the lines. But in OpenStack it was the main feature from the start. So you can manage everything and anything programmatically, including, for example, running on bare hardware. One can say that cloud is "Software-Defined Computing".
So, how does this programmatic thing apply to Swift? Joe had interesting insights cunningly hidden in the book, like these:
In an SDS system, reliability is the responsibility of the software, not the hardware. Replication and data integrity tactics are used to ensure that data does not become corrupt and that lost data is recovered.
[...]
A crucial function of an SDS system is to orchestrate capacity — storage, networking, routing & services — for entire cluster.
Swift covers the first part well already. The second is missing, or "de-coupled".
For galactic fairness, he also wrote things that seem wrong-headed to me:
There is no application sharding or managing volumes which can drive operational knowledge and complexity into applications because the SDS system is one cohesive system. Users do not need to ask for or know 'which storage pool' should be used because there is only one namespace.
The problem with hiding the pools outside of namespace is that they become invisible to the programmatic control as well, and such control is essential to the very definition of SDS. Someone at Amazon made a brilliant decision to make buckets a unit of replication in S3, so they can be linked to a region. In effect this hides the complexity but exposes knowledge that an application needs. Thus, any S3 client can do what Joe coniders SDS, but without any de-coupling, through the namespace and inside the API (or it can chose not to do it and just use a default region, for simplicity).
Joe's employees are hard at work implementing the vision as he outlined it, using the concept of regions that are internal to Swift cluster. The problem for everyone else, however, is how the programmatic control of that stack is exclusive to SwiftStack (with some useful things leaking into Swift, such as changeable replica count).
So, in the end, today Swift offers a solid foundation and parts of an SDS system, but the orchestration is "de-coupled" away elsewhere. Seems like a clear challenge to OpenStack to (re-)create the missing pieces.
P.S. I'd love to see the missing parts inside the Swift API and even namespace, although we have a problem here. Our Accounts and Containers are not guaranteed to live anywhere specifically or even on the same nodes. Changing that would be a step that I prefer. But Joe prefers to give up on plugging programmatic orchestration into the Swift API and just "de-couple" the heck of it. John, our benevolent PTL, seems to toe that line. Maybe they are right.
P.P.S. The deal with the programmatic orchestration is something that "unified" storage projects have to address too. E.g. in GlusterFS a program can issue mkdir(2). Is this programmatic control? No, not enough. Okay, they have glusterfsd nowadays, I can create volfiles in there, is that SDS? That is getting closer!
May 16, 2013 04:19 PM
My bad luck with hardware continues.
At the beginning of this year, I bought an SSD for my laptop
I previously wrote about the need to update smartmontools, which should now be updated everywhere. One thing I was not aware of at the time however, was that there’s a firmware update available. Had I known this, I would have applied it, because as soon as I hit the “400GB of lifetime writes” counter (coincidence?), it lost the ability to write to any block. It won’t even respond to secure erase commands.
The failure is exacerbated by the fact that the disk contains journalling filesystems in need of recovery. So if anything tries to mount them, it tries to write to the disk, and then falls off the bus requiring a power cycle to even see the disk again. The recovery tools provided by OCZ apparently try to mount every partition it finds during boot up (derp).
So now it’s on its way back to OCZ for reflashing/replacement. Lesson learned.
If you have one of these, and hdparm -I shows you have firmware 1.03, you might want to update it to 2.0. There are flashing tools on ocz’s site.
(in the form of bootable linux images, using an insane desktop that looks like what hacker movies in the 1990s looked like). There’s no guarantee that the new firmware actually fixes whatever problem I’ve hit, due to the lack of changelogs, but given it was the first thing they asked me to try, I’m going to say there’s a strong possibility it’s a known bug.
PSA: OCZ Vector SSD firmware. is a post from: codemonkey.org.uk
May 16, 2013 04:16 PM
Last month Tommi found a kernel bug in perf_swevent_init using trinity, and posted a fix upstream. This apparently turned out to be a local root. Someone released an exploit for it this week. (interesting dissection of the exploit by spender here).
The code to fuzz perf_event_open was added to Trinity in November 2011. Yet for some reason, we only started to hit this recently. The sanitise routine for this syscall is still pretty basic, even after I added a little more to it yesterday. There’s probably more fruit on that branch somewhere.
There’s a date in the exploit code that claims it was written shortly after the affected code was merged upstream in 2010. Assuming that’s true, it’s taken way too long to find this. Trinity should have found this a lot sooner.
CVE-2013-2094. Another day, another fuzzed bug. is a post from: codemonkey.org.uk
May 16, 2013 03:01 PM
May 15, 2013
3.10rc1 came out a few days ago. At 12,000 changesets, lwn calls it the busiest such ever. Statements like that usually make me nervous. But things are generally in pretty good shape. Much better than 3.9rc1 was.
- There has been nowhere near the same level of fallout from trinity this cycle. The only bug I’m reliably hitting has been around for a while (connect vs sendmsg udpv6 oops)
- I hit a few crash-in-early-boot bugs that were a pain to debug. (fixes still pending merge)
- Some slab corruption found in XFS. (again, fixes pending merge). There’s some talk on lkml about an ext3 issue with the same symptoms, but I’ve not managed to reproduce this (yet?).
and that’s been about it.
Generally feeling pretty solid. Fedora 19 is still going to ship with 3.9, but we’ll likely have a 3.10.x update on day of release.
3.10rc1 testing status is a post from: codemonkey.org.uk
May 15, 2013 02:40 PM
One of the common arguments against solar as an energy source is that it’s just too variable. You can never count on it when you need it. What if clouds roll in and out? [1]
One counter-argument might be – well, you never know when anyone will turn on their AC, either, at least not minute-by-minute. The grid is a balancing act; unpredictable, random loads have the same effect as unpredictable, random generators.
To which one might then counter yes, but there are so many AC units out there, they average out, more or less, turning on and off at random times and smoothing things out in aggregate.
To which the solar advocate might reply OK, then with enough solar the peaks and valleys of generation should cancel out too, as clouds move out of one area into another. Does this seem likely out in practice?
To find out, I grabbed 5 minute data from about 40 Enphase systems in the twin cities on a highly variable, sporadically cloudy day. Because we don’t yet have a whole lot of solar here, and I didn’t want the one or two large commercial systems in the group to swamp the smaller residential systems, first I normalized them all to a % of their max output. (This might be cheating a little, but with a lot more systems randomly distributed in size and geography, the swamping-out effect should be minimiized.) Here’s what just 4 of those systems looks like; each is indeed pretty messy and unpredictable at the 5-minute range:
Then I averaged all of the systems. Here’s what the average looks like, compared to one of the individual systems:
It appears that things certainly do smooth out when we look at geographically distributed systems. If I were a grid operator, I might feel a lot better about that.
The caveats might be that this is a very wide geographic range – I grabbed systems from all of the twin cities and suburbs. And that’s probably larger than the various sub-grids within the cities; what the variability is within those subgrids is, or how this solar variability affects them, I’m not sure. And of course my initial normalization of all systems to the same size could be argued with.
There have been much more rigorous papers and presentations written on this as well, see for example “Quantifying PV Power Output Variability” by Thomas E. Hoff and Richard Perez in 1999, and “Implications of Wide-Area Geographic Diversity for Short- Term Variability of Solar Power” by Andrew Mills and Ryan Wiser at LBNL in 2010. But with the advent of 5-minute monitoring from systems like Enphase, I wonder if even better results could be found from this wealth of data.
[1] I’ll submit that a sporadically cloudy day is more trouble to a grid operator than a generally cloudy day. We often know if a day will be cloudy well ahead of time, and that doesn’t yield the minute-to-minute variations of a sporadically cloudy day. The grid is better, I think, at responding to these longer-term variations.
May 15, 2013 02:39 AM
May 14, 2013
...are here. Recording was going on, but I'm not sure if it is online somewhere...
May 14, 2013 12:54 PM
May 13, 2013
Whoops. Looks like I forgot to post my slides from last year’s LinuxCon Japan talk on the Linux kernel security subsystem.
Here they are:
http://namei.org/presentations/kernel-security-state-linuxconjp-2012b.pdf
I’ll be giving an update at the upcoming LinuxCon Japan in Tokyo in a couple of weeks.
May 13, 2013 11:14 AM
May 10, 2013
| |
17 |
18 |
19 |
rawhide |
|
| Open: |
273 |
348 |
117 |
73 |
(811) |
| Opened since 2013-05-03 |
7 |
17 |
8 |
6 |
(38) |
| Closed since 2013-05-03 |
7 |
20 |
14 |
4 |
(45) |
| Changed since 2013-05-03 |
20 |
41 |
20 |
7 |
(88) |
Nothing terribly exciting in this weeks new bugs. Backlog continues to slowly get beaten down. Next week should see a rebase to 3.9 for F18.
Weekly Fedora kernel bug statistics – May 10th 2013 is a post from: codemonkey.org.uk
May 10, 2013 04:18 PM
May 09, 2013
This is not something to brag about, but apparently I managed to program computers for about 30 years without writing unit tests. Today it's recitified by adding a test to one of my projects voluntarily. I encountered the goodness of build-time testing when working on Jeff Garzik's Project Hail. And of course, OpenStack, including Swift, had them since forever. Those weren't my projects, however.
May 09, 2013 08:12 PM
I'm still running squeeze on my X60... and I decided that with wheezy becoming "stable", it was good idea to upgrade. Before I started, I did back up my root filesystem (fortunately), with
cp -a --one-file-system / somewhere
Upgrade was a bit of fight (like aptitude trying to take hours of cpu time), but eventually I succeeded... Only to realize that system no longer boots into GUI and (worse) that gnome2 is gone. I'm not great fan of gnome3; definitely on X60, anyway. Its animations feel excessive even when system is unloaded, if there's some background load it quickly becomes unusable. I googled a bit, and it did not look like going back to gnome2 is not exactly easy.
So I went back from the backup. First, chromium refused to run because new version broke the config files. I restored those from backup. But next... strangely my self-compiled 3.9 kernel stopped working. Stock debian kernel kept running, but own kernel ran init then rsyslogd broke the boot.
Can you guess what went wrong?
pc jvgu bar svyr flfgrz bcgvba vf ernyyl onq vqrn; vg jvyy abg pbcl rirelguvat sebz lbhe / svyrflfgrz, va cnegvphyne vg jvyy abg pbcl /qri, orpnhfr gurer'f gz csf zbhagrq bire vg. Bhpu.
May 09, 2013 12:20 PM
May 07, 2013
I've been working on TPMs lately. It turns out that they're moderately awful, but what's significantly more awful is basically all the existing documentation. So here's some of what I've learned, presented in the hope that it saves someone else some amount of misery.
What is a TPM?
TPMs are devices that adhere to the Trusted Computing Group's Trusted Platform Module specification. They're typically microcontrollers[1] with a small amount of flash, and attached via either i2c (on embedded devices) or LPC[2] (on PCs). While designed for performing cryptographic tasks, TPMs are
not cryptographic accelerators - in almost all situations, carrying out any TPM operations on the CPU instead would be massively faster[3]. So why use a TPM at all?
Keeping secrets with a TPM
TPMs can encrypt and decrypt things. They're not terribly fast at doing so, but they have one significant benefit over doing it on the CPU - they can do it with keys that are tied to the TPM. All TPMs have something called a Storage Root Key (or SRK) that's generated when the TPM is initially configured. You can ask the TPM to generate a new keypair, and it'll do so, encrypt them with the SRK (or another key descended from the SRK) and hand it back to you. Other than the SRK (and another key called the Endorsement Key, which we'll get back to later), these keys aren't actually kept on the TPM - the running OS stores them on disk. If the OS wants to encrypt or decrypt something, it loads the key into the TPM and asks it to perform the desired operation. The TPM decrypts the key and then goes to work on the data. For small quantities of data, the secret can even be stored in the TPM's nvram rather than on disk.
All of this means that the keys are tied to a system, which is great for security. An attacker can't obtain the decrypted keys, even if they have a keylogger and full access to your filesystem. If I encrypt my laptop's drive and then encrypt the decryption key with the TPM, stealing my drive won't help even if you have my passphrase - any other TPM simply doesn't have the keys necessary to give you access.
That's fine for keys which are system specific, but what about keys that I might want to use on multiple systems, or keys that I want to carry on using when I need to replace my hardware? Keys can optionally be flagged as migratable, which makes it possible to export them from the TPM and import them to another TPM. This seems like it defeats most of the benefits, but there's a couple of features that improve security here. The first is that you need the TPM ownership password, which is something that's set during initial TPM setup and then not usually used afterwards. An attacker would need to obtain this somehow. The other is that you can set limits on migration when you initially import the key. In this scenario the TPM will only be willing to export the key by encrypting it with a pre-configured public key. If the private half is kept offline, an attacker is still unable to obtain a decrypted copy of the key.
So I just replace the OS with one that steals the secret, right?
Say my root filesystem is encrypted with a secret that's stored on the TPM. An attacker can replace my kernel with one that grabs that secret once the TPM's released it. How can I avoid that?
TPMs have a series of Platform Configuration Registers (PCRs) that are used to record system state. These all start off programmed to zero, but applications can extend them at runtime by writing a sha1 hash into them. The new hash is concatenated to the existing PCR value and another sha1 calculated, and then this value is stored in the PCR. The firmware hashes itself and various option ROMs and adds those values to some PCRs, and then grabs the bootloader and hashes that. The bootloader then hashes its configuration and the files it reads before executing them.
This chain of trust means that you can verify that no prior system component has been modified. If an attacker modifies the bootloader then the firmware will calculate a different hash value, and there's no way for the attacker to force that back to the original value. Changing the kernel or the initrd will result in the same problem. Other than replacing the very low level firmware code that controls the root of trust, there's no way an attacker can replace any fundamental system components without changing the hash values.
TPMs support using these hash values to decide whether or not to perform a decryption operation. If an attacker replaces the initrd, the PCRs won't match and the TPM will simply refuse to hand over the secret. You can actually see this in use on Windows devices using Bitlocker - if you do anything that would change the PCR state (like booting into recovery mode), the TPM won't hand over the key and Bitlocker has to prompt for a recovery key. Choosing which PCRs to care about is something of a balancing act. Firmware configuration is typically hashed into PCR 1, so changing any firmware configuration options will change it. If PCR 1 is listed as one of the values that must match in order to release the secret, changing any firmware options will prevent the secret from being released. That's probably overkill. On the other hand, PCR 0 will normally contain the firmware hash itself. Including this means that the user will need to recover after updating their firmware, but failing to include it means that an attacker can subvert the system by replacing the firmware.
What about using TPMs for DRM?
In theory you could populate TPMs with DRM keys for media playback, and seal them such that the hardware wouldn't hand them over. In practice this is probably too easily subverted or too user-hostile - changing default boot order in your firmware would result in validation failing, and permitting that would allow fairly straightforward subverted boot processes. You really need a finer grained policy management approach, and that's something that the TPM itself can't support.
This is where Remote Attestation comes in. Rather than keep any secrets on the local TPM, the TPM can assert to a remote site that the system is in a specific state. The remote site can then make a policy determination based on multiple factors and decide whether or not to hand over session decryption keys. The idea here is fairly straightforward. The remote site sends a nonce and a list of PCRs. The TPM generates a blob with the requested PCR values, sticks the nonce on, encrypts it and sends it back to the remote site. The remote site verifies that the reply was encrypted with an actual TPM key, makes sure that the nonce matches and then makes a policy determination based on the PCR state.
But hold on. How does the remote site know that the reply was encrypted with an actual TPM? When TPMs are built, they have something called an Endorsement Key (EK) flashed into them. The idea is that the only way to have a valid EK is to have a TPM, and that the TPM will never release this key to anything else. There's a couple of problems here. The first is that proving you have a valid EK to a remote site involves having a chain of trust between the EK and some globally trusted third party. Most TPMs don't have this - the only ones I know of that do are recent Infineon and STMicro parts. The second is that TPMs only have a single EK, and so any site performing remote attestation can cross-correlate you with any other site. That's a pretty significant privacy concern.
There's a theoretical solution to the privacy issue. TPMs never actually sign PCR quotes with the EK. Instead, TPMs can generate something called an Attestation Identity Key (AIK) and sign it with the EK. The OS can then provide this to a site called a PrivacyCA, which verifies that the AIK is signed by a real EK (and hence a real TPM). When a third party site requests remote attestation, the TPM signs the PCRs with the AIK and the third party site asks the PrivacyCA whether the AIK is real. You can have as many AIKs as you want, so you can provide each service with a different AIK.
As long as the PrivacyCA only keeps track of whether an AIK is valid and not which EK it was signed with, this avoids the privacy concerns - nobody would be able to tell that multiple AIKs came from the same TPM. On the other hand, it makes any PrivacyCA a pretty attractive target. Compromising one would not only allow you to fake up any remote attestation requests, it would let you violate user privacy expectations by seeing that (say) the TPM being used to attest to HolyScriptureVideos.com was also being used to attest to DegradingPornographyInvolvingAnimals.com.
Perhaps unsurprisingly (given the associated liability concerns), there's no public and trusted PrivacyCAs yet, and even if they were (a) many computers are still being sold without TPMs and (b) even those with TPMs often don't have the EK certificate that would be required to make remote attestation possible. So while remote attestation could theoretically be used to impose DRM in a way that would require you to be running a specific OS, practical concerns make it pretty difficult for anyone to deploy that at any point in the near future.
Is this just limited to early OS components?
Nope. The Linux kernel
has support for measuring each binary run or each module loaded and extending PCRs accordingly. This makes it possible to ensure that the running binaries haven't been modified on disk. There's not a lot of distribution infrastructure for setting this up, but in theory a distribution could deploy an entirely signed userspace and allow the user to opt into only executing correctly signed binaries. Things get more interesting when you add interpreted scripts to the mix, so there's still plenty of work to do there.
So what can I actually use a TPM for?
Drive encryption is probably the best example (Bitlocker does it on Windows, and there's a LUKS-based implementation for Linux
here) - while in theory you could do things like use your TPM as a factor in two-factor authentication or tie your GPG key to it, there's not a lot of existing infrastructure for handling all of that. For the majority of people, the most useful feature of the TPM is probably the random number generator. rngd has support for pulling numbers out of it and stashing them in /dev/random, and it's probably worth doing that unless you have an Ivy Bridge or other CPU with an RNG.
Things get more interesting in more niche cases. Corporations can bind VPN keys to corporate machines, making it possible to impose varying security policies. Intel use the TPM as part of their anti-theft technology on education-oriented devices like the Classmate. And in the cloud, projects like
Trusted Computing Pools use remote attestation to verify that compute nodes are in a known good state before scheduling jobs on them.
Is there a threat to freedom?
At the moment, probably not. The lack of any workable general purpose remote attestation makes it difficult for anyone to impose TPM-based restrictions on users, and any local code is obviously under the user's control - got a program that wants to read the PCR state before letting you do something? LD_PRELOAD something that gives it the desired response, or hack it so it ignores failure. It's just far too easy to circumvent.
Summary?
TPMs are useful for some very domain-specific applications, drive encryption and random number generation. The current state of technology doesn't make them useful for practical limitations of end-user freedom.
[1] Ranging from 8-bit things that are better suited to driving washing machines, up to full ARM cores
[2] "Low Pin Count", basically ISA without the slots.
[3] Loading a key and decrypting a 5 byte payload takes 1.5
seconds on my laptop's TPM.

comments
May 07, 2013 05:18 PM
May 06, 2013
The CFP for the 2013 Linux Security Summit has been announced.
The summit will be held across the 19th and 20th of September in New Orleans, co-located again with LinuxCon and Linux Plumbers. Note that presenters and attendees at LSS must be registered as LinuxCon attendees.
We’ll be following a similar format to last year, with a day of refereed presentations, followed by subsystem updates and break-out sessions on the second day. We’ll probably finish up around lunchtime on the Friday for people needing to head home that day, but check the final schedule for details once it’s published.
The CFP is open until 14th June, with speaker notifications to be posted by 21st June.
If you’ve been doing cool and interesting work in Linux security, be sure to submit a proposal!
May 06, 2013 09:59 AM
May 04, 2013
I published a quick overview on how to do TSX profiling with Linux perf: Intel TSX profiling with Linux perf
This is a technical overview that assumes some prior knowledge of profiling. I apologize for the cumbersome title.
May 04, 2013 02:53 AM
May 03, 2013
| |
17 |
18 |
19 |
rawhide |
|
| Open: |
270 |
345 |
126 |
70 |
(811) |
| Opened since 2013-04-26 |
4 |
24 |
9 |
6 |
(43) |
| Closed since 2013-04-26 |
12 |
28 |
8 |
9 |
(57) |
| Changed since 2013-04-26 |
15 |
52 |
18 |
7 |
(92) |
Weekly Fedora kernel bug statistics – May 03 2013 is a post from: codemonkey.org.uk
May 03, 2013 04:10 PM
April 30, 2013
I've released man-pages-3.51. The release tarball is available on kernel.org. The browsable online pages can be found on man7.org. The Git repository for man-pages is available on kernel.org.
This is a relatively small release that has various fixes across a number of pages. Among the more notable changes in man-pages-3.51 are the following:
- Documentation of various /proc interfaces was added in a number of pages.
- Various architecture-specific details were added to the syscall(1) and clone(2) pages.
April 30, 2013 07:31 AM
I wrote a blog post and contributed to a paper on modern locking on Intel Xeon systems. My recent talk on this has been also covered by LWN here (still behind paywall for a few days)
Summary is more or less: batch your locks. Don’t make critical sections too small. Having the smallest locks is not cool anymore.
April 30, 2013 04:21 AM
April 29, 2013
Huge number of bug closures this month. Unfortunately several hundred of them are the automated ‘faf’ bugs that were pretty useless.
(1. lots of tainted/virtualbox reports. 2. old kernels. 3. no human attached to them if we need to ask questions, which most of the time we do).
Even discounting those bugs, it’s been quite a productive month, with the total open count around 170 bugs lower than it was a month ago.
Monthly Fedora kernel bug statistics – April 2013 is a post from: codemonkey.org.uk
April 29, 2013 05:41 PM
April 27, 2013
April 26, 2013
Ok, so I have small herd of computers -- Thinkpad X60, AMD desktop, ARM "tablet", Nokia N900, Nokia 6151, TI Chronos smartwatch. All but 6151 and chronos run Linux, all are programmable, and I have root access on all but Nokia 6151.
Yes, I should probbly get some light and cheap android as 6151 replacement. Light and cheap because it will go on horseback with me, and should have powerful at least 60mW led... because that's important in woods. I'm thinking about Alcatel v860 (old android) or Gigabyte Gsmart g1342 (unsure rooting)... suggestions welcome..
I somehow assumed that having machines under control and running Linux would make them well-behaved, solve world hunger, and make them behave as they do in movies... and yes, it is nice to be able to ssh around and to run same app on most machines.
But happy new world did not happen.
Phones have concept of "behaving themselves", like quiet profile, and I did program automatihc profile switching -- sl4a for android is nice, and that's part of reason I ported it to maemo. Unfortunately, desktops/notebooks do not even have that much... and tend to wake me, or worse (wake my girlfriend).
Another thing PCs should learn from phones is concept of loud speaker and headset. Using headset is pretty much requirement for VoIP telephony, but incoming rings should go to the loudspeakers... even if headset is plugged in. (Currently, I have USB speakers connected for loudspeakers. Yes, twinkle can select soundcards for ringtone and voice separately; but these should be system-level options, not app-level.)
What more would I like? It should be easy to do remote access between my machines; I should not have to search for cell phone to read incoming SMS, or do anything with the small keyboard when I'm near PC. Yes, ssh exists and I use it a lot, and vnc for n900 exists, too... but it has all to be configured separately. When I open web page on a PC and start reading, it should be easy to finish reading on other computer. I now use chromium, so maybe it is doable, but I'd really prefer to use my server, not google.
I'd like my machines to synchronize contact and calendar entries. At least this seems to be mostly working, thanks to owncloud. I'd like machines to automatically back each other up. I have some scripts for that, but they are a bit hacky and far away from automatic. And there's more that should be done...
April 26, 2013 01:01 PM
April 25, 2013
A few years ago, I was fortunate enough to get given a USB EHCI debug cable. With traditional serial ports being a thing of the past that I haven’t seen on a new machine in a long time, it’s been a lifesaver. The number of kernel crashes I’ve been able to capture through using that cable that would have otherwise been lost is some ridiculously high immeasurable number. I’m saying I like this thing, a lot.
So much so that I wanted to buy more of them, so I could not have to keep replugging it around between test machines.
With multiple test machines constantly running, it’s not really a practical solution.
The first problem, they aren’t cheap. $95 each. For basically two USB->serial chips, and some circuitry to make them handshake.
The bigger problem, is that only one place seems to sell them and they’ve been
“out of stock, and in redesign” for a long time now.
I tried emailing the manufacturer Ajaystech, who seem to completely ignore their sales@ email address.
Disappointing.
In the absence of a replacement, I’m going to have to hope that netconsole works well enough on older machines, and in the future, dumps to pstore.
USB debug cables. is a post from: codemonkey.org.uk
April 25, 2013 05:27 PM
April 23, 2013
Greg Holt posted a change to Swift's object server and replicator, which adds a counting semaphore to the server. As best I can tell, the intent is to provide a form of load control, so that no more than N replicators talk to a given server. So:
global_conf['replication_semaphore'] = [posix_ipc.Semaphore(
None, flags=posix_ipc.O_CREAT | posix_ipc.O_EXCL,
initial_value=int(preloaded_app_conf['replication_concurrency']))]
.......
try:
self.replication_semaphore.acquire(0)
except posix_ipc.BusyError:
return HTTPServiceUnavailable()
try:
....... replication service goes here
finally:
self.replication_semaphore.release()
This appears to be doing the job, but it adds a dependency on posix_ipc. Question arises, are we missing a traditional way to do this?
Since the semaphores are not taken and checked often in this application, once per replication request, something like creating a directory in /var/run, populating it with one file per server instance, and counting them with listdir would be acceptable (even with obvious inexact count), but we need something that cleans up in case of crashes.
Another trick I know is to create a pipe and prime it with N 1-byte tokens, but then again, crash cleanup is not bulletproof.
The best would be to identify an entity in Linux that counts opens. Something in /proc, perhaps? Open it, read it to see the open count, if it is too high, abort. Close when done. Kernel closes it in case of a crash, perfect. Well, I guess I just described POSIX semaphore, again. If only it was a part of Python...
UPDATE: Someone was urging Greg to use multiprocessing.Semaphore. Unfortunately, the docs for that contain omnious warnings of poor crash resistance. Cannot say more without looking at the implementation.
April 23, 2013 06:28 PM
April 22, 2013
I was a little distracted by local events last Friday, so here’s a belated post of the weekly bug stats for the kernel.
| |
17 |
18 |
19 |
rawhide |
|
| Open: |
281 |
348 |
166 |
61 |
(856) |
| Opened since 2013-04-12 |
6 |
19 |
21 |
1 |
(47) |
| Closed since 2013-04-12 |
6 |
39 |
10 |
1 |
(56) |
| Changed since 2013-04-12 |
20 |
58 |
28 |
4 |
(110) |
Weekly Fedora kernel bug statistics – April 19th 2013 is a post from: codemonkey.org.uk
April 22, 2013 05:31 PM
April 21, 2013
I have been using xfig for a very long time, almost as long as I have been using gnuplot. But xfig has been getting a bit cranky lately, mostly in terms of font handling. I suspect that it is possible to make it handle fonts like it used to, but I decided to take this as a hint to try something that might actually be younger than the typical Linux kernel hacker. (Yes, I am getting a bit old to engage in ageism, but there you have it!)
I had tried inkscape some years back, but at the time it was not ready for prime time, at least not from the perspective of a long-time xfig user. But I had recently received a .svg, and had installed inkscape in order to be able to work with it. Besides, some of the more recent browsers can render directly from .svg, which may in the fullness of time remove the need to generate bitmap files for HTML documents.
So I gave inkscape a try.
The first pleasant surprise is that inkscape is able to import xfig's .fig file format. This import process is not perfect, for example, the fonts do not match exactly and arrowheads are sometimes imported as objects separate from the line that they are supposed to be attached to, but it is much nicer than recreating the diagram from scratch. In addition, in many cases, the import imperfections are not a problem, such as when the goal is simply to add something to the figure.
Of course, the menu layout is completely different than that of xfig, but this is not always a bad thing. For example, even given long familiarity with xfig, I found inkscape's object rotation to be much more powerful and easier to use than that of xfig. Object alignment and distribution is also much nicer in xfig. The manual canvas configuration in inkscape is a step back from xfig's automation, but it might well be that I just haven't yet found the corresponding inkscape setting. Finally, the ability to directly generate .pdf files works more smoothly with pdflatex, which I use heavily. The fact that they get rotated 90 degrees was a bit surprising at first, but the \rotatebox{270} directive in Latex takes care of that.
So who knows? After more years than I care to recall, it might finally be time to bid xfig a fond farewell.
April 21, 2013 10:45 PM
April 20, 2013
You know what is the most annoying misfeature of Sharp Zaurus sl-5500? They spared few cents, and did not provide real time clock.
Do you know what is the most annoying misfeature of Nokia N900? Yes. They failed to provide real time clock. actually... it seems to be misfeature of whole Nokia line, including old Nokia 6151.
Now... if you try to swap simcards between two Nokia phones, that's when the fun starts... because you are likely to lose all the time sources.
Bad Nokia.
April 20, 2013 11:39 AM
April 18, 2013
So kernel 3.9 should be releasing really soon and so it's time for our regular look at what 3.10 brings to drm/i915:
For enthusiast (i.e. people who like to see their hw burn down in flames ...) the improved overclocking support is certainly the interesting bit. Thanks to Ben Widawsky's patches we now correctly detect the gpu turbo limit and set the non-turbo frequency as the default limit to avoid hanging systems right on boot. So GPU overclocking should now work on Sandybridge and Ivybridge - apparently something changed on Haswell. Related is Chris Wilson's patch to tune the Haswell turbo support properly - Haswell has new frequency domains and so needs a different control table to ramp up the ring frequency when the GPU is busy.
A big pain relief, at least on affected systems, is Ebgert Eich's hotplug irq storm mitigation work. With the modeset locking rework from 3.9 this should get rid of the last reason to boot with the drm_kms_helper.poll=0 option. Finally no longer a sluggish cursor and delayed screen updates!
Another neat improvement is the vt-switchless suspend/resume support from Jesse Barnes. This goes back to the user modesetting days, where the kernel forced a vt-switch to the vga linux console to make sure the X server properly saved and afterwards again restored the display state. With kernel modesetting that's pretty pointless and results in an ugly console cursor appearing quickly on resume (usually you can't spot it on suspend ...). But those days are past now.
Display-less gpu support is probably not something many people will care about. But Ben Widawsky's patches allows us to run on special Ivybridge server configurations, where all the display parts of the gpu are fused of and only the GT is used for e.g. video transcoding. Which is pretty cool since for a long time Intel graphics was only about display things, with a comparetively puny gpu attached to it. Times are changing ...
For driver internals the big change is the introduction of the pipe configuration tracking. This is pretty much just a continuation of the modeset rework started in 3.7 and serves two main purposes: First this will allow us to precompute the desired display pipe configuration before we start to touch the hardware. This is required to make atomic modeset operations actually useful: Userspace can then just ask the kernel which configurations work very quickly and more important without causing any flickering, before deciding on a given setup. The second reason is that for fastboot we need to track the display state left behind by the BIOS precisely - integrating that tracking into the established hw state readout and cross-check support will make sure that this tracking is actually reliable.
3.10 only contains the basic infrastructure though and moves only a few basic attributes over to it (like the pipe bpp values, dither settings, color space conversion and a few internal states used by different platforms to keep track of enabled chip functions). Our aim for 3.11 is to tackle the really big things like clock sharing, output mode and especially dotclock reconstruction and so lay the groundwork for solid fastboot support.
Again purely driver-internal was the massive low-level GTT interface rework from Ben Widawsky. This will help a lot in finally implementing real per-process gpu address spaces on Intel hardware, and it should also simplify enabling of future hardware platforms since that has now a clearly-defined and well-separated interface. Related is Imre Deak's little cleanup to abstract all our scatter-gather list walking with a for_each_sg_page iterator.
Finally I want to point out the pageflip improvements from Ville Syrjälä. Compositors should now no longer get stuck after a gpu hang, gen2-4 received vblank interrupt and pageflip completion interrupt handling fixes and modesetting or panning operations should now also be able to survive gpu hangs without resulting in deadlocks. And the best part is that we have full coverage for all these corner cases in our kernel test suite, so these bugs should be gone for good.
Last but not least there's been the usual big pile of small&large improvements all over: More vlv patches, backlight improvements and tons of bugfixes all over. For amusement maybe take a look at Chris Wilson's bring a bigger gun coherence fix.
April 18, 2013 09:29 PM
Techcrunch said it all:
At at AWS Summit in New York today, Amazon announced that its S3 storage service now holds more than 2 trillion objects. That’s up from 1 trillion last June and 1.3 trillion in November [...]. As Amazon’s Chief Evangelist for AWS Jeff Barr notes in a blog post today, it took Amazon six years to grow to get to 1 trillion stored objects, “and less than a year to double that number.” S3, Barr also writes, now regularly sees peaks of more than 1.1 million requests per second.
I don't think all Swift istallations combined come anywhere close, although potentially they should be able to. Pretty easy, actually. Meanwhile, we hit local problems. At the Summit two days ago, a presenter from Korea Telecom said that some of their customers put 50 million (5e7) objects into a Swift container. Naturally, container listings and even lookups become challenging.
April 18, 2013 05:39 PM
April 17, 2013
openwlanmap was emptier than openbmap, but unfortunately I could not figure out how to upload there. Figuring out Openwlanmap upload took a while, too, but I succeeded in the end.
Oh, and based on openmobility feedback, I created "fusion" script to determine position from multiple sources, assuming limit on speed user moves. In the progress, I did some measureemnts. 95% of all "current GSM cell" situations seem to be less than 4.8km from the cell. That was expected. But the same number for wifi is very high: 95% of all "wifi detected" situations fits within 2.8km... Either omnidirectional wifi has way longer range then I expected, or there's something wrong in the measurements. That was with n900.
April 17, 2013 12:59 PM
April 16, 2013

Continuing the Hardware at Havana theme, Facebook showed their "Open Hardware" platform. Like Google, they found it advantageous to design systems for themselves, but unlike Google, they opened the design. Anyone can download the detailed docs and anyone can contribute.
I was sceptical about open source for hardware. The material nature of hardware places a high barrier, so participation tends to be low. Nonetheless, the gentleman at the booth said that someone contributed an ARM server board. It slides into the slot under the drive tray where SAS extender goes in the reference design.
It shows how long I was out of the hardware, but apparently it's completely acceptable nowadays to place drives along the airflow. The Facebook tray is very much like Dell design that way, only using 3.5" drives. Fans are removable on drive trays but not on server bricks.
In a strange twist, Facebook trays take 12V DC power. Not telco-style 48V, nor contentional AC.
April 16, 2013 08:45 PM

Saw a cute thing at the "Havana" OpenStack Summit: a Dell ARM prototype. It is based on C5125 module for C5220 chassis. If you go to Dell website and check their gallery, it's readily apparent how busy the design of the x86-based module is when compared with the ARM (the x86 is AMD; Dell have an even worse module based on Intel -- 50% thicker).
Another thing, I love the way drives are bunched on trays. People keep chasing the individually accessible drives with the corresponding hit to density. If one abandons the hot-swap mania, he can do these nice trays. Also! This is the first time I see someone intelligently using the 2.5" drives. Only took 15 years. About the only critique I would level on the thing is that the air goes over the drives in sequence. There's a bit of spare width for staggering, so a few cheap baffles and/or raisers could even it out, but someone at Dell decided not to bother.
BTW, Dell people told me that the prototype is not slated for production. They are waiting for the 64-bit ARM. Typical. ARM servers are always one year away, for many years now. In the time we heard promises of ARM servers SpaceX built whole rocket and started launching it, but Netwinder seems harder to reproduce than Saturn V.
April 16, 2013 04:12 PM
April 12, 2013
| |
17 |
18 |
19 |
rawhide |
|
| Open: |
278 |
371 |
152 |
62 |
(863) |
| Opened since 2013-04-05 |
9 |
22 |
7 |
3 |
(41) |
| Closed since 2013-04-05 |
7 |
40 |
62 |
3 |
(112) |
| Changed since 2013-04-05 |
20 |
67 |
95 |
18 |
(200) |
Weekly Fedora kernel bug statistics – April 12th 2013 is a post from: codemonkey.org.uk
April 12, 2013 10:02 PM
April 09, 2013
First things first: If you damage your hardware and burn down your house, it's not my problem. You've been warned!
So after a bit of irc discusssions yesterday it turns out that you can overclock intel gpus by quite a margin. Which makes some sense now that the gfx performance of intel chips isn't something to be completely ashamed of.
You need a few ingredients:
- An Sandybridge/Ivybridge Intel GPU (Intel HD 2000/3000/2500/4000 in marketing speak).
- A "enthusiast" motherboard which allows you to set gpu overclocking settings (higher frequency and voltage).
- Lastet drm-intel-nightly kernel branch from drm-intel git or at least a kernel built with this patch from Ben Widawsky. The patch should apply to pretty much any recent stable kernel. Without that patch gpu turbo support is broken.
- Preferably a desktop gpu with a big cooling rig. Overclocking on modern intel chips essentially just increases the turbo headroom into the unvalidated range (with the potential for hangs and corruptions in rendering). And if your cooling system can't keep up with the increased heat output the on-chip controller will quickly clamp your clocks down to the non-turbo frequency.
- Your favorite gpu benchmark. The reason for that is the thermal throttling when running in the turbo range: It happens behind the driver's back and the only observable effect is a slower gpu - the current frequency value in sysfs is still the same!
To check whether it all works out you can boot with
drm.debug=0xe appended to your kernel cmdline and check dmesg for the gpu overclocking support debug message:
[98650.411179] [drm:gen6_enable_rps], overclocking supported, adjusting frequency max from 1300MHz to 1300MHzYeah, my system doesn't really support overclocking :(
Or check in sysfs in
/sys/class/drm/card0 the various
gt_*_freq_mhz files. You can also use those to limit the upper clock, which is useful for figuring out at which frequency your gpu is still stable at for a given voltage setting.
Limiting the lower clocks is usually not interesting, since it will only affect how much power the hardware can safe under intermediate loads. And you want to aggressively reduce heat output to better use the turbo range when needed.
If you play around with this please drop a comment with your results - I only have boring Intel developer board's around here which don't support overclocking. So no fun fore me. But freezer on #intel-gfx managed to overclock his desktop Ivybridge from 1.1GHz to 1.6GHz with only a small voltage increase, and fps in CS:S increased quite a bit due to that.
Happy overclocking and benchmarking
Update: In debugfs you can check the CAGF (current actual gt frequency) field in the
i915_cur_delayinfo file for the real frequency, including any effects due to thermal throttling.
Update 2: Latest kernels (and so also 3.10) will limit the gpu frequency at boot-up to the non-overclocked range. This should prevent crashes while booting, but it also means that you need to manually set the overclocked frequency limit.
April 09, 2013 10:45 PM
April 08, 2013
I picked up TI Chronos after long time -- like half a year. (Yes, I'd like to know if Pebble smartwatch is lighter than this... because chronos are a bit too heavy for me.) And I was surprised: time on watch actually matched time on thermometer.
...which was very surpising, given that DST changed in the meanwhile, and given that chronos had trouble keeping good time.
Well... it turned out I was really lucky: chronos was day off. (And then I realized that I did not update thermometer to DST. I hate DST. Changing timezone twice a year is stupid.)
April 08, 2013 12:49 PM
April 07, 2013
Openmobility conference was nice... All the promising cellphone platforms around, 2 3D printers and a plane with camera as a bonus.
If someone is interested, slides are here. And yes, contributing GSM cells do the databases is as easy as walking around with Android phone and suitable application. Czech republic has ~50% cell coverage now, so GSM positions are already usable, and it will only get better...
April 07, 2013 10:15 PM
April 05, 2013
(See here for an update to this)
A hardware vendor apparently had a copy of an AMI private key on a public FTP site. This is concerning, but it's not immediately obvious how dangerous this is for a few reasons. The first is that this is apparently the firmware signing key, not any of the Secure Boot keys. That means it can't be used to sign a UEFI executable or bootloader, so can't be used to sidestep Secure Boot directly. The second is that it's AMI's key, not a board vendor - we don't (yet) know if this key is used to sign any actual shipping firmware images, or whether it's effectively a reference key. And, thirdly, the code apparently dates from early 2012 - even if it was an actual signing key, it may have been replaced before any firmware based on this code shipped.
But there's still the worst case scenario that this key is used to sign most (or all) AMI-based vendor firmware. Can this be used to subvert Secure Boot? Plausibly. The attack would involve producing a new, signed firmware image with Secure Boot either disabled or with an additional key installed, and then to reflash that firmware. Firmware images are very board-specific, so unless you're engaging in a very targeted attack you either need a large repository of firmware for every board you want to attack, or you need to perform in-place modification.
Taking a look at the firmware update tool used for AMI systems, the latter might be possible. It seems that the AMI firmware driver allows you to dump the existing ROM to a file. It'd then be a matter of pulling apart the firmware image, modifying the key database, putting it back together, signing it and flashing it. It looks like doing this does require that the user enter the firmware password if one's set, so the simplest mitigation strategy would be to do that.
So. If this key is used by most vendors shipping AMI-based firmware, and if it's a current (rather than test) key, then it may well be possible for it to be deployed in an automated malware attack that subverts the Secure Boot trust model on systems running AMI-based firmware. The obvious lesson here is that handing out your private keys to third parties that you don't trust is a pretty bad idea, as is including them in source repositories.
(Wow, was this really as long ago as 2004? How little things change)
comments
April 05, 2013 11:21 PM
According to the update here, the signing keys are supposed to be replaced by the hardware vendor. If vendors do that, this ends up being uninteresting from a security perspective - you could generate a signed image, but nothing would trust it. It should be easy enough to verify, though. Just download a firmware image from someone using AMI firmware, pull apart the capsule file, decompress everything and check whether the leaked public key is present in the binaries.
The real risk here is that even if most vendors have replaced that key, some may not have done. There's certainly an argument that shipping test keys at all increases the probability that a vendor will accidentally end up using those rather than generating their own, and it's difficult to rule out the possibility that that's happened.
comments
April 05, 2013 11:21 PM
| |
17 |
18 |
19 |
rawhide |
|
| Open: |
279 |
393 |
212 |
49 |
(933) |
| Opened since 2013-03-29 |
14 |
20 |
10 |
3 |
(47) |
| Closed since 2013-03-29 |
12 |
52 |
5 |
8 |
(77) |
| Changed since 2013-03-29 |
29 |
132 |
178 |
5 |
(344) |
Ton of F18 bugs closed out, and the rawhide bugs became F19 bugs now that we’ve branched.
Weekly Fedora kernel bug statistics – April 05 2013 is a post from: codemonkey.org.uk
April 05, 2013 04:13 PM
April 04, 2013
Czech newspapers have article on human vs. animal performance. They claim that horses have top speed around 30 km/h and can't compete with humans on long distances. Well, top horse speed is above 60 km/h (ponies can do 30km/h), and yes they can mostly beat humans on long distances.
Anyway, I always wanted to do horse vs. human races... I'll provide the horse (slightly-trained fjord). Any humans?
April 04, 2013 11:14 AM
April 03, 2013
Last week I mentioned that March had been a busy month for Trinity.
It wasn’t until I went to update the bugs found list that I realised just how busy. Some historical context: Throughout 2012, Trinity found a dozen or so bugs in a month. (Plus probably some that I wasn’t cc’d on).
The last few months, it’s been finding a lot more. February, it found 65 bugs. March, at least 85. (Actually even more, my count is based on unique Subject: lines in emails to linux-kernel, and sometimes multiple bugs were found/reported within the same thread. And again, I may have missed mails that I wasn’t cc’d on).
This doesn’t necessarily mean that the kernel is getting buggier, but more likely, that trinity is getting a lot better at finding bugs.
Several new features introduced last month are responsible for a slew of the most recently reported bugs, and many of them were in code that was introduced into the kernel a while ago. (See for eg, the recent namespace bugs).
With the volume of bugs found getting so high, it’s too much work to do things like tracking down whether each bug got fixed and moving it to the ‘fixed’ page, so I’m probably going to have to come up with some better way of tracking bugs found with Trinity.
Trinity success rate. is a post from: codemonkey.org.uk
April 03, 2013 12:27 AM
April 01, 2013
Last week I used 2 BTC to support Jupiter Broadcasting’s Unfilter show (and their other shows, but only Unfilter takes BTC so far). Just now I noticed that someone made a 0.5BTC donation to my blog (I’ve had a BTC donation address in the sidebar of my blog for a few years now). Thanks!
As I promised to pass donations onwards, I googled for bitcoin donations, and chose the following places to give 0.05 BTC each:
- Juice Rap News for making high-baud political commentary (Unfilter in rap form)
- Freedom Box for actually doing something about Internet freedom.
- Torservers.net (as recommended by torproject.org) for the same.
- f-droid.org for keeping a healthy Open alternative.
- Bitcoin Foundation to support and strengthen the infrastructure that made this possible.
- The Free Software Foundation even though I don’t always agree with them.
- Wikileaks for recognizing something society needs, even if they stumble at delivery.
- The Internet Archive for something that only gets more useful over time.
There are two left to go, so I’ll keep an eye out for more opportunities to donate in the next few weeks…
-0.05
April 01, 2013 02:36 AM
March 30, 2013
So I've been involved in a recent dispute on the wayland project, with a person I'd classify as a poisonous person. Basically a contributor who was doing more damage than good, and was causing unneeded disturbances. I won't comment any further on that here, but just setting the scene for writing this.
So everytime something like this happens in a project, there emerges from the woodwork, people who claim that having public discussions about these sort of things is bad for open source, or makes us look like a crowd of juvenile developers, also how you never see this thing on closed sourced projects, or with open-source projects developer in-house and thrown over the wall. I've also recently seen this crop up when Linus flamed people, and everyone wondered why he didn't do it on some sort of private list or something.
Now I can only think these people are one of:
a) never worked in a company on a major closed source project.
b) if they have, its been top down development, where managers are telling them what to do, and maybe some architect dude has drawn a load of pretty pictures and docs. Of course the architect is never wrong, but its above your pay grade to talk to someone of such authority, so when you find problems with the architecture you hack around them instead of growing a pair and standing your ground, or else you aren't good enough to notice anything wrong.
I've seen plenty of companies where developers leave due to in-fighting or transfer to a different department, this stuff never comes out and you all are none the wiser.
So open source doesn't have top-down development, its all bottom up, most contributors to major projects do so with some ideas of what they want, but they aren't been driven by a management chain. However it means that there is generally nobody to force someone into their views, and when two people collide (or in this case, one person and everyone else), something has to give, and its best to give in public, so nobody can say it was some sort of cabal or closed decision.
Now open-source is about seeing the sausage making process, you get to see all the bits of stuff you don't want to think go into the sausages, you have to face a lot more truth, and you have to be willing to stand up against things without mummy manager to back you up. You can't have all the nice benefits of open-source development without having the bad side, the public blowups and discussion, it just can't work like that. If we take all those discussions to private lists or emails, where do you draw the line, are the people on that private list some sort of shadowy cabal overlords? Do you want an open-source development model that isn't public?
I'm sure people will say why can't we all just get along? and why can't everyone act mature? well a) we are human, b) there is no HR department frontend blocking the people at the gate, there's no interview process to weed out undesirable traits before they join the project. So when someone submits patches that work you generally accept them as a contributor, and it can take a while before you realise they are doing more harm than good, at which point its going to be public.
March 30, 2013 08:08 AM
March 29, 2013
I figured in addition to the weekly bug reports, I’m going to start posting an end of month report too, to get a bigger picture of what we’re dealing with in a typical month. I might even stop doing the weekly reports at some point.
So the highlights this month: We closed out 354, and got 259 new bugs. 405 other bugs had their status change in some way.
This also doesn’t count some residual post-EOL F16 activity.
It also doesn’t count a lot of f19/rawhide activity where we find bugs ourselves and report & get them fixed upstream.
During March, Trinity found a *lot* of bugs. Separate post on that next week.
Monthly Fedora kernel bug statistics – March 2013 is a post from: codemonkey.org.uk
March 29, 2013 04:30 PM
| |
17 |
18 |
19 |
rawhide |
|
| Open: |
271 |
443 |
16 |
244 |
(974) |
| Opened since 2013-03-22 |
7 |
39 |
13 |
5 |
(64) |
| Closed since 2013-03-22 |
98 |
16 |
2 |
2 |
(118) |
| Changed since 2013-03-22 |
28 |
75 |
13 |
14 |
(130) |
Lots of closing out of old bugs in the last week.
Weekly Fedora kernel bug statistics – March 29th 2013 is a post from: codemonkey.org.uk
March 29, 2013 04:23 PM
As usual, murphy's law dictates that problems will occur at the worst
possible moment. One of my servers in the data center died on March 20,
and it was the machine which hosts the majority of the free software
projects that I've created or am involved in. From people.netfilter.org
to OpenPCD and OpenEZX to gpl-violations.org and virtually all
osmocom.org sites and services.
Recovery was slow as there is no hot spare and none of my other
machines in the data center have backplanes for the old SCA-80 hard
disks that are in use by that particular machine. So we had to send the
disks to Berlin, wait until I'm back there, and then manually rsync
everything over to a different box in the data center.
To my big surprise, not many complaints reached me (and yes, my
personal and/or business e-mail was not affected in any way)
Recovery is complete now, and I'm looking forward to things getting back
to normal soon.
March 29, 2013 01:00 AM
OsmoDevCon
2013 is getting closer every day, and I'm very much looking forward
to meet the fellow developers of the various Osmcoom sub-projects.
Organization-wise, the catering has now been sorted out, and Holger has
managed to get a test license for two ARFCN from the regulatory body
without any trouble.
This means that we're more or less all set. The key needs to be picked
up from IN-Berlin, and we need to bring some extra extension cords,
ethernet switch, power cords and other gear, but that's really only very
minor tasks.
There's not as much formal schedule as we used to have last year, which
is good as I hope it means we can focus on getting actual work done, as
opposed to spending most of the time updating one another about our
respective work and progress.
March 29, 2013 01:00 AM
March 27, 2013
I gave a presentation at Libreplanet this weekend on the topic of Secure Boot and Restricted Boot. There's a copy of the video here - it should be up on the conference site at some point. It turned out to be excellent timing, in that a group in Spain filed a complaint with the European Commission this morning arguing that Microsoft's imposition of Secure Boot on the x86 client PC market is anticompetitive. I suspect that this is unlikely to succeed (the Commission has already stated that the current implementation appears to conform to EU law), and I fear that it's going to make it harder to fight the real battle we face.
Secure Boot means different things to different people. I think the FSF's definition is a useful one - Secure Boot is any boot validation scheme in which ultimate control is in the hands of the owner of the device, while Restricted Boot is any boot validation scheme in which ultimate control is in the hands of a third party. What Microsoft require for x86 Windows 8 devices falls into the category of Secure Boot - assuming that OEMs conform to Microsoft's requirements, the user must be able to both disable Secure Boot entirely and also leave Secure Boot enabled, but with their own choice of trusted keys and binaries. If the FSF set up a signing service to sign operating systems that met all of their criteria for freeness, Microsoft's requirements would permit an end user to configure their system such that it refused to run non-free software. My system is configured to trust things shipped by Fedora or built locally by me, a decision that I can make because Microsoft require that OEMs support it. Any system that meets Microsoft's requirements is a system that respects the freedom of the computer owner to choose how restrictive their computer's boot policy is.
This isn't to say that it's ideal. The lack of any common UI or key format between hardware vendors makes it difficult for OS vendors to document the steps users must take to assert this freedom. The presence of Microsoft as the only widely trusted key authority leaves people justifiably concerned as to whether Microsoft will be equally aggressive in blacklisting its own products as it will be in blacklisting third party ones. Implementation flaws in a (very) small number of systems have resulted in correctly signed operating systems failing to boot, requiring users to update their firmware before being able to install anything but Windows.
But concentrating on these problems misses the wider point. The x86 market remains one where users are able to run whatever they want, but the x86 market is shrinking. Users are purchasing tablets and other ARM-based ultraportables. Some users are using phones as their primary computing device. In contrast to the x86 market, Microsoft's policies for the ARM market restrict user freedom. Windows Phone and Windows RT devices are required to boot only signed binaries, with no option for the end user to disable the signature validation or install their own keys. While the underlying technology is identical, this differing set of default policies means that Microsoft's ARM implementation is better described as Restricted Boot. The hardware vendors and Microsoft define which software will run on these systems. The owner gets no say.
And, unfortunately, Microsoft aren't alone. Apple, the single biggest vendor in this market, implement effectively identical restrictions. Some Android vendors provide unlockable bootloaders, but others (either through personal preference or at the behest of phone carriers) lock down their platforms. A naive user is likely to end up purchasing a device that will, in the absence of exploited security flaws, refuse to run if any system components are modified. Even in cases where the underlying components are built using free software, there's no guarantee that the user will have the ability to assert any of those freedoms.
Why does this matter? Some of these platforms (notably Windows RT and iOS, but also some Android-based devices) will even refuse to run unsigned applications. Users are unable to write their own software and distribute it to others without agreeing to often onerous restrictions. Users with the misfortune of living in the wrong country may be forbidden from even that opportunity. The vendor may choose to block applications that compete with their own, reducing innovation. The ability to explore and tinker with the components of the system is restricted, making it harder for users to learn how modern operating systems work. If I own a perfectly functional phone that no longer receives vendor updates, I don't even have the option of paying a third party to ensure that I can't be compromised by a malicious website and risk the loss of passwords or financial details. The user is directly harmed by these restrictions.
I won't argue that there are no benefits to curated software ecosystems. I won't even argue against devices shipping with a locked down policy by default. I will strongly argue that the owner of a device should not only have the freedom to choose whether they wish to remain within those locked-down boundaries, but should also have the freedom to impose their own boundaries. There should be no forced choice between freedom and security.
Those who argue against Secure Boot risk depriving us of the freedom to make a personal decision as to who we trust. Those who argue against Secure Boot while ignoring Restricted Boot risk depriving us of even more. The traditional PC market is decreasing in importance. Unless we do anything about it, free software will be limited to a niche group of enthusiasts who've carefully chosen from a small set of devices that respect user freedom. We should have been campaigning against Restricted Boot 10 years ago. Don't delay it even further by fighting against implementations that already respect user freedom.
comments
March 27, 2013 12:28 AM
March 26, 2013
This probably only interests you if you’re from Minnesota in the US and have Xcel as a utility provider. Xcel has recently proposed rate increases as high as 10.7%; they recently scaled that back to 8.2%, and discussions are still pending. However, in the reporting thus far I’ve not seen any details about proposed changes for residential customers. I went looking at the PUC docket system for details, and found document #201211-80322-06 in docket 12-961. There are probably other fee/rate/rider changes as well, but the top-line changes for residential seem to be:
That’s a $2 $3/month increase for “customer charge” and per-kWh increases as well. Other parts of the document seem to reveal no change in the WindSource program cost, and I can’t find any details of changes to net metering agreements. I’m not too broken up by this, I only buy on average about 110kWh/month net after my solar, and price increases raise my avoided costs and make solar pay off faster. I imagine it’s a hardship for some. If the increases went towards cleaner, more efficient energy I’d be happier with it; if it goes to corporate jets, a bit less so.
The above changes are before Xcel proposed a slightly lower increase; it’s unclear how that will affect residential customers at this point, I guess.
March 26, 2013 01:11 AM
March 23, 2013

Today marks the 1000th day of operation for the 2.53kw solar array on our roof. Since we’ve installed it,
- 8.9 MWh of energy has been produced (almost 9kWh/day on average)
- 6 tons of CO2 have been avoided
- $1,100 in electricity costs have been avoided ($850 if I don’t count the optional fee for wind power I’ve subscribed to for years)
- 70% of our electrical consumption has been covered
- No maintenance has been required (other than optional snow removal)
- My inverter manufacturer (Enphase) has gone public
- My panel manufacturer (Siliken) has gone bankrupt. (But warranties remain intact, in theory)
- Utility rebates from Xcel have dropped from $2.25/watt to $1.50 since we installed.
We’ve recently embarked on a plan to renovate the house, and may end up with a new flat roof on the back. If so, I hope we can have space and budget to add another 1kW or so.

Minnesota has also continued to move forward on solar energy, with Ikea briefly holding the state record with a 1.1MW array on its roof, later eclipsed by a 2MW utility-scale system in Slayton, MN. My utility, Xcel Energy wanted to eliminate the solar rebate altogether, but it was extended at a lower rate of $1.50/watt for now – appropriate, I feel, so that the solar industry has some planning runway, but rebates drop along with system costs. There has also been some discussion of a 10% solar standard for Minnesota. Solar Works in Minnesota!
March 23, 2013 08:02 PM
March 22, 2013
It looks like I'll be speaking on openmobility conference in Bratislava. Talk will be about various positioning system, so basically GPS, positioning using GSM and wifi.
March 22, 2013 08:02 PM
| |
17 |
18 |
19 |
rawhide |
|
| Open: |
359 |
422 |
8 |
241 |
(1030) |
| Opened since 2013-03-15 |
6 |
41 |
9 |
5 |
(61) |
| Closed since 2013-03-15 |
20 |
53 |
2 |
12 |
(87) |
| Changed since 2013-03-15 |
33 |
92 |
6 |
17 |
(148) |
The arrival of F19 bugs.
Most notable thing this past week has been yet more intel-pstate bugs. That new cpufreq driver has been blowing up in a number of ways recently. Some of them are now fixed, but there’s still 1-2 outstanding issues.
Weekly Fedora kernel bug statistics – March 22nd 2013 is a post from: codemonkey.org.uk
March 22, 2013 05:34 PM
March 21, 2013
...and yes, I did suspect map format update. It turns out I was right. Thanks to PocketBook hackers monav works for me, again... making n900 way more useful. I just wonder if there's better place to download data maps than uloz.to...?
March 21, 2013 10:54 PM
March 20, 2013
With the imminent release of gcc 4.8, GCC has finally switched to C++ as the implementation language. As usual, LWN has excellent coverage. Those with long memories will remember Linux trying to use g++ back in 1992 and retreating in horror at the larger, slower code. The main benefit was stricter typechecking, particularly for enums (a great idea: I had -Wstrict-enum patches for gcc about 12 years ago, which was a superset of the -Wenum-compare we have now, but never got it merged).
With this in mind, and Ian Taylor’s bold assertion that “The C subset of C++ is as efficient as C”, I wanted to test what had changed with some actual measurements. So I grabbed gcc 4.7.2 (the last release which could do this), and built it with C and C++ compilers:
- ../gcc-4.7.2/configure –prefix=/usr/local/gcc-c –disable-bootstrap –enable-languages=c,c++ –disable-multiarch –disable-multilib
- ../gcc-4.7.2/configure –prefix=/usr/local/gcc-cxx –disable-bootstrap –enable-languages=c,c++ –disable-multiarch –disable-multilib –enable-build-with-cxx
The C++-compiled binaries are slightly larger, though that’s mostly debug info:
- -rwxr-xr-x 3 rusty rusty 1886551 Mar 18 17:13 /usr/local/gcc-c/bin/gcc
text data bss dec hex filename
552530 3752 6888 563170 897e2 /usr/local/gcc-c/bin/gcc
- -rwxr-xr-x 3 rusty rusty 1956593 Mar 18 17:13 /usr/local/gcc-cxx/bin/gcc
text data bss dec hex filename
552731 3760 7176 563667 899d3 /usr/local/gcc-cxx/bin/gcc
Then I used them both to compile a clean Linux kernel 10 times:
- for i in `seq 10`; do time make -s CC=/usr/local/gcc-c/bin/gcc 2>/dev/null; make -s clean; done
- for i in `seq 10`; do time make -s CC=/usr/local/gcc-cxx/bin/gcc 2>/dev/null; make -s clean; done
Using stats –trim-outliers, which throws away best and worse, and we have the times for the remaining 8:
- real 14m24.359000-35.107000(25.1521+/-0.62)s
user 12m50.468000-52.576000(50.912+/-0.23)s
sys 1m24.921000-27.465000(25.795+/-0.31)s
- real 14m27.148000-29.635000(27.8895+/-0.78)s
user 12m50.428000-52.852000(51.956+/-0.7)s
sys 1m26.597000-29.274000(27.863+/-0.66)s
So the C++-compiled binaries are measurably slower, though not noticably: it’s about 865 seconds vs 868 seconds, or about .3%. Even if a kernel compile spends half its time linking, statting, etc, that’s under 1% slowdown.
And it’s perfectly explicable by the larger executable size. If we strip all the gcc binaries, and do another 10 runs of each (… flash forward to the next day.. oops, powerfail, make that 2 days later):
- real 14m24.659000-33.435000(26.1196+/-0.65)s
user 12m50.032000-57.701000(50.9755+/-0.36)s
sys 1m26.057000-28.406000(26.863+/-0.36)s
- real 14m26.811000-29.284000(27.1308+/-0.17)s
user 12m51.428000-52.696000(52.156+/-0.39)s
sys 1m26.157000-27.973000(26.869+/-0.41)s
Now the difference is 0.1%, pretty much in the noise.
Summary: so whether you like C++ or not, the performance argument is moot.
March 20, 2013 10:52 PM
I've released man-pages-3.50. The release tarball is available on kernel.org. The browsable online pages can be found on man7.org. The Git repository for man-pages is available on kernel.org.
This is a relatively small release; the notable changes in man-pages-3.50 are as follows:
- I've rewritten the canonicalize_file_name(3) manual page, adding quite a bit more detail.
- I've updated and added several details to what was a rather outdated ioperm(2) manual page.
- The syscalls(2) page now includes the kernel version in which all system calls were added.
- Elie de Brauwer added descriptions for various ioctl() interfaces for /dev/random in the random(4) page.
March 20, 2013 10:28 PM
March 19, 2013
The problem with Samsung laptops bricking themselves turned out to be down to the UEFI variable store becoming more than 50% full and Samsung's firmware being dreadful, but the trigger was us writing a crash dump to the nvram. I ended up using this feature to help someone get a backtrace from a kernel oops during suspend today, and realised that it's not been terribly well publicised, so.
First, make sure pstore is mounted. If you're on 3.9 then do:
mount -t pstore /sys/fs/pstore /sys/fs/pstore
For earlier kernels you'll need to find somewhere else to stick it. If there's anything in there, delete it - we want to make sure there's enough space to save future dumps. Now reboot twice[1]. Next time you get a system crash that doesn't make it to system logs, mount pstore again and (with luck) there'll be a bunch of files there. For tedious reasons these need to be assembled in reverse order (part 12 comes before part 11, and so on) but you should have a crash log. Report that, delete the files again and marvel at the benefits that technology has brought to your life.
[1] UEFI implementations generally handle variable deletion by flagging the space as reclaimable rather than immediately making it available again. You need to reboot in order for the firmware to garbage collect it. Some firmware seems to require two reboot cycles to do this properly. Thanks, firmware.
comments
March 19, 2013 06:31 PM
Content copyright by their respective authors.