October 27, 2016

Russ Allbery

DocKnot 1.00

I'm a bit of a perfectionist about package documentation, and I'm also a huge fan of consistency. As I've slowly accumulated more open source software packages (alas, fewer new ones these days since I have less day-job time to work on them), I've developed a standard format for package documentation files, particularly the README in the package and the web pages I publish. I've iterated on these, tweaking them and messing with them, trying to incorporate all my accumulated wisdom about what information people need.

Unfortunately, this gets very tedious, since I have dozens of packages at this point and rarely want to go through the effort of changing every one of them every time I come up with a better way of phrasing something or change some aspect of my standard package build configuration. I also have the same information duplicated in multiple locations (the README and the web page for the package). And there's a lot of boilerplate that's common for all of my packages that I don't want to keep copying (or changing when I do things like change all eyrie.org URLs to HTTPS).

About three years ago, I started seriously brainstorming ways of automating this process. I made a start on it during one self-directed day at my old job at Stanford, but only got it far enough to generate a few basic files. Since then, I keep thinking about it, keep wishing I had it, and keep not putting the work into finishing it.

During this vacation, after a week and a half of relaxing and reading, I finally felt like doing a larger development project and finally started working on this for long enough to build up some momentum. Two days later, and this is finally ready for an initial release.

DocKnot uses metadata (which I'm putting in docs/metadata) that's mostly JSON plus some documentation fragments and generates README, the web page for the package (in thread, the macro language I use for all my web pages), and (the other thing I've wanted to do and didn't want to tackle without this tool) README.md, a Markdown version of README that will look nice on GitHub.

The templates that come with the package are all rather specific to me, particularly the thread template which would be unusable by anyone else. I have no idea if anyone else will want to use this package (and right now the metadata format is entirely undocumented). But since it's a shame to not release things as free software, and since I suspect I may need to upload it to Debian since, technically, this tool is required to "build" the README file distributed with my packages, here it is. I've also uploaded it to CPAN (it's my first experiment with the App::* namespace for things that aren't really meant to be used as a standalone Perl module).

You can get the latest version from the DocKnot distribution page (which is indeed generated with DocKnot). Also now generated with DocKnot are the rra-c-util and C TAP Harness distribution pages. Let me know if you see anything weird; there are doubtless still a few bugs.

27 October, 2016 05:07AM

October 26, 2016

hackergotchi for Christoph Egger

Christoph Egger

Installing a python systemd service?

As web search engines and IRC seems to be of no help, maybe someone here has a helpful idea. I have some service written in python that comes with a .service file for systemd. I now want to build&install a working service file from the software's setup.py. I can override the build/build_py commands of setuptools, however that way I still lack knowledge wrt. the bindir/prefix where my service script will be installed.


Turns out, if you override the install command (not the install_data!), you will have self.root and self.install_scripts (and lots of other self.install_*). As a result, you can read the template and write the desired output file after calling super's run method. The fix was inspired by GateOne (which, however doesn't get the --root parameter right, you need to strip self.root from the beginning of the path to actually make that work as intended).

class myinstall(install):
    _servicefiles = [

    def run(self):

        if not self.dry_run:
            bindir = self.install_scripts
            if bindir.startswith(self.root):
                bindir = bindir[len(self.root):]

            systemddir = os.path.join(self.root, "lib/systemd/system")

            for servicefile in self._servicefiles:
                service = os.path.split(servicefile)[1]
                self.announce("Creating %s" % os.path.join(systemddir, service),
                with open(servicefile) as servicefd:
                    servicedata = servicefd.read()

                with open(os.path.join(systemddir, service), "w") as servicefd:
                    servicefd.write(servicedata.replace("%BINDIR%", bindir))

Comments, suggestions and improvements, of course, welcome!

26 October, 2016 11:16AM

hackergotchi for Steinar H. Gunderson

Steinar H. Gunderson

Why does software development take so long?

Nageru 1.4.0 is out (and on its way through the Debian upload process right now), so now you can do live video mixing with multichannel audio to your heart's content. I've already blogged about most of the interesting new features, so instead, I'm trying to answer a question: What took so long?

To be clear, I'm not saying 1.4.0 took more time than I really anticipated (on the contrary, I pretty much understood the scope from the beginning, and there was a reason why I didn't go for building this stuff into 1.0.0); but if you just look at the changelog from the outside, it's not immediately obvious why “multichannel audio support” should take the better part of three months of develoment. What I'm going to say is of course going to be obvious to most software developers, but not everyone is one, and perhaps my experiences will be illuminating.

Let's first look at some obvious things that isn't the case: First of all, development is not primarily limited by typing speed. There are about 9,000 lines of new code in 1.4.0 (depending a bit on how you count), and if it was just about typing them in, I would be done in a day or two. On a good keyboard, I can type plain text at more than 800 characters per minute—but you hardly ever write code for even a single minute at that speed. Just as when writing a novel, most time is spent thinking, not typing.

I also didn't spend a lot of time backtracking; most code I wrote actually ended up in the finished product as opposed to being thrown away. (I'm not as lucky in all of my projects.) It's pretty common to do so if you're in an exploratory phase, but in this case, I had a pretty good idea of what I wanted to do right from the start, and that plan seemed to work. This wasn't a difficult project per se; it just needed to be done (which, in a sense, just increases the mystery).

However, even if this isn't at the forefront of science in any way (most code in the world is pretty pedestrian, after all), there's still a lot of decisions to make, on several levels of abstraction. And a lot of those decisions depend on information gathering beforehand. Let's take a look at an example from late in the development cycle, namely support for using MIDI controllers instead of the mouse to control the various widgets.

I've kept a pretty meticulous TODO list; it's just a text file on my laptop, but it serves the purpose of a ghetto bugtracker. For 1.4.0, it contains 83 work items (a single-digit number is not ticked off, mostly because I decided not to do those things), which corresponds roughly 1:2 to the number of commits. So let's have a look at what the ~20 MIDI controller items went into.

First of all, to allow MIDI controllers to influence the UI, we need a way of getting to it. Since Nageru is single-platform on Linux, ALSA is the obvious choice (if not, I'd probably have to look for a library to put in-between), but seemingly, ALSA has two interfaces (raw MIDI and sequencer). Which one do you want? It sounds like raw MIDI is what we want, but actually, it's the sequencer interface (it does more of the MIDI parsing for you, and generally is friendlier).

The first question is where to start picking events from. I went the simplest path and just said I wanted all events—anything else would necessitate a UI, a command-line flag, figuring out if we wanted to distinguish between different devices with the same name (and not all devices potentially even have names), and so on. But how do you enumerate devices? (Relatively simple, thankfully.) What do you do if the user inserts a new one while Nageru is running? (Turns out there's a special device you can subscribe to that will tell you about new devices.) What if you get an error on subscription? (Just print a warning and ignore it; it's legitimate not to have access to all devices on the system. By the way, for PCM devices, all of these answers are different.)

So now we have a sequencer device, how do we get events from it? Can we do it in the main loop? Turns out it probably doesn't integrate too well with Qt, but it's easy enough to put it in a thread. The class dealing with the MIDI handling now needs locking; what mutex granularity do we want? (Experience will tell you that you nearly always just want one mutex. Two mutexes give you all sorts of headaches with ordering them, and nearly never gives any gain.) ALSA expects us to poll() a given set of descriptors for data, but on shutdown, how do you break out of that poll to tell the thread to go away? (The simplest way on Linux is using an eventfd.)

There's a quirk where if you get two or more MIDI messages right after each other and only read one, poll() won't trigger to alert you there are more left. Did you know that? (I didn't. I also can't find it documented. Perhaps it's a bug?) It took me some looking into sample code to find it. Oh, and ALSA uses POSIX error codes to signal errors (like “nothing more is available”), but it doesn't use errno.

OK, so you have events (like “controller 3 was set to value 47”); what do you do about them? The meaning of the controller numbers is different from device to device, and there's no open format for describing them. So I had to make a format describing the mapping; I used protobuf (I have lots of experience with it) to make a simple text-based format, but it's obviously a nightmare to set up 50+ controllers by hand in a text file, so I had to make an UI for this. My initial thought was making a grid of spinners (similar to how the input mapping dialog already worked), but then I realized that there isn't an easy way to make headlines in Qt's grid. (You can substitute a label widget for a single cell, but not for an entire row. Who knew?) So after some searching, I found out that it would be better to have a tree view (Qt Creator does this), and then you can treat that more-or-less as a table for the rows that should be editable.

Of course, guessing controller numbers is impossible even in an editor, so I wanted it to respond to MIDI events. This means the editor needs to take over the role as MIDI receiver from the main UI. How you do that in a thread-safe way? (Reuse the existing mutex; you don't generally want to use atomics for complicated things.) Thinking about it, shouldn't the MIDI mapper just support multiple receivers at a time? (Doubtful; you don't want your random controller fiddling during setup to actually influence the audio on a running stream. And would you use the old or the new mapping?)

And do you really need to set up every single controller for each bus, given that the mapping is pretty much guaranteed to be similar for them? Making a “guess bus” button doesn't seem too difficult, where if you have one correctly set up controller on the bus, it can guess from a neighboring bus (assuming a static offset). But what if there's conflicting information? OK; then you should disable the button. So now the enable/disable status of that button depends on which cell in your grid has the focus; how do you get at those events? (Install an event filter, or subclass the spinner.) And so on, and so on, and so on.

You could argue that most of these questions go away with experience; if you're an expert in a given API, you can answer most of these questions in a minute or two even if you haven't heard the exact question before. But you can't expect even experienced developers to be an expert in all possible libraries; if you know everything there is to know about Qt, ALSA, x264, ffmpeg, OpenGL, VA-API, libusb, microhttpd and Lua (in addition to C++11, of course), I'm sure you'd be a great fit for Nageru, but I'd wager that pretty few developers fit that bill. I've written C++ for almost 20 years now (almost ten of them professionally), and that experience certainly helps boosting productivity, but I can't say I expect a 10x reduction in my own development time at any point.

You could also argue, of course, that spending so much time on the editor is wasted, since most users will only ever see it once. But here's the point; it's not actually a lot of time. The only reason why it seems like so much is that I bothered to write two paragraphs about it; it's not a particular pain point, it just adds to the total. Also, the first impression matters a lot—if the user can't get the editor to work, they also can't get the MIDI controller to work, and is likely to just go do something else.

A common misconception is that just switching languages or using libraries will help you a lot. (Witness the never-ending stream of software that advertises “written in Foo” or “uses Bar” as if it were a feature.) For the former, note that nothing I've said so far is specific to my choice of language (C++), and I've certainly avoided a bunch of battles by making that specific choice over, say, Python. For the latter, note that most of these problems are actually related to library use—libraries are great, and they solve a bunch of problems I'm really glad I didn't have to worry about (how should each button look?), but they still give their own interaction problems. And even when you're a master of your chosen programming environment, things still take time, because you have all those decisions to make on top of your libraries.

Of course, there are cases where libraries really solve your entire problem and your code gets reduced to 100 trivial lines, but that's really only when you're solving a problem that's been solved a million times before. Congrats on making that blog in Rails; I'm sure you're advancing the world. (To make things worse, usually this breaks down when you want to stray ever so slightly from what was intended by the library or framework author. What seems like a perfect match can suddenly become a development trap where you spend more of your time trying to become an expert in working around the given library than actually doing any development.)

The entire thing reminds me of the famous essay No Silver Bullet by Fred Brooks, but perhaps even more so, this quote from John Carmack's .plan has struck with me (incidentally about mobile game development in 2006, but the basic story still rings true):

To some degree this is already the case on high end BREW phones today. I have a pretty clear idea what a maxed out software renderer would look like for that class of phones, and it wouldn't be the PlayStation-esq 3D graphics that seems to be the standard direction. When I was doing the graphics engine upgrades for BREW, I started along those lines, but after putting in a couple days at it I realized that I just couldn't afford to spend the time to finish the work. "A clear vision" doesn't mean I can necessarily implement it in a very small integral number of days.

In a sense, programming is all about what your program should do in the first place. The “how” question is just the “what”, moved down the chain of abstractions until it ends up where a computer can understand it, and at that point, the three words “multichannel audio support” have become those 9,000 lines that describe in perfect detail what's going on.

26 October, 2016 07:30AM

hackergotchi for Daniel Pocock

Daniel Pocock

FOSDEM 2017 Real-Time Communications Call for Participation

FOSDEM is one of the world's premier meetings of free software developers, with over five thousand people attending each year. FOSDEM 2017 takes place 4-5 February 2017 in Brussels, Belgium.

This email contains information about:

  • Real-Time communications dev-room and lounge,
  • speaking opportunities,
  • volunteering in the dev-room and lounge,
  • related events around FOSDEM, including the XMPP summit,
  • social events (the legendary FOSDEM Beer Night and Saturday night dinners provide endless networking opportunities),
  • the Planet aggregation sites for RTC blogs

Call for participation - Real Time Communications (RTC)

The Real-Time dev-room and Real-Time lounge is about all things involving real-time communication, including: XMPP, SIP, WebRTC, telephony, mobile VoIP, codecs, peer-to-peer, privacy and encryption. The dev-room is a successor to the previous XMPP and telephony dev-rooms. We are looking for speakers for the dev-room and volunteers and participants for the tables in the Real-Time lounge.

The dev-room is only on Saturday, 4 February 2017. The lounge will be present for both days.

To discuss the dev-room and lounge, please join the FSFE-sponsored Free RTC mailing list.

To be kept aware of major developments in Free RTC, without being on the discussion list, please join the Free-RTC Announce list.

Speaking opportunities

Note: if you used FOSDEM Pentabarf before, please use the same account/username

Real-Time Communications dev-room: deadline 23:59 UTC on 17 November. Please use the Pentabarf system to submit a talk proposal for the dev-room. On the "General" tab, please look for the "Track" option and choose "Real-Time devroom". Link to talk submission.

Other dev-rooms and lightning talks: some speakers may find their topic is in the scope of more than one dev-room. It is encouraged to apply to more than one dev-room and also consider proposing a lightning talk, but please be kind enough to tell us if you do this by filling out the notes in the form.

You can find the full list of dev-rooms on this page and apply for a lightning talk at https://fosdem.org/submit

Main track: the deadline for main track presentations is 23:59 UTC 31 October. Leading developers in the Real-Time Communications field are encouraged to consider submitting a presentation to the main track.

First-time speaking?

FOSDEM dev-rooms are a welcoming environment for people who have never given a talk before. Please feel free to contact the dev-room administrators personally if you would like to ask any questions about it.

Submission guidelines

The Pentabarf system will ask for many of the essential details. Please remember to re-use your account from previous years if you have one.

In the "Submission notes", please tell us about:

  • the purpose of your talk
  • any other talk applications (dev-rooms, lightning talks, main track)
  • availability constraints and special needs

You can use HTML and links in your bio, abstract and description.

If you maintain a blog, please consider providing us with the URL of a feed with posts tagged for your RTC-related work.

We will be looking for relevance to the conference and dev-room themes, presentations aimed at developers of free and open source software about RTC-related topics.

Please feel free to suggest a duration between 20 minutes and 55 minutes but note that the final decision on talk durations will be made by the dev-room administrators. As the two previous dev-rooms have been combined into one, we may decide to give shorter slots than in previous years so that more speakers can participate.

Please note FOSDEM aims to record and live-stream all talks. The CC-BY license is used.

Volunteers needed

To make the dev-room and lounge run successfully, we are looking for volunteers:

  • FOSDEM provides video recording equipment and live streaming, volunteers are needed to assist in this
  • organizing one or more restaurant bookings (dependending upon number of participants) for the evening of Saturday, 4 February
  • participation in the Real-Time lounge
  • helping attract sponsorship funds for the dev-room to pay for the Saturday night dinner and any other expenses
  • circulating this Call for Participation (text version) to other mailing lists

See the mailing list discussion for more details about volunteering.

Related events - XMPP and RTC summits

The XMPP Standards Foundation (XSF) has traditionally held a summit in the days before FOSDEM. There is discussion about a similar summit taking place on 2 and 3 February 2017. XMPP Summit web site - please join the mailing list for details.

We are also considering a more general RTC or telephony summit, potentially in collaboration with the XMPP summit. Please join the Free-RTC mailing list and send an email if you would be interested in participating, sponsoring or hosting such an event.

Social events and dinners

The traditional FOSDEM beer night occurs on Friday, 3 February.

On Saturday night, there are usually dinners associated with each of the dev-rooms. Most restaurants in Brussels are not so large so these dinners have space constraints and reservations are essential. Please subscribe to the Free-RTC mailing list for further details about the Saturday night dinner options and how you can register for a seat.

Spread the word and discuss

If you know of any mailing lists where this CfP would be relevant, please forward this email (text version). If this dev-room excites you, please blog or microblog about it, especially if you are submitting a talk.

If you regularly blog about RTC topics, please send details about your blog to the planet site administrators:

Planet site Admin contact
All projects Free-RTC Planet (http://planet.freertc.org) contact planet@freertc.org
XMPP Planet Jabber (http://planet.jabber.org) contact ralphm@ik.nu
SIP Planet SIP (http://planet.sip5060.net) contact planet@sip5060.net
SIP (Español) Planet SIP-es (http://planet.sip5060.net/es/) contact planet@sip5060.net

Please also link to the Planet sites from your own blog or web site as this helps everybody in the free real-time communications community.


For any private queries, contact us directly using the address fosdem-rtc-admin@freertc.org and for any other queries please ask on the Free-RTC mailing list.

The dev-room administration team:

26 October, 2016 06:39AM by Daniel.Pocock

hackergotchi for Joachim Breitner

Joachim Breitner

Showcasing Applicative

My plan for this week’s lecture of the CIS 194 Haskell course at the University of Pennsylvania is to dwell a bit on the concept of Functor, Applicative and Monad, and to highlight the value of the Applicative abstraction.

I quite like the example that I came up with, so I want to share it here. In the interest of long-term archival and stand-alone pesentation, I include all the material in this post.1


In case you want to follow along, start with these imports:

import Data.Char
import Data.Maybe
import Data.List

import System.Environment
import System.IO
import System.Exit

The parser

The starting point for this exercise is a fairly standard parser-combinator monad, which happens to be the result of the student’s homework from last week:

newtype Parser a = P (String -> Maybe (a, String))

runParser :: Parser t -> String -> Maybe (t, String)
runParser (P p) = p

parse :: Parser a -> String -> Maybe a
parse p input = case runParser p input of
    Just (result, "") -> Just result
    _ -> Nothing -- handles both no result and leftover input

noParserP :: Parser a
noParserP = P (\_ -> Nothing)

pureParserP :: a -> Parser a
pureParserP x = P (\input -> Just (x,input))

instance Functor Parser where
    fmap f p = P $ \input -> do
	(x, rest) <- runParser p input
	return (f x, rest)

instance Applicative Parser where
    pure = pureParserP
    p1 <*> p2 = P $ \input -> do
        (f, rest1) <- runParser p1 input
        (x, rest2) <- runParser p2 rest1
        return (f x, rest2)

instance Monad Parser where
    return = pure
    p1 >>= k = P $ \input -> do
        (x, rest1) <- runParser p1 input
        runParser (k x) rest1

anyCharP :: Parser Char
anyCharP = P $ \input -> case input of
    (c:rest) -> Just (c, rest)
    []       -> Nothing

charP :: Char -> Parser ()
charP c = do
    c' <- anyCharP
    if c == c' then return ()
               else noParserP

anyCharButP :: Char -> Parser Char
anyCharButP c = do
    c' <- anyCharP
    if c /= c' then return c'
               else noParserP

letterOrDigitP :: Parser Char
letterOrDigitP = do
    c <- anyCharP
    if isAlphaNum c then return c else noParserP

orElseP :: Parser a -> Parser a -> Parser a
orElseP p1 p2 = P $ \input -> case runParser p1 input of
    Just r -> Just r
    Nothing -> runParser p2 input

manyP :: Parser a -> Parser [a]
manyP p = (pure (:) <*> p <*> manyP p) `orElseP` pure []

many1P :: Parser a -> Parser [a]
many1P p = pure (:) <*> p <*> manyP p

sepByP :: Parser a -> Parser () -> Parser [a]
sepByP p1 p2 = (pure (:) <*> p1 <*> (manyP (p2 *> p1))) `orElseP` pure []

A parser using this library for, for example, CSV files could take this form:

parseCSVP :: Parser [[String]]
parseCSVP = manyP parseLine
    parseLine = parseCell `sepByP` charP ',' <* charP '\n'
    parseCell = do
        charP '"'
        content <- manyP (anyCharButP '"')
        charP '"'
        return content

We want EBNF

Often when we write a parser for a file format, we might also want to have a formal specification of the format. A common form for such a specification is EBNF. This might look as follows, for a CSV file:

cell = '"', {not-quote}, '"';
line = (cell, {',', cell} | ''), newline;
csv  = {line};

It is straight-forward to create a Haskell data type to represent an ENBF syntax description. Here is a simple EBNF library (data type and pretty-printer) for your convenience:

data RHS
  = Terminal String
  | NonTerminal String
  | Choice RHS RHS
  | Sequence RHS RHS
  | Optional RHS
  | Repetition RHS
  deriving (Show, Eq)

ppRHS :: RHS -> String
ppRHS = go 0
    go _ (Terminal s)     = surround "'" "'" $ concatMap quote s
    go _ (NonTerminal s)  = s
    go a (Choice x1 x2)   = p a 1 $ go 1 x1 ++ " | " ++ go 1 x2
    go a (Sequence x1 x2) = p a 2 $ go 2 x1 ++ ", "  ++ go 2 x2
    go _ (Optional x)     = surround "[" "]" $ go 0 x
    go _ (Repetition x)   = surround "{" "}" $ go 0 x

    surround c1 c2 x = c1 ++ x ++ c2

    p a n | a > n     = surround "(" ")"
          | otherwise = id

    quote '\'' = "\\'"
    quote '\\' = "\\\\"
    quote c    = [c]

type Production = (String, RHS)
type BNF = [Production]

ppBNF :: BNF -> String
ppBNF = unlines . map (\(i,rhs) -> i ++ " = " ++ ppRHS rhs ++ ";")

Code to produce EBNF

We had a good time writing combinators that create complex parsers from primitive pieces. Let us do the same for EBNF grammars. We could simply work on the RHS type directly, but we can do something more nifty: We create a data type that keeps track, via a phantom type parameter, of what Haskell type the given EBNF syntax is the specification:

newtype Grammar a = G RHS

ppGrammar :: Grammar a -> String
ppGrammar (G rhs) = ppRHS rhs

So a value of type Grammar t is a description of the textual representation of the Haskell type t.

Here is one simple example:

anyCharG :: Grammar Char
anyCharG = G (NonTerminal "char")

Here is another one. This one does not describe any interesting Haskell type, but is useful when spelling out the special characters in the syntax described by the grammar:

charG :: Char -> Grammar ()
charG c = G (Terminal [c])

A combinator that creates new grammar from two existing grammars:

orElseG :: Grammar a -> Grammar a -> Grammar a
orElseG (G rhs1) (G rhs2) = G (Choice rhs1 rhs2)

We want the convenience of our well-known type classes in order to combine these values some more:

instance Functor Grammar where
    fmap _ (G rhs) = G rhs

instance Applicative Grammar where
    pure x = G (Terminal "")
    (G rhs1) <*> (G rhs2) = G (Sequence rhs1 rhs2)

Note how the Functor instance does not actually use the function. How should it? There are no values inside a Grammar!

We cannot define a Monad instance for Grammar: We would start with (G rhs1) >>= k = …, but there is simply no way of getting a value of type a that we can feed to k. So we will do without a Monad instance. This is interesting, and we will come back to that later.

Like with the parser, we can now begin to build on the primitive example to build more complicated combinators:

manyG :: Grammar a -> Grammar [a]
manyG p = (pure (:) <*> p <*> manyG p) `orElseG` pure []

many1G :: Grammar a -> Grammar [a]
many1G p = pure (:) <*> p <*> manyG p

sepByG :: Grammar a -> Grammar () -> Grammar [a]
sepByG p1 p2 = ((:) <$> p1 <*> (manyG (p2 *> p1))) `orElseG` pure []

Let us run a small example:

dottedWordsG :: Grammar [String]
dottedWordsG = many1G (manyG anyCharG <* charG '.')
*Main> putStrLn $ ppGrammar dottedWordsG
'', ('', char, ('', char, ('', char, ('', char, ('', char, ('', …

Oh my, that is not good. Looks like the recursion in manyG does not work well, so we need to avoid that. But anyways we want to be explicit in the EBNF grammars about where something can be repeated, so let us just make many a primitive:

manyG :: Grammar a -> Grammar [a]
manyG (G rhs) = G (Repetition rhs)

With this definition, we already get a simple grammar for dottedWordsG:

*Main> putStrLn $ ppGrammar dottedWordsG
'', {char}, '.', {{char}, '.'}

This already looks like a proper EBNF grammar. One thing that is not nice about it is that there is an empty string ('') in a sequence (…,…). We do not want that.

Why is it there in the first place? Because our Applicative instance is not lawful! Remember that pure id <*> g == g should hold. One way to achieve that is to improve the Applicative instance to optimize this case away:

instance Applicative Grammar where
    pure x = G (Terminal "")
    G (Terminal "") <*> G rhs2 = G rhs2
    G rhs1 <*> G (Terminal "") = G rhs1
    (G rhs1) <*> (G rhs2) = G (Sequence rhs1 rhs2)
Now we get what we want:
*Main> putStrLn $ ppGrammar dottedWordsG
{char}, '.', {{char}, '.'}

Remember our parser for CSV files above? Let me repeat it here, this time using only Applicative combinators, i.e. avoiding (>>=), (>>), return and do-notation:

parseCSVP :: Parser [[String]]
parseCSVP = manyP parseLine
    parseLine = parseCell `sepByP` charG ',' <* charP '\n'
    parseCell = charP '"' *> manyP (anyCharButP '"') <* charP '"'

And now we try to rewrite the code to produce Grammar instead of Parser. This is straight forward with the exception of anyCharButP. The parser code for that in inherently monadic, and we just do not have a monad instance. So we work around the issue by making that a “primitive” grammar, i.e. introducing a non-terminal in the EBNF without a production rule – pretty much like we did for anyCharG:

primitiveG :: String -> Grammar a
primitiveG s = G (NonTerminal s)

parseCSVG :: Grammar [[String]]
parseCSVG = manyG parseLine
    parseLine = parseCell `sepByG` charG ',' <* charG '\n'
    parseCell = charG '"' *> manyG (primitiveG "not-quote") <* charG '"'

Of course the names parse… are not quite right any more, but let us just leave that for now.

Here is the result:

*Main> putStrLn $ ppGrammar parseCSVG
{('"', {not-quote}, '"', {',', '"', {not-quote}, '"'} | ''), '

The line break is weird. We do not really want newlines in the grammar. So let us make that primitive as well, and replace charG '\n' with newlineG:

newlineG :: Grammar ()
newlineG = primitiveG "newline"

Now we get

*Main> putStrLn $ ppGrammar parseCSVG
{('"', {not-quote}, '"', {',', '"', {not-quote}, '"'} | ''), newline}

which is nice and correct, but still not quite the easily readable EBNF that we saw further up.

Code to produce EBNF, with productions

We currently let our grammars produce only the right-hand side of one EBNF production, but really, we want to produce a RHS that may refer to other productions. So let us change the type accordingly:

newtype Grammar a = G (BNF, RHS)

runGrammer :: String -> Grammar a -> BNF
runGrammer main (G (prods, rhs)) = prods ++ [(main, rhs)]

ppGrammar :: String -> Grammar a -> String
ppGrammar main g = ppBNF $ runGrammer main g

Now we have to adjust all our primitive combinators (but not the derived ones!):

charG :: Char -> Grammar ()
charG c = G ([], Terminal [c])

anyCharG :: Grammar Char
anyCharG = G ([], NonTerminal "char")

manyG :: Grammar a -> Grammar [a]
manyG (G (prods, rhs)) = G (prods, Repetition rhs)

mergeProds :: [Production] -> [Production] -> [Production]
mergeProds prods1 prods2 = nub $ prods1 ++ prods2

orElseG :: Grammar a -> Grammar a -> Grammar a
orElseG (G (prods1, rhs1)) (G (prods2, rhs2))
    = G (mergeProds prods1 prods2, Choice rhs1 rhs2)

instance Functor Grammar where
    fmap _ (G bnf) = G bnf

instance Applicative Grammar where
    pure x = G ([], Terminal "")
    G (prods1, Terminal "") <*> G (prods2, rhs2)
        = G (mergeProds prods1 prods2, rhs2)
    G (prods1, rhs1) <*> G (prods2, Terminal "")
        = G (mergeProds prods1 prods2, rhs1)
    G (prods1, rhs1) <*> G (prods2, rhs2)
        = G (mergeProds prods1 prods2, Sequence rhs1 rhs2)

primitiveG :: String -> Grammar a
primitiveG s = G (NonTerminal s)

The use of nub when combining productions removes duplicates that might be used in different parts of the grammar. Not efficient, but good enough for now.

Did we gain anything? Not yet:

*Main> putStr $ ppGrammar "csv" (parseCSVG)
csv = {('"', {not-quote}, '"', {',', '"', {not-quote}, '"'} | ''), newline};

But we can now introduce a function that lets us tell the system where to give names to a piece of grammar:

nonTerminal :: String -> Grammar a -> Grammar a
nonTerminal name (G (prods, rhs))
  = G (prods ++ [(name, rhs)], NonTerminal name)

Ample use of this in parseCSVG yields the desired result:

parseCSVG :: Grammar [[String]]
parseCSVG = manyG parseLine
    parseLine = nonTerminal "line" $
        parseCell `sepByG` charG ',' <* newline
    parseCell = nonTerminal "cell" $
        charG '"' *> manyG (primitiveG "not-quote") <* charG '"
*Main> putStr $ ppGrammar "csv" (parseCSVG)
cell = '"', {not-quote}, '"';
line = (cell, {',', cell} | ''), newline;
csv = {line};

This is great!

Unifying parsing and grammar-generating

Note how simliar parseCSVG and parseCSVP are! Would it not be great if we could implement that functionaliy only once, and get both a parser and a grammar description out of it? This way, the two would never be out of sync!

And surely this must be possible. The tool to reach for is of course to define a type class that abstracts over the parts where Parser and Grammer differ. So we have to identify all functions that are primitive in one of the two worlds, and turn them into type class methods. This includes char and orElse. It includes many, too: Although manyP is not primitive, manyG is. It also includes nonTerminal, which does not exist in the world of parsers (yet), but we need it for the grammars.

The primitiveG function is tricky. We use it in grammars when the code that we might use while parsing is not expressible as a grammar. So the solution is to let it take two arguments: A String, when used as a descriptive non-terminal in a grammar, and a Parser a, used in the parsing code.

Finally, the type classes that we except, Applicative (and thus Functor), are added as constraints on our type class:

class Applicative f => Descr f where
    char :: Char -> f ()
    many :: f a -> f [a]
    orElse :: f a -> f a -> f a
    primitive :: String -> Parser a -> f a
    nonTerminal :: String -> f a -> f a

The instances are easily written:

instance Descr Parser where
    char = charP
    many = manyP
    orElse = orElseP
    primitive _ p = p
    nonTerminal _ p = p

instance Descr Grammar where
    char = charG
    many = manyG
    orElse = orElseG
    primitive s _ = primitiveG s
    nonTerminal s g = nonTerminal s g

And we can now take the derived definitions, of which so far we had two copies, and define them once and for all:

many1 :: Descr f => f a -> f [a]
many1 p = pure (:) <*> p <*> many p

anyChar :: Descr f => f Char
anyChar = primitive "char" anyCharP

dottedWords :: Descr f => f [String]
dottedWords = many1 (many anyChar <* char '.')

sepBy :: Descr f => f a -> f () -> f [a]
sepBy p1 p2 = ((:) <$> p1 <*> (many (p2 *> p1))) `orElse` pure []

newline :: Descr f => f ()
newline = primitive "newline" (charP '\n')

And thus we now have our CSV parser/grammar generator:

parseCSV :: Descr f => f [[String]]
parseCSV = many parseLine
    parseLine = nonTerminal "line" $
        parseCell `sepBy` char ',' <* newline
    parseCell = nonTerminal "cell" $
        char '"' *> many (primitive "not-quote" (anyCharButP '"')) <* char '"'

We can now use this definition both to parse and to generate grammars:

*Main> putStr $ ppGrammar2 "csv" (parseCSV)
cell = '"', {not-quote}, '"';
line = (cell, {',', cell} | ''), newline;
csv = {line};
*Main> parse parseCSV "\"ab\",\"cd\"\n\"\",\"de\"\n\n"
Just [["ab","cd"],["","de"],[]]

The INI file parser and grammar

As a final exercise, let us transform the INI file parser into a combined thing. Here is the parser (another artifact of last week’s homework) again using applicative style2:

parseINIP :: Parser INIFile
parseINIP = many1P parseSection
    parseSection =
        (,) <$  charP '['
            <*> parseIdent
            <*  charP ']'
            <*  charP '\n'
            <*> (catMaybes <$> manyP parseLine)
    parseIdent = many1P letterOrDigitP
    parseLine = parseDecl `orElseP` parseComment `orElseP` parseEmpty

    parseDecl = Just <$> (
        (,) <*> parseIdent
            <*  manyP (charP ' ')
            <*  charP '='
            <*  manyP (charP ' ')
            <*> many1P (anyCharButP '\n')
            <*  charP '\n')

    parseComment =
        Nothing <$ charP '#'
                <* many1P (anyCharButP '\n')
                <* charP '\n'

    parseEmpty = Nothing <$ charP '\n'

Transforming that to a generic description is quite straight-forward. We use primitive again to wrap letterOrDigitP:

descrINI :: Descr f => f INIFile
descrINI = many1 parseSection
    parseSection =
        (,) <*  char '['
            <*> parseIdent
            <*  char ']'
            <*  newline
            <*> (catMaybes <$> many parseLine)
    parseIdent = many1 (primitive "alphanum" letterOrDigitP)
    parseLine = parseDecl `orElse` parseComment `orElse` parseEmpty

    parseDecl = Just <$> (
        (,) <*> parseIdent
            <*  many (char ' ')
            <*  char '='
            <*  many (char ' ')
            <*> many1 (primitive "non-newline" (anyCharButP '\n'))
	    <*  newline)

    parseComment =
        Nothing <$ char '#'
                <* many1 (primitive "non-newline" (anyCharButP '\n'))
		<* newline

    parseEmpty = Nothing <$ newline

This yields this not very helpful grammar (abbreviated here):

*Main> putStr $ ppGrammar2 "ini" descrINI
ini = '[', alphanum, {alphanum}, ']', newline, {alphanum, {alphanum}, {' '}…

But with a few uses of nonTerminal, we get something really nice:

descrINI :: Descr f => f INIFile
descrINI = many1 parseSection
    parseSection = nonTerminal "section" $
        (,) <$  char '['
            <*> parseIdent
            <*  char ']'
            <*  newline
            <*> (catMaybes <$> many parseLine)
    parseIdent = nonTerminal "identifier" $
        many1 (primitive "alphanum" letterOrDigitP)
    parseLine = nonTerminal "line" $
        parseDecl `orElse` parseComment `orElse` parseEmpty

    parseDecl = nonTerminal "declaration" $ Just <$> (
        (,) <$> parseIdent
            <*  spaces
            <*  char '='
            <*  spaces
            <*> remainder)

    parseComment = nonTerminal "comment" $
        Nothing <$ char '#' <* remainder

    remainder = nonTerminal "line-remainder" $
        many1 (primitive "non-newline" (anyCharButP '\n')) <* newline

    parseEmpty = Nothing <$ newline

    spaces = nonTerminal "spaces" $ many (char ' ')
*Main> putStr $ ppGrammar "ini" descrINI
identifier = alphanum, {alphanum};
spaces = {' '};
line-remainder = non-newline, {non-newline}, newline;
declaration = identifier, spaces, '=', spaces, line-remainder;
comment = '#', line-remainder;
line = declaration | comment | newline;
section = '[', identifier, ']', newline, {line};
ini = section, {section};

Recursion (variant 1)

What if we want to write a parser/grammar-generator that is able to generate the following grammar, which describes terms that are additions and multiplications of natural numbers:

const = digit, {digit};
spaces = {' ' | newline};
atom = const | '(', spaces, expr, spaces, ')', spaces;
mult = atom, {spaces, '*', spaces, atom}, spaces;
plus = mult, {spaces, '+', spaces, mult}, spaces;
expr = plus;

The production of expr is recursive (via plus, mult, atom). We have seen above that simply defining a Grammar a recursively does not go well.

One solution is to add a new combinator for explicit recursion, which replaces nonTerminal in the method:

class Applicative f => Descr f where
    recNonTerminal :: String -> (f a -> f a) -> f a

instance Descr Parser where
    recNonTerminal _ p = let r = p r in r

instance Descr Grammar where
    recNonTerminal = recNonTerminalG

recNonTerminalG :: String -> (Grammar a -> Grammar a) -> Grammar a
recNonTerminalG name f =
    let G (prods, rhs) = f (G ([], NonTerminal name))
    in G (prods ++ [(name, rhs)], NonTerminal name)

nonTerminal :: Descr f => String -> f a -> f a
nonTerminal name p = recNonTerminal name (const p)

runGrammer :: String -> Grammar a -> BNF
runGrammer main (G (prods, NonTerminal nt)) | main == nt = prods
runGrammer main (G (prods, rhs)) = prods ++ [(main, rhs)]

The change in runGrammer avoids adding a pointless expr = expr production to the output.

This lets us define a parser/grammar-generator for the arithmetic expressions given above:

data Expr = Plus Expr Expr | Mult Expr Expr | Const Integer
    deriving Show

mkPlus :: Expr -> [Expr] -> Expr
mkPlus = foldl Plus

mkMult :: Expr -> [Expr] -> Expr
mkMult = foldl Mult

parseExpr :: Descr f => f Expr
parseExpr = recNonTerminal "expr" $ \ exp ->
    ePlus exp

ePlus :: Descr f => f Expr -> f Expr
ePlus exp = nonTerminal "plus" $
    mkPlus <$> eMult exp
           <*> many (spaces *> char '+' *> spaces *> eMult exp)
           <*  spaces

eMult :: Descr f => f Expr -> f Expr
eMult exp = nonTerminal "mult" $
    mkPlus <$> eAtom exp
           <*> many (spaces *> char '*' *> spaces *> eAtom exp)
           <*  spaces

eAtom :: Descr f => f Expr -> f Expr
eAtom exp = nonTerminal "atom" $
    aConst `orElse` eParens exp

aConst :: Descr f => f Expr
aConst = nonTerminal "const" $ Const . read <$> many1 digit

eParens :: Descr f => f a -> f a
eParens inner =
    id <$  char '('
       <*  spaces
       <*> inner
       <*  spaces
       <*  char ')'
       <*  spaces

And indeed, this works:

*Main> putStr $ ppGrammar "expr" parseExpr
const = digit, {digit};
spaces = {' ' | newline};
atom = const | '(', spaces, expr, spaces, ')', spaces;
mult = atom, {spaces, '*', spaces, atom}, spaces;
plus = mult, {spaces, '+', spaces, mult}, spaces;
expr = plus;

Recursion (variant 1)

Interestingly, there is another solution to this problem, which avoids introducing recNonTerminal and explicitly passing around the recursive call (i.e. the exp in the example). To implement that we have to adjust our Grammar type as follows:

newtype Grammar a = G ([String] -> (BNF, RHS))

The idea is that the list of strings is those non-terminals that we are currently defining. So in nonTerminal, we check if the non-terminal to be introduced is currently in the process of being defined, and then simply ignore the body. This way, the recursion is stopped automatically:

nonTerminalG :: String -> (Grammar a) -> Grammar a
nonTerminalG name (G g) = G $ \seen ->
    if name `elem` seen
    then ([], NonTerminal name)
    else let (prods, rhs) = g (name : seen)
         in (prods ++ [(name, rhs)], NonTerminal name)

After adjusting the other primitives of Grammar (including the Functor and Applicative instances, wich now again have nonTerminal) to type-check again, we observe that this parser/grammar generator for expressions, with genuine recursion, works now:

parseExp :: Descr f => f Expr
parseExp = nonTerminal "expr" $

ePlus :: Descr f => f Expr
ePlus = nonTerminal "plus" $
    mkPlus <$> eMult
           <*> many (spaces *> char '+' *> spaces *> eMult)
           <*  spaces

eMult :: Descr f => f Expr
eMult = nonTerminal "mult" $
    mkPlus <$> eAtom
           <*> many (spaces *> char '*' *> spaces *> eAtom)
           <*  spaces

eAtom :: Descr f => f Expr
eAtom = nonTerminal "atom" $
    aConst `orElse` eParens parseExp

Note that the recursion is only going to work if there is at least one call to nonTerminal somewhere around the recursive calls. We still cannot implement many as naively as above.


If you want to play more with this: The homework is to define a parser/grammar-generator for EBNF itself, as specified in this variant:

identifier = letter, {letter | digit | '-'};
spaces = {' ' | newline};
quoted-char = non-quote-or-backslash | '\\', '\\' | '\\', '\'';
terminal = '\'', {quoted-char}, '\'', spaces;
non-terminal = identifier, spaces;
option = '[', spaces, rhs, spaces, ']', spaces;
repetition = '{', spaces, rhs, spaces, '}', spaces;
group = '(', spaces, rhs, spaces, ')', spaces;
atom = terminal | non-terminal | option | repetition | group;
sequence = atom, {spaces, ',', spaces, atom}, spaces;
choice = sequence, {spaces, '|', spaces, sequence}, spaces;
rhs = choice;
production = identifier, spaces, '=', spaces, rhs, ';', spaces;
bnf = production, {production};

This grammar is set up so that the precedence of , and | is correctly implemented: a , b | c will parse as (a, b) | c.

In this syntax for BNF, terminal characters are quoted, i.e. inside '…', a ' is replaced by \' and a \ is replaced by \\ – this is done by the function quote in ppRHS.

If you do this, you should able to round-trip with the pretty-printer, i.e. parse back what it wrote:

*Main> let bnf1 = runGrammer "expr" parseExpr
*Main> let bnf2 = runGrammer "expr" parseBNF
*Main> let f = Data.Maybe.fromJust . parse parseBNF. ppBNF
*Main> f bnf1 == bnf1
*Main> f bnf2 == bnf2

The last line is quite meta: We are unsing parseBNF as a parser on the pretty-printed grammar produced from interpreting parseBNF as a grammar.


We have again seen an example of the excellent support for abstraction in Haskell: Being able to define so very different things such as a parser and a grammar description with the same code is great. Type classes helped us here.

Note that it was crucial that our combined parser/grammars are only able to use the methods of Applicative, and not Monad. Applicative is less powerful, so by giving less power to the user of our Descr interface, the other side, i.e. the implementation, can be more powerful.

The reason why Applicative is ok, but Monad is not, is that in Applicative, the results do not affect the shape of the computation, whereas in Monad, the whole point of the bind operator (>>=) is that the result of the computation is used to decide the next computation. And while this is perfectly fine for a parser, it just makes no sense for a grammar generator, where there simply are no values around!

We have also seen that a phantom type, namely the parameter of Grammar, can be useful, as it lets the type system make sure we do not write nonsense. For example, the type of orElseG ensures that both grammars that are combined here indeed describe something of the same type.

  1. It seems to be the week of applicative-appraising blog posts: Brent has posted a nice piece about enumerations using Applicative yesterday.

  2. I like how in this alignment of <*> and <* the > point out where the arguments are that are being passed to the function on the left.

26 October, 2016 04:00AM by Joachim Breitner (mail@joachim-breitner.de)

hackergotchi for Dirk Eddelbuettel

Dirk Eddelbuettel

Rblpapi 0.3.5

A new release of Rblpapi is now on CRAN. Rblpapi provides a direct interface between R and the Bloomberg Terminal via the C++ API provided by Bloomberg Labs (but note that a valid Bloomberg license and installation is required).

This is the sixth release since the package first appeared on CRAN last year. This release brings new functionality via new (getPortfolio()) and extended functions (getTicks()) as well as several fixes:

Changes in Rblpapi version 0.3.5 (2016-10-25)

  • Add new function getPortfolio to retrieve portfolio data via bds (John in #176)

  • Extend getTicks() to (optionally) return non-numeric data as part of data.frame or data.table (Dirk in #200)

  • Similarly extend getMultipleTicks (Dirk in #202)

  • Correct statement on timestamp for getBars (Closes issue #192)

  • Minor edits to a few files in order to either please R(-devel) CMD check --as-cran, or update documentation

Courtesy of CRANberries, there is also a diffstat report for the this release. As always, more detailed information is on the Rblpapi page. Questions, comments etc should go to the issue tickets system at the GitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

26 October, 2016 02:14AM

October 25, 2016

hackergotchi for Laura Arjona Reina

Laura Arjona Reina

Rankings, Condorcet and free software: Calculating the results for the Stretch Artwork Survey

We had 12 candidates for the Debian Stretch Artwork and a survey was set up for allowing people to vote which one they prefer.

The survey was run in my LimeSurvey instance, surveys.larjona.net. LimeSurvey  its a nice free software with a lot of features. It provides a “Ranking” question type, and it was very easy for allowing people to “vote” in the Debian style (Debian uses the Condorcet method in its elections).

However, although LimeSurvey offers statistics and even graphics to show the results of many type of questions, its output for the Ranking type is not useful, so I had to export the data and use another tool to find the winner.

Export the data from LimeSurvey

I’ve created a read-only user to visit the survey site. With this visitor you can explore the survey questionnaire, its results, and export the data.
Username: stretch
Password: artwork

First attempt, the quick and easy (and nonfree, I guess)

There is an online tool to calculate the Condorcet winner, http://www.ericgorr.net/condorcet/ 
The steps I followed to feed the tool with the data from LimeSurvey were these:
1.- Went to admin interface of lime survey, selected the stretch artwork survey, responses and statistics, export results to application
2.- Selected “Completed responses only”, “Question codes”, “Answer codes”, and exported to CSV. (results_stretch1.csv)
3.- Opened the CSV with LibreOffice Calc, and removed these columns:
id    submitdate    lastpage    startlanguage
4.- Remove the first row containing the headers and saved the result (results_stretch2.csv)
5.- In commandline:
sort results_stretch2.csv | uniq -c > results_stretch3.csv
6.- Opened results_stretch3.csv with LibreOffice Calc and “merge delimitors” when importing.
7.- Removed the first column (blank) and added a column between the numbers and the first ranked option, and fulfilled that column with “:” value. Saved (results_stretch4.csv)
8.- Opened results_stretch4.csv with my preferred editor and search and replace “,:,” for “:” and after that, search and replace “,” for “>”. Save the result (results_stretch5.csv)
9.- Went to http://condorcet.ericgorr.net/, selected Condorcet basic, “tell me some things”, and pasted the contents of results_stretch5.csv there.
The results are in results_stretch1.html

But where is the source code of this Condorcet tool?

I couldn’t find the source code (nor license) of the solver by Eric Gorr.
The tool is mentioned in http://www.accuratedemocracy.com/z_tools.htm where other tools are listed and when the tool is libre software, is noted so. But not in this case.
There, I found another tool, VoteEngine, which is open source, so I tried with that.

Second attempt: VoteEngine, a Free Open Source Software tool made with Python

I used a modification of voteengine-0.99 (the original zip is available in http://vote.sourceforge.net/ and a diff with the changes I made (basically, Numeric -> numpy and Int -> int, inorder that works in Debian stable), here.
Steps 1 to 4 are the same as in the first attempt.
5.- Sorted alphabetically the different 12 options to vote, and
assigned a letter to each one (saved the assignments in a file called 
6.- Opened results_stretch2.csv with my favorite editor, and search
and replace the name of the different options, for their corresponding
letter in stretch_key.txt file.
Searched and replaced “,” for ” ” (space). Then, saved the results into
7.- Copied the input.txt file from voteengine-0.99 into stretch.txt and edited the options
to our needs. Pasted the contents of results_stretch3_voteengine.cvs
at the end of stretch.txt
8.-In the commandline
./voteengine.py <stretch.txt  > winner.txt
(winner.txt contains the results for the Condorcet method).
9.- I edited again stretch.txt to change the method to shulze and
calculated the results, and again with the smith method. The winner in
the 3 methods is the same. I pasted the summary of these 3 methods
(shulze and smith provide a ranked list) in stretch_results.txt

If it can be done, it can be done with R…

I found the algstat R package:
which includes a “condorcet” function but I couldn’t make it work with the data.
I’m not sure how the data needs to be shaped. I’m sure that this can be done in R and the problem is me, in this case. Comments are welcome, and I’ll try to ask to a friend whose R skills are better than mine!

And another SaaS

I found https://www.condorcet.vote/ and its source code. It would be interesting to deploy a local instance to drive future surveys, but for this time I didn’t want to fight with PHP in order to use only the “solver” part, nor install another SaaS in my home server just to find that I need some other dependency or whatever.
I’ll keep an eye on this, though, because it looks like a modern and active project.

Finally, devotee

Well and which software Debian uses for its elections? 
There is a git repository with devotee, you can clone it:
I found that although the tool is quite modular, it’s written specifically for the Debian case (votes received by mail, GPG signed, there is a quorum, and other particularities) and I was not sure if I could use it with my data. It is written in Perl and then I understood it worse than the Python from VoteEngine.
Maybe I’ll return to it, though, when I have more time, to try to put our data in the shape of a typicall tally.txt file and then see if the module solving the condorcet winner can work for me.
That’s all, folks! (for now…)


You can coment on this blog post in this pump.io thread

Filed under: Tools Tagged: data mining, Debian, English, SaaS, statistics

25 October, 2016 08:11PM by larjona

Jose M. Calhariz

New packages for Amanda on the works

Because of the upgrade of perl, amanda is currently broken on testing and unstable on Debian. The problem is known and I am working with my sponsor to create new packages to solve the problem. Please hang a little more.

25 October, 2016 06:41PM by Jose M. Calhariz

Bits from Debian

"softWaves" will be the default theme for Debian 9

The theme "softWaves" by Juliette Taka Belin has been selected as default theme for Debian 9 'stretch'.

softWaves Login screen. Click to see the whole theme proposal

After the Debian Desktop Team made the call for proposing themes, a total of twelve choices have been submitted, and any Debian contributor has received the opportunity to vote on them in a survey. We received 3,479 responses ranking the different choices, and softWaves has been the winner among them.

We'd like to thank all the designers that have participated providing nice wallpapers and artwork for Debian 9, and encourage everybody interested in this area of Debian, to join the Design Team. It is being considered to package all of them so they are easily available in Debian. If you want to help in this effort, or package any other artwork (for example, particularly designed to be accessibility-friendly), please contact the Debian Desktop Team, but hurry up, because the freeze for new packages in the next release of Debian starts on January 5th, 2017.

This is the second time that Debian ships a theme by Juliette Belin, who also created the theme "Lines" that enhances our actual stable release, Debian 8. Congratulations, Juliette, and thank you very much for your continued commitment to Debian!

25 October, 2016 05:50PM by Laura Arjona Reina and Niels Thykier

Julian Andres Klode

Introducing DNS66, a host blocker for Android


I’m proud (yes, really) to announce DNS66, my host/ad blocker for Android 5.0 and newer. It’s been around since last Thursday on F-Droid, but it never really got a formal announcement.

DNS66 creates a local VPN service on your Android device, and diverts all DNS traffic to it, possibly adding new DNS servers you can configure in its UI. It can use hosts files for blocking whole sets of hosts or you can just give it a domain name to block (or multiple hosts files/hosts). You can also whitelist individual hosts or entire files by adding them to the end of the list. When a host name is looked up, the query goes to the VPN which looks at the packet and responds with NXDOMAIN (non-existing domain) for hosts that are blocked.

You can find DNS66 here:

F-Droid is the recommended source to install from. DNS66 is licensed under the GNU GPL 3, or (mostly) any later version.

Implementation Notes

DNS66’s core logic is based on another project,  dbrodie/AdBuster, which arguably has the cooler name. I translated that from Kotlin to Java, and cleaned up the implementation a bit:

All work is done in a single thread by using poll() to detect when to read/write stuff. Each DNS request is sent via a new UDP socket, and poll() polls over all UDP sockets, a Device Socket (for the VPN’s tun device) and a pipe (so we can interrupt the poll at any time by closing the pipe).

We literally redirect your DNS servers. Meaning if your DNS server is, all traffic to is routed to the VPN. The VPN only understands DNS traffic, though, so you might have trouble if your DNS server also happens to serve something else. I plan to change that at some point to emulate multiple DNS servers with fake IPs, but this was a first step to get it working with fallback: Android can now transparently fallback to other DNS servers without having to be aware that they are routed via the VPN.

We also need to deal with timing out queries that we received no answer for: DNS66 stores the query into a LinkedHashMap and overrides the removeEldestEntry() method to remove the eldest entry if it is older than 10 seconds or there are more than 1024 pending queries. This means that it only times out up to one request per new request, but it eventually cleans up fine.


Filed under: Android, Uncategorized

25 October, 2016 04:20PM by Julian Andres Klode

hackergotchi for Michal Čihař

Michal Čihař

New features on Hosted Weblate

Today, new version has been deployed on Hosted Weblate. It brings many long requested features and enhancements.

Adding project to watched got way simpler, you can now do it on the project page using watch button:

Watch project

Another feature which will be liked by project admins is that they can now change project metadata without contacting me. This works for both project and component level:

Project settings

And adding some fancy things, there is new badge showing status of translations into all languages. This is how it looks for Weblate itself:

Translation status

As you can see it can get pretty big for projects with many translations, but you get complete picture of the translation status in it.

You can find all these features in upcoming Weblate 2.9 which should be released next week. Complete list of changes in Weblate 2.9 is described in our documentation.

Filed under: Debian English phpMyAdmin SUSE Weblate | 0 comments

25 October, 2016 04:00PM

hackergotchi for Jaldhar Vyas

Jaldhar Vyas

Aaargh gcc 5.x You Suck

Aaargh gcc 5.x You Suck

I had to write a quick program today which is going to be run many thousands of times a day so it has to run fast. I decided to do it in c++ instead of the usual perl or javascript because it seemed appropriate and I've been playing around a lot with c++ lately trying to update my knowledge of its' modern features. So 200 LOC later I was almost done so I ran the program through valgrind a good habit I've been trying to instill. That's when I got a reminder of why I avoid c++.

==37698== HEAP SUMMARY:
==37698==     in use at exit: 72,704 bytes in 1 blocks
==37698==   total heap usage: 5 allocs, 4 frees, 84,655 bytes allocated
==37698== LEAK SUMMARY:
==37698==    definitely lost: 0 bytes in 0 blocks
==37698==    indirectly lost: 0 bytes in 0 blocks
==37698==      possibly lost: 0 bytes in 0 blocks
==37698==    still reachable: 72,704 bytes in 1 blocks
==37698==         suppressed: 0 bytes in 0 blocks

One of things I've learnt which I've been trying to apply more rigorously is to avoid manual memory management (news/deletes.) as much as possible in favor of modern c++ features such as std::unique_ptr etc. By my estimation there should only be three places in my code where memory is allocated and none of them should leak. Where do the others come from? And why is there a missing free (or delete.) Now the good news is that valgrind is saying that the memory is not technically leaking. It is still reachable at exit but that's ok because the OS will reclaim it. But this program will run a lot and I think it could still lead to problems over time such as memory fragmentation so I wanted to understand what was going on. Not to mention the bad aesthetics of it.

My first assumption (one which has served me well over the years) was to assume that I had screwed up somewhere. Or perhaps it could some behind the scenes compiler magic. It turned out to be the latter -- sort of as I found out only after two hours of jiggling code in different ways and googling for clues. That's when I found this Stack Overflow question which suggests that it is either a valgrind or compiler bug. The answer specifically mentions gcc 5.1. I was using Ubuntu LTS which has gcc 5.4 so I have just gone ahead and assumed all 5.x versions of gcc have this problem. Sure enough, compiling the same program on Debian stable which has gcc 4.9 gave this...

==6045== HEAP SUMMARY:
==6045==     in use at exit: 0 bytes in 0 blocks
==6045==   total heap usage: 3 allocs, 3 frees, 10,967 bytes allocated
==6045== All heap blocks were freed -- no leaks are possible

...Much better. The executable was substantially smaller too. The time was not a total loss however. I learned that valgrind is pronounced val-grinned (it's from Norse mythology.) not val-grind as I had thought. So I have that going for me which is nice.

25 October, 2016 06:45AM

Russ Allbery

Review: Lord of Emperors

Review: Lord of Emperors, by Guy Gavriel Kay

Series: Sarantine Mosaic #2
Publisher: Eos
Copyright: 2000
Printing: February 2001
ISBN: 0-06-102002-8
Format: Mass market
Pages: 560

Lord of Emperors is the second half of a work that began with Sailing to Sarantium and is best thought of as a single book split for publishing reasons. You want to read the two together and in order.

As is typical for this sort of two-part work, it's difficult to review the second half without spoilers. I'll be more vague about the plot and the characters than normal, and will mark one bit that's arguably a bit of a spoiler (although I don't think it would affect the enjoyment of the book).

At the end of Sailing to Sarantium, we left Crispin in the great city, oddly and surprisingly entangled with some frighteningly powerful people and some more mundane ones (insofar as anyone is mundane in a Guy Gavriel Kay novel, but more on that in a bit). The opening of Lord of Emperors takes a break from the city to introduce a new people, the Bassanids, and a new character, Rustem of Karakek. While Crispin is still the heart of this story, the thread that binds the entirety of the Sarantine Mosaic together, Rustem is the primary protagonist for much of this book. I had somehow forgotten him completely since my first read of this series many years ago. I have no idea how.

I mentioned in my review of the previous book that one of the joys of reading this series is competence porn: watching the work of someone who is extremely good at what they do, and experiencing vicariously some of the passion and satisfaction they have for their work. Kay's handling of Crispin's mosaics is still the highlight of the series for me, but Rustem's medical practice (and Strumosus, and the chariot races) comes close. Rustem is a brilliant doctor by the standards of the time, utterly frustrated with the incompetence of the Sarantine doctors, but also weaving his own culture's belief in omens and portents into his actions. He's more reserved, more laconic than Crispin, but is another character with focused expertise and a deep internal sense of honor, swept unexpectedly into broader affairs and attempting to navigate them by doing the right thing in each moment. Kay fills this book with people like that, and it's compelling reading.

Rustem's entrance into the city accidentally sets off a complex chain of events that draws together all of the major characters of Sailing to Sarantium and adds a few more. The stakes are no less than war and control of major empires, and here Kay departs firmly from recorded history into his own creation. I had mentioned in the previous review that Justinian and Theodora are the clear inspirations for this story; that remains true, and many other characters are easy to map, but don't expect history to go here the way that it did in our world. Kay's version diverges significantly, and dramatically.

But one of the things I love the most about this book is its focus on the individual acts of courage, empathy, and ethics of each of the characters, even when those acts do not change the course of empires. The palace intrigue happens, and is important, but the individual acts of Kay's large cast get just as much epic narrative attention even if they would never appear in a history book. The most globally significant moment of the book is not the most stirring; that happens slightly earlier, in a chariot race destined to be forgotten by history. And the most touching moment of the book is a moment of connection between two people who would never appear in history, over the life of a third, that matters so much to the reader only because of the careful attention to individual lives and personalities Kay has shown over the course of a hundreds of pages.

A minor spoiler follows in the next paragraph, although I don't think it affects the reading of the book.

One brilliant part of Kay's fiction is that he doesn't have many villains, and goes to some lengths to humanize the actions of nearly everyone in the book. But sometimes the author's deep dislike of one particular character shows through, and here it's Pertennius (the clear analogue of Procopius). In a way, one could say the entirety of the Sarantine Mosaic is a rebuttal of the Secret History. But I think Kay's contrast between Crispin's art (and Scortius's, and Strumosus's) and Pertennius's history has a deeper thematic goal. I came away from this book feeling like the Sarantine Mosaic as a whole stands in contrast to a traditional history, stands against a reduction of people to dates and wars and buildings and governments. Crispin's greatest work attempts to capture emotion, awe, and an inner life. The endlessly complex human relationships shown in this book running beneath the political events occasionally surface in dramatic upheavals, but in Kay's telling the ones that stay below the surface are just as important. And while much of the other art shown in this book differs from Crispin's in being inherently ephemeral, it shares that quality of being the art of life, of complexity, of people in dynamic, changing, situational understanding of the world, exercising competence in some area that may or may not be remembered.

Kay raises to the level of epic the bits of history that don't get recorded, and, in his grand and self-conscious fantasy epic style, encourages the reader to feel those just as deeply as the ones that will have later historical significance. The measure of people, their true inner selves, is often shown in moments that Pertennius would dismiss and consider unworthy of recording in his history.

End minor spoiler.

I think Lord of Emperors is the best part of the Sarantine Mosaic duology. It keeps the same deeply enjoyable view of people doing things they are extremely good at while correcting some of the structural issues in the previous book. Kay continues to use a large cast, and continues to cut between viewpoint characters to show each event from multiple angles, but he has a better grasp of timing and order here than in Sailing to Sarantium. I never got confused about the timeline, thanks in part to more frequent and more linear scene cuts. And Lord of Emperors passes, with flying colors, the hardest test of a novel with a huge number of viewpoint characters: when Kay cuts to a new viewpoint, my reaction is almost always "yes, I wanted to see what they were thinking!" and almost never "wait, no, go back!".

My other main complaint about Sailing to Sarantium was the treatment of women, specifically the irresistibility of female sexual allure. Kay thankfully tones that down a lot here. His treatment of women is still a bit odd — one notices that five women seem to all touch the lives of the same men, and little room is left for Platonic friendship between the genders — but they're somewhat less persistently sexualized. And the women get a great deal of agency in this book, and a great deal of narrative respect.

That said, Lord of Emperors is also emotionally brutal. It's beautifully done, and entirely appropriate to the story, and Kay does provide a denouement that takes away a bit of the sting. But it's still very hard to read in spots if you become as invested in the characters and in the world as I do. Kay is writing epic that borders on tragedy, and uses his full capabilities as a writer to make the reader feel it. I love it, but it's not a book that I want to read too often.

As with nearly all Kay, the Sarantine Mosaic as a whole is intentional, deliberate epic writing, wearing its technique on its sleeve and making no apologies. There is constant foreshadowing, constant attempts to draw larger conclusions or reveal great principles of human nature, and a very open, repeated stress on the greatness and importance of events while they're being described. This works for me, but it doesn't work for everyone. If it doesn't work for you, the Sarantine Mosaic is unlikely to change your mind. But if you're in the mood for that type of story, I think this is one of Kay's best, and Lord of Emperors is the best half of the book.

Rating: 10 out of 10

25 October, 2016 04:04AM

hackergotchi for Gunnar Wolf

Gunnar Wolf

On the results of vote "gr_private2"

Given that I started the GR process, and that I called for discussion and votes, I feel somehow as my duty to also put a simple wrap-around to this process. Of course, I'll say many things already well-known to my fellow Debian people, but also non-debianers read this.

So, for further context, if you need to, please read my previous blog post, where I was about to send a call for votes. It summarizes the situation and proposals; you will find we had a nice set of messages in debian-vote@lists.debian.org during September; I have to thank all the involved parties, much specially to Ian Jackson, who spent a lot of energy summing up the situation and clarifying the different bits to everyone involved.

So, we held the vote; you can be interested in looking at the detailed vote statistics for the 235 correctly received votes, and most importantly, the results:

Results for gr_private2

First of all, I'll say I'm actually surprised at the results, as I expected Ian's proposal (acknowledge difficulty; I actually voted this proposal as my top option) to win and mine (repeal previous GR) to be last; turns out, the winner option was Iain's (remain private). But all in all, I am happy with the results: As I said during the discussion, I was much disappointed with the results to the previous GR on this topic — And, yes, it seems the breaking point was when many people thought the privacy status of posted messages was in jeopardy; we cannot really compare what I would have liked to have in said vote if we had followed the strategy of leaving the original resolution text instead of replacing it, but I believe it would have passed. In fact, one more surprise of this iteration was that I expected Further Discussion to be ranked higher, somewhere between the three explicit options. I am happy, of course, we got such an overwhelming clarity of what does the project as a whole prefer.

And what was gained or lost with this whole excercise? Well, if nothing else, we gain to stop lying. For over ten years, we have had an accepted resolution binding us to release the messages sent to debian-private given such-and-such-conditions... But never got around to implement it. We now know that debian-private will remain private... But we should keep reminding ourselves to use the list as little as possible.

For a project such as Debian, which is often seen as a beacon of doing the right thing no matter what, I feel being explicit about not lying to ourselves of great importance. Yes, we have the principle of not hiding our problems, but it has long been argued that the use of this list is not hiding or problems. Private communication can happen whenever you have humans involved, even if administratively we tried to avoid it.

Any of the three running options could have won, and I'd be happy. My #1 didn't win, but my #2 did. And, I am sure, it's for the best of the project as a whole.

25 October, 2016 01:46AM by gwolf

October 24, 2016

hackergotchi for Chris Lamb

Chris Lamb


Today marks the 13th anniversary since the last passenger flight from New York arrived in the UK. Every seat was filled, a feat that had become increasingly rare for a plane that was a technological marvel but a commercial flop….

  • Only 20 aircraft were ever built despite 100 orders, most of them cancelled in the early 1970s.
  • Taxiing to the runway consumed 2 tons of fuel.
  • The white colour scheme was specified to reduce the outer temperature by about 10°C.
  • In a promotional deal with Pepsi, F-BTSD was temporarily painted blue. Due to the change of colour, Air France were advised to remain at Mach 2 for no more than 20 minutes at a time.
  • At supersonic speed the fuselage would heat up and expand by as much as 30cm. The most obvious manifestation of this was a gap that opened up on the flight deck between the flight engineer's console and the bulkhead. On some aircraft conducting a retiring supersonic flight, the flight engineers placed their caps in this expanded gap, permanently wedging the cap as it shrank again.
  • At Concorde's altitude a breach of cabin integrity would result in a loss of pressure so severe that passengers would quickly suffer from hypoxia despite application of emergency oxygen. Concorde was thus built with smaller windows to reduce the rate of loss in such a breach.
  • The high cruising altitude meant passengers received almost twice the amount of radiation as a conventional long-haul flight. To prevent excessive exposure, the flight deck comprised of a radiometer; if the radiation level became too high, pilots would descend below 45,000 feet.
  • BA's service had a greater number of passengers who booked a flight and then failed to appear than any other aircraft in their fleet.
  • Market research later in Concorde's life revealed that customers thought Concorde was more expensive than it actually was. Ticket prices were progressively raised to match these perceptions.
  • The fastest transatlantic airliner flight was from New York JFK to London Heathrow on 7 February 1996 by British Airways' G-BOAD in 2 hours, 52 minutes, 59 seconds from takeoff to touchdown. It was aided by a 175 mph tailwind.

See also: A Rocket to Nowhere.

24 October, 2016 06:59PM

Reproducible builds folks

Reproducible Builds: week 78 in Stretch cycle

What happened in the Reproducible Builds effort between Sunday October 16 and Saturday October 22 2016:

Media coverage

Upcoming events


In order to build packages reproducibly, you not only need identical sources but also some external definition of the environment used for a particular build. This definition includes the inputs and the outputs and, in the Debian case, are available in a $package_$architecture_$version.buildinfo file.

We anticipate the next dpkg upload to sid will create .buildinfo files by default. Whilst it's clear that we also need to teach dak to deal with them (#763822) its not actually clear how to handle .buildinfo files after dak has processed them and how to make them available to the world.

To this end, Chris Lamb has started development on a proof-of-concept .buildinfo server to see what issues arise. Source

Reproducible work in other projects

  • Ximin Luo submitted a patch to GCC as a prerequisite for future patches to make debugging symbols reproducible.

Packages reviewed and fixed, and bugs filed

Reviews of unreproducible packages

99 package reviews have been added, 3 have been updated and 6 have been removed in this week, adding to our knowledge about identified issues.

6 issue types have been added:

Weekly QA work

During of reproducibility testing, some FTBFS bugs have been detected and reported by:

  • Chris Lamb (23)
  • Daniel Reichelt (2)
  • Lucas Nussbaum (1)
  • Santiago Vila (18)

diffoscope development


  • h01ger increased the diskspace for reproducible content on Jenkins. Thanks to ProfitBricks.
  • Valerie Young supplied a patch to make Python SQL interface more SQLite/PostgresSQL agnostic.
  • lynxis worked hard to make LEDE and OpenWrt builds happen on two hosts.


Our poll to find a good time for an IRC meeting is still running until Tuesday, October 25st; please reply as soon as possible.

We need a logo! Some ideas and requirements for a Reproducible Builds logo have been documented in the wiki. Contributions very welcome, even if simply by forwarding this information.

This week's edition was written by Chris Lamb & Holger Levsen and reviewed by a bunch of Reproducible Builds folks on IRC.

24 October, 2016 04:10PM

Russ Allbery

Review: The Design of Everyday Things

Review: The Design of Everyday Things, by Don Norman

Publisher: Basic Books
Copyright: 2013
ISBN: 0-465-05065-4
Format: Trade paperback
Pages: 298

There are several editions of this book (the first under a different title, The Psychology of Everyday Things). This review is for the Revised and Expanded Edition, first published in 2013 and quite significantly revised compared to the original. I probably read at least some of the original for a class in human-computer interaction around 1994, but that was long enough ago that I didn't remember any of the details.

I'm not sure how much impact this book has had outside of the computer field, but The Design of Everyday Things is a foundational text of HCI (human-computer interaction) despite the fact that many of its examples and much of its analysis is not specific to computers. Norman's goal is clearly to write a book that's fundamental to the entire field of design; not having studied the field, I don't know if he succeeded, but the impact on computing was certainly immense. This is the sort of book that everyone ends up hearing about, if not necessarily reading, in college. I was looking forward to filling a gap in my general knowledge.

Having now read it cover-to-cover, would I recommend others invest the time? Maybe. But probably not.

There are several things this book does well. One of the most significant is that it builds a lexicon and a set of general principles that provide a way of talking about design issues. Lexicons are not the most compelling reading material (see also Design Patterns), but having a common language is useful. I still remember affordances from college (probably from this book or something else based on it). Norman also adds, and defines, signifiers, constraints, mappings, and feedback, and talks about the human process of building a conceptual model of the objects with which one is interacting.

Even more useful, at least in my opinion, is the discussion of human task-oriented behavior. The seven stages of action is a great systematic way of analyzing how humans perform tasks, where those actions can fail, and how designers can help minimize failure. One thing I particularly like about Norman's presentation here is the emphasis on the feedback cycle after performing a task, or a step in a task. That feedback, and what makes good or poor feedback, is (I think) an underappreciated part of design and something that too often goes missing. I thought Norman was a bit too dismissive of simple beeps as feedback (he thinks they don't carry enough information; while that's not wrong, I think they're far superior to no feedback at all), but the emphasis on this point was much appreciated.

Beyond these dry but useful intellectual frameworks, though, Norman seems to have a larger purpose in The Design of Everyday Things: making a passionate argument for the importance of design and for not tolerating poor design. This is where I think his book goes a bit off the rails.

I can appreciate the boosterism of someone who feels an aspect of creating products is underappreciated and underfunded. But Norman hammers on the unacceptability of bad design to the point of tedium, and seems remarkably intolerant of, and unwilling to confront, the reasons why products may be released with poor designs for their eventual users. Norman clearly wishes that we would all boycott products with poor designs and prize usability above most (all?) other factors in our decisions. Equally clearly, this is not happening, and Norman knows it. He even describes some of the reasons why not, most notably (and most difficultly) the fact that the purchasers of many products are not the eventual users. Stoves are largely sold to builders, not kitchen cooks. Light switches are laid out for the convenience of the electrician; here too, the motive for the builder to spend additional money on better lighting controls is unclear. So much business software is purchased by people who will never use it directly, and may have little or no contact with the people who do. These layers of economic separation result in deep disconnects of incentive structure between product manufacturers and eventual consumers.

Norman acknowledges this, writes about it at some length, and then seems to ignore the point entirely, returning to ranting about the deficiencies of obviously poor design and encouraging people to care more about design. This seems weirdly superficial in this foundational of a book. I came away half-convinced that these disconnects of incentive (and some related problems, such as the unwillingness to invest in proper field research or the elaborate, expensive, and lengthy design process Norman lays out as ideal) are the primary obstacle in the way of better-designed consumer goods. If that's the case, then this is one of the largest, if not the largest, obstacle in the way of doing good design, and I would have expected this foundational of a book to tackle it head-on and provide some guidance for how to fight back against this problem. But Norman largely doesn't.

There is some mention of this in the introduction. Apparently much of the discussion of the practical constraints on product design in the business world was added in this revised edition, and perhaps what I'm seeing is the limitations of attempting to revise an existing text. But that also implies that the original took an even harder line against poor design. Throughout, Norman is remarkably high-handed in his dismissal of bad design, focusing more on condemnation than on an investigation of why bad design might happen and what we, as readers, can learn from that process to avoid repeating it. Norman does provide extensive analysis of the design process and the psychology of human interaction, but still left me with the impression that he believes most design failures stem from laziness and stupidity. The negativity and frustration got a bit tedious by the middle of the book.

There's quite a lot here that someone working in design, particularly interface design, should be at least somewhat familiar with: affordances, signifiers, the importance of feedback, the psychological model of tasks and actions, and the classification of errors, just to name a few. However, I'm not sure this book is the best medium for learning those things. I found it a bit tedious, a bit too arrogant, and weirdly unconcerned with feasible solutions to the challenge of mismatched incentives. I also didn't learn that much from it; while the concepts here are quite important, most of them I'd picked up by osmosis from working in the computing field for twenty years.

In that way, The Design of Everyday Things reminded me a great deal of the Gang of Four's Design Patterns, even though it's a more readable book and less of an exercise in academic classification. The concepts presented are useful and important, but I'm not sure I can recommend the book as a book. It may be better to pick up the same concepts as you go, with the help of Internet searches and shorter essays.

Rating: 6 out of 10

24 October, 2016 04:17AM

hackergotchi for Dirk Eddelbuettel

Dirk Eddelbuettel

Word Marathon Majors: Five Star Finisher!

A little over eight years ago, I wrote a short blog post which somewhat dryly noted that I had completed the five marathons constituting the World Marathon Majors. I had completed Boston, Chicago and New York during 2007, adding London and then Berlin (with a personal best) in 2008. The World Marathon Majors existed then, but I was not aware of a website. The organisation was aiming to raise the profile of the professional and very high-end aspect of the sport. But marathoning is funny as they let somewhat regular folks like you and me into the same race. And I always wondered if someone kept track of regular folks completing the suite...

I have been running a little less the last few years, though I did get around to complete the Illinois Marathon earlier this year (only tweeted about it and still have not added anything to the running section of my blog). But two weeks ago, I was once again handing out water cups at the Chicago Marathon, sending along two tweets when the elite wheelchair and elite male runners flew by. To the first, the World Marathon Majors account replied, which lead me to their website. Which in turn lead me to the Five Star Finisher page, and the newer / larger Six Star Finisher page now that Tokyo has been added.

And in short, one can now request one's record to be added (if they check out). So I did. And now I am on the Five Star Finisher page!

I don't think I'll ever surpass that as a runner. The table header and my row look like this:

Table header Dirk Eddelbuettel

If only my fifth / sixth grade physical education teacher could see that---he was one of those early running nuts from the 1970s and made us run towards / around this (by now enlarged) pond and boy did I hate that :) Guess it did have some long lasting effects. And I casually circled the lake a few years ago, starting much further away from my parents place. Once you are in the groove for distance...

But leaving that aside, running has been fun and I with some luck I may have another one or two marathons or Ragnar Relays left. The only really bad part about this is that I may have to get myself to Tokyo after all (for something that is not an ISM workshop) ...

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

24 October, 2016 02:41AM

hackergotchi for Daniel Silverstone

Daniel Silverstone

Gitano - Approaching Release - Deprecated commands

As mentioned previously I am working toward getting Gitano into Stretch. Last time we spoke about lace, on which a colleague and friend of mine (Richard Maw) did a large pile of work. This time I'm going to discuss deprecation approaches and building more capability out of fewer features.

First, a little background -- Gitano is written in Lua which is a deliberately small language whose authors spend more time thinking about what they can remove from the language spec than they do what they could add in. I first came to Lua in the 3.2 days, a little before 4.0 came out. (The authors provide a lovely timeline in case you're interested.) With each of the releases of Lua which came after 3.2, I was struck with how the authors looked to take a number of features which the language had, and collapse them into more generic, more powerful, smaller, fewer features.

This approach to design stuck with me over the subsequent decade, and when I began Gitano I tried to have the smallest number of core features/behaviours, from which could grow the power and complexity I desired. Gitano is, at its core, a set of files in a single format (clod) stored in a consistent manner (Git) which mediate access to a resource (Git repositories). Some of those files result in emergent properties such as the concept of the 'owner' of a repository (though that can simply be considered the value of the project.owner property for the repository). Indeed the concept of the owner of a repository is a fiction generated by the ACL system with a very small amount of collusion from the core of Gitano. Yet until recently Gitano had a first class command set-owner which would alter that one configuration value.

[gitano]  set-description ---- Set the repo's short description (Takes a repo)
[gitano]         set-head ---- Set the repo's HEAD symbolic reference (Takes a repo)
[gitano]        set-owner ---- Sets the owner of a repository (Takes a repo)

Those of you with Gitano installations may see the above if you ask it for help. Yet you'll also likely see:

[gitano]           config ---- View and change configuration for a repository (Takes a repo)

The config command gives you access to the repository configuration file (which, yes, you could access over git instead, but the config command can be delegated in a more fine-grained fashion without having to write hooks). Given the config command has all the functionality of the three specific set-* commands shown above, it was time to remove the specific commands.


If you had automation which used the set-description, set-head, or set-owner commands then you will want to switch to the config command before you migrate your server to the current or any future version of Gitano.

In brief, where you had:

ssh git@gitserver set-FOO repo something

You now need:

ssh git@gitserver config repo set project.FOO something

It looks a little more wordy but it is consistent with the other features that are keyed from the project configuration, such as:

ssh git@gitserver config repo set cgitrc.section Fooble Section Name

And, of course, you can see what configuration is present with:

ssh git@gitserver config repo show

Or look at a specific value with:

ssh git@gitserver config repo show specific.key

As always, you can get more detailed (if somewhat cryptic) help with:

ssh git@gitserver help config

Next time I'll try and touch on the new PGP/GPG integration support.

24 October, 2016 02:24AM by Daniel Silverstone

hackergotchi for Francois Marier

Francois Marier

Tweaking Referrers For Privacy in Firefox

The Referer header has been a part of the web for a long time. Websites rely on it for a few different purposes (e.g. analytics, ads, CSRF protection) but it can be quite problematic from a privacy perspective.

Thankfully, there are now tools in Firefox to help users and developers mitigate some of these problems.


In a nutshell, the browser adds a Referer header to all outgoing HTTP requests, revealing to the server on the other end the URL of the page you were on when you placed the request. For example, it tells the server where you were when you followed a link to that site, or what page you were on when you requested an image or a script. There are, however, a few limitations to this simplified explanation.

First of all, by default, browsers won't send a referrer if you place a request from an HTTPS page to an HTTP page. This would reveal potentially confidential information (such as the URL path and query string which could contain session tokens or other secret identifiers) from a secure page over an insecure HTTP channel. Firefox will however include a Referer header in HTTPS to HTTPS transitions unless network.http.sendSecureXSiteReferrer (removed in Firefox 52) is set to false in about:config.

Secondly, using the new Referrer Policy specification web developers can override the default behaviour for their pages, including on a per-element basis. This can be used both to increase or reduce the amount of information present in the referrer.

Legitimate Uses

Because the Referer header has been around for so long, a number of techniques rely on it.

Armed with the Referer information, analytics tools can figure out:

  • where website traffic comes from, and
  • how users are navigating the site.

Another place where the Referer is useful is as a mitigation against cross-site request forgeries. In that case, a website receiving a form submission can reject that form submission if the request originated from a different website.

It's worth pointing out that this CSRF mitigation might be better implemented via a separate header that could be restricted to particularly dangerous requests (i.e. POST and DELETE requests) and only include the information required for that security check (i.e. the origin).

Problems with the Referrer

Unfortunately, this header also creates significant privacy and security concerns.

The most obvious one is that it leaks part of your browsing history to sites you visit as well as all of the resources they pull in (e.g. ads and third-party scripts). It can be quite complicated to fix these leaks in a cross-browser way.

These leaks can also lead to exposing private personally-identifiable information when they are part of the query string. One of the most high-profile example is the accidental leakage of user searches by healthcare.gov.

Solutions for Firefox Users

While web developers can use the new mechanisms exposed through the Referrer Policy, Firefox users can also take steps to limit the amount of information they send to websites, advertisers and trackers.

In addition to enabling Firefox's built-in tracking protection by setting privacy.trackingprotection.enabled to true in about:config, which will prevent all network connections to known trackers, users can control when the Referer header is sent by setting network.http.sendRefererHeader to:

  • 0 to never send the header
  • 1 to send the header only when clicking on links and similar elements
  • 2 (default) to send the header on all requests (e.g. images, links, etc.)

It's also possible to put a limit on the maximum amount of information that the header will contain by setting the network.http.referer.trimmingPolicy to:

  • 0 (default) to send the full URL
  • 1 to send the URL without its query string
  • 2 to only send the scheme, host and port

or using the network.http.referer.XOriginTrimmingPolicy option (added in Firefox 52) to only restrict the contents of referrers attached to cross-origin requests.

Site owners can opt to share less information with other sites, but they can't share any more than what the user trimming policies allow.

Another approach is to disable the Referer when doing cross-origin requests (from one site to another). The network.http.referer.XOriginPolicy preference can be set to:

  • 0 (default) to send the referrer in all cases
  • 1 to send a referrer only when the base domains are the same
  • 2 to send a referrer only when the full hostnames match


If you try to remove all referrers (i.e. network.http.sendRefererHeader = 0, you will most likely run into problems on a number of sites, for example:

The first two have been worked-around successfully by setting network.http.referer.spoofSource to true, an advanced setting which always sends the destination URL as the referrer, thereby not leaking anything about the original page.

Unfortunately, the last two are examples of the kind of breakage that can only be fixed through a whitelist (an approach supported by the smart referer add-on) or by temporarily using a different browser profile.

My Recommended Settings

As with my cookie recommendations, I recommend strengthening your referrer settings but not disabling (or spoofing) it entirely.

While spoofing does solve many the breakage problems mentioned above, it also effectively disables the anti-CSRF protections that some sites may rely on and that have tangible user benefits. A better approach is to limit the amount of information that leaks through cross-origin requests.

If you are willing to live with some amount of breakage, you can simply restrict referrers to the same site by setting:

network.http.referer.XOriginPolicy = 2

or to sites which belong to the same organization (i.e. same ETLD/public suffix) using:

network.http.referer.XOriginPolicy = 1

This prevent leaks to third-parties while giving websites all of the information that they can already see in their own server logs.

On the other hand, if you prefer a weaker but more compatible solution, you can trim cross-origin referrers down to just the scheme, hostname and port:

network.http.referer.XOriginTrimmingPolicy = 2

I have not yet found user-visible breakage using this last configuration. Let me know if you find any!

24 October, 2016 12:00AM

October 23, 2016

Carl Chenet

PyMoneroWallet: the Python library for the Monero wallet

Do you know the Monero crytocurrency? It’s a cryptocurrency, like Bitcoin, focused on the security, the privacy and the untracabily. That’s a great project launched in 2014, today called XMR on all cryptocurrency exchange platforms (like Kraken or Poloniex).

So what’s new? In order to work with a Monero wallet from some Python applications, I just wrote a Python library to use the Monero wallet: PyMoneroWallet


Using PyMoneroWallet is as easy as:

$ python3
>>> from monerowallet import MoneroWallet
>>> mw = MoneroWallet()
>>> mw.getbalance()
{'unlocked_balance': 2262265030000, 'balance': 2262265030000}

Lots of features are included, you should have a look at the documentation of the monerowallet module to know them all, but quickly here are some of them:

And so on. Have a look at the complete documentation for extensive available functions.

UPDATE: I’m trying to launch a crowdfunding of the PyMoneroWallet project. Feel free to comment in this thread of the official Monero forum to let them know you think that PyMoneroWallet is a great idea 😉

Feel free to contribute to this starting project to help spreading the Monero use by using the PyMoneroWallet project with your Python applications 🙂

23 October, 2016 10:00PM by Carl Chenet

Vincent Sanders

Rabbit of Caerbannog

Subsequent to my previous use of American Fuzzy Lop (AFL) on the NetSurf bitmap image library I applied it to the gif library which, after fixing the test runner, failed to produce any crashes but did result in a better test corpus improving coverage above 90%

I then turned my attention to the SVG processing library. This was different to the bitmap libraries in that it required parsing a much lower density text format and performing operations on the resulting tree representation.

The test program for the SVG library needed some improvement but is very basic in operation. It takes the test SVG, parses it using libsvgtiny and then uses the parsed output to write out an imagemagick mvg file.

The libsvg processing uses the NetSurf DOM library which in turn uses an expat binding to parse the SVG XML text. To process this with AFL required instrumenting not only the XVG library but the DOM library. I did not initially understand this and my first run resulted in a "map coverage" indicating an issue. Helpfully the AFL docs do cover this so it was straightforward to rectify.

Once the test program was written and environment set up an AFL run was started and left to run. The next day I was somewhat alarmed to discover the fuzzer had made almost no progress and was running very slowly. I asked for help on the AFL mailing list and got a polite and helpful response, basically I needed to RTFM.

I must thank the members of the AFL mailing list for being so helpful and tolerating someone who ought to know better asking  dumb questions.

After reading the fine manual I understood I needed to ensure all my test cases were as small as possible and further that the fuzzer needed a dictionary as a hint to the file format because the text file was of such low data density compared to binary formats.

Rabbit of Caerbannog. Death awaits you with pointy teeth
I crafted an SVG dictionary based on the XML one, ensured all the seed SVG files were as small as possible and tried again. The immediate result was thousands of crashes, nothing like being savaged by a rabbit to cause a surprise.

Not being in possession of the appropriate holy hand grenade I resorted instead to GDB and electric fence. Unlike the bitmap library crashes memory bounds issues simply did not feature in the crashes.Instead they mainly centered around actual logic errors when constructing and traversing the data structures.

For example Daniel Silverstone fixed an interesting bug where the XML parser binding would try and go "above" the root node in the tree if the source closed more tags than it opened which resulted in wild pointers and NULL references.

I found and squashed several others including dealing with SVG which has no valid root element and division by zero errors when things like colour gradients have no points.

I find it interesting that the type and texture of the crashes completely changed between the SVG and binary formats. Perhaps it is just the nature of the textural formats that causes this although it might be due to the techniques used to parse the formats.

Once all the immediately reproducible crashes were dealt with I performed a longer run. I used my monster system as previously described and ran the fuzzer for a whole week.

Summary stats

Fuzzers alive : 10
Total run time : 68 days, 7 hours
Total execs : 9268 million
Cumulative speed : 15698 execs/sec
Pending paths : 0 faves, 2501 total
Pending per fuzzer : 0 faves, 250 total (on average)
Crashes found : 9 locally unique

After burning almost seventy days of processor time AFL found me another nine crashes and possibly more importantly a test corpus that generates over 90% coverage.

A useful tool that AFL provides is afl-cmin. This reduces the number of test files in a corpus to only those that are required to exercise all the code paths reached by the test set. In this case it reduced the number of files from 8242 to 2612

afl-cmin -i queue_all/ -o queue_cmin -- test_decode_svg @@ 1.0 /dev/null
corpus minimization tool for afl-fuzz by

[+] OK, 1447 tuples recorded.
[*] Obtaining traces for input files in 'queue_all/'...
Processing file 8242/8242...
[*] Sorting trace sets (this may take a while)...
[+] Found 23812 unique tuples across 8242 files.
[*] Finding best candidates for each tuple...
Processing file 8242/8242...
[*] Sorting candidate list (be patient)...
[*] Processing candidates and writing output files...
Processing tuple 23812/23812...
[+] Narrowed down to 2612 files, saved in 'queue_cmin'.

Additionally the actual information within the test files can be minimised with the afl-tmin tool. This must be run on each file individually and can take a relatively long time. Fortunately with GNU parallel one can run many of these jobs simultaneously which merely required another three days of CPU time to process. The resulting test corpus weighs in at a svelte 15 Megabytes or so against the 25 Megabytes before minimisation.

The result is yet another NetSurf library significantly improved by the use of AFL both from finding and squashing crashing bugs and from having a greatly improved test corpus to allow future library changes with a high confidence there will not be any regressions.

23 October, 2016 09:27PM by Vincent Sanders (noreply@blogger.com)

hackergotchi for Jaldhar Vyas

Jaldhar Vyas

What I Did During My Summer Vacation

Thats So Raven

If I could sum up the past year in one word, that word would be distraction. There have been so many strange, confusing or simply unforseen things going on I have had trouble focusing like never before.

For instance, on the opposite side of the street from me is one of Jersey City's old resorvoirs. It's not used for drinking water anymore and the city eventually plans on merging it into the park on the other side. In the meantime it has become something of a wildlife refuge. Which is nice except one of the newly settled critters was a bird of prey -- the consensus is possibly some kind of hawk or raven. Starting your morning commute under the eyes of a harbinger of death is very goth and I even learned to deal with the occasional piece of deconstructed rodent on my doorstep but nighttime was a big problem. For contrary to popular belief, ravens do not quoth "nevermore" but "KRRAAAA". Very loudly. Just as soon as you have drifted of to sleep. Eventually my sleep-deprived neighbors and I appealed to the NJ division of enviromental protection to get it removed but by the time they were ready to swing into action the bird had left for somewhere more congenial like Transylvania or Newark.

Or here are some more complete wastes of time: I go the doctor for my annual physical. The insurance company codes it as Adult Onset Diabetes by accident. One day I opened the lid of my laptop and there's a "ping" sound and a piece of the hinge flies off. Apparently that also severed the connection to the screen and naturally the warranty had just expired so I had to spend the next month tethered to an external monitor until I could afford to buy a new one. Mix in all the usual social, political, family and work drama and you can see that it has been a very trying time for me.


I have managed to get some Debian work done. On Dovecot, my principal package, I have gotten tremendous support from Apollon Oikonomopolous who I belatedly welcome as a member of the Dovecot maintainer team. He has been particularly helpful in fixing our systemd support and cleaning out a lot of the old and invalid bugs. We're in pretty good shape for the freeze. Upstream has released an RC of 2.2.26 and hopefully the final version will be out in the next couple of days so we can include it in Stretch. We can always use more help with the package so let me know if you're interested.


Most of the action has been going on without me but I've been lending support and sponsoring whenever I can. We have several new DDs and DMs but still no one north of the Vindhyas I'm afraid.

Debian Perl Group

gregoa did a ping of inactive maintainers and I regretfully had to admit to myself that I wasn't going to be of use anytime soon so I resigned. Perl remains my favorite language and I've actually been more involved in the meetings of my local Perlmongers group so hopefully I will be back again one day. And I still maintain the Perl modules I wrote myself.


May have gained a recruit.

*Stricly speaking it should be called Debian-People-Who-Dont-Think-Faults-in-One-Moral-Domain-Such-As-For-Example-Axe-Murdering-Should-Leak-Into-Another-Moral-Domain-Such-As-For-Example-Debian but come on, that's just silly.

23 October, 2016 05:01AM

October 22, 2016

Ingo Juergensmann

Automatically update TLSA records on new Letsencrypt Certs

I've been using DNSSEC for some quite time now and it is working quite well. When LetsEncrypt went public beta I jumped on the train and migrated many services to LE-based TLS. However there was still one small problem with LE certs: 

When there is a new cert, all of the old TLSA resource records are not valid anymore and might give problems to strict DNSSEC checking clients. It took some while until my pain was big enough to finally fix it by some scripts.

There are at least two scripts involved:

1) dnssec.sh
This script does all of my DNSSEC handling. You can just do a "dnssec.sh enable-dnssec domain.tld" and everything is configured so that you only need to copy the appropriate keys into the webinterface of your DNS registry.

host:~/bin# dnssec.sh
No parameter given.
Usage: dnsec.sh MODE DOMAIN

MODE can be one of the following:
enable-dnssec : perform all steps to enable DNSSEC for your domain
edit-zone     : safely edit your zone after enabling DNSSEC
create-dnskey : create new dnskey only
load-dnskey   : loads new dnskeys and signs the zone with them
show-ds       : shows DS records of zone
zoneadd-ds    : adds DS records to the zone file
show-dnskey   : extract DNSKEY record that needs to uploaded to your registrar
update-tlsa   : update TLSA records with new TLSA hash, needs old and new TLSA hashes as additional parameters

For updating zone-files just do a "dnssech.sh edit-zone domain.tld" to add new records and such and the script will take care e.g. of increasing the serial of the zone file. I find this very convenient, so I often use this script for non-DNSSEC-enabled domains as well.

However you can spot the command line option "update-tlsa". This option needs the old and the new TLSA hashes beside the domain.tld parameter. However, this option is used from the second script: 

2) check_tlsa.sh
This is a quite simple Bash script that parses the domains.txt from letsencrypt.sh script, looking up the old TLSA hash in the zone files (structured in TLD/domain.tld directories), compare the old with the new hash (by invoking tlsagen.sh) and if there is a difference in hashes, call dnssec.sh with the proper parameters: 

set -e
for i in `cat /etc/letsencrypt.sh/domains.txt | awk '{print $1}'` ; do
        domain=`echo $i | awk 'BEGIN {FS="."} ; {print $(NF-1)"."$NF}'`
        #echo -n "Domain: $domain"
        TLD=`echo $i | awk 'BEGIN {FS="."}; {print $NF}'`
        #echo ", TLD: $TLD"
        OLDTLSA=`grep -i "in.*tlsa" /etc/bind/${TLD}/${domain} | grep ${i} | head -n 1 | awk '{print $NF}'`
        if [ -n "${OLDTLSA}" ] ; then
                #echo "--> ${OLDTLSA}"
                # Usage: tlsagen.sh cert.pem host[:port] usage selector mtype
                NEWTLSA=`/path/to/tlsagen.sh $LEPATH/certs/${i}/fullchain.pem ${i} 3 1 1 | awk '{print $NF}'`
                #echo "==> $NEWTLSA"
                if [ "${OLDTLSA}" != "${NEWTLSA}" ] ; then
                        /path/to/dnssec.sh update-tlsa ${domain} ${OLDTLSA} ${NEWTLSA} > /dev/null
                        echo "TLSA RR update for ${i}"

So, quite simple and obviously a quick hack. For sure someone else can write a cleaner and more sophisticated implementation to do the same stuff, but at least it works for meTM. Use it on your own risk and do whatever you want with these scripts (licensed under public domain).

You can invoke check_tlsa.sh right after your crontab call for letsencrypt.sh. In a more sophisticated way it should be fairly easy to invoke these scripts from letsencrypt.sh post hooks as well.
Please find the files attached to this page (remove the .txt extension after saving, of course).


check_tlsa.sh.txt812 bytes
dnssec.sh.txt3.88 KB

22 October, 2016 10:29PM by ij

Matthieu Caneill

Debugging 101

While teaching this semester a class on concurrent programming, I realized during the labs that most of the students couldn't properly debug their code. They are at the end of a 2-year cursus, know many different programming languages and frameworks, but when it comes to tracking down a bug in their own code, they often lacked the basics. Instead of debugging for them I tried to give them general directions that they could apply for the next bugs. I will try here to summarize the very first basic things to know about debugging. Because, remember, writing software is 90% debugging, and 10% introducing new bugs (that is not from me, but I could not find the original quote).

So here is my take at Debugging 101.

Use the right tools

Many good tools exist to assist you in writing correct software, and it would put you behind in terms of productivity not to use them. Editors which catch syntax errors while you write them, for example, will help you a lot. And there are many features out there in editors, compilers, debuggers, which will prevent you from introducing trivial bugs. Your editor should be your friend; explore its features and customization options, and find an efficient workflow with them, that you like and can improve over time. The best way to fix bugs is not to have them in the first place, obviously.

Test early, test often

I've seen students writing code for one hour before running make, that would fail so hard that hundreds of lines of errors and warnings were outputted. There are two main reasons doing this is a bad idea:

  • You have to debug all the errors at once, and the complexity of solving many bugs, some dependent on others, is way higher than the complexity of solving a single bug. Moreover, it's discouraging.
  • Wrong assumptions you made at the beginning will make the following lines of code wrong. For example if you chose the wrong data structure for storing some information, you will have to fix all the code using that structure. It's less painful to realize earlier it was the wrong one to choose, and you have more chances of knowing that if you compile and execute often.

I recommend to test your code (compilation and execution) every few lines of code you write. When something breaks, chances are it will come from the last line(s) you wrote. Compiler errors will be shorter, and will point you to the same place in the code. Once you get more confident using a particular language or framework, you can write more lines at once without testing. That's a slow process, but it's ok. If you set up the right keybinding for compiling and executing from within your editor, it shouldn't be painful to test early and often.

Read the logs

Spot the places where your program/compiler/debugger writes text, and read it carefully. It can be your terminal (quite often), a file in your current directory, a file in /var/log/, a web page on a local server, anything. Learn where different software write logs on your system, and integrate reading them in your workflow. Often, it will be your only information about the bug. Often, it will tell you where the bug lies. Sometimes, it will even give you hints on how to fix it.

You may have to filter out a lot of garbage to find relevant information about your bug. Learn to spot some keywords like error or warning. In long stacktraces, spot the lines concerning your files; because more often, your code is to be blamed, rather than deeper library code. grep the logs with relevant keywords. If you have the option, colorize the output. Use tail -f to follow a file getting updated. There are so many ways to grasp logs, so find what works best with you and never forget to use it!

Print foobar

That one doesn't concern compilation errors (unless it's a Makefile error, in that case this file is your code anyway).

When the program logs and output failed to give you where an error occured (oh hi Segmentation fault!), and before having to dive into a memory debugger or system trace tool, spot the portion of your program that causes the bug and add in there some print statements. You can either print("foo") and print("bar"), just to know that your program reaches or not a certain place in your code, or print(some_faulty_var) to get more insights on your program state. It will give you precious information.

stderr >> "foo" >> endl;
my_db.connect(); // is this broken?
stderr >> "bar" >> endl;

In the example above, you can be sure it is the connection to the database my_db that is broken if you get foo and not bar on your standard error.

(That is an hypothetical example. If you know something can break, such as a database connection, then you should always enclose it in a try/catch structure).

Isolate and reproduce the bug

This point is linked to the previous one. You may or may not have isolated the line(s) causing the bug, but maybe the issue is not always raised. It can depend on many other things: the program or function parameters, the network status, the amount of memory available, the decisions of the OS scheduler, the user rights on the system or on some files, etc. More generally, any assumption you made on any external dependency can appear to be wrong (even if it's right 99% of the time). According to the context, try to isolate the set of conditions that trigger the bug. It can be as simple as "when there is no internet connection", or as complicated as "when the CPU load of some external machine is too high, it's a leap year, and the input contains illegal utf-8 characters" (ok, that one is fucked up; but it surely happens!). But you need to reliably be able to reproduce the bug, in order to be sure later that you indeed fixed it.

Of course when the bug is triggered at every run, it can be frustrating that your program never works but it will in general be easier to fix.


Always read the documentation before reaching out for help. Be it man, a book, a website or a wiki, you will find precious information there to assist you in using a language or a specific library. It can be quite intimidating at first, but it's often organized the same way. You're likely to find a search tool, an API reference, a tutorial, and many examples. Compare your code against them. Check in the FAQ, maybe your bug and its solution are already referenced there.

You'll rapidly find yourself getting used to the way documentation is organized, and you'll be more and more efficient at finding instantly what you need. Always keep the doc window open!

Google and Stack Overflow are your friends

Let's be honest: many of the bugs you'll encounter have been encountered before. Learn to write efficient queries on search engines, and use the knowledge you can find on questions&answers forums like Stack Overflow. Read the answers and comments. Be wise though, and never blindly copy and paste code from there. It can be as bad as introducing malicious security issues into your code, and you won't learn anything. Oh, and don't copy and paste anyway. You have to be sure you understand every single line, so better write them by hand; it's also better for memorizing the issue.

Take notes

Once you have identified and solved a particular bug, I advise to write about it. No need for shiny interfaces: keep a list of your bugs along with their solutions in one or many text files, organized by language or framework, that you can easily grep.

It can seem slightly cumbersome to do so, but it proved (at least to me) to be very valuable. I can often recall I have encountered some buggy situation in the past, but don't always remember the solution. Instead of losing all the debugging time again, I search in my bug/solution list first, and when it's a hit I'm more than happy I kept it.

Further reading degugging

Remember this was only Debugging 101, that is, the very first steps on how to debug code on your own, instead of getting frustrated and helplessly stare at your screen without knowing where to begin. When you'll write more software, you'll get used to more efficient workflows, and you'll discover tools that are here to assist you in writing bug-free code and spotting complex bugs efficiently. Listed below are some of the tools or general ideas used to debug more complex software. They belong more to a software engineering course than a Debugging 101 blog post. But it's good to know as soon as possible these exist, and if you read the manuals there's no reason you can't rock with them!

  • Loggers. To make the "foobar" debugging more efficient, some libraries are especially designed for the task of logging out information about a running program. They often have way more features than a simple print statement (at the price of being over-engineered for simple programs): severity levels (info, warning, error, fatal, etc), output in rotating files, and many more.

  • Version control. Following the evolution of a program in time, over multiple versions, contributors and forks, is a hard task. That's where version control plays: it allows you to keep the entire history of your program, and switch to any previous version. This way you can identify more easily when a bug was introduced (and by whom), along with the patch (a set of changes to a code base) that introduced it. Then you know where to apply your fix. Famous version control tools include Git, Subversion, and Mercurial.

  • Debuggers. Last but not least, it wouldn't make sense to talk about debugging without mentioning debuggers. They are tools to inspect the state of a program (for example the type and value of variables) while it is running. You can pause the program, and execute it line by line, while watching the state evolve. Sometimes you can also manually change the value of variables to see what happens. Even though some of them are hard to use, they are very valuable tools, totally worth diving into!

Don't hesitate to comment on this, and provide your debugging 101 tips! I'll be happy to update the article with valuable feedback.

Happy debugging!

22 October, 2016 10:00PM

hackergotchi for Iain R. Learmonth

Iain R. Learmonth

The Domain Name System

As I posted yesterday, we released PATHspider 1.0.0. What I didn’t talk about in that post was an event that occured only a few hours before the release.

Everything was going fine, proofreading of the documentation was in progress, a quick git push with the documentation updates and… CI FAILED!?! Our CI doesn’t build the documentation, only tests the core code. I’m planning to release real soon and something has broken.

Starting to panic.

irl@orbiter# ./tests.sh
Ran 16 tests in 0.984s


This makes no sense. Maybe I forgot to add a dependency and it’s been broken for a while? I scrutinise the dependencies list and it all looks fine.

In fairness, probably the first thing I should have done is look at the build log in Jenkins, but I’ve never had a failure that I couldn’t reproduce locally before.

It was at this point that I realised there was something screwy going on. A sigh of relief as I realise that there’s not a catastrophic test failure but now it looks like maybe there’s a problem with the University research group network, which is arguably worse.

Being focussed on getting the release ready, I didn’t realise that the Internet was falling apart. Unknown to me, a massive DDoS attack against Dyn, a major DNS host, was in progress. After a few attempts to debug the problem, I hardcoded a line into /etc/hosts, still believing it to be a localised issue.  github.com

I’ve just removed this line as the problem seems to have resolved itself for now. There are two main points I’ve taken away from this:

  • CI failure doesn’t necessarily mean that your code is broken, it can also indicate that your CI infrastructure is broken.
  • Decentralised internetwork routing is pretty worthless when the centralised name system goes down.

This afternoon I read a post by [tj] on the 57North Planet, and this is where I learnt what had really happened. He mentions multicast DNS and Namecoin as distributed name system alternatives. I’d like to add some more to that list:

Only the first of these is really a distributed solution.

My idea with ICMP Domain Name Messages is that you send an ICMP message to a webserver. Somewhere along the path, you’ll hit either a surveillance or censorship middlebox. These middleboxes can provide value by caching any DNS replies that are seen so that an ICMP DNS request message will cause the message to not be forwarded but a reply is generated to provide the answer to the query. If the middlebox cannot generate a reply, it can still forward it to other surveillance and censorship boxes.

I think this would be a great secondary use for the NSA and GCHQ boxen on the Internet, clearly fits within the scope of “defending national security” as if DNS is down the Internet is kinda dead, plus it’d make it nice and easy to find the boxes with PATHspider.

22 October, 2016 06:15PM

hackergotchi for Dirk Eddelbuettel

Dirk Eddelbuettel

RcppArmadillo 0.7.500.0.0

armadillo image

A few days ago, Conrad released Armadillo 7.500.0. The corresponding RcppArmadillo release 0.7.500.0.0 is now on CRAN (and will get into Debian shortly).

Armadillo is a powerful and expressive C++ template library for linear algebra aiming towards a good balance between speed and ease of use with a syntax deliberately close to a Matlab. RcppArmadillo integrates this library with the R environment and language--and is widely used by (currently) 274 other packages on CRAN.

Changes in this release relative to the previous CRAN release are as follows:

Changes in RcppArmadillo version 0.7.500.0.0 (2016-10-20)

  • Upgraded to Armadillo release 7.500.0 (Coup d'Etat)

    • Expanded qz() to optionally specify ordering of the Schur form

    • Expanded each_slice() to support matrix multiplication

Courtesy of CRANberries, there is a diffstat report. More detailed information is on the RcppArmadillo page. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

22 October, 2016 03:43PM

hackergotchi for Christoph Egger

Christoph Egger

Running Debian on the ClearFog

Back in August, I was looking for a Homeserver replacement. During FrOSCon I was then reminded of the Turris Omnia project by NIC.cz. The basic SoC (Marvel Armada 38x) seemed to be nice hand have decent mainline support (and, with the turris, users interested in keeping it working). Only I don't want any WIFI and I wasn't sure the standard case would be all that usefully. Fortunately, there's also a simple board available with the same SoC called ClearFog and so I got one of these (the Base version). With shipping and the SSD (the only 2242 M.2 SSD with 250 GiB I could find, a ADATA SP600) it slightly exceeds the budget but well.

ClearFog with SSD

When installing the machine, the obvious goal was to use mainline FOSS components only if possible. Fortunately there's mainline kernel support for the device as well as mainline U-Boot. First attempts to boot from a micro SD card did not work out at all, both with mainline U-Boot and the vendor version though. Turns out the eMMC version of the board does not support any micro SD cards at all, a fact that is documented but others failed to notice as well.


As the board does not come with any loader on eMMC and booting directly from M.2 requires removing some resistors from the board, the easiest way is using UART for booting. The vendor wiki has some shell script wrapping an included C fragment to feed U-Boot to the device but all that is really needed is U-Boot's kwboot utility. For some reason the SPL didn't properly detect UART booting on my device (wrong magic number) but patching the if (in arch-mvebu's spl.c) and always assume UART boot is an easy way around.

The plan then was to boot a Debian armhf rootfs with a defconfig kernel from USB stick. and install U-Boot and the rootfs to eMMC from within that system. Unfortunately U-Boot seems to be unable to talk to the USB3 port so no kernel loading from there. One could probably make UART loading work but switching between screen for serial console and xmodem seemed somewhat fragile and I never got it working. However ethernet can be made to work, though you need to set eth1addr to eth3addr (or just the right one of these) in U-Boot, saveenv and reboot. After that TFTP works (but is somewhat slow).


There's one last step required to allow U-Boot and Linux to access the eMMC. eMMC is wired to the same PINs as the SD card would be. However the SD card has an additional indicator pin showing whether a card is present. You might be lucky inserting a dummy card into the slot or go the clean route and remove the pin specification from the device tree.

--- a/arch/arm/dts/armada-388-clearfog.dts
+++ b/arch/arm/dts/armada-388-clearfog.dts
@@ -306,7 +307,6 @@

                        sdhci@d8000 {
                                bus-width = <4>;
-                               cd-gpios = <&gpio0 20 GPIO_ACTIVE_LOW>;
                                pinctrl-0 = <&clearfog_sdhci_pins

Next Up is flashing the U-Boot to eMMC. This seems to work with the vendor U-Boot but proves to be tricky with mainline. The fun part boils down to the fact that the boot firmware reads the first block from eMMC, but the second from SD card. If you write the mainline U-Boot, which was written and tested for SD card, to eMMC the SPL will try to load the main U-Boot starting from it's second sector from flash -- obviously resulting in garbage. This one took me several tries to figure out and made me read most of the SPL code for the device. The fix however is trivial (apart from the question on how to support all different variants from one codebase, which I'll leave to the U-Boot developers):

--- a/include/configs/clearfog.h
+++ b/include/configs/clearfog.h
@@ -143,8 +143,7 @@
 #define CONFIG_SYS_MMC_U_BOOT_OFFS             (160 << 10)
-                                                + 1)
 #define CONFIG_SYS_U_BOOT_MAX_SIZE_SECTORS     ((512 << 10) / 512) /* 512KiB */
 #define CONFIG_FIXED_SDHCI_ALIGNED_BUFFER      0x00180000      /* in SDRAM */


Now we have a System booting from eMMC with mainline U-Boot (which is a most welcome speedup compared to the UART and TFTP combination from the beginning). Getting to fine-tune linux on the device -- we want to install the armmp Debian kernel and have it work. As all the drivers are build as modules for that kernel this also means initrd support. Funnily U-Boots bootz allows booting a plain vmlinux kernel but I couldn't get it to boot a plain initrd. Passing a uImage initrd and a normal kernel however works pretty well. Back when I first tried there were some modules missing and ethernet didn't work with the PHY driver built as a module. In the meantime the PHY problem was fixed in the Debian kernel and almost all modules already added. Ben then only added the USB3 module on my suggestion and as a result, unstable's armhf armmp kernel should work perfectly well on the device (you still need to patch the device tree similar to the patch above). Still missing is an updated flash-kernel to automatically generate the initrd uImage which is work in progress but got stalled until I fixed the U-Boot on eMMC problem and everything should be fine -- maybe get debian u-boot builds for that board.

Pro versus Base

The main difference so far between the Pro and the Base version of the ClearFog is the switch chip which is included on the Pro. The Base instead "just" has two gigabit ethernet ports and a SFP. Both, linux' and U-Boot's device tree are intended for the Pro version which makes on of the ethernet ports unusable (it tries to find the switch behind the ethernet port which isn't there). To get both ports working (or the one you settled on earlier) there's a second patch to the device tree (my version might be sub-optimal but works), U-Boot -- the linux-kernel version is a trivial adaption:

--- a/arch/arm/dts/armada-388-clearfog.dts
+++ b/arch/arm/dts/armada-388-clearfog.dts
@@ -89,13 +89,10 @@
        internal-regs {
            ethernet@30000 {
                mac-address = [00 50 43 02 02 02];
+              managed = "in-band-status";
+              phy = <&phy1>;
                phy-mode = "sgmii";
                status = "okay";
-              fixed-link {
-                  speed = <1000>;
-                  full-duplex;
-              };

            ethernet@34000 {
@@ -227,6 +224,10 @@
                pinctrl-0 = <&mdio_pins>;
                pinctrl-names = "default";

+              phy1: ethernet-phy@1 { /* Marvell 88E1512 */
+                   reg = <1>;
+              };
                phy_dedicated: ethernet-phy@0 {
                     * Annoyingly, the marvell phy driver
@@ -386,62 +386,6 @@
        tx-fault-gpio = <&expander0 13 GPIO_ACTIVE_HIGH>;

-  dsa@0 {
-      compatible = "marvell,dsa";
-      dsa,ethernet = <&eth1>;
-      dsa,mii-bus = <&mdio>;
-      pinctrl-0 = <&clearfog_dsa0_clk_pins &clearfog_dsa0_pins>;
-      pinctrl-names = "default";
-      #address-cells = <2>;
-      #size-cells = <0>;
-      switch@0 {
-          #address-cells = <1>;
-          #size-cells = <0>;
-          reg = <4 0>;
-          port@0 {
-              reg = <0>;
-              label = "lan1";
-          };
-          port@1 {
-              reg = <1>;
-              label = "lan2";
-          };
-          port@2 {
-              reg = <2>;
-              label = "lan3";
-          };
-          port@3 {
-              reg = <3>;
-              label = "lan4";
-          };
-          port@4 {
-              reg = <4>;
-              label = "lan5";
-          };
-          port@5 {
-              reg = <5>;
-              label = "cpu";
-          };
-          port@6 {
-              /* 88E1512 external phy */
-              reg = <6>;
-              label = "lan6";
-              fixed-link {
-                  speed = <1000>;
-                  full-duplex;
-              };
-          };
-      };
-  };
    gpio-keys {
        compatible = "gpio-keys";
        pinctrl-0 = <&rear_button_pins>;


Apart from the mess with eMMC this seems to be a pretty nice device. It's now happily running with a M.2 SSD providing enough storage for now and still has a mSATA/mPCIe plug left for future journeys. It seems to be drawing around 5.5 Watts with SSD and one Ethernet connected while mostly idle and can feed around 500 Mb/s from disk over an encrypted ethernet connection which is, I guess, not too bad. My plans now include helping to finish flash-kernel support, creating a nice case and probably get it deployed. I might bring it to FOSDEM first though.

Working on it was really quite some fun (apart from the frustrating parts finding the one-block-offset ..) and people were really helpful. Big thanks here to Debian's arm folks, Ben Hutchings the kernel maintainer and U-Boot upstream (especially Tom Rini and Stefan Roese)

22 October, 2016 10:37AM

hackergotchi for Matthew Garrett

Matthew Garrett

Fixing the IoT isn't going to be easy

A large part of the internet became inaccessible today after a botnet made up of IP cameras and digital video recorders was used to DoS a major DNS provider. This highlighted a bunch of things including how maybe having all your DNS handled by a single provider is not the best of plans, but in the long run there's no real amount of diversification that can fix this - malicious actors have control of a sufficiently large number of hosts that they could easily take out multiple providers simultaneously.

To fix this properly we need to get rid of the compromised systems. The question is how. Many of these devices are sold by resellers who have no resources to handle any kind of recall. The manufacturer may not have any kind of legal presence in many of the countries where their products are sold. There's no way anybody can compel a recall, and even if they could it probably wouldn't help. If I've paid a contractor to install a security camera in my office, and if I get a notification that my camera is being used to take down Twitter, what do I do? Pay someone to come and take the camera down again, wait for a fixed one and pay to get that put up? That's probably not going to happen. As long as the device carries on working, many users are going to ignore any voluntary request.

We're left with more aggressive remedies. If ISPs threaten to cut off customers who host compromised devices, we might get somewhere. But, inevitably, a number of small businesses and unskilled users will get cut off. Probably a large number. The economic damage is still going to be significant. And it doesn't necessarily help that much - if the US were to compel ISPs to do this, but nobody else did, public outcry would be massive, the botnet would not be much smaller and the attacks would continue. Do we start cutting off countries that fail to police their internet?

Ok, so maybe we just chalk this one up as a loss and have everyone build out enough infrastructure that we're able to withstand attacks from this botnet and take steps to ensure that nobody is ever able to build a bigger one. To do that, we'd need to ensure that all IoT devices are secure, all the time. So, uh, how do we do that?

These devices had trivial vulnerabilities in the form of hardcoded passwords and open telnet. It wouldn't take terribly strong skills to identify this at import time and block a shipment, so the "obvious" answer is to set up forces in customs who do a security analysis of each device. We'll ignore the fact that this would be a pretty huge set of people to keep up with the sheer quantity of crap being developed and skip straight to the explanation for why this wouldn't work.

Yeah, sure, this vulnerability was obvious. But what about the product from a well-known vendor that included a debug app listening on a high numbered UDP port that accepted a packet of the form "BackdoorPacketCmdLine_Req" and then executed the rest of the payload as root? A portscan's not going to show that up[1]. Finding this kind of thing involves pulling the device apart, dumping the firmware and reverse engineering the binaries. It typically takes me about a day to do that. Amazon has over 30,000 listings that match "IP camera" right now, so you're going to need 99 more of me and a year just to examine the cameras. And that's assuming nobody ships any new ones.

Even that's insufficient. Ok, with luck we've identified all the cases where the vendor has left an explicit backdoor in the code[2]. But these devices are still running software that's going to be full of bugs and which is almost certainly still vulnerable to at least half a dozen buffer overflows[3]. Who's going to audit that? All it takes is one attacker to find one flaw in one popular device line, and that's another botnet built.

If we can't stop the vulnerabilities getting into people's homes in the first place, can we at least fix them afterwards? From an economic perspective, demanding that vendors ship security updates whenever a vulnerability is discovered no matter how old the device is is just not going to work. Many of these vendors are small enough that it'd be more cost effective for them to simply fold the company and reopen under a new name than it would be to put the engineering work into fixing a decade old codebase. And how does this actually help? So far the attackers building these networks haven't been terribly competent. The first thing a competent attacker would do would be to silently disable the firmware update mechanism.

We can't easily fix the already broken devices, we can't easily stop more broken devices from being shipped and we can't easily guarantee that we can fix future devices that end up broken. The only solution I see working at all is to require ISPs to cut people off, and that's going to involve a great deal of pain. The harsh reality is that this is almost certainly just the tip of the iceberg, and things are going to get much worse before they get any better.

Right. I'm off to portscan another smart socket.

[1] UDP connection refused messages are typically ratelimited to one per second, so it'll take almost a day to do a full UDP portscan, and even then you have no idea what the service actually does.

[2] It's worth noting that this is usually leftover test or debug code, not an overtly malicious act. Vendors should have processes in place to ensure that this isn't left in release builds, but ha well.

[3] My vacuum cleaner crashes if I send certain malformed HTTP requests to the local API endpoint, which isn't a good sign

comment count unavailable comments

22 October, 2016 05:14AM

Russell Coker

Another Broken Nexus 5

In late 2013 I bought a Nexus 5 for my wife [1]. It’s a good phone and I generally have no complaints about the way it works. In the middle of 2016 I had to make a warranty claim when the original Nexus 5 stopped working [2]. Google’s warranty support was ok, the call-back was good but unfortunately there was some confusion which delayed replacement.

Once the confusion about the IMEI was resolved the warranty replacement method was to bill my credit card for a replacement phone and reverse the charge if/when they got the original phone back and found it to have a defect covered by warranty. This policy meant that I got a new phone sooner as they didn’t need to get the old phone first. This is a huge benefit for defects that don’t make the phone unusable as you will never be without a phone. Also if the user determines that the breakage was their fault they can just refrain from sending in the old phone.

Today my wife’s latest Nexus 5 developed a problem. It turned itself off and went into a reboot loop when connected to the charger. Also one of the clips on the rear case had popped out and other clips popped out when I pushed it back in. It appears (without opening the phone) that the battery may have grown larger (which is a common symptom of battery related problems). The phone is slightly less than 3 years old, so if I had got the extended warranty then I would have got a replacement.

Now I’m about to buy a Nexus 6P (because the Pixel is ridiculously expensive) which is $700 including postage. Kogan offers me a 3 year warranty for an extra $108. Obviously in retrospect spending an extra $100 would have been a benefit for the Nexus 5. But the first question is whether new phone going to have a probability greater than 1/7 of failing due to something other than user error in years 2 and 3? For an extended warranty to provide any benefit the phone has to have a problem that doesn’t occur in the first year (or a problem in a replacement phone after the first phone was replaced). The phone also has to not be lost, stolen, or dropped in a pool by it’s owner. While my wife and I have a good record of not losing or breaking phones the probability of it happening isn’t zero.

The Nexus 5 that just died can be replaced for 2/3 of the original price. The value of the old Nexus 5 to me is less than 2/3 of the original price as buying a newer better phone is the option I want. The value of an old phone to me decreases faster than the replacement cost because I don’t want to buy an old phone.

For an extended warranty to be a good deal for me I think it would have to cost significantly less than 1/10 of the purchase price due to the low probability of failure in that time period and the decreasing value of a replacement outdated phone. So even though my last choice to skip an extended warranty ended up not paying out I expect that overall I will be financially ahead if I keep self-insuring, and I’m sure that I have already saved money by self-insuring all my previous devices.

22 October, 2016 04:56AM by etbe

October 21, 2016

hackergotchi for Iain R. Learmonth

Iain R. Learmonth

PATHspider 1.0.0 released!

In today’s Internet we see an increasing deployment of middleboxes. While middleboxes provide in-network functionality that is necessary to keep networks manageable and economically viable, any packet mangling — whether essential for the needed functionality or accidental as an unwanted side effect — makes it more and more difficult to deploy new protocols or extensions of existing protocols.

For the evolution of the protocol stack, it is important to know which network impairments exist and potentially need to be worked around. While classical network measurement tools are often focused on absolute performance values, PATHspider performs A/B testing between two different protocols or different protocol extensions to perform controlled experiments of protocol-dependent connectivity problems as well as differential treatment.

PATHspider is a framework for performing and analyzing these measurements, while the actual A/B test can be easily customized. Late on the 21st October, we released version 1.0.0 of PATHspider and it’s ready for “production” use (whatever that means for Internet research software).

Our first real release of PATHspider was version 0.9.0 just in time for the presentation of PATHspider at the 2016 Applied Networking Research Workshop co-located with IETF 96 in Berlin earlier this year. Since this release we have made a lot of changes and I’ll talk about some of the highlights here (in no particular order):

Switch from twisted.plugin to straight.plugin

While we anticipate that some plugins may wish to use some features of Twisted, we didn’t want to have Twisted as a core dependency for PATHspider. We found that straight.plugin was not just a drop-in replacement but it simplified the way in which 3rd-party plugins could be developed and it was worth the effort for that alone.

Library functions for the Observer

PATHspider has an embedded flow-meter (think something like NetFlow but highly customisable). We found that even with the small selection of plugins that we had we were duplicating code across plugins for these customisations of the flow-meter. In this release we now provide library functions for common needs such as identifying TCP 3-way handshake completions or identifying ICMP Unreachable messages for flows.

New plugin: DSCP

We’ve added a new plugin for this release to detect breakage when using DiffServ code points to achieve differentiated services within a network.

Plugins are now subcommands

Using the subparsers feature of argparse, all plugins including 3rd-party plugins will now appear as subcommands to the PATHspider command. This makes every plugin a first-class citizen and makes PATHspider truly generalised.

We have an added benefit from this that plugins can also ask for extra arguments that are specific to the needs of the plugin, for example the DSCP plugin allows the user to select which code point to send for the experimental test.

Test Suite

PATHspider now has a test suite! As the size of the PATHspider code base grows we need to be able to make changes and have confidence that we are not breaking code that another module relies on. We have so far only achieved 54% coverage of the codebase but we hope to improve this for the next release. We have focussed on the critical portions of data collection to ensure that all the results collected by PATHspider during experiments is valid.

DNS Resolver Utility

Back when PATHspider was known as ECNSpider, it had a utility for resolving IP addresses from the Alexa top 1 million list. This utility has now been fully integrated into PATHspider and appears as a plugin to allow for repeated experiments against the same IP addresses even if the DNS resolver would have returned a different addresss.


Documentation is definitely not my favourite activity, but it has to be done. PATHspider 1.0.0 now ships with documentation covering commandline usage, input/output formats and development of new plugins.

If you’d like to check out PATHspider, you can find the website at https://pathspider.net/.

Debian packages will be appearing shortly and will find their way into stable-backports within the next 2 weeks (hopefully).

Current development of PATHspider is supported by the European Union’s Horizon 2020 project MAMI. This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 688421. The opinions expressed and arguments employed reflect only the authors’ view. The European Commission is not responsible for any use that may be made of that information.

21 October, 2016 11:46PM

hackergotchi for Dirk Eddelbuettel

Dirk Eddelbuettel

anytime 0.0.4: New features and fixes

A brand-new release of anytime is now on CRAN following the three earlier releases since mid-September. anytime aims to convert anything in integer, numeric, character, factor, ordered, ... format to POSIXct (or Date) objects -- and does so without requiring a format string. See the anytime page for a few examples.

With release 0.0.4, we add two nice new features. First, NA, NaN and Inf are now simply skipped (similar to what the corresponding Base R functions do). Second, we now also accept large numeric values so that, _e.g., anytime(as.numeric(Sys.time()) also works, effectively adding another input type. We also have squashed an issue reported by the 'undefined behaviour' sanitizer, and the widened the test for when we try to deploy the gettz package get missing timezone information.

A quick example of the new features:

anydate(c(NA, NaN, Inf, as.numeric(as.POSIXct("2016-09-01 10:11:12"))))
[1] NA           NA           NA           "2016-09-01"

The NEWS file summarises the release:

Changes in anytime version 0.0.4 (2016-10-20)

  • Before converting via lexical_cast, assign to atomic type via template logic to avoid an UBSAN issue (PR #15 closing issue #14)

  • More robust initialization and timezone information gathering.

  • More robust processing of non-finite input also coping with non-finite values such as NA, NaN and Inf which all return NA

  • Allow numeric POSIXt representation on input, also creating proper POSIXct (or, if requested, Date)

Courtesy of CRANberries, there is a comparison to the previous release. More information is on the anytime page.

For questions or comments use the issue tracker off the GitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

21 October, 2016 02:25AM

October 20, 2016

hackergotchi for Kees Cook

Kees Cook


My prior post showed my research from earlier in the year at the 2016 Linux Security Summit on kernel security flaw lifetimes. Now that CVE-2016-5195 is public, here are updated graphs and statistics. Due to their rarity, the Critical bug average has now jumped from 3.3 years to 5.2 years. There aren’t many, but, as I mentioned, they still exist, whether you know about them or not. CVE-2016-5195 was sitting on everyone’s machine when I gave my LSS talk, and there are still other flaws on all our Linux machines right now. (And, I should note, this problem is not unique to Linux.) Dealing with knowing that there are always going to be bugs present requires proactive kernel self-protection (to minimize the effects of possible flaws) and vendors dedicated to updating their devices regularly and quickly (to keep the exposure window minimized once a flaw is widely known).

So, here are the graphs updated for the 668 CVEs known today:

  • Critical: 3 @ 5.2 years average
  • High: 44 @ 6.2 years average
  • Medium: 404 @ 5.3 years average
  • Low: 216 @ 5.5 years average

© 2016, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License

20 October, 2016 11:02PM by kees

Héctor Orón Martínez

Build a Debian package against Debian 8.0 using Download On Demand (DoD) service

In the previous post Open Build Service software architecture has been overviewed. In the current blog post, a tutorial on setting up a package build with OBS from Debian packages is presented.


  • Generate a test environment by creating Stretch/SID VM
  • Enable experimental repository
  • Install OBS server, api, worker and osc CLI packages
  • Ensure all OBS services are running
  • Create an OBS project for Download on Demand (DoD)
  • Create an OBS project linked to DoD
  • Adding a package to the project
  • Troubleshooting OBS

Generate a test environment by creating Stretch/SID VM

Really, use whatever suits you best, but please create an untrusted test environment for this one.

In the current tutorial it assumes “$hostname” is “stretch”, which should be stretch or sid suite.

Be aware that copy & paste configuration files from current post might lead you into broken characters (i.e. “).

Debian Stretch weekly netinst CD

Enable experimental repository

# echo "deb http://httpredir.debian.org/debian experimental main" >> /etc/apt/sources.list.d/experimental.list
# apt-get update

Install and setup OBS server, api, worker and osc CLI packages

# apt-get install obs-server obs-api obs-worker osc

In the install process mysql database is needed, therefore if mysql server is not setup, a password needs to be provided.
When OBS API database ‘obs-api‘ is created, we need to pick a password for it, provide “opensuse”. The ‘obs-api’ package will configure apache2 https webserver (creating a dummy certificate for “stretch”) to serve OBS webui.
Add “stretch” and “obs” aliases to “localhost” entry in your /etc/hosts file.
Enable worker by setting ENABLED=1 in /etc/default/obsworker
Try to connect to the web UI https://stretch/
Login into OBS webui, default login credentials: Admin/opensuse).
From command line tool, try to list projects in OBS

 $ osc -A https://stretch ls

Accept dummy certificate and provide credentials (defaults: Admin/opensuse)
If the install proceeds as expected follow to the next step.

Ensure all OBS services are running

# backend services
obsrun     813  0.0  0.9 104960 20448 ?        Ss   08:33   0:03 /usr/bin/perl -w /usr/lib/obs/server/bs_dodup
obsrun     815  0.0  1.5 157512 31940 ?        Ss   08:33   0:07 /usr/bin/perl -w /usr/lib/obs/server/bs_repserver
obsrun    1295  0.0  1.6 157644 32960 ?        S    08:34   0:07  \_ /usr/bin/perl -w /usr/lib/obs/server/bs_repserver
obsrun     816  0.0  1.8 167972 38600 ?        Ss   08:33   0:08 /usr/bin/perl -w /usr/lib/obs/server/bs_srcserver
obsrun    1296  0.0  1.8 168100 38864 ?        S    08:34   0:09  \_ /usr/bin/perl -w /usr/lib/obs/server/bs_srcserver
memcache   817  0.0  0.6 346964 12872 ?        Ssl  08:33   0:11 /usr/bin/memcached -m 64 -p 11211 -u memcache -l
obsrun     818  0.1  0.5  78548 11884 ?        Ss   08:33   0:41 /usr/bin/perl -w /usr/lib/obs/server/bs_dispatch
obsserv+   819  0.0  0.3  77516  7196 ?        Ss   08:33   0:05 /usr/bin/perl -w /usr/lib/obs/server/bs_service
mysql      851  0.0  0.0   4284  1324 ?        Ss   08:33   0:00 /bin/sh /usr/bin/mysqld_safe
mysql     1239  0.2  6.3 1010744 130104 ?      Sl   08:33   1:31  \_ /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysql/plugin --log-error=/var/log/mysql/error.log --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/run/mysqld/mysqld.sock --port=3306

# web services
root      1452  0.0  0.1 110020  3968 ?        Ss   08:34   0:01 /usr/sbin/apache2 -k start
root      1454  0.0  0.1 435992  3496 ?        Ssl  08:34   0:00  \_ Passenger watchdog
root      1460  0.3  0.2 651044  5188 ?        Sl   08:34   1:46  |   \_ Passenger core
nobody    1465  0.0  0.1 444572  3312 ?        Sl   08:34   0:00  |   \_ Passenger ust-router
www-data  1476  0.0  0.1 855892  2608 ?        Sl   08:34   0:09  \_ /usr/sbin/apache2 -k start
www-data  1477  0.0  0.1 856068  2880 ?        Sl   08:34   0:09  \_ /usr/sbin/apache2 -k start
www-data  1761  0.0  4.9 426868 102040 ?       Sl   08:34   0:29 delayed_job.0
www-data  1767  0.0  4.8 425624 99888 ?        Sl   08:34   0:30 delayed_job.1
www-data  1775  0.0  4.9 426516 101708 ?       Sl   08:34   0:28 delayed_job.2
nobody    1788  0.0  5.7 496092 117480 ?       Sl   08:34   0:03 Passenger RubyApp: /usr/share/obs/api
nobody    1796  0.0  4.9 488888 102176 ?       Sl   08:34   0:00 Passenger RubyApp: /usr/share/obs/api
www-data  1814  0.0  4.5 282576 92376 ?        Sl   08:34   0:22 delayed_job.1000
www-data  1829  0.0  4.4 282684 92228 ?        Sl   08:34   0:22 delayed_job.1010
www-data  1841  0.0  4.5 282932 92536 ?        Sl   08:34   0:22 delayed_job.1020
www-data  1855  0.0  4.9 427988 101492 ?       Sl   08:34   0:29 delayed_job.1030
www-data  1865  0.2  5.0 492500 102964 ?       Sl   08:34   1:09 clockworkd.clock
www-data  1899  0.0  0.0  87100  1400 ?        S    08:34   0:00 /usr/bin/searchd --pidfile --config /usr/share/obs/api/config/production.sphinx.conf
www-data  1900  0.1  0.4 161620  8276 ?        Sl   08:34   0:51  \_ /usr/bin/searchd --pidfile --config /usr/share/obs/api/config/production.sphinx.conf

# OBS worker
root      1604  0.0  0.0  28116  1492 ?        Ss   08:34   0:00 SCREEN -m -d -c /srv/obs/run/worker/boot/screenrc
root      1605  0.0  0.9  75424 18764 pts/0    Ss+  08:34   0:06  \_ /usr/bin/perl -w ./bs_worker --hardstatus --root /srv/obs/worker/root_1 --statedir /srv/obs/run/worker/1 --id stretch:1 --reposerver http://obs:5252 --jobs 1

Create an OBS project for Download on Demand (DoD)

Create a meta project file:

$ osc -A https://stretch:443 meta prj Debian:8 -e

<project name=”Debian:8″>
<title>Debian 8 DoD</title>
<description>Debian 8 DoD</description>
<person userid=”Admin” role=”maintainer”/>
<repository name=”main”>
<download arch=”x86_64″ url=”http://deb.debian.org/debian/jessie/main” repotype=”deb”/>

Visit webUI to check project configuration

Create a meta project configuration file:

$ osc -A https://stretch:443 meta prjconf Debian:8 -e

Add the following file, as found at build.opensuse.org

Repotype: debian

# create initial user
Preinstall: base-passwd
Preinstall: user-setup

# required for preinstall images
Preinstall: perl

# preinstall essentials + dependencies
Preinstall: base-files base-passwd bash bsdutils coreutils dash debconf
Preinstall: debianutils diffutils dpkg e2fslibs e2fsprogs findutils gawk
Preinstall: gcc-4.9-base grep gzip hostname initscripts insserv libacl1
Preinstall: libattr1 libblkid1 libbz2-1.0 libc-bin libc6 libcomerr2 libdb5.3
Preinstall: libgcc1 liblzma5 libmount1 libncurses5 libpam-modules
Preinstall: libpcre3 libsmartcols1
Preinstall: libpam-modules-bin libpam-runtime libpam0g libreadline6
Preinstall: libselinux1 libsemanage-common libsemanage1 libsepol1 libsigsegv2
Preinstall: libslang2 libss2 libtinfo5 libustr-1.0-1 libuuid1 login lsb-base
Preinstall: mount multiarch-support ncurses-base ncurses-bin passwd perl-base
Preinstall: readline-common sed sensible-utils sysv-rc sysvinit sysvinit-utils
Preinstall: tar tzdata util-linux zlib1g

Runscripts: base-passwd user-setup base-files gawk

VMinstall: libdevmapper1.02.1

Order: user-setup:base-files

# Essential packages (this should also pull the dependencies)
Support: base-files base-passwd bash bsdutils coreutils dash debianutils
Support: diffutils dpkg e2fsprogs findutils grep gzip hostname libc-bin 
Support: login mount ncurses-base ncurses-bin perl-base sed sysvinit 
Support: sysvinit-utils tar util-linux

# Build-essentials
Required: build-essential
Prefer: build-essential:make

# build script needs fakeroot
Support: fakeroot
# lintian support would be nice, but breaks too much atm
#Support: lintian

# helper tools in the chroot
Support: less kmod net-tools procps psmisc strace vim

# everything below same as for Debian:6.0 (apart from the version macros ofc)

# circular dependendencies in openjdk stack
Order: openjdk-6-jre-lib:openjdk-6-jre-headless
Order: openjdk-6-jre-headless:ca-certificates-java

Keep: binutils cpp cracklib file findutils gawk gcc gcc-ada gcc-c++
Keep: gzip libada libstdc++ libunwind
Keep: libunwind-devel libzio make mktemp pam-devel pam-modules
Keep: patch perl rcs timezone

Prefer: cvs libesd0 libfam0 libfam-dev expect

Prefer: gawk locales default-jdk
Prefer: xorg-x11-libs libpng fam mozilla mozilla-nss xorg-x11-Mesa
Prefer: unixODBC libsoup glitz java-1_4_2-sun gnome-panel
Prefer: desktop-data-SuSE gnome2-SuSE mono-nunit gecko-sharp2
Prefer: apache2-prefork openmotif-libs ghostscript-mini gtk-sharp
Prefer: glib-sharp libzypp-zmd-backend mDNSResponder

Prefer: -libgcc-mainline -libstdc++-mainline -gcc-mainline-c++
Prefer: -libgcj-mainline -viewperf -compat -compat-openssl097g
Prefer: -zmd -OpenOffice_org -pam-laus -libgcc-tree-ssa -busybox-links
Prefer: -crossover-office -libgnutls11-dev

# alternative pkg-config implementation
Prefer: -pkgconf
Prefer: -openrc
Prefer: -file-rc

Conflict: ghostscript-library:ghostscript-mini

Ignore: sysvinit:initscripts

Ignore: aaa_base:aaa_skel,suse-release,logrotate,ash,mingetty,distribution-release
Ignore: gettext-devel:libgcj,libstdc++-devel
Ignore: pwdutils:openslp
Ignore: pam-modules:resmgr
Ignore: rpm:suse-build-key,build-key
Ignore: bind-utils:bind-libs
Ignore: alsa:dialog,pciutils
Ignore: portmap:syslogd
Ignore: fontconfig:freetype2
Ignore: fontconfig-devel:freetype2-devel
Ignore: xorg-x11-libs:freetype2
Ignore: xorg-x11:x11-tools,resmgr,xkeyboard-config,xorg-x11-Mesa,libusb,freetype2,libjpeg,libpng
Ignore: apache2:logrotate
Ignore: arts:alsa,audiofile,resmgr,libogg,libvorbis
Ignore: kdelibs3:alsa,arts,pcre,OpenEXR,aspell,cups-libs,mDNSResponder,krb5,libjasper
Ignore: kdelibs3-devel:libvorbis-devel
Ignore: kdebase3:kdebase3-ksysguardd,OpenEXR,dbus-1,dbus-1-qt,hal,powersave,openslp,libusb
Ignore: kdebase3-SuSE:release-notes
Ignore: jack:alsa,libsndfile
Ignore: libxml2-devel:readline-devel
Ignore: gnome-vfs2:gnome-mime-data,desktop-file-utils,cdparanoia,dbus-1,dbus-1-glib,krb5,hal,libsmbclient,fam,file_alteration
Ignore: libgda:file_alteration
Ignore: gnutls:lzo,libopencdk
Ignore: gnutls-devel:lzo-devel,libopencdk-devel
Ignore: pango:cairo,glitz,libpixman,libpng
Ignore: pango-devel:cairo-devel
Ignore: cairo-devel:libpixman-devel
Ignore: libgnomeprint:libgnomecups
Ignore: libgnomeprintui:libgnomecups
Ignore: orbit2:libidl
Ignore: orbit2-devel:libidl,libidl-devel,indent
Ignore: qt3:libmng
Ignore: qt-sql:qt_database_plugin
Ignore: gtk2:libpng,libtiff
Ignore: libgnomecanvas-devel:glib-devel
Ignore: libgnomeui:gnome-icon-theme,shared-mime-info
Ignore: scrollkeeper:docbook_4,sgml-skel
Ignore: gnome-desktop:libgnomesu,startup-notification
Ignore: python-devel:python-tk
Ignore: gnome-pilot:gnome-panel
Ignore: gnome-panel:control-center2
Ignore: gnome-menus:kdebase3
Ignore: gnome-main-menu:rug
Ignore: libbonoboui:gnome-desktop
Ignore: postfix:pcre
Ignore: docbook_4:iso_ent,sgml-skel,xmlcharent
Ignore: control-center2:nautilus,evolution-data-server,gnome-menus,gstreamer-plugins,gstreamer,metacity,mozilla-nspr,mozilla,libxklavier,gnome-desktop,startup-notification
Ignore: docbook-xsl-stylesheets:xmlcharent
Ignore: liby2util-devel:libstdc++-devel,openssl-devel
Ignore: yast2:yast2-ncurses,yast2-theme-SuSELinux,perl-Config-Crontab,yast2-xml,SuSEfirewall2
Ignore: yast2-core:netcat,hwinfo,wireless-tools,sysfsutils
Ignore: yast2-core-devel:libxcrypt-devel,hwinfo-devel,blocxx-devel,sysfsutils,libstdc++-devel
Ignore: yast2-packagemanager-devel:rpm-devel,curl-devel,openssl-devel
Ignore: yast2-devtools:perl-XML-Writer,libxslt,pkgconfig
Ignore: yast2-installation:yast2-update,yast2-mouse,yast2-country,yast2-bootloader,yast2-packager,yast2-network,yast2-online-update,yast2-users,release-notes,autoyast2-installation
Ignore: yast2-bootloader:bootloader-theme
Ignore: yast2-packager:yast2-x11
Ignore: yast2-x11:sax2-libsax-perl
Ignore: openslp-devel:openssl-devel
Ignore: java-1_4_2-sun:xorg-x11-libs
Ignore: java-1_4_2-sun-devel:xorg-x11-libs
Ignore: kernel-um:xorg-x11-libs
Ignore: tetex:xorg-x11-libs,expat,fontconfig,freetype2,libjpeg,libpng,ghostscript-x11,xaw3d,gd,dialog,ed
Ignore: yast2-country:yast2-trans-stats
Ignore: susehelp:susehelp_lang,suse_help_viewer
Ignore: mailx:smtp_daemon
Ignore: cron:smtp_daemon
Ignore: hotplug:syslog
Ignore: pcmcia:syslog
Ignore: avalon-logkit:servlet
Ignore: jython:servlet
Ignore: ispell:ispell_dictionary,ispell_english_dictionary
Ignore: aspell:aspel_dictionary,aspell_dictionary
Ignore: smartlink-softmodem:kernel,kernel-nongpl
Ignore: OpenOffice_org-de:myspell-german-dictionary
Ignore: mediawiki:php-session,php-gettext,php-zlib,php-mysql,mod_php_any
Ignore: squirrelmail:mod_php_any,php-session,php-gettext,php-iconv,php-mbstring,php-openssl

Ignore: simias:mono(log4net)
Ignore: zmd:mono(log4net)
Ignore: horde:mod_php_any,php-gettext,php-mcrypt,php-imap,php-pear-log,php-pear,php-session,php
Ignore: xerces-j2:xml-commons-apis,xml-commons-resolver
Ignore: xdg-menu:desktop-data
Ignore: nessus-libraries:nessus-core
Ignore: evolution:yelp
Ignore: mono-tools:mono(gconf-sharp),mono(glade-sharp),mono(gnome-sharp),mono(gtkhtml-sharp),mono(atk-sharp),mono(gdk-sharp),mono(glib-sharp),mono(gtk-sharp),mono(pango-sharp)
Ignore: gecko-sharp2:mono(glib-sharp),mono(gtk-sharp)
Ignore: vcdimager:libcdio.so.6,libcdio.so.6(CDIO_6),libiso9660.so.4,libiso9660.so.4(ISO9660_4)
Ignore: libcdio:libcddb.so.2
Ignore: gnome-libs:libgnomeui
Ignore: nautilus:gnome-themes
Ignore: gnome-panel:gnome-themes
Ignore: gnome-panel:tomboy

Substitute: utempter

%ifnarch s390 s390x ppc ia64
Substitute: java2-devel-packages java-1_4_2-sun-devel
 %ifnarch s390x
Substitute: java2-devel-packages java-1_4_2-ibm-devel
Substitute: java2-devel-packages java-1_4_2-ibm-devel xorg-x11-libs-32bit

Substitute: yast2-devel-packages docbook-xsl-stylesheets doxygen libxslt perl-XML-Writer popt-devel sgml-skel update-desktop-files yast2 yast2-devtools yast2-packagemanager-devel yast2-perl-bindings yast2-testsuite

# SUSE compat mappings
Substitute: gcc-c++ gcc
Substitute: libsigc++2-devel libsigc++-2.0-dev
Substitute: glibc-devel-32bit
Substitute: pkgconfig pkg-config

%ifarch %ix86
Substitute: kernel-binary-packages kernel-default kernel-smp kernel-bigsmp kernel-debug kernel-um kernel-xen kernel-kdump
%ifarch ia64
Substitute: kernel-binary-packages kernel-default kernel-debug
%ifarch x86_64
Substitute: kernel-binary-packages kernel-default kernel-smp kernel-xen kernel-kdump
%ifarch ppc
Substitute: kernel-binary-packages kernel-default kernel-kdump kernel-ppc64 kernel-iseries64
%ifarch ppc64
Substitute: kernel-binary-packages kernel-ppc64 kernel-iseries64
%ifarch s390
Substitute: kernel-binary-packages kernel-s390
%ifarch s390x
Substitute: kernel-binary-packages kernel-default

%define debian_version 800

%debian_version 800

Visit webUI to check project configuration

Create an OBS project linked to DoD

$ osc -A https://stretch:443 meta prj test -e

<project name=”test”>
<person userid=”Admin” role=”maintainer”/>
<repository name=”Debian_8.0″>
<path project=”Debian:8″ repository=”main”/>

Visit webUI to check project configuration

Adding a package to the project

$ osc -A https://stretch:443 co test ; cd test
$ mkdir hello ; cd hello ; apt-get source -d hello ; cd - ; 
$ osc add hello 
$ osc ci -m "New import" hello

The package should go to dispatched state then get in blocked state while it downloads build dependencies from DoD link, eventually it should start building. Please check the journal logs to check if something went wrong or gets stuck.

Visit webUI to check hello package build state

OBS logging to the journal

Check in the journal logs everything went fine:

$ sudo journalctl -u obsdispatcher.service -u obsdodup.service -u obsscheduler@x86_64.service -u obsworker.service -u obspublisher.service


Currently we are facing few issues with web UI:

And there are more issues that have not been reported, please do ‘reportbug obs-api‘.

20 October, 2016 07:58AM by zumbi

hackergotchi for Daniel Pocock

Daniel Pocock

Choosing smartcards, readers and hardware for the Outreachy project

One of the projects proposed for this round of Outreachy is the PGP / PKI Clean Room live image.

Interns, and anybody who decides to start using the project (it is already functional for command line users) need to decide about purchasing various pieces of hardware, including a smart card, a smart card reader and a suitably secure computer to run the clean room image. It may also be desirable to purchase some additional accessories, such as a hardware random number generator.

If you have any specific suggestions for hardware or can help arrange any donations of hardware for Outreachy interns, please come and join us in the pki-clean-room mailing list or consider adding ideas on the PGP / PKI clean room wiki.

Choice of smart card

For standard PGP use, the OpenPGP card provides a good choice.

For X.509 use cases, such as VPN access, there are a range of choices. I recently obtained one of the SmartCard HSM cards, Card Contact were kind enough to provide me with a free sample. An interesting feature of this card is Elliptic Curve (ECC) support. More potential cards are listed on the OpenSC page here.

Choice of card reader

The technical factors to consider are most easily explained with a table:

On disk Smartcard reader without PIN-pad Smartcard reader with PIN-pad
Software Free/open Mostly free/open, Proprietary firmware in reader
Key extraction Possible Not generally possible
Passphrase compromise attack vectors Hardware or software keyloggers, phishing, user error (unsophisticated attackers) Exploiting firmware bugs over USB (only sophisticated attackers)
Other factors No hardware Small, USB key form-factor Largest form factor

Some are shortlisted on the GnuPG wiki and there has been recent discussion of that list on the GnuPG-users mailing list.

Choice of computer to run the clean room environment

There are a wide array of devices to choose from. Here are some principles that come to mind:

  • Prefer devices without any built-in wireless communications interfaces, or where those interfaces can be removed
  • Even better if there is no wired networking either
  • Particularly concerned users may also want to avoid devices with opaque micro-code/firmware
  • Small devices (laptops) that can be stored away easily in a locked cabinet or safe to prevent tampering
  • No hard disks required
  • Having built-in SD card readers or the ability to add them easily

SD cards and SD card readers

The SD cards are used to store the master private key, used to sign the certificates/keys on the smart cards. Multiple copies are kept.

It is a good idea to use SD cards from different vendors, preferably not manufactured in the same batch, to minimize the risk that they all fail at the same time.

For convenience, it would be desirable to use a multi-card reader:

although the software experience will be much the same if lots of individual card readers or USB flash drives are used.

Other devices

One additional idea that comes to mind is a hardware random number generator (TRNG), such as the FST-01.

Can you help with ideas or donations?

If you have any specific suggestions for hardware or can help arrange any donations of hardware for Outreachy interns, please come and join us in the pki-clean-room mailing list or consider adding ideas on the PGP / PKI clean room wiki.

20 October, 2016 07:25AM by Daniel.Pocock

October 19, 2016

hackergotchi for Pau Garcia i Quiles

Pau Garcia i Quiles

FOSDEM Desktops DevRoom 2017 all for Participation

FOSDEM is one of the largest (5,000+ hackers!) gatherings of Free Software contributors in the world and happens each February in Brussels (Belgium, Europe).

Once again, one of the tracks will be the Desktops DevRoom (formerly known as “CrossDesktop DevRoom”), which will host Desktop-related talks.

We are now inviting proposals for talks about Free/Libre/Open-source Software on the topics of Desktop development, Desktop applications and interoperability amongst Desktop Environments. This is a unique opportunity to show novel ideas and developments to a wide technical audience.

Topics accepted include, but are not limited to:

  • Open Desktops: Gnome, KDE, Unity, Enlightenment, XFCE, Razor, MATE, Cinnamon, ReactOS, CDE etc
  • Closed desktops: Windows, Mac OS X, MorphOS, etc (when talking about a FLOSS topic)
  • Software development for the desktop
  • Development tools
  • Applications that enhance desktops
  • General desktop matters
  • Cross-platform software development
  • Web
  • Thin clients, desktop virtualiation, etc

Talks can be very specific, such as the advantages/disadvantages of distributing a desktop application with snap vs flatpak, or as general as using HTML5 technologies to develop native applications.

Topics that are of interest to the users and developers of all desktop environments are especially welcome. The FOSDEM 2016 schedule might give you some inspiration.


Please include the following information when submitting a proposal:

  • Your name
  • The title of your talk (please be descriptive, as titles will be listed with around 400 from other projects)
  • Short abstract of one or two paragraphs
  • Short bio (with photo)
  • Requested time: from 15 to 45 minutes. Normal duration is 30 minutes. Longer duration requests must be properly justified. You may be assigned LESS time than you request.

How to submit

All submissions are made in the Pentabarf event planning tool: https://penta.fosdem.org/submission/FOSDEM17

To submit your talk, click on “Create Event”, then make sure to select the “Desktops” devroom as the “Track”. Otherwise your talk will not be even considered for any devroom at all.

If you already have a Pentabarf account from a previous year, even if your talk was not accepted, please reuse it. Create an account if, and only if, you don’t have one from a previous year. If you have any issues with Pentabarf, please contact desktops-devroom@lists.fosdem.org.


The deadline for submissions is December 5th 2016.

FOSDEM will be held on the weekend of 4 & 5 February 2017 and the Desktops DevRoom will take place on Sunday, February 5th 2017.

We will contact every submitter with a “yes” or “no” before December 11th 2016.

Recording permission

The talks in the Desktops DevRoom will be audio and video recorded, and possibly streamed live too.

In the “Submission notes” field, please indicate that you agree that your presentation will be licensed under the CC-By-SA-4.0 or CC-By-4.0 license and that you agree to have your presentation recorded. For example:

“If my presentation is accepted for FOSDEM, I hereby agree to license all recordings, slides, and other associated materials under the Creative Commons Attribution Share-Alike 4.0 International License. Sincerely, <NAME>.”

If you want us to stop the recording in the Q & A part (should you have one), please tell us. We can do that but only for the Q & A part.

More information

The official communication channel for the Desktops DevRoom is its mailing list desktops-devroom@lists.fosdem.org.

Use this page to manage your subscription: https://lists.fosdem.org/listinfo/desktops-devroom


The Desktops DevRoom 2017 is managed by a team representing the most notable open desktops:

  • Pau Garcia i Quiles, KDE
  • Christophe Fergeau, Gnome
  • Michael Zanetti, Unity
  • Philippe Caseiro, Enlightenment
  • Jérome Leclanche, Razor

If you want to join the team, please contact desktops-devroom@lists.fosdem.org

19 October, 2016 11:41PM by pgquiles

Héctor Orón Martínez

Open Build Service in Debian needs YOU! ☞

“Open Build Service is a generic system to build and distribute packages from sources in an automatic, consistent and reproducible way.”


openSUSE distributions’ build system is based on a generic framework named Open Build Service (OBS), I have been using these tools in my work environment, and I have to say, as Debian developer, that it is a great tool. In the current blog post I plan for you to learn the very basics of such tool and provide you with a tutorial to get, at least, a Debian package building.


Fig 1 – Open Build Service Architecture

The figure above shows Open Build Service, from now on OBS, software architecture. There are several parts which we should differenciate:

  • Web UI / API (obs-api)
  • Backend (obs-server)
  • Build daemon / worker (obs-worker)
  • CLI tool to manage API (osc)

Each one of the above packages can be installed in separated machines as a distributed architecture, it is very easy to split the system into several machines running the services, however in the tutorial below everything installs in one machine.


The backend is composed of several scripts written either in shell or Perl. There are several services running in the backend:

  • Source service
  • Repository service
  • Scheduler service
  • Dispatcher service
  • Warden service
  • Publisher service
  • Signer service
  • DoD service

The backend manages source packages (any format such RPM, DEB, …) and schedules them for a build in the worker. Once the package is built it can be published in a repository for the wider audience or kept unpublished and used by other builds.


System can have several worker machines which are encharged to perform the package builds. There are different options that can be configured (see /etc/default/obsworker) such enabling switch, number of worker instances, jobs per instance. This part of the system is written in shell and/or Perl language.


The frontend allows in a clickable way to get around most options OBS provides: setup projects, upload/branch/delete packages, submit review requests, etc. As an example, you can see a live instance running at https://build.opensuse.org/

The frontend parts are really a Ruby-on-rails web application, we (mainly thanks to Andrew Lee with ruby team help) have tried to get it nicely running, however we have had lots of issues due to javascripts or rubygems malfunctioning. Current webui is visible and provides some package status, however most actions do not work properly, or configurations cannot be applied as editor does not save changes, projects or packages in a project are not listed either. If you are a Ruby-on-rails expert or if you are able to help us out with some of the webui issues we get at Debian that would be really appreciated from our side.


OSC is a managing command line tool, written in Python, that interfaces with OBS API to be able to perform actions, edit configurations, do package reviews, etc.


Now that we have done a general overview of the system, let me introduce you to OBS with a practical tutorial.

TUTORIAL: Build a Debian package against Debian 8.0 using Download On Demand (DoD) service.

19 October, 2016 07:13PM by zumbi

hackergotchi for Raphaël Hertzog

Raphaël Hertzog

Freexian’s report about Debian Long Term Support, September 2016

A Debian LTS logoLike each month, here comes a report about the work of paid contributors to Debian LTS.

Individual reports

In September, about 152 work hours have been dispatched among 13 paid contributors. Their reports are available:

  • Balint Reczey did 15 hours (out of 12.25 hours allocated + 7.25 remaining, thus keeping 4.5 extra hours for October).
  • Ben Hutchings did 6 hours (out of 12.3 hours allocated + 1.45 remaining, he gave back 7h and thus keeps 9.75 extra hours for October).
  • Brian May did 12.25 hours.
  • Chris Lamb did 12.75 hours (out of 12.30 hours allocated + 0.45 hours remaining).
  • Emilio Pozuelo Monfort did 1 hour (out of 12.3 hours allocated + 2.95 remaining) and gave back the unused hours.
  • Guido Günther did 6 hours (out of 7h allocated, thus keeping 1 extra hour for October).
  • Hugo Lefeuvre did 12 hours.
  • Jonas Meurer did 8 hours (out of 9 hours allocated, thus keeping 1 extra hour for October).
  • Markus Koschany did 12.25 hours.
  • Ola Lundqvist did 11 hours (out of 12.25 hours assigned thus keeping 1.25 extra hours).
  • Raphaël Hertzog did 12.25 hours.
  • Roberto C. Sanchez did 14 hours (out of 12.25h allocated + 3.75h remaining, thus keeping 2 extra hours).
  • Thorsten Alteholz did 12.25 hours.

Evolution of the situation

The number of sponsored hours reached 172 hours per month thanks to maxcluster GmbH joining as silver sponsor and RHX Srl joining as bronze sponsor.

We only need a couple of supplementary sponsors now to reach our objective of funding the equivalent of a full time position.

The security tracker currently lists 39 packages with a known CVE and the dla-needed.txt file 34. It’s a small bump compared to last month but almost all issues are affected to someone.

Thanks to our sponsors

New sponsors are in bold.

No comment | Liked this article? Click here. | My blog is Flattr-enabled.

19 October, 2016 10:29AM by Raphaël Hertzog

hackergotchi for Kees Cook

Kees Cook

Security bug lifetime

In several of my recent presentations, I’ve discussed the lifetime of security flaws in the Linux kernel. Jon Corbet did an analysis in 2010, and found that security bugs appeared to have roughly a 5 year lifetime. As in, the flaw gets introduced in a Linux release, and then goes unnoticed by upstream developers until another release 5 years later, on average. I updated this research for 2011 through 2016, and used the Ubuntu Security Team’s CVE Tracker to assist in the process. The Ubuntu kernel team already does the hard work of trying to identify when flaws were introduced in the kernel, so I didn’t have to re-do this for the 557 kernel CVEs since 2011.

As the README details, the raw CVE data is spread across the active/, retired/, and ignored/ directories. By scanning through the CVE files to find any that contain the line “Patches_linux:”, I can extract the details on when a flaw was introduced and when it was fixed. For example CVE-2016-0728 shows:

 break-fix: 3a50597de8635cd05133bd12c95681c82fe7b878 23567fd052a9abb6d67fe8e7a9ccdd9800a540f2

This means that CVE-2016-0728 is believed to have been introduced by commit 3a50597de8635cd05133bd12c95681c82fe7b878 and fixed by commit 23567fd052a9abb6d67fe8e7a9ccdd9800a540f2. If there are multiple lines, then there may be multiple SHAs identified as contributing to the flaw or the fix. And a “-” is just short-hand for the start of Linux git history.

Then for each SHA, I queried git to find its corresponding release, and made a mapping of release version to release date, wrote out the raw data, and rendered graphs. Each vertical line shows a given CVE from when it was introduced to when it was fixed. Red is “Critical”, orange is “High”, blue is “Medium”, and black is “Low”:

CVE lifetimes 2011-2016

And here it is zoomed in to just Critical and High:

Critical and High CVE lifetimes 2011-2016

The line in the middle is the date from which I started the CVE search (2011). The vertical axis is actually linear time, but it’s labeled with kernel releases (which are pretty regular). The numerical summary is:

  • Critical: 2 @ 3.3 years
  • High: 34 @ 6.4 years
  • Medium: 334 @ 5.2 years
  • Low: 186 @ 5.0 years

This comes out to roughly 5 years lifetime again, so not much has changed from Jon’s 2010 analysis.

While we’re getting better at fixing bugs, we’re also adding more bugs. And for many devices that have been built on a given kernel version, there haven’t been frequent (or some times any) security updates, so the bug lifetime for those devices is even longer. To really create a safe kernel, we need to get proactive about self-protection technologies. The systems using a Linux kernel are right now running with security flaws. Those flaws are just not known to the developers yet, but they’re likely known to attackers, as there have been prior boasts/gray-market advertisements for at least CVE-2010-3081 and CVE-2013-2888.

(Edit: see my updated graphs that include CVE-2016-5195.)

© 2016, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License

19 October, 2016 04:46AM by kees

hackergotchi for Michal Čihař

Michal Čihař

Gammu 1.37.90

Yesterday Gammu 1.37.90 has been released. This release brings quite a lot of changes and it's for testing purposes. Hopefully stable 1.38.0 will follow soon as soon as I won't get negative feedback on the changes.

Besides code changes, there is one news for Windows users - there is Windows binary coming with the release. This was possible to automate thanks to AppVeyor, who does provide CI service where you can download built artifacts. Without this, I'd not be able to do make this as I don't have single Windows computer :-).

Full list of changes:

  • Improved support Huawei K3770.
  • API changes in some parameter types.
  • Fixed various Windows compilation issues.
  • Fixed several resource leaks.
  • Create outbox SMS atomically in FILES backend.
  • Removed getlocation command as we no longer fit into their usage policy.
  • Fixed call diverts on TP-LINK MA260.
  • Initial support for Oracle database.
  • Removed unused daemons, pbk and pbk_groups tables from the SMSD schema.
  • SMSD outbox entries now can have priority set in the database.
  • Added SIM IMSI to the SMSD status table.
  • Added CheckNetwork directive.
  • SMSD attempts to power on radio if disabled.
  • Fixed processing of AT unsolicited responses in some cases.
  • Fixed parsing USSD responses from some devices.

Would you like to see more features in Gammu? You an support further Gammu development at Bountysource salt or by direct donation.

Filed under: Debian English Gammu | 0 comments

19 October, 2016 04:00AM

Reproducible builds folks

Reproducible Builds: week 77 in Stretch cycle

What happened in the Reproducible Builds effort between Sunday October 9 and Saturday October 15 2016:

Media coverage

  • despinosa wrote a blog post on Vala and reproducibility
  • h01ger and lynxis gave a talk called "From Reproducible Debian builds to Reproducible OpenWrt, LEDE" (video, slides) at the OpenWrt Summit 2016 held in Berlin, together with ELCE, held by the Linux Foundation.
  • A discussion on debian-devel@ resulted in a nice quotable comment from Paul Wise: "(Reproducible) builds from source (with continuous rechecking) is the only way to have enough confidence that a Debian user has the freedoms promised to them by the Debian social contract."
  • Chris Lamb will present a talk at Software Freedom Kosovo on reproducible builds on Saturday 22nd October.

Documentation update

After discussions with HW42, Steven Chamberlain, Vagrant Cascadian, Daniel Shahaf, Christopher Berg, Daniel Kahn Gillmor and others, Ximin Luo has started writing up more concrete and detailed design plans for setting SOURCE_ROOT_DIR for reproducible debugging symbols, buildinfo security semantics and buildinfo security infrastructure.

Toolchain development and fixes

Dmitry Shachnev noted that our patch for #831779 has been temporarily rejected by docutils upstream; we are trying to persuade them again.

Tony Mancill uploaded javatools/0.59 to unstable containing original patch by Chris Lamb. This fixed an issue where documentation Recommends: substvars would not be reproducible.

Ximin Luo filed bug 77985 to GCC as a pre-requisite for future patches to make debugging symbols reproducible.

Packages reviewed and fixed, and bugs filed

The following updated packages have become reproducible - in our current test setup - after being fixed:

The following updated packages appear to be reproducible now, for reasons we were not able to figure out. (Relevant changelogs did not mention reproducible builds.)

  • aodh/3.0.0-2 by Thomas Goirand.
  • eog-plugins/3.16.5-1 by Michael Biebl.
  • flam3/3.0.1-5 by Daniele Adriana Goulart Lopes.
  • hyphy/2.2.7+dfsg-1 by Andreas Tille.
  • libbson/1.4.1-1 by A. Jesse Jiryu Davis.
  • libmongoc/1.4.1-1 by A. Jesse Jiryu Davis.
  • lxc/1:2.0.5-1 by Evgeni Golov.
  • spice-gtk/0.33-1 by Liang Guo.
  • spice-vdagent/0.17.0-1 by Liang Guo.
  • tnef/1.4.12-1 by Kevin Coyner.

Some uploads have addressed some reproducibility issues, but not all of them:

Some uploads have addressed nearly all reproducibility issues, except for build path issues:

Patches submitted that have not made their way to the archive yet:

Reviews of unreproducible packages

101 package reviews have been added, 49 have been updated and 4 have been removed in this week, adding to our knowledge about identified issues.

3 issue types have been updated:

Weekly QA work

During of reproducibility testing, some FTBFS bugs have been detected and reported by:

  • Anders Kaseorg (1)
  • Chris Lamb (18)



  • h01ger has turned off the "Scheduled in testing+unstable+experimental" regular IRC notifications and turned them into emails to those running jenkins.d.n.
  • Re-add opi2a armhf node and 3 new builder jobs for a total of 60 build jobs for armhf. (h01ger and vagrant)
  • vagrant suggested to add a variation of init systems effecting the build, and h01ger added it to the TODO list.
  • Steven Chamberlain submitted a patch so that now all buildinfo files are collected (unsigned yet) at submit@buildinfo.kfreebsd.eu.
  • Holger enabled CPU type variation (Intel Haswell or AMD Opteron 62xx) for i386. Thanks to Profitbricks.com for their great and continued support!


  • Increase memory on the 2 build nodes from 12 to 16gb, thanks to profitbricks.com


We are running a poll to find a good time for an IRC meeting.

This week's edition was written by Ximin Luo, Holger Levsen & Chris Lamb and reviewed by a bunch of Reproducible Builds folks on IRC.

19 October, 2016 12:02AM

October 18, 2016

Enrico Zini

debtags and aptitude forget-new

I like to regularly go through the new packages section in aptitude to see what interesting new packages entered testing, but recently that joyful moment got less joyful for me because of a barrage of obscurely named packages.

I have just realised that aptitude forget-new supports search patterns, and that brought back the joy.

I put this in a script that I run before looking for new packages in aptitude:

aptitude forget-new '?tag(field::biology)
                   | ?tag(devel::lang:ruby)
                   | ?tag(devel::lang:perl)
                   | ?tag(role::shared-lib)
                   | ?tag(suite::openstack)
                   | ?tag(implemented-in::php)
                   | ~n^node-'

The actual content of the search pattern is purely a matter of taste.

I'm happy to see how debtags becomes quite useful here, to keep my own user experience manageable as the size of Debian keeps growing.

Update: pabs suggested to use apt post-invoke hooks. For example:

        $ cat /etc/apt/apt.conf.d/99forget-new
        APT::Update::Post-Invoke { "aptitude forget-new '~sdebug'"; };

18 October, 2016 08:25AM

hackergotchi for MJ Ray

MJ Ray

Rinse and repeat

Forgive me, reader, for I have sinned. It has been over a year since my last blog post. Life got busy. Paid work. Another round of challenges managing my chronic illness. Cycle campaigning. Fun bike rides. Friends. Family. Travels. Other social media to stroke. I’m still reading some of the planets where this blog post should appear and commenting on some, so I’ve not felt completely cut off, but I am surprised how many people don’t allow comments on their blogs any more (or make it too difficult for me with reCaptcha and the like).

The main motive for this post is to test some minor upgrades, though. Hi everyone. How’s it going with you? I’ll probably keep posting short updates in the future.

Go in peace to love and serve the web. 🙂

18 October, 2016 04:28AM by mjr

hackergotchi for Dirk Eddelbuettel

Dirk Eddelbuettel

gettz 0.0.2

Release 0.0.2 of gettz is now on CRAN.

gettz provides a possible fallback in situations where Sys.timezone() fails to determine the system timezone. That can happen when e.g. the file /etc/localtime somehow is not a link into the corresponding file with zoneinfo data in, say, /usr/share/zoneinfo.

Windows is now no longer excluded, though it doesn't do anything useful yet. The main use of the package is still for Linux.

For questions or comments use the issue tracker off the GitHub repo.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

18 October, 2016 02:16AM

October 17, 2016

Russ Allbery

pgpcontrol 2.5

pgpcontrol is the collection of the original signing and verification scripts that David Lawrence wrote (in Perl) for verification of Usenet control messages. I took over maintenance of it, with a few other things, but haven't really done much with it. It would benefit a lot from an overhaul of both the documentation and the code, and turning it into a more normal Perl module and supporting scripts.

This release is none of those things. It's just pure housekeeping, picking up changes made by other people (mostly Julien ÉLIE) to the copies of the scripts in INN and making a few minor URL tweaks. But I figured I may as well, rather than distribute old versions of the scripts.

You can tell how little I've done with this stuff by noting that they don't even have a distribution page on my web site. The canonical distribution site is ftp.isc.org, although I'm not sure if that site will pick up the new release. (This relies on a chain of rsync commands that have been moved multiple times since the last time I pushed the release button, and I suspect that has broken.) I'll ping someone about possibly fixing that; in the meantime, you can find the files on archives.eyrie.org.

17 October, 2016 11:36PM

Arturo Borrero González

nftables in Debian Stretch

Debian - Netfilter

The next Debian stable release is codenamed Stretch, which I would expect to be released in less than a year.

The Netfilter Project has been developing nftables for years now, and the status of the framework has been improved to a good point: it’s ready for wide usage and adoption, even in high-demand production environments.

The last released version of nft was 0.6, and the Debian package was updated just a day after Netfilter announced it.

With the 0.6 version the software framework reached a good state of maturity, and I myself encourage users to migrate from iptables to nftables.

In case you don’t know about nftables yet, here is a quick reference:

  • it’s the tool/framework meant to replace iptables (also ip6tables, arptables and ebtables)
  • it integrates advanced structures which allow to arrange your ruleset for optimal performance
  • all the system is more configurable than in iptables
  • the syntax is much better than in iptables
  • several actions in a single rule
  • simplified IPv4/IPv6 dual stack
  • less kernel updates required
  • great support for incremental, dynamic and atomic ruleset updates

To run nftables in Debian Stretch you need several components:

  1. nft: the command line interface
  2. libnftnl: the nftables-netlink library
  3. linux kernel: a least 4.7 is recommended

A simple aptitude run will put your system ready to go, out of the box, with nftables:

root@debian:~# aptitude install nftables

Once installed, you can start using the nft command:

root@debian:~# nft list ruleset

A good starting point is to copy a simple workstation firewall configuration:

root@debian:~# cp /usr/share/doc/nftables/examples/syntax/workstation /etc/nftables.conf

And load it:

root@debian:~# nft -f /etc/nftables.conf

Your nftables ruleset is now firewalling your network:

root@debian:~# nft list ruleset
table inet filter {
        chain input {
                type filter hook input priority 0;
                iif lo accept
                ct state established,related accept
                ip6 nexthdr icmpv6 icmpv6 type { nd-neighbor-solicit,  nd-router-advert, nd-neighbor-advert } accept
                counter drop

Several examples can be found at /usr/share/doc/nftables/examples/.

A simple systemd service is included to load your ruleset at boot time, which is disabled by default.

If you are running Debian Jessie and want to give a try, you can use nftables from jessie-backports.

If you want to migrate your ruleset from iptables to nftables, good news. There are some tools in place to help in the task of translating from iptables to nftables, but that doesn’t belong to this post :-)


The nano editor includes nft syntax highlighting. What are you waiting for to use nftables?

17 October, 2016 01:30PM

hackergotchi for Thomas Lange

Thomas Lange

FAI 5.2 is going to the cloud

The newest version of FAI, the Fully Automatic Installation tool set, now supports creating disk images for virtual machines or for your cloud environment.

The new command fai-diskimage uses the normal FAI process for building disk images of different formats. An image with a small set of packages can be created in less than 50 seconds, a Debian XFCE desktop in nearly two minutes and a complete Ubuntu 16.04 desktop image is created in four minutes.

New FAI installation images for CD and USB stick are also available.

Update: Add link to announcement

FAI cloud

17 October, 2016 11:51AM

hackergotchi for Jaldhar Vyas

Jaldhar Vyas

Something Else Will Be Posted Soon Also.

Yikes today was Sharad Purnima which means there is about two weeks to go before Diwali and I haven't written anything here all year.

OK new challenge: write 7 substantive blog posts before Diwali. Can I manage to do it? Let's see...

17 October, 2016 06:07AM

Russell Coker

Improving Memory

I’ve just attended a lecture about improving memory, mostly about mnemonic techniques. I’m not against learning techniques to improve memory and I think it’s good to teach kids a variety of things many of which won’t be needed when they are younger as you never know which kids will need various skills. But I disagree with the assertion that we are losing valuable skills due to “digital amnesia”.

Nowadays we have programs to check spelling so we can avoid the effort of remembering to spell difficult words like mnemonic, calendar apps on our phones that link to addresses and phone numbers, and the ability to Google the world’s knowledge from the bathroom. So the question is, what do we need to remember?

For remembering phone numbers it seems that all we need is to remember numbers that we might call in the event of a mobile phone being lost or running out of battery charge. That would be a close friend or relative and maybe a taxi company (and 13CABS isn’t difficult to remember).

Remembering addresses (street numbers etc) doesn’t seem very useful in any situation. Remembering the way to get to a place is useful and it seems to me that the way the navigation programs operate works against this. To remember a route you would want to travel the same way on multiple occasions and use a relatively simple route. The way that Google maps tends to give the more confusing routes (IE routes varying by the day and routes which take all shortcuts) works against this.

I think that spending time improving memory skills is useful, but it will either take time away from learning other skills that are more useful to most people nowadays or take time away from leisure activities. If improving memory skills is fun for you then it’s probably better than most hobbies (it’s cheap and provides some minor benefits in life).

When I was in primary school it was considered important to make kids memorise their “times tables”. I’m sure that memorising the multiplication of all numbers less than 13 is useful to some people, but I never felt a need to do it. When I was young I could multiply any pair of 2 digit numbers as quickly as most kids could remember the result. The big difference was that most kids needed a calculator to multiply any number by 13 which is a significant disadvantage.

What We Must Memorise

Nowadays the biggest memory issue is with passwords (the Correct Horse Battery Staple XKCD comic is worth reading [1]). Teaching mnemonic techniques for the purpose of memorising passwords would probably be a good idea – and would probably get more interest from the audience.

One interesting corner-case of passwords is ATM PIN numbers. The Wikipedia page about PIN numbers states that 4-12 digits can be used for PINs [2]. The 4 digit PIN was initially chosen because John Adrian Shepherd-Barron (who is credited with inventing the ATM) was convinced by his wife that 6 digits would be too difficult to memorise. The fact that hardly any banks outside Switzerland use more than 4 digits suggests that Mrs Shepherd-Barron had a point. The fact that this was decided in the 60’s proves that it’s not “digital amnesia”.

We also have to memorise how to use various supposedly user-friendly programs. If you observe an iPhone or Mac being used by someone who hasn’t used one before it becomes obvious that they really aren’t so user friendly and users need to memorise many operations. This is not a criticism of Apple, some tasks are inherently complex and require some complexity of the user interface. The limitations of the basic UI facilities become more obvious when there are operations like palm-swiping the screen for a screen-shot and a double-tap plus drag for a 1 finger zoom on Android.

What else do we need to memorise?

17 October, 2016 04:20AM by etbe

October 16, 2016

hackergotchi for Thomas Goirand

Thomas Goirand

Released OpenStack Newton, Moving OpenStack packages to upstream Gerrit CI/CD

OpenStack Newton is released, and uploaded to Sid

OpenStack Newton was released on the Thursday 6th of October. I was able to upload nearly all of it before the week-end, though there was a bit of hick-ups still, as I forgot to upload python-fixtures 3.0.0 to unstable, and only realized it thanks to some bug reports. As this is a build time dependency, it didn’t disrupt Sid users too much, but 38 packages wouldn’t build without it. Thanks to Santiago Vila for pointing at the issue here.

As of writing, a lot of the Newton packages didn’t migrate to Testing yet. It’s been migrating in a very messy way. I’d love to improve this process, but I’m not sure how, if not filling RC bugs against 250 packages (which would be painful to do), so they would migrate at once. Suggestions welcome.

Bye bye Jenkins

For a few years, I was using Jenkins, together with a post-receive hook to build Debian Stable backports of OpenStack packages. Though nearly a year and a half ago, we had that project to build the packages within the OpenStack infrastructure, and use the CI/CD like OpenStack upstream was doing. This is done, and Jenkins is gone, as of OpenStack Newton.

Current status

As of August, almost all of the packages Git repositories were uploaded to OpenStack Gerrit, and the build now happens in OpenStack infrastructure. We’ve been able to build all packages a release OpenStack Newton Debian packages using this system. This non-official jessie backports repository has also been validated using Tempest.

Goodies from Gerrit and upstream CI/CD

It is very nice to have it built this way, so we will be able to maintain a full CI/CD in upstream infrastructure using Newton for the life of Stretch, which means we will have the tools to test security patches virtually forever. Another thing is that now, anyone can propose packaging patches without the need for an Alioth account, by sending a patch for review through Gerrit. It is our hope that this will increase the likeliness of external contribution, for example from 3rd party plugins vendors (ie: networking driver vendors, for example), or upstream contributors themselves. They are already used to Gerrit, and they all expected the packaging to work this way. They are all very much welcome.

The upstream infra: nodepool, zuul and friends

The OpenStack infrastructure has been described already in planet.debian.org, by Ian Wienand. So I wont describe it again, he did a better job than I ever would.

How it works

All source packages are stored in Gerrit with the “deb-” prefix. This is in order to avoid conflict with upstream code, and to easily locate packaging repositories. For example, you’ll find Nova packaging under https://git.openstack.org/cgit/openstack/deb-nova. Two Debian repositories are stored in the infrastructure AFS (Andrew File System, which means a copy of that repository exist on each cloud were we have compute resources): one for the actual deb-* builds, under “jessie-newton”, and one for the automatic backports, maintained in the deb-auto-backports gerrit repository.

We’re using a “git tag” based workflow. Every Gerrit repository contains all of the upstream branch, plus a “debian/newton” branch, which contains the same content as a tag of upstream, plus the debian folder. The orig tarball is generated using “git archive”, then used by sbuild to produce binaries. To package a new upstream release, one simply needs to “git merge -X theirs FOO” (where FOO is the tag you want to merge), then edit debian/changelog so that the Debian package version matches the tag, then do “git commit -a –amend”, and simply “git review”. At this point, the OpenStack CI will build the package. If it builds correctly, then a core reviewer can approve the “merge commit”, the patch is merged, then the package is built and the binary package published on the OpenStack Debian package repository.

Maintaining backports automatically

The automatic backports is maintained through a Gerrit repository called “deb-auto-backports” containing a “packages-list” file that simply lists source packages we need to backport. On each new CR (change request) in Gerrit, thanks to some madison-lite and dpkg –compare-version magic, the packages-list is used to compare what’s in the Debian archive and what we have in the jessie-newton-backports repository. If the version is lower in our repository, or if the package doesn’t exist, then a build is triggered. There is the possibility to backport from any Debian release (using the -d flag in the “packages-list” file), and even we can use jessie-backports to just rebuild the package. I also had to write a hack to just download from jessie-backports without rebuilding, because rebuilding the webkit2gtk package (needed by sphinx) was taking too resources (though we’ll try to never use it, and rebuild packages when possible).

The nice thing with this system, is that we don’t need to care much about maintaining packages up-to-date: the script does that for us.

Upstream Debian repository are NOT for production

The produced package repositories are there because we have interconnected build dependencies, needed to run unit test at build time. It is the only reason why such Debian repository exist. They are not for production use. If you wish to deploy OpenStack, we very much recommend using packages from distributions (like Debian or Ubuntu). Indeed, the infrastructure Debian repositories are updated multiple times daily. As a result, it is very likely that you will experience failures to download (hash or file size mismatch and such). Also, the functional tests aren’t yet wired in the CI/CD in OpenStack infra, and therefore, we cannot guarantee yet that the packages are usable.

Improving the build infrastructure

There’s a bunch of things which we could do to improve the build process. Let me give a list of things we want to do.

  • Get sbuild pre-setup in the Jessie VM images, so we can win 3 minutes per build. This means writing a diskimage-builder element for sbuild.
  • Have the infrastructure use a state-of-the-art Debian ftp-sync mirror, instead of the current reprepro mirroring which produces an unsigned reprository, which we can’t use for sbuild-createchroot. This will improve things a lot, as currently, there’s lots of build failures because of httpredir.debian.org mirror inconsistencies (and these are very frustrating loss of time).
  • For each packaging change, there’s 3 build: the check job, the gate job, and the POST job. This is a loss of time and resources, as we need to build a package once only. It will be hopefully possible to fix this when the OpenStack infra team will deploy Zuul 3.

Generalizing to Debian

During Debconf 16, I had very interesting talks with the DSA (Debian System Administrator) about deploying such a CI/CD for the whole of the Debian archive, interfacing Gerrit with something like dgit and a build CI. I was told that I should provide a proof of concept first, which I very much agreed with. Such a PoC is there now, within OpenStack infra. I very much welcome any Debian contributor to try it, through a packaging patch. If you wish to do so, you should read how to contribute to OpenStack here: https://wiki.openstack.org/wiki/How_To_Contribute#If_you.27re_a_developer and then simply send your patch with “git review”.

This system, however, currently only fits the “git tag” based packaging workflow. We’d have to do a little bit more work to make it possible to use pristine-tar (basically, allow to push in the upstream and pristine-tar branches without any CI job connected to the push).

Dear DSA team, as we now nice PoC that is working well, on which the OpenStack PKG team is maintaining 100s of packages, shall we try to generalize and provide such infrastructure for every packaging team and DDs?

16 October, 2016 09:28PM by Goirand Thomas

hackergotchi for Steinar H. Gunderson

Steinar H. Gunderson

backup.sh opensourced

It's been said that backup is a bit like flossing; everybody knows you should do it, but nobody does it.

If you want to start flossing, an immediate question is what kind of dental floss to get—and conversely, for backup, which backup software do you want to rely on? I had some criteria:

  • Automated full-system backup, not just user files.
  • Self-controlled, not cloud (the cloud economics don't really make sense for 10 TB+ of backup storage, especially when you factor in restore cost).
  • Does not require one file on the backup server for one each file on the backed-up server (makes for infinitely long fscks, greatly increased risk of file system corruption, frequently gives performance problems in the backup host, and makes inter-file compression impossible).
  • Not written in Python (makes for glacial speeds).
  • Pull backups, not push (so a backed-up server cannot delete its own backups in event of a break-in).
  • Does not require any special preparation or lots of installation on each server.
  • Ideally, restore using UNIX standard tools only.

I looked at basically everything that existed in Debian and then some, and all of them failed. But Samfundet had its own script that's basically just a simple wrapper around tar and ssh, which has worked for 15+ years without a hitch (including several restores), so why not use it?

All the authors agreed to a GPLv2+ licensing, so now it's time for backup.sh to meet the world. It does about the simplest thing you can imagine: ssh to the server and use GNU tar to tar down every filesystem that has the “dump” bit set in fstab. Every 30 days, it does a full backup; otherwise, it does an incremental backup using GNU tar's incremental mode (which makes sure you will also get information about file deletes). It doesn't do inter-file diffs (so if you have huge files that change only a little bit every day, you'll get blowup), and you can't do single-file restores without basically scanning through all the files; tar isn't random-access. So it doesn't do much fancy, but it works, and it sends you a nice little email every day so you can know your backup went well. (There's also a less frequently used mode where the backed-up server encrypts the backup using GnuPG, so you don't even need to trust the backup server.) It really takes fifteen minutes to set up, so now there's no excuse. :-)

Oh, and the only good dental floss is this one. :-)

16 October, 2016 01:43PM

Rémi Vanicat

Trying to install Debian on G752VM-GC006T

I'm trying to install Debian GNU/linux on my new ASUS G752VM-GC006T

So what I've discovered:

  • It's F2 to have the bios, and in the last bios section, you can directly boot on any device.
  • It boot on the netinst DVD
  • netinst can't see the SSD disk
  • the trackpad doesn't work
  • after successful install, booting on the fresh install failed. I had to use the recovery tools to install nvidia non-free package to have debian successfully boot.
  • I mostly use sid on my computer (mostly to test problem, and report them). It was a bad idea: Debian stopped to find its own disk. adding pci=nomsi to the kernel option fix this.

So I've a working linux. My problem are:

  • I still can't see the SSD disk from linux
  • I cannot easily dualboot:
    • linux can't see the SSD where windows is,
    • windows boot loader don't want to start Debian, because it doesn't want to,
    • at least, the bios can boot both of them, but there is no "pretty" menu
  • the trackpad is not working.
  • 0.5 To feel small today...

And the question is: where to report those bug.

First edit: rEFInd seem to find windows and Debian, thanks to blackcat77

16 October, 2016 12:13PM

hackergotchi for Mirco Bauer

Mirco Bauer

Debian 8 on Dell XPS 15

It was time for a new work laptop so I got a Dell XPS 15 9950. I wasn't planning to write a blog post of how to install Debian 8 "Jessie" on the laptop but since it wasn't just install and use, I will share what is needed to get the wifi and graphics card to work.

So first download the DVD-1 AMD64 image of Debian 8 from your favorite download mirror. The closest one for me is the Hong Kong mirror. You do not need to download the other DVDs, just the first one is sufficient. The netinstaller and CD images will not provide a good experience since they need a working network/internet connection. With the DVD image you can do a full default desktop install and most things will just work out-of-the-box.

Now you can do a regular install, no special procedure or anything will be needed. Depending on your desktop selection it will boot right into lovely GNOME3.

You will quickly notice that the wifi is not working out-of-the-box though. It is a Qualcomm Atheros QCA6174 and the Linux kernel version 3.16 shipped with Debian 8 does not support that wifi card. This card needs the ath10k_pci kernel module which is included in a newer Linux kernel package from the Debian backports archive. If you don't have the Dell docking station as neither I do, then there is no wired ethernet that you can use for getting a temporary Internet connection. So use a different computer with Internet access to download the following packages from the Debian backports archive manually and put them on a USB disk.

After that connect the USB disk to the new Dell laptop and mount the disk using the GNOME3 file browser (nautilus). It will mount the USB disk to /media/$your_username/$volume_name. Become root using sudo or su. Then install all downloaded package from USB with like this:

cd /media/$your_username/$volume_name
dpkg -i linux-base_*.deb
dpkg -i linux-image-4.7.0-0.bpo.1-amd64_*.deb
dpkg -i firmware-atheros_*.deb
dpkg -i firmware-misc-nonfree_*.deb
dpkg -i xserver-xorg-video-intel_*.deb

That's it. If dpkg finished without error message then you can reboot and your wifi and graphics card should just work! After reboot you can verify the wifi card is recognized by running "/sbin/iwconfig" and see if wlan0 shows up.

Have fun with your Dell XPS and Debian!

PS: if this does not work for you, leave a comment or write to meebey at meebey . net

16 October, 2016 03:46AM

October 15, 2016

Thorsten Alteholz

DOPOM: libmatthew-java – Unix socket API and bindings for Java

While looking at the “action needed”-paragraph of one of my packages, I saw that a dependency was orphaned and needed a new maintainer. So I decided to restart DOPOM (Debian Orphaned Package Of the Month), that I started in 2012 with ent as the first package.

This month I adopted libmatthew-java. Sure it was not a big deal as the QA-team already did a good job and kept the package in shape. But now there is one burden lifted from their shoulders.

According to the Work-Needing and Prospective Packages page 956 packages are ophaned at the moment. If every Debian contributor grabs one of them, we could unwind the QA-team (no, just kidding). So similar to NEW which was down to 0 this year, can we get rid of the WNPP as well? At least for a short time?

15 October, 2016 09:01PM by alteholz

hackergotchi for Daniel Silverstone

Daniel Silverstone

Gitano - Approaching Release - Access Control Changes

As mentioned previously I am working toward getting Gitano into Stretch. A colleague and friend of mine (Richard Maw) did a large pile of work on Lace to support what we are calling sub-defines. These let us simplify Gitano's ACL files, particularly for individual projects.

In this posting, I'd like to cover what has changed with the access control support in Gitano, so if you've never used it then some of this may make little sense. Later on, I'll be looking at some better user documentation in conjunction with another friend of mine (Lars Wirzenius) who has promised to help produce a basic administration manual before Stretch is totally frozen.


With a more modern lace (version 1.3 or later) there is a mechanism we are calling 'sub-defines'. Previously if you wanted to write a ruleset which said something like "Allow Steve to read my repository" you needed:

define is_steve user exact steve
allow "Steve can read my repo" is_steve op_read

And, as you'd expect, if you also wanted to grant read access to Jeff then you'd need yet set of defines:

define is_jeff user exact jeff
define is_steve user exact steve
define readers anyof is_jeff is_steve
allow "Steve and Jeff can read my repo" readers op_read

This, while flexible (and still entirely acceptable) is wordy for small rulesets and so we added sub-defines to create this syntax:

allow "Steve and Jeff can read my repo" op_read [anyof [user exact jeff] [user exact steve]]

Of course, this is generally neater for simpler rules, if you wanted to add another user then it might make sense to go for:

define readers anyof [user exact jeff] [user exact steve] [user exact susan]
allow "My friends can read my repo" op_read readers

The nice thing about this sub-define syntax is that it's basically usable anywhere you'd use the name of a previously defined thing, they're compiled in much the same way, and Richard worked hard to get good error messages out from them just in case.

No more auto_user_XXX and auto_group_YYY

As a result of the above being implemented, the support Gitano previously grew for automatically defining users and groups has been removed. The approach we took was pretty inflexible and risked compilation errors if a user was deleted or renamed, and so the sub-define approach is much much better.

If you currently use auto_user_XXX or auto_group_YYY in your rulesets then your upgrade path isn't bumpless but it should be fairly simple:

  1. Upgrade your version of lace to 1.3
  2. Replace any auto_user_FOO with [user exact FOO] and similarly for any auto_group_BAR to [group exact BAR].
  3. You can now upgrade Gitano safely.

No more 'basic' matches

Since Gitano first gained support for ACLs using Lace, we had a mechanism called 'simple match' for basic inputs such as groups, usernames, repo names, ref names, etc. Simple matches looked like user FOO or group !BAR. The match syntax grew more and more arcane as we added Lua pattern support refs ~^refs/heads/${user}/. When we wanted to add proper PCRE regex support we added a syntax of the form: user pcre ^/.+?... where pcre could be any of: exact, prefix, suffix, pattern, or pcre. We had a complex set of rules for exactly what the sigils at the start of the match string might mean in what order, and it was getting unwieldy.

To simplify matters, none of the "backward compatibility" remains in Gitano. You instead MUST use the what how with match form. To make this slightly more natural to use, we have added a bunch of aliases: is for exact, starts and startswith for prefix, and ends and endswith for suffix. In addition, kind of match can be prefixed with a ! to invert it, and for natural looking rules not is an alias for !is.

This means that your rulesets MUST be updated to support the more explicit syntax before you update Gitano, or else nothing will compile. Fortunately this form has been supported for a long time, so you can do this in three steps.

  1. Update your gitano-admin.git global ruleset. For example, the old form of the defines used to contain define is_gitano_ref ref ~^refs/gitano/ which can trivially be replaced with: define is_gitano_ref prefix refs/gitano/
  2. Update any non-zero rulesets your projects might have.
  3. You can now safely update Gitano

If you want a reference for making those changes, you can look at the Gitano skeleton ruleset which can be found at https://git.gitano.org.uk/gitano.git/tree/skel/gitano-admin/rules/ or in /usr/share/gitano if Gitano is installed on your local system.

Next time, I'll likely talk about the deprecated commands which are no longer in Gitano, and how you'll need to adjust your automation to use the new commands.

15 October, 2016 03:11AM by Daniel Silverstone

October 14, 2016

hackergotchi for Michal Čihař

Michal Čihař

New free software projects on Hosted Weblate

Hosted Weblate provides also free hosting for free software projects. I'm quite slow in processing the hosting requests, but when I do that, I process them in a batch and add several projects at once.

This time, the newly hosted projects include:

Filed under: Debian English SUSE Weblate | 0 comments

14 October, 2016 04:00PM

Mike Gabriel

[Arctica Project] Release of nx-libs (version


NX is a software suite which implements very efficient compression of the X11 protocol. This increases performance when using X applications over a network, especially a slow one.

NX (v3) has been originally developed by NoMachine and has been Free Software ever since. Since NoMachine obsoleted NX (v3) some time back in 2013/2014, the maintenance has been continued by a versatile group of developers. The work on NX (v3) is being continued under the project name "nx-libs".

Release Announcement

On Thursday, Oct 13th, version of nx-libs has been released [1].

This release brings a major backport of libNX_X11 to the status of libX11 1.3.4 (as provided by X.org). On top of that, all CVE fixes provided for libX11 by the Debian X11 Strike Force and the Debian LTS team got cherry-picked to libNX_X11, too. This big chunk of work has been performed by Ulrich Sibiller and there is more to come. We currently have a pull request pending review that backports more commits from libX11 (bumping the status of libNX_X11 to the state of libX11 1.6.4, which is the current HEAD on the X.org Git site).

Another big clean-up performed by Ulrich is the split-up of XKB code which got symlinked between libNX_X11 and nx-X11/programs/Xserver. This brings in some code duplications but allows maintaing the nxagent Xserver code and the libNX_X11 code separately.

In the upstream ChangeLog you will find some more items around code clean-ups and .deb packaging, see the diff [2] on the ChangeLog file for details.

So for this releas, a very special and massive thanks goes to Ulrich Sibiller!!! Well done!!!

Change Log

A list of recent changes (since can be obtained from here.

Known Issues

This version of nx-libs is known to segfault when LDFLAGS / CFLAGS have the -pie / -fPIE hardening flags set. This issue is currently under investigation.

Binary Builds

You can obtain binary builds of nx-libs for Debian (jessie, stretch, unstable) and Ubuntu (trusty, xenial) via these apt-URLs:

Our package server's archive key is: 0x98DE3101 (fingerprint: 7A49 CD37 EBAE 2501 B9B4 F7EA A868 0F55 98DE 3101). Use this command to make APT trust our package server:

 wget -qO - http://packages.arctica-project.org/archive.key | sudo apt-key add -

The nx-libs software project brings to you the binary packages nxproxy (client-side component) and nxagent (nx-X11 server, server-side component).

Ubuntu developers, please note: we have added nightly builds for Ubuntu latest to our build server. This has been Ubuntu 16.10 so far, but we will soon drop 16.10 support in nightly builds and add 17.04 support.


14 October, 2016 03:47PM by sunweaver

Antoine Beaupré

Managing good bug reports

Bug reporting is an art form that is too often neglected in software projects. Bug reports allow contributors to participate without deep technical knowledge and at the same time provide a crucial space for developers to be made aware of issues with their software that they could not have foreseen or found themselves, for lack of resources, variety or imagination.

Prior art

Unfortunately, there are rarely good guidelines for submitting bug reports. Historically, people have pointed towards How to report bugs effectively or How to ask questions the smart way. While those guides can be useful for motivated people and may seem attractive references for project managers, they suffer from serious issues:

  • they are written by technical people, for non-technical people
  • as a result, they have a deeply condescending attitude such as calling people "stupid" or various animal names like "mongoose"
  • they are also very technical themselves: one starts with a copyright notice and a changelog, the other uses magic words like "Core dumps" and $Id$
  • they are too long: sgtatham's is about 3600 words long, esr's is even longer at about 11800 words. those texts will take about 20 to 60 minutes to read by an average reader, according to research

Individual projects have their own guides as well. Linux has the REPORTING_BUGS file that is a much shorter 1200 that can be read under 5 minutes, provided that you can understand the topic at hand. Interestingly, that guide refers to both esr's and sgtatham's guidelines, which means, in the degenerate case where the user hasn't had the "privilege" of reading esr's prose already, they will have an extra hour and a half of reading to do to have honestly followed the guidelines before reporting the bug.

I often find good documentation in the Tails project. Their bug reporting guidelines are easily accessible and quick to read, although they still might be too technical. It could be argued that you need to get technical at some point to get that information out, of course.

In the Monkeysign project, I have started a bug reporting guide that doesn't yet address all those issues. I am considering writing a new guide, but I figured I would look at other people's work and get feedback before writing my own standard.

What's the point?

Why have those documents been written? Are people really expected to read them before seeking help? It seems to me unlikely that someone would:

  1. be motivated enough to do something about a broken part of their computer
  2. figure out they can do something about it
  3. read a fifteen thousand words novel about how to report a bug...
  4. just to finally write a 20-line bug report that has no warranty of support attached to it

And if I would be a paying customer, I wouldn't want to be forced to waste my time reading that prose either: it's your job to help me fix your broken things, not the reverse. As someone doing consulting these days: I totally understand: it's not you, the user, it's us, the developers, that have a problem. We have been socialized through computers, and it makes us weird and obtuse, but that's no excuse, and we need to clean up our act.

Furthermore, it's surprising how often we get (and make!) bug reports that can be difficult to use. The Monkeysign project is very "technical" and I have expected that the bug reports I would get would be well written, with ways to reproduce and so on, but it happened that I received bug reports that were all over the place, didn't have any ways of reproducing or were simply incomplete. Those three bug reports were filed by people that I know to be very technically capable: one is a fellow Debian developer, the second had filed a good bug report 5 days before, and the third one is a contributor that sent good patches before.

In all three cases, they knew what they were doing. Those three people probably read the guidelines mentioned in the past. They may have even read the Monkeysign bug reporting guidelines as well. I can only explain those bug reports by the lack of time: people thought the issue was obvious, that it would get fixed rapidly because, obviously, something is broken.

We need a better way.

The takeaway

What are those guides trying to tell us?

  1. ask questions in the right place
  2. search for similar questions and issues before reporting the bug
  3. try to make the developers reproduce the issues
  4. failing that, try to describe the issue as well as you can
  5. write clearly, be specific and verbose yet concise

There are obviously contradictions in there, like sgtatham telling us to be verbose and esr tells us to, basically, not be verbose. There is definitely a tension in there, and there are many, many more details about how great bug reports can be if done properly.

I tend towards the side of terseness in our descriptions: people that will know how to be concise will be, people that don't will most likely not learn by reading a 12 000 words novel that, in itself, didn't manage to be parsimonious.

But I am willing to allow for verbosity in bug reports: I prefer too many details instead of missing a key bit of information.

Issue trackers

Step 1 is our job: we should send people in the right place, and give them the right tools. Monkeysign used to manage bugs with bugs-everywhere and this turned out to be a terrible idea: you had to understand git and bugs-everywhere to file any bug reports. As a result, there were exactly zero bug reports filed by non-developers during the whole time BE was used, although some bugs were filed in the Debian Bugtracker.

So have a good bug tracker. A mailing list or email address is not a good bug tracker: you lose track of old issues, and it's hard for newcomers to search the archives. It does have the advantage of having a unified interface for the support forum and bug tracking, however.

Redmine, Gitlab, Github and others are all decent-enough bug trackers. The key point is that the issue tracker should be publicly available, and users should be able to register easily to file new issues. You should also be able to mass-edit tickets and users should be able to discover the tracker's features easily. I am sorry to say that the Debian BTS somewhat falls short on those two features.

Step 2 is a shared responsibility: there should be an easy way to search for issues, and we should help the user looking for similar issues. Stackexchange sites do an excellent job at this, by automatically searching for similar questions while you write your question, suggesting similar ones in an attempt to weed out duplicates. Duplicates still happen, but they can then clearly be marked and linked with a distinct mechanism. Most bug trackers do not offer such high level functionality, but should, so I feel the fault lies more on "our" end than at the user's end.

Reproducing the environment

Step 3 and 4 are more or less the user's responsibility. We can detail in our documentation how to clearly share the environment where we reproduced the bug, for example, but in the end, the user decides if they want to share that information or not.

In Monkeysign, I have finally implemented joeyh's suggestion of shipping the test suite with the program. I can now tell people to run the test suite in their environment to see if this is a regression that is specific to their environment - so a known bug, in a way - or a novel bug for which I can look at writing a new unit test. I also include way more information about the environment in the --version output, an idea I brought forward in the borg project to ease debugging. That way, people can just send the output of monkeysign --test and monkeysign --version, and I have a very good overview of what is happening on their end. Of course, Monkeysign also supports the usual --verbose and --debug flag that users should enable when reproducing issues.

Another idea is to report bugs directly from the application. We have all seen Firefox or other software have automatic bug reporting tools, but somehow those seem unsatisfactory for a user: we have no feedback of where the report goes, if it's followed up on. It is useful for larger project to get statistical data, but not so useful for users in the short term.

Monkeysign tries to handle exceptions in the code in a graceful way, but could do better. We use a small library to handle exceptions, but that library has since then been improved to directly file bugs against the Github project. This assumes the user is logged into Github, but it is nice to pre-populate bug reports with the relevant information up front.

Issue templates

In the meantime, to make sure people provide enough information, I have now moved a lot of the bug reporting guidelines to a separate issue template. That issue template is available through the issue creation form now, although it is not enabled by default, a weird limitation of Gitlab. Issue templates are available in Gitlab and Github.

Issue templates somewhat force users in a straight jacket: there is already something to structure their bug report. Those could be distinct form elements that had to be filled in, but I like the flexibility of the template, and the possibility for users to just escape the formalism and just plead for help in their own way.

Issue guidelines

In the end, I opted for a short few paragraphs in the style of the Tails documentation, including a reference to sgtatham, as an optional future reference:

  • Before you report a new bug, review the existing issues in the online issue tracker and the Debian BTS for Monkeysign to make sure the bug has not already been reported elsewhere.

  • The first aim of a bug report is to tell the developers exactly how to reproduce the failure, so try to reproduce the issue yourself and describe how you did that.

  • If that is not possible, try to describe what went wrong in detail. Write down the error messages, especially if they have numbers.

  • Take the necessary time to write clearly and precisely. Say what you mean, and make sure it cannot be misinterpreted.

  • Include the output of monkeysign --test, monkeysign --version and monkeysign --debug in your bug reports. See the issue template for more details about what to include in bug reports.

If you wish to read more about issues regarding communication in bug reports, you can read How to Report Bugs Effectively, which takes around 20 to 30 minutes.

Unfortunately, short of rewriting sgtatham's guide, I do not feel there is much more we can do as a general guide. I find esr's guide to be too verbose and commanding, so sgtatham it will be for now.

The prose and literacy

In the end, there is a fundamental issue with reporting bugs: it assumes our users are literate and capable of writing amazing prose that we will enjoy reading as the last J.K. Rowling novel (if you're into that kind of thing). It's just an unreasonable expectation: some of your users don't even speak the same language as you, let alone read or write it. This makes for challenging collaboration, to say the least. This is where automated reporting makes sense: it doesn't require user's intervention, and the communication is mediated by machines without human intervention and their pesky culture.

But we should, as maintainers, "be liberal in what we accept and conservative in what we send". Be tolerant, and help your users in fixing their issues. It's what you are there for, after all.

And in the end, we all fail the same way. In an attempt to improve the situation on bug reporting guides, I seem to have myself written a 2000 short story that will have taken up a hopefully pleasant 10 minutes of your time at minimum. Hopefully I will have succeeded at being clear, specific, verbose and concise all at once and look forward to your feedback on how to improve our bug reporting culture.

14 October, 2016 03:11PM

hackergotchi for Jonathan Dowland

Jonathan Dowland

Hi-Fi Furniture

sadly obsolete

sadly obsolete

For the last four years or so, I've had my Hi-Fi and the vast majority of my vinyl collection stored in a self-contained, mildly-customized Ikea unit. Since moving house this has been in my dining room—which we have always referred to as the "play room", since we have a second dining room in which we actually dine.

The intention for the play room was for it to be the room within which all our future children would have their toys kept, in an attempt to keep the living room from being overrun with plastic. The time has thus come for my Hi-Fi to come out of there, so we've moved it to our living room. Unfortunately, there's not enough room in the living room for the Ikea unit: I need something narrower for the space available.

via IkeaHackers.net

via IkeaHackers.net

In the spirit of my original hack, I started looking at what others might have achieved with Ikea components. There are some great examples of open-style units built out of the (extremely cheap) Lack coffee tables, such as this ikeahackers article, but I'd prefer something more enclosed. One problem I've had with the Expedit unit was my cat trying to scratch the records. I ended up putting framed records at the front to cover the spines of the records within. If I were keeping the unit, I'd look at fitting hinges (another ikeahackers article)

Asides from hacked Ikea stuff, there are a few companies offering traditional enclosed Hi Fi cabinets. I'm going to struggle to fit both the equipment and a subset of records into these, so I might have to look at storing them separately. In some ways that makes life easier: the records could go into a 1x4 Ikea KALLAX unit, leaving the amp and deck to home somewhere. Perhaps I could look at a bigger unit for under the TV.

My parents have a nice Hi-Fi unit that pretends to be a chest of drawers. I'm fairly sure my Dad custom-built it, as it has a hinged top to provide access to the turntable and I haven't seen anything like that on the market.

That brings me onto thinking about other AV things I'd like to achieve in the living room. I've always been interested in exploring surround sound, but my initial attempt in my prior flat did not go well, either because the room was not terribly suited accoustically, or because the Pioneer unit I bought was rubbish, or both. It seems that there aren't really AV receivers which are designed to satisfy both people wanting to use them in a Hi-Fi and a home cinema setting. I could stick to stereo and run the TV into my existing (or a new) amplifier, subject to some logistics around wiring. A previous house owner ran some phono cables under the hard-wood flooring from the TV alcove to the opposite side of the fire place, which might give me some options.

There's also the world of wireless audio, Sonos etcetera. Realistically the majority of my music is digital nowadays, and it would be good to be able to listen to it conveniently in the house. I've heard good reports on the entry level Sonos stuff, but they seem to be Mono, and even the more high-end ones with lots of drivers have very small separation. I did buy a Chromecast Audio on a whim recently, but I haven't looked at it much yet: perhaps that's part of the solution.

So, lots of stuff in the melting pot to figure out here!

14 October, 2016 02:23PM

hackergotchi for Daniel Silverstone

Daniel Silverstone

Gitano - Approaching Release - Changes

Continuing on from the previous article, here is a (probably incomplete) list of the critical changes to Gitano which have been, or will be, worked on during the run toward a 1.0 release. Each of these will have a blog posting to discuss what the changes mean for current and future users. Sometimes I'll aggregate postings, sometimes I won't.

The following are some highlights from the past little while of development which has been undertaken by Richard and myself. Each item is, I feel, important enough to warrant commentary, even for those who already use Gitano.

  • Lace now supports a sub-define syntax: [foo bar] which makes for simpler rulesets.
  • Gitano no longer creates auto_user_XXX and auto_group_XXX Lace predicates
  • Gitano no longer supports "basic" simple matches of the form user foo but instead requires a match kind such as group prefix bar-.
  • Gitano is gaining i18n/l10n support, though it will not be complete for version 1.0 the basics will be in place.
  • Gitano is gaining a much larger integration test suite using yarn.
  • Deprecated commands have now been removed from Gitano. (e.g. no more set-owner)
  • Gitano has gained PGP/GPG signature verification for commits and tags.

Any number of smaller things have been done which fall below some arbitrary barrier for telling you about. If you're aware of any of them and feel they are worthwhile telling the world about, then please prod me and I'll add an article to the series.

Finally it's worth noting that the effort to get all this into Debian Stretch proceeds apace. Of the eight packages needed, at the time of posting: one was already in and has been updated (luxio), three have been accepted into Debian already (supple, clod, lua-scrypt), two are in NEW (gall and lace), and that leaves the newest library (tongue) and then Gitano itself still to go. The Debian FTP team have been awesome in helping me with all this, so thanks go to them.

14 October, 2016 01:30PM by Daniel Silverstone