Reproducible Builds
Status Update

Holger Levsen <>

The incomplete team, with apologies to $YOU

akira • Alexis Bienvenüe • Alexander Couzens • Andrew Ayer • Asheesh Laroia • Bernhard M. Wiedemann • Boyuan Yang • Ceridwen • Chris Lamb • Chris West • Christoph Berg • Clint Adams • Dafydd Harries • Daniel Kahn Gillmor • Daniel Shahaf • Daniel Stender • David Suarez • Dhole • Drew Fisher • Emmanuel Bourg • Emanuel Bronshtein • Esa Peuha • Fabian Wolff • Guillem Jover • Hans-Christoph Steiner • Harlan Lieberman-Berg • Helmut Grohne • Holger Levsen • HW42 • Intrigeri • Jelmer Vernooij • josch • Juan Picca • Justin Cappos • Lunar • Maria Glukhova • Mathieu Bridon • Mattia Rizzolo • Nicolas Boulenguez • Niels Thykier • Niko Tyni • Paul Wise • Peter De Wachter • Philip Rinn • Reiner Herrmann • Robbie Harwood • Santiago Vila • Sascha Steinbiss • Satyam Zode • Scarlett Clark • Stefano Rivera • Stéphane Glondu • Steven Chamberlain • Tom Fitzhenry • Vagrant Cascadian • Valerie Young • Valentin Lorentz • Wookey • Ximin Luo

Who are you?

  • Who knows about Reproducible Builds?
  • Who doesn't?
  • Who contributed?
  • Who's name is missing on the previous slide?

about me

  • Debian user since 1995, contributing since 2001
  • Debian-Edu (Debian for Education), since 2003
  • DebConf organizer, founded the DebConf video team in 2005
  • Debian developer since 2007, holger@debian.og
  • Freelancer since 2004

more about Debian QA and me

  • since 2009 - today (Nov 2016) juggling with 648988 logs from 53158 packages in 28 suites with Andreas Beckmann
  • since 2012
  • since 2014
  • since 2015 funded by the Linux Foundation for working on

What is the goal of Reproducible Builds?

Prove binary came from source code.

Why do we want to prove this?

  • We don't want to believe, we want to know…

The binary could have been:

  • ...compiled by a malicious actor.
  • ...compiled with a compromised compiler.

How do we achieve Reproducible Builds?

Two main areas of work in Debian:

  • 1. Compilation of binary program should be deterministic.
  • 2. Build environment of any binary program should be reproducible.

How far we've come..!

First rebuild in 201324% packages reproducible
June 201794% packages reproducible

How far we'll need to go..!

sometime100% packages reproducible
sometimetools to actually verify that in practice

6% is still a lot

303 unreproducible key packages in Buster: nss pypy libffado+ mongodb jython postfix libsaxon-java libxalan2-java libxerces2-java qt4-x11 qtwebkit boost1.62 fonts-freefont qemu fonts-dejavu gdb chromium-browser cwidget gradle gsasl groovy indent krunner dose3 automake1.11 vlc lynx unifont fuse shotwell xmlunit lua5.3 ceph+ dbus-c++ a2ps libvirt php7.0 evolution-data-server python2.7 rrdtool dia p7zip zephyr+ jruby-openssl matplotlib closure-compiler libgpg-error sane-backends mesa bsh clamav festival syslinux xserver-xorg-input-evdev icedove apache2 xen tk8.6 libreoffice autogen courier-authlib ipxe jack-audio-connection-kit gcc-6-cross gcc-6-cross-ports gcc-6 postgresql-9.6 libtheora zeitgeist gnuplot lame r-base heimdal openjdk-8 wine gcc-defaults python3.5 bind9+ emacs25 ghc libident poppler firefox-esr ant subversion flex geoip python-numpy x11proto-core ibus nsis rustc openbios doxygen gcc-7 ecj guile-2.0 iproute2+ directfb+ gcc-7-cross-ports gcc-7-cross octave sqlite3 unbound xorg-server+ automake-1.15 bash+ grub2 autoconf cjk dbus-python ffmpegthumbnailer graphviz mono cppunit pycairo

136 ftbfs, 4618 fine

Check the progress


Technical & other security benefits

Predictable OpenID secret

# Build.PL
# /usr/share/perl5/GBrowse/
 'OpenIDConsumerSecret' => '639098210478536',
 'cgibin' => '/usr/lib/cgi-bin/gbrowse',
 'conf' => '/etc/gbrowse',
  • Every installation shares the same secret!

#833885 (gbrowse)

Random chars in manpages

-This manual page documents the usageoof WikipediaFS.
+This manual page documents the usage of WikipediaFS.
memcpy(&buf[1], &buf[2], strlen(buf)-1);
memcpy(3): The memory areas must not overlap
  • " n\\011" → "\111" → maps to capital "I"
- memcpy(&buf[1], &buf[2], strlen(buf)-1);
+ memmove(&buf[1], &buf[2], strlen(buf)-1);

Fails to build 0.46% of the time

x = f(u('abc'), 16)
y = f(u('abc'), 16)
self.assertEqual(sorted(set(x)), [u('a'), u('b'), u('c')])
AssertionError: Lists differ: [u'a', u'b'] != [u'a', u'b', u'c']
  • (3C2)*(2/3)16 – (3C1)*(1/3)16 =~ 0.46%

#844233 (python-passlib)

Recent updates

Reproducible Builds Summit

December 2016, Berlin

Who Attended?

  • Software Freedom Conservancy
  • Bazel

A build is reproducible if given the same source code, build environment and build instructions, any party can recreate bit-by-bit identical copies of all specified artifacts.

Reproducible Builds Summit 2016

Other work

  • .buildinfo files for RPMs
  • Cross Distro Collaboration expansion

  • increased amd64+i386 ressources
  • added arm64
  • expanded armhf, up to 29 boards!
  • some arm64 boards building armhf, with issues
  • some more projects, more collaboration

OpenSuSE on board

  • Bernhard submitting a lot of patches upstream, pick some examples

Examples of more active distros/projects

  • Guix, Fdroid, LEDE, Coreboot, but also in-toto, Tails, Heads & MirageOS…

Updates on build path

(only needed because Debian cares to do the things properly.)

Updates on build path

  • GCC -fdebug-prefix-map, DW_AT_producer, etc
  • golang -trimpath: golang/go#16860
  • rustc --remap-path-prefix: rust-lang/rust#41555, #34902
  • GCC patch fixes 1800 packages but hasn't been accepted upstream yet

Reproducibility tools

reprotest - overview and updates

  • What: run commands under varying build environments and check their output for reproducibility. Features:
    • Running inside virtual containers (e.g. ...)
    • Presets for convenience, atm only Deb packages
  • Reduce diff with autopkgtest, aim to deduplicate (ximin)
  • Make it distro-independent, first ArchLinux (santiago)

diffoscope - overview
  • What exactly makes two files different?
  • Recursively unpacks archives, decompresses PDF files, disassembles binaries etc
  • Converts various file formats to human-readable form
  • Reports differences in form of plain text, HTML, RST, JSON or Markdown
  • Try it online:

diffoscope - updates 1/2

  • Now works better with huge diffs (like GCC)
    • possible to control how detailed the output gets
    • reuse previously generated output saved in JSON format
  • 10s of speed optimisations (via Tails)
    • from 3 hours → 8 minutes
  • Progress bar displayed when diffoscope runs in terminal
  • --exclude, --max-container-depth and other ways to control behaviour

diffoscope - updates 2/2

  • Better logging and debugging utilities
  • New formats supported for comparison: APK, OGG, .dtb, R object files (.rds, .rdb), PGP files, .docx, .odt, ...
  • New output formats: RST, Markdown, JSON
  • Visual comparison of images (JPEG, ICO, PNG, GIF)

Future directions

Distributing Debian .buildinfo files

  • Publish buildinfo files in the official archive, requires some dak ("Debian Archive Kit") patches.
  • Then, run rebuilds against actual Debian binaries, and encourage third parties to do the same.
  • Steven Chamberlain began to work in this area with (the topic of his "Fun with .buildinfo" talk at DebConf17)
  • put them all in one GIT repo?

Debian buster

  • We had not been testing against actual archive binaries, hope to fix this soon.
  • Recently, required packages NMU (by ximin), now all reproducible except GCC.
  • Next, begin our wider NMU campaign, for packages with long-pending patches.
  • Eventually aim for build-essential and key-packages.

Debian Policy

  • "Packages should be reproducible" (#844431).
    • we'll need to define reproducibilty
    • reproducibile in a fairly controlled way / sane environment - not everywhere
    • define requirements / exceptions: same buld environment + same options + same path
    • mention .buildinfo files and missing processes+tools
  • Should we have this in policy now? Do we agree that Debian is ready for this, as a should which still needs work and non complying is a normal bug for now…

Debian Policy


Packages should build reproducibly, which for the purposes of this
document [#]_ means that given

- a version of a source package unpacked at a given path;
- a set of versions of installed build dependencies;
- a set of environment variable values;
- a build architecture; and
- a host architecture,

repeatedly building the source package for the build architecture on
any machine of the host architecture with those versions of the build
dependencies installed and exactly those environment variable values
set will produce bit-for-bit identical binary packages.

It is recommended that packages produce bit-for-bit identical binaries
even if most environment variables and build paths are varied.  It is
intended for this stricter standard to replace the above when it is
easier for packages to meet it.

.. [#]
   This is Debian's precisification of the `
   definition `_.

User interfaces

  • UI/workflow for APT to notify users about unreproducible packages (#863622).
  • sbuild, pbuilder

  • Parse JSON results from other tests setups and put them in our DB
  • Store our non-Debian tests results in DB
  • Generate web pages for other distros & projects and also create statistics and graphs

How can I help?

  • Join our lovely team!
  • Check your packages on
    • or$srcpkg
  • Merge patches & push them upstream
  • Fix toolchain issues (Java, TeX, dvips, graphviz, etc.)


(on B8BF 5413 7B09 D35C F026
FE9D 091A B856 069A AA1C

Thanks to our sponsors: