Archive for the ‘gentoo’ Category

finally … the online publication of my diploma thesis (DT) is here, it can be found online [1] including the source at [2].

i hope that the terminology introduced in this DT (chap 9) will be used. this is of course also true for concepts engineered in (chap 7).

candies can be found here:
 - chap 4: components used in a package manager
 - chap 4.6: different integration levels of a package manager
 - chap 5.13: ways to replicate a system
 - chap 5.15 ff
 - chap 6.2: evopedia deployment summary
 - chap 7 (you might want to read this a few times, it is quite complex)
 - chap 9: here i introduce some new terminology in package management
           (probably a _must read_)

see also the README for in [2] for further information.


[1] https://github.com/qknight/Multi-PlatformSoftwarePackageManagement/blob/master/Multi-PlatformSoftwarePackageManagement.pdf

[2] https://github.com/qknight/Multi-PlatformSoftwarePackageManagement


Read Full Post »

source: http://libvirt.org/motivation

i have a gentoo system inside a virtualbox but i wanted to make some ‘long term tests’ so i decided to migrate it to a libvirt machine which is running ‘fedora core 15 beta’.

problems converting the image

first i tried to migrate the ‘Gentoo 64 (portage).vdi’ directly to a libvirt image, using [2]. but anything i tried: afterwards the image was never bootable so i decided to use ssh to copy all the files instead.

  1. boot both virtual machines using the ‘grml64-mediaum_2010.12.iso‘.
  2. assign the ip addresses
    while i was using on the virtualbox side using: vboxnet0 in a host only networking schema i used a bridge on the other machine which involved lots of manual configuration as: disable networkmanager (on fedora core, remember?), removing the eth0 configuration (which happens to be called em1); adding a new configuration for the bridge br0 (using eth0).
  3. finally i could ping from the virtualbox image to the libvirt guest system
  4. i used ‘rsync -av /mnt/gentoo -e ssh
    Note: both local gentoo systems were mounted into /mnt/gentoo
  5. but libvirt used a ide host controller (which was very slow)
    therefore i manually removed the ide controller and replaced it by a VirtIO Disk using ‘qcow2’ as storage format and ‘Virtio’ as bus.
  6. after all the copying i installed grub (grub-1.99rc1) but the original system had a grub1 config!
    the conversion was not simple!

The grub pitfall

virtualbox image using grub1:

cat /boot/grub/menu.lst

default 0
timeout 30

title Gentoo Linux 2.6.24-r7
root (hd0,0)
kernel /boot/kernel-genkernel-x86_64-2.6.36-gentoo-r5  root=/dev/ram0 real_root=/dev/sda1
initrd /boot/initramfs-genkernel-x86_64-2.6.36-gentoo-r5

in comparison: ‘libvirt guest’ using grub2

cat /boot/grub/grub.cfg

set default=0
set timeout=30

menuentry “Gentoo Linux 2.6.36-gentoo-r5” {
        insmod part_msdos
        insmod ext2
set root=(hd0,msdos1)
linux /boot/kernel-genkernel-x86_64-2.6.36-gentoo-r5  root=/dev/ram0 real_root=/dev/vda1
initrd /boot/initramfs-genkernel-x86_64-2.6.36-gentoo-r5

Note: i marked the differences.

Note: take care of the different filename as well!

anyway: in the grml shell you can install grub into /dev/vda using:

grub-install –root-directory=/mnt/gentoo /dev/vda

the kernel configuration pitfall

a libvirt guest must be aware of /dev/vda (virtIO) but my genkernel was not. also i lacked ext4 support. so it is a good idea to included this into the kernel (i had it included as modules but it did not work well).

cat /etc/kernels/kernel-config-x86_64-2.6.36-gentoo-r5 | grep -i virt | grep -v “^#”


just use ‘genkernel’ to build the new kernel (and don’t forget the ext4 support as i did).

fedora core network problems

i basically used [3] to make it work. the benefit is now that em1 is not used directly but the system uses br0 to access the internet.

PRO: the libvirt guests do get their own ‘mac address’, thus are separated from being able to see each others traffic.

fedora core yum problems

i also tried to install virtualbox and followed the instructions found on virtualbox.org but soon i had the problem that the virtualbox kernel modules won’t build and need ‘kernel-devel’ but after installing the kernel-devel package using ‘yum install kernel-devel’ there was a mismatch between ‘used kernel’ and ‘kernel-devel’ headers.


libvirt and the ‘virtual machine manager’ are very nice:

  • i like that it is so easy to start a virtual machine when the host machine boots.
  • i also like the ‘virtual machine manager’ as it shows cpu/disk io/network io nicely
    (but that is not limited to libvirt virtualizations).
  • fedora core 15 beta was running quite nicely (except that it crashed while i was writing this article)
    so i can at least say: it ran for straight 6hours without crash ;P


[1] http://libvirt.org/

[2] http://blog.loxal.net/2009/04/how-to-convert-vdi-to-vmdk-converting.html

[3] http://www.howtoforge.com/virtualization-with-kvm-on-a-fedora-11-server

Read Full Post »

shinken 0.6 on gentoo

gentoo linux logo (copied from commons.wikipedia.org)



recently i had problems with the deployment of an experimentalshinken‘ [1],  a monitoring tool, on gentoo. most of the installation was ‘straightforward’. but i had a problem with livestatus in combination to ‘thruk‘ [2].

Note: i basically used the shinken guide [3] for fast and easy testing of shinken

  • gentoo linux in virtualbox
  • shinken 0.6
  • thruk-1.0.3 (using the perl webserver; NOT using apache2)


this list contains packages which are not essentially (as the apache and mysql stuff) but it’s basically all packages i’ve installed to make it work, so just pick what you need:


problem with livestatus

i’ve missed a line in the README of shinken and therefore the livestatus service on port 50000 was not enabled. but there was no ‘good’ error message mentioning what actually got wrong.

thruk reported (in the webinterface):

No Backend available

shinken reported (/var/lib/shinken/brokerd.log):

[broker-1] Warning : the module type livestatus for Livestatus was not found in modules!

But that was fixed after i installed:

  • dev-db/sqlite-3.7.5
  • dev-python/pysqlite-2.6.3

and restarted shinken.


look into:

  • /etc/shinken/  (config files)
  • /var/lib/shinken (log files)

optinally one can restart a single module of the shinken service with debugging:

/etc/init.d/shinken-broker -d restart



[1] http://www.shinken-monitoring.org/

[2] http://www.thruk.org/

[3] http://www.shinken-monitoring.org/wiki/shinken_10min_start

Read Full Post »

a new workstation III

nvidia and the proprietary driver

gentoo linux logo (copied from commons.wikipedia.org)


well, how to say it politely: proprietary driver, go to hell! lately i’ve been using =x11-drivers/nvidia-drivers-260.19.29 with:

  • NX (No eXecute bit)
  • VT (virtualization bit)

enabled/disabled in various combinations but the laptop was crashing all over the place, that includes:

  • rendering a webpage (sometimes)
  • playing flash videos (youtube)
  • rendering a pdf (using kile)
  • resume cycle (after pm-suspend)

i probably had 2-6 crashes per day at least. but on the bright side i did not have much file loss because i migrated from xfs to ext3!

i’ve had issues with this since i bought the laptop!

tracking down the problem:

i’ve been experimenting with ‘/usr/src/linux/Documentation/networking/netconsole.txt‘ which is a very very important tool when tracking down kernel related issues. before KMS + nouveau + gallium3d a kernel crash would not ‘bluescreen’ on linux when using the old driver architecture. KMS makes this possible now.

back then i was using the nvidia.ko driver, so i create a little setup to track down the problem:

on the remote machine

setup here is pretty easy, inside a screen console, issue this command:

nc -l -u -p 6666

using screen makes it easy to retrieve the error log later for saving it to a file.

on the laptop

start that script every new boot as root:

ip a del dev eth0
ip a add brd dev eth0

rmmod netconsole
sleep 1
dmesg -n 8

modprobe netconsole netconsole=4444@,6666@

the log

right after you start the script on the laptop this (or similar) should appear:

netconsole: network logging started
netconsole: local port 4444
netconsole: local IP
netconsole: interface ‘eth0’
netconsole: remote port 6666
netconsole: remote IP

the outcome

the problem for me was that i wasn’t able to record any nvidia related bug as it was probably a hardlock, i did not try to use the ‘sys key’ but it would have been a good idea. the logging works for some ‘suspend/resume issues’.

Note: use the ethernet devices, it does not work with the wireless lan devices!

the solution to the nvidia.ko problem

since yesterday i’m using =x11-drivers/xf86-video-nouveau-0.0.16_pre20101130 for 2d (3d does not work yet). a friend with the same laptop reported that on his debian machine also 3d support is working. so far i only had one crash after 7x suspend/resume cycling. this is very good as it makes working with that computer now possible. 3d support as running =x11-apps/mesa-progs-8.0.1 glxgears/glxinfo only produces a black window which seems to do nothing. at least 3d does not ‘segfault’ the application using it as yesterday before i did: ’emerge -uDN world’ properly.

the first ‘blue screen’ in linux

‘blue screens’ on linux are actually ‘black screens’ and it is finally working which is a very good thing. still i did not understand why my laptop was crashing after the 7th resume cycle. i have to learn how to interpret such a trace ;-). at least i got a backtrace in comparison to the netconsole approch (using the nvidia.ko driver).

nouveau driver status

last time i tried to install the nouveau driver on my gentoo based laptop i had many troubles. this time i only emerged a few packages as x11-drivers/xf86-video-nouveau  and it was working after i blacklisted the nvidia module and adapted my xorg.conf to use nouveau by deleting it.

2d performance is really good and power saving seems to be implemented now as the fan gets very silent now. not quite as silent compared to the nvidia.ko driver but much much better compared to the last time i tried nouveau.

there are not 2d drawing artefacts anymore and scrolling in the browser is very performant and feels good.

i’m using a setup where both NX and VT are enabled and working. all my virtual machines using virtualbox are running.

suspend/resume does not work 100% well as the screen brightness is setup to the minimum level after resume and it can not be changed so far.


i think that x11-drivers/nvidia-drivers finally bites the dust for several reasons:

  • nvidia.ko seems to be unmaintained for the 01:00.0 VGA compatible controller: nVidia Corporation G96M [Quadro FX 770M] (rev a1) card at least
  • proprietary driver quality was never very good for the G96M Quadro FX 770M on the hp 8530w as i was blogging quite a lot about significant issues
  • even though it was never very good support seems to degrade lately
  • KMS+nouveau+gallium3d is the way to go, what works so far looks promising

thanks to all developers of nouveau to make this possible! i owe you one!!

Edit: some updates as:

  • there are some 2d drawing issues where icons look like defective frames of an incomplete mpeg video download
  • some 2d drawing issues where some regions are drawn wrong

i guess both things are related to caching done wrongly

Read Full Post »


binary deployment‘ seems to be a good and fast solution nowadays (i’m talking about open source here). but what prove do i have to check if the source code was modified before compiled and signed (say by downstream::debian)?

Note: you can replace debian by any other distribution doing ‘binary deployment’ (it is just an example).

how is binary deployment actually done

this is very much distribution dependent. in general this workflow is used:

  1. download upstream source
  2. arrange a build environment
  3. apply ‘downstream’ patches
  4. install into DESTDIR/PREFIX and create an image from that
  5. finally distribute that image

(1) can be secured by signatures using cryptographic hashes and a sig file. (2) is complicated as a pure build environment CAN NOT be guaranteed by most distributions while a notable exception is nix as the build chain and all packages are pure (pure means that no mutual effects between two or more installed components do happen). (3) as downstream patches are usually very small they could be checked manually for security related issues.

security problems using binary deployment

downstream could simply add another ‘evil’ patch in step (3) but when the package got created, the source patch could be removed to hide the modification. this has happended already, see [2]. if the user wants to prevent such a situations there is a limited set of options. he could:

  • choose to only do  ‘source deployment’ (like in gentoo)
  • setup his own build environment (debian) which would transform the ‘binary deployment’ into ‘source deployment’
  • use tools like SELinux and AppArmor (but these tools work best on programs you can’t check as skype for instance or open source tools you assume ‘poor programming practice’ in regards to security)

.. another option

i’ve been plying with nix lately and as nix is a ‘purely functional package manager’ this implies that step (2) effects are minimized as components don’t interfere. as a result this means: if you clone the original build chain, you could expect the same outcome using the same input. so i experimented with two components:

  • vim
  • apache-httpd

the results are very promising as:

  • both projects have a 1:1 file mapping after reinstallation (that means reinstalling would result in the same files being created for each project)
  • only the binaries had differences, that is: both tools contain a timestamp which is of course different
  • DSO (dynamic shared objects) as modules/mod_cgi.so were not timestamped contrary to my expectation

Edit: it turns out that there was some research on this topic already, see [3] page 30. I quote it and hightlight some passages:

To ascertain how well these measures work in preventing impurities in NixOS, we performed two builds of the Nixpkgs collection6 on two different NixOS machines. This consisted of building 485 non-fetchurl derivations. The output consisted of 165927 files and directories. Of these, there was only one file name that differed between the two builds, namely in mono-1.1.4: a directory gac/IBM.Data.DB2/1.0.3008.37160 7c307b91aa13-
d208 versus 1.0.3008.40191 7c307b91aa13d208. The differing number is likely derived from the system time. We then compared the contents of each file. There were differences in 5059 files, or 3.4% of all regular files. We inspected the nature of the differences: almost all were caused by timestamps being encoded in files, such as in Unix object file archives or compiled Python code. 1048 compiled Emacs Lisp files differed because the hostname of the build machines were stored in the output. Filtering out these and other file types that are known to contain timestamps, we were left with 644 files, or 0.4%. However, most of these differences (mostly in executables and libraries) are likely to be due to timestamps as well (such as a build process inserting the build time in a C string). This hypothesis is strongly supported by the fact that of those, only 42 (or 0.03%) had different file sizes. None of these content differences have ever caused an observable difference in behaviour.

how did i do the checks

i used a prefix installation of nix on gentoo. i set the store path to something like ‘~/mynix/store’ so that every program needs to be recompiled (nix limitation/feature). afterwards i did:

nix-env -i apache-httpd

ls store| grep apache-httpd

cp -R store/gyp2arhqcglbq6iq1hndclljs7v9n30k-apache-httpd-2.2.17/ apache1

nix-env -e apache-http

nix-env –delete-generations old

nix-store –delete store/gyp2arhqcglbq6iq1hndclljs7v9n30k-apache-httpd-2.2.17/

and then do it again but copy to apache2/ instead. next start the comparing.

possible solution to the timestamp problem

as it seems that the timestamps are the only problems, here are some thoughts how to overcome this:

  • write a compare utility which ignores timestamps (of course one has to find such regions first)
  • always freeze the clock when compiling and setting it to a fixed time: this could be done by altering the libc library using LD_LIBRARY_PATH to map a indirection layer to the syscalls used for time/date things. remapping syscalls is nothing new (‘trickle is a portable lightweight userspace bandwidth shaper’ uses it).
    NOTE: this might have unknown side effects and needs to be evaluated as a fixed time will interfere with:

    1. a build environment measuring build-time using the time command
    2. resetting the clock might result in ‘clock screw detected’ messages and stop building, therefore all files need to be ‘touched’ in order to make that work
  • adding a PACKAGE_MANAGER_BUILD_TIME variable to the build environment. this implies one would either have to alter the buildchain (gcc timestamps) or one would have to patch upstream’s source dependent where that timestamp is applied. but the effect would be that the same timestamp is used resulting in a 1:1 match


  • i would really love to experiment further on this topic but i don’t have the time right now to do so. i hope that someone else might take over.
  • i also could imagine a ‘chain of trust’ using gpg signatures. this way we could have a several automated build systems monitoring the sanity of the builds.
  • i also don’t think that the ‘possible solutions’ are of limited use for distributions like debian (i think debian has some kind of build purity but i can’t find the docs right now) and alike.


[1] http://monkey.org/~marius/pages/?page=trickle

[2] https://www.redhat.com/archives/fedora-announce-list/2008-August/msg00012.html

[3] http://www.st.ewi.tudelft.nl/~dolstra/pubs/nixos-jfp-final.pdf

Read Full Post »


this posting is about how to setup a nix prefix installation on gentoo linux. if you do not have permission to install software on your server you can install a package manager in your home directory.

prefix distros:

  • [1] gentoo prefix (using portage)
  • [2] nix prefix (using nixpkgs)
  • source deployment (done manually)

gentoo prefix

+ pros:

  • contains many packages
  • great documentation
  • works in prefix on: linux|mac|cygwin/interix
  • security related tools available
  • Xorg stuff as qt programs will work

– cons:

  • time consuming installation
  • complicated
  • linux prefix setup uses the sun solaris guide, which is …. _strange_ at first

nix prefix

+ pros:

  • binary deployment (when not altering: –with-store-dir OR –localstatedir)
    this is only possible if root assists installation
  • assisted binary deployment (when using self-made channel & a build robot as hydra)
    i have not tested this but it should be possible
  • it is very easy to experiment with several different versions of a single program
  • Xorg stuff as qt programs will work

– cons:

  • because you need to change the /store path, it is mainly source deployment at first
  • no security tools
  • compared to other linux distros a very small subset of packages available (as in ebuilds)

nix prefix – setup

download the software from [2], then follow this guide:

tar xf nix-0.16.tar

cd nix-0.16


./configure –prefix=~/nix –localstatedir=~nix/state –with-store-dir=~/nix/store


make install

NOTE: –localstatedir is not visible when doing ./configure –help!

nix prefix – how to use

next you have to add it to your PATH, do:


export PATH=~/.nix-profile/bin:$PATH

NOTE: you have to do this every time you want to use your prefixed nix.

this will alter your path to use program you install using ‘nix-env’ as:

nix-env -i wget

this should download about 10-40 software components as gcc, binutils, libraries and finally wget. afterwards do:

which wget

which should report: ~/.nix-profile/bin/wget


[1] http://www.gentoo.org/proj/en/gentoo-alt/prefix/

[2] http://nixos.org/nix/

Read Full Post »


gentoo linux logo (copied from commons.wikipedia.org)


this is the essence of the recent findings when doing server updates (on my two gentoo boxes).

in general: this is about: ‘emerge -uDN world‘ and ‘emerge –depclean‘.

system 1:

  • i had serious problems with a failed glibcxx/gcc update.
    emerging qt-core failed with: /var/tmp/portage/x11-libs/qt-core-4.6.3/work/qt-everywhere-opensource-src-4.6.3/bin/qmake: /lib/libstdc++.so.6: version `GLIBCXX_3.4.11′ not found (required by /var/tmp/portage/x11-libs/qt-core-4.6.3/work/qt-everywhere-opensource-src-4.6.3/bin/qmake)
    FIX: the solution to this problem was quite adventurous: i applied a temporary hack from [1]: using libssp_simple.so and ld.so.preload. i was then able to recompile system (emerge -e system) and afterwards i could remove the preloaded library again.

system 2:

this update went pretty well compared to system 1 but it also failed horribly:

  • after the ‘emerge -uDN world‘ update the system wasn’t able to start the /etc/init.d/net.eth0 service on reboot.
    this is because i used etc-update improperly and then /etc/modules.autoload.d/kernel-2.6 did not include the one kernelmodule i needed to be loaded.
    FIX: to avoid further module issues i decided to switch to genkernel using ‘make oldconfig’
  • as a result of using genkernel there were no /dev/hda or /dev/sda device nodes.
    i was able to added them manually using: mknod but after the reboot they were gone.
    FIX: see [2], a missing kernel configuration setting: CONFIG_SYSFS_DEPRECATED_V2=y but it should be disabled with =n, after a genkernel recompile & reboot it worked!


two updates made two systems fail. that is why i hate to update in general. this isn’t an gentoo specific issue but a more general issue of the nature of updates. i always do my security updates but from time to time it is a good thing to do complete system updates. because services seem to degrade when they leave the ‘time window’* they were designed for.

*the time window of a software (i define it) is a consequence of upstream/downstream using certain tools to build software. as the development cycle continues with more recent libraries/softwares, which is used by upstream, more recent components (dependencies) are pulled into the system. as a consequence: it is a good thing to use old programs with old libraries and recent programs with recent libraries. most often a mixture of both, old and new, leads to service degradation.


[1] http://bugs.gentoo.org/125988

[2] http://forums.gentoo.org/viewtopic-t-832584.html

Read Full Post »

Older Posts »