Installing Arch Linux with System Encryption Enabled

I still use Arch Linux because it is pretty much the only rolling-release distro that offers binary (pre-compiled) packages by default *and* only installs bare minimum packages required to get a working Linux system (no window manager, no desktop environment). I was re-installing Arch Linux on my desktop last week to set up system-wide encryption and decided to record all of the commands I used to get the bare minimum working. I am not aware of any source out there that actually tells you all of the console commands you need to run when you first install Arch Linux in plaintext format so I decided to write this post.

The system I am setting up is simple: it has only 1 hard drive. This drive has 1 boot partition 100MB big, and the rest is a single LVM partition that will eventually contain the / (aka “root”), /home, and swap partitions. This single LVM partition will be encrypted, so that when the system boots, it would require a password to unlock the /, /home, and swap partitions to actually load up the Kernel and userland.

If you are wondering why the /home partition is mounted on /mnt/home in some of the commands, this is because the Arch Linux live CD environment will load up a temporary Linux system onto your RAM, and we use the /mnt directory (again all files/folders are temporarily in your RAM) to mount the hard drive partitions. You can always use a live CD anywhere to load up a barebone Linux system to your RAM and mount the hard drives on your computer for inspection, and in such a case you will again use the /mnt directory to mount your hard drives here.

If you don’t know how to use vi (tsk, tsk), then just substitute that command with “nano” or something. Here are the commands, with comments, in plaintext format:

    # Put in the Arch Linux live CD and run the following commands...

    # Set up 2 partitions; the first one is the boot 100MB partition (be sure
    # to toggle the BOOT flag on it), and the second will be the LVM partition.
    # Be sure to select 8E for the LVM partition! You can use
    # /dev/disk/by-uuid/XXXX for more fine-grained control; use "blkid -o list"
    # to find out which drives are given which UUIDs.

    cfdisk /dev/sda

    # Now set up encryption on the LVM partition (the 2nd partition, so it's
    # /dev/sda2). It is recommended to use aes-xts-plain64 if the partition is
    # larger than 2 TB, I think.

    cryptsetup -c aes-xts-plain -y -s 512 luksFormat /dev/sda2

    # Decrypt the encrypted partition with LUKS. We need to decrypt it first
    # before setting up LVM volumes and filesystems on it. This will create
    # an entry under /dev/mapper/luks.

    cryptsetup luksOpen /dev/sda2 luks

    # Set up LVM. We call our volume group "vg0". To be honest I am not sure if
    # the "--contiguous y" option for swap matters because we are encrypting it
    # (doesn't block encryption encrypt data randomly on the disk?), but that's
    # what I found online and that's what I'll use here. The "+100%FREE" option
    # is a very handy option that tells LVM to fill up all remaining free space,
    # in this case, for the "home" logical volume.
    # Read more about LVM on your favorite online wiki.

    pvcreate /dev/mapper/luks
    vgcreate vg0 /dev/mapper/luks
    lvcreate --size 40G vg0 --name root
    lvcreate --size 2G --contiguous y vg0 --name swap
    lvcreate -l +100%FREE vg0 --name home

    # Initialize filesystems. Btrfs is still experimental, so I choose ext4
    # here. The vg0-root, vg0-home, and vg0-swap volumes should
    # appear under /dev/mapper as you create them with the lvcreate command.

    mkfs.ext4 /dev/mapper/vg0-root
    mkfs.ext4 /dev/mapper/vg0-home
    mkswap /dev/mapper/vg0-swap

    # Mount the logical volumes. Why? Because we need to install Arch Linux onto
    # it! You probably don't need to mount the "home" volume but it couldn't
    # hurt. Besides, it's nice to test that all volumes can be mounted OK anyway.

    mount /dev/mapper/vg0-root /mnt # /mnt is our system's "/" root   directory
    mkdir /mnt/home
    mount /dev/mapper/vg0-home /mnt/home
    swapon /dev/mapper/vg0-swap

    # Set up package download mirrors. The kernel mirror is generally a good one
    # if you live in the US. Google "arch linux mirror status" to find the
    # fastest mirrors.

    vi /etc/pacman.d/mirrorlist

    # Download and install the core minimum packages to get a working Linux
    # terminal (console) environment. Ah, simplicity!

    pacstrap -i /mnt base base-devel

    # Generate and inspect the filesystem tables needed when the system first
    # starts. The file should contain four "partitions" (I put them in quotes
    # because remember, we are using only 2 real partitions; 1st one is /boot and
    # the 2nd one is our LVM containing /, /home, and swap).

    genfstab -U -p /mnt >> /mnt/etc/fstab
    vi /mnt/etc/fstab

    # Make the system pretend that /mnt is /. This is called "chrooting". We need
    # to do this because some of the commands like locale-gen, systemctl, etc.
    # read configuration files from /etc, for example, and right now our /etc is
    # the /etc used by the Arch Linux live CD, not our newly-installed system's
    # root directory, /mnt.

    arch-chroot /mnt /bin/bash

    # Choose and generate locale; programs supporting internationalization (aka
    # "i18n") will use the language you pick to generate the proper text used in
    # menus, error messages, etc. Just choose "en_US.UTF-8 UTF-8" if you live in
    # the US.

    vi /etc/locale.gen
    locale-gen
    echo LANG=en_US.UTF-8 > /etc/locale.conf

    # Choose default font used by the virtual console (the terminal screen you
    # will see when you first boot up your new Arch Linux system).

    setfont Lat2-Terminus16
    echo FONT=Lat2-Terminus16 > /etc/vconsole.conf

    # Choose timezone. Here, we tell hwclock that we want the hardware clock on
    # the motherboard to be set to UTC, which will then be interpreted as UTC
    # time by the kernel and re-adjusted to the timezone you chose when you
    # actually use the system. This is nice because UTC is the "master" time the
    # world uses anyway, so it makes sense to tell your motherboard to use it.

    ln -s /usr/share/zoneinfo/America/Los_Angeles /etc/localtime
    hwclock --systohc --utc

    # Choose your computer's name. Keep it simple, without spaces. I use "k0".

    echo MYHOSTNAME > /etc/hostname

    # Enable DHCPCD... aka "acquire dynamic ip address from your router/switch".
    # Be sure to choose the right eth0 or eth1 (eth2, if you have 3 ethernet
    # ports) port. If you choose the wrong "eth" number, just go to
    # /etc/systemd/system/multi-user.target.wants/ and rename the symlink; e.g.,
    # rename /etc/systemd/system/multi-user.target.wants/dhcpcd@eth0.service to
    # /etc/systemd/system/multi-user.target.wants/dhcpcd@eth1.service.

    systemctl enable dhcpcd@eth0.service

    # Disable unused package repos. You will want to disable the [testing] repo
    # unless you want to test unstable packages and engage in the package
    # debugging/development process. For Haskellers, be sure to add the
    # [haskell-core] repo (see
    # https://wiki.archlinux.org/index.php/Haskell_Package_Guidelines).

    vi /etc/pacman.conf

    # Set up the root user ("administrator" for you Windows people) password.

    passwd

    # Install zsh, because you will like it better any other shell out there.

    pacman -S zsh

    # Add your regular user, and set up your password. This is your normal
    # username. We use the "-s /bin/zsh" option to tell useradd that we want to
    # log in with zsh.

    # useradd -m -g users -G wheel,storage,power -s /bin/zsh MYUSERNAME
    # passwd MYUSERNAME

    # VERY IMPORTANT: Add in the filesystem type that /root partition (LVM) is
    # using (for this tutorial, it is "ext4") into the MODULES variable and also
    # add 'encrypt' and 'lvm2' into HOOKS.

    vi /etc/mkinitcpio.conf

    # Re-generate the linux image to take into account the LVM and encrypt
    # flags we added into /etc/mkinitcpio.conf. I say "re-generate" because this
    # is our second time doing this (the first time was when we installed the
    # linux package with the pacstrap command above).

    mkinitcpio -p linux

    # Install boot loader. I like SYSLINUX because of the saner/easier-looking
    # boot configuration file, compared to GRUB 2. Be sure to add in
	#     "APPEND cryptdevice=/dev/disk/by-uuid/xxxxxxxxxx:luks root=/dev/mapper/vg0-root resume=/dev/mapper/vg0-swap ro"
    # under the "LABEL arch" entries, so that SYSLINUX tells the kernel to look
    # for the encrypted LVM partition. Otherwise, your system won't boot! Don't
    # forget to actually look up the UUID of your LVM partition (/dev/sda2 in
    # this tutorial). You don't need to tell the kernel about your vg0-home
    # volume because that will get mounted by /etc/fstab when it is read later in
    # the boot process.

    pacman -S syslinux
    syslinux-install_update -i -a -m
    vi /boot/syslinux/syslinux.cfg

    # Exit chroot environment.

    exit

    # Unmount volumes, and reboot. Remove your Arch Linux live CD when your
    # computer powers on again.

    umount /mnt/{boot,home}
    swapoff
    reboot

PROTIP: Use the ALT+F2, ALT+F3, etc. shortcuts to log into and switch between different TTY screens; this is very useful when you are unsure about some of the commands used above and want to look at their manpages for more detailed explanations.

The process of installing Arch Linux is long, but arguably extremely clean and clutter-free. *EVERY* command counts, and matters. And you also get system-level encryption (everything except the /boot partition is encrypted). I would go ahead and install gvim, emacs, xmonad, and all the other little programs I use daily, but that can wait until the next reboot. And because it’s Linux, you don’t have to worry about restarting the whole system when you install new programs, so you can actually install 99% of everything you need within one session.

How to Encrypt Your USB Flash Drives

Did you know that your modern Linux kernel comes with a built-in encryption framework? I am talking about dm-crypt (device-mapper crypt) and the user-friendly layer on top of it, called LUKS (Linux Unified Key Setup). I just encrypted all of my USB flash drives two weeks ago using the dm-crypt + LUKS method and I am very happy with the results.

The process itself is very simple.

# 1. Find the correct device.

lsblk

# 2. Wipe the device with random data. I prefer to target the disk by its UUID
# because using the /dev/sdX convention is not very reliable (the letters can
# change between boots/hotmounts). NOTE: You might be interested in
# http://frandom.sourceforge.net/ if your device is over 16 GiB or so, because
# using /dev/urandom can be very slow. If using Arch Linux, you can get it from
# the AUR: https://aur.archlinux.org/packages/frandom/.

dd if=/dev/urandom of=/dev/disk/by-uuid/XXX bs=4096

# 3. Create the partition on the device.

cfdisk /dev/disk/by-uuid/XXX

# 4. Encrypt the partition and make it LUKS-compatible. See the manpage for
# cryptsetup(8).
#   -c: cipher type to use
#   -y: LUKS will ask you to input the passphrase; using -y will ask you twice
#       and complain if the two do not match.
#   -s: Key size in bits; the larget the merrier, but limited by the cipher/mode used.

cryptsetup -c aes-xts-plain -y -s 512 luksFormat /dev/disk/by-uuid/XXX

# 5. Open the partition with LUKS.

cryptsetup luksOpen /dev/disk/by-uuid/XXX mycrypteddev

# The partition is now available from /dev/mapper/mycrypteddev as a "regular"
# partition, since LUKS is now handling all block device encryption between the
# user and the device.

# 6. Set up a filesystem on the partition.

mkfs.ext4 /dev/mapper/mycrypteddev

# 7. Close the partition with LUKS.

cryptsetup luksClose /dev/mapper/mycrypteddev

# Encryption setup complete! Now every time you want to access the partition,
# you must first open it with LUKS and then mount it. Then when you're done, do
# the reverse: unmount and close it with LUKS.

# To mount and open with LUKS:

cryptsetup luksOpen /dev/disk/by-uuid/XXX mycrypteddev
mount -t ext4 /dev/mapper/mycrypteddev /mnt/mount_point

# To unmount and close with LUKS:

umount /mnt/mount_point
cryptsetup luksClose mycrypteddev

The mount/open and unmount/close steps necessary for using the device is laborious. That’s why you should write a bash script to run them. I’ve written the following bash script called cmount.sh to access my 3 USB drives this way:

#!/bin/zsh
# LICENSE: PUBLIC DOMAIN
# mount/unmount encrypted flash drives

mp=$3
uuid=""

case $2 in
    "0")
        uuid="11e102cd-dea1-46a8-ae9b-b3f74b536e64" # my red USB drive
        ;;
    "1")
        uuid="cf169437-b937-4a39-86cb-7ca82bd9fe94" # my green one
        ;;
    "2")
        uuid="57a0b7d5-d2a6-47e0-a0e3-adf69501d0cd" # my blue one
        ;;
    *)
        ;;
esac

if [[ $uuid == "" ]]; then
    echo "No predefined device specified."
    exit 0
fi

case $1 in
    "m")
        echo "Authorizing encrypted partition /dev/mapper/$mp..."
        sudo cryptsetup luksOpen /dev/disk/by-uuid/$uuid $mp
        echo -n "Mounting partition on /mnt/$mp..."
        sudo mount -t ext4 /dev/mapper/$mp /mnt/$mp && echo "done."
        ;;
    "u")
        echo -n "Unmounting /mnt/$mp..."
        sudo umount /mnt/$mp && echo "done."
        echo -n "Closing encrypted partition /dev/mapper/$mp..."
        sudo cryptsetup luksClose $mp && echo "done."
        ;;
    *)
        ;;
esac

To mount the green USB to /mnt/ef0 (“ef0” is just an arbitrary folder name):

./cmount.sh m 1 ef0

Then to unmount:

./cmount.sh u 1 ef0

Simple, eh? Go forth and encrypt all of your USB drives, so that when they get lost, they can’t be read by curious strangers. You can use the above steps to create and encrypt multiple partitions in the same device, or to only encrypt one partition while leaving other partitions unencrypted (i.e., steps 4 through 7 are partition-specific). The choice is yours. I prefer partition-level (aka “block device”) encryption over file/folder encryption because I don’t have to mentally think every time “hey, do I want to encrypt this?” for every file/folder I create.

If you want to look into encrypting your hard drives and swap partitions, read through this disk encryption page, and particularly, this section. There are many “levels” of encryption, and you should consider the many options available to you.

Generating Random Numbers without Modulo Bias

TLDR: If (RAND_MAX + 1) % n != 0, you have modulo bias. Fix it!

Let’s say you need a random number from 0 to 2 (0, 1 or 2). If you do this:

/* LISTING 1 */
int x;
int n = 3;
x = rand() % n;
return x;

your code is broken, thanks to modulo bias. You will always get modulo bias whenever RAND_MAX + 1 of your rand() function is not evenly divisible by your modulus n.

For example, let’s say rand()‘s RAND_MAX is 4, so there are 5 different possible values (0, 1, 2, 3, 4). Now rand() will return each value 20% of the time, assuming that rand() has very good distribution from 0 to RAND_MAX (all modern rand() functions should have very good distribution). But since we take the modulo by 3, we get 0 40% of the time (0 and 3), 1 40% of the time (1 and 4), and 2 20% of the time (2 by itself). Visually, it is as follows:

The following code is correct:

/* LISTING 2 */
int x;
int n = 3;
int rand_limit;
int rand_excess;
rand_excess = (RAND_MAX + 1) % n;
rand_limit = RAND_MAX - rand_excess;
while (x = rand() > rand_limit) {};
return x % n;

So, if RAND_MAX is 4 as in the strongvious example, rand_excess is (4 + 1) % 3, or 5 % 3, or 2. Then rand_limit becomes 4 – 2, or 2. The while statement then throws out the values 3 and 4 (i.e., x is only allowed to be 0, 1, or 2). Then, we return the modulo expression x % n. The expression x % n is redundant here only because RAND_MAX is very low for the sake of our example.

The only problem with the above code is if RAND_MAX is already the maximum value allowed (e.g., if rand() returns an unsigned 64-bit integer within the range 0 through 2^64-1). Then, (RAND_MAX + 1) would wrap back around to 0, and rand_excess would be 0. To avoid this, you can use the alternate expression

/* LISTING 3 */
...
rand_excess = (RAND_MAX % n) + 1;
...

Alas there still remains a remote problem with this workaround: there is a remote chance that rand_excess will be equal to n, needlessly reducing the size of rand_limit. For example, say RAND_MAX is 8 and n is 3. Then (8 % 3) + 1 is (2 + 1), or 3. Now rand_limit is 8 – 3, or 5. But RAND_MAX of 8 is already valid because there are 9 possible values 0 through 8, and there is no modulo bias to begin with (since 9 % 3 = 0)! Luckily, this scenario should be pretty rare given that rand() is a very large number.

But remember: whenever you generate a random number, make sure to remove modulo bias! Use LISTING 2, and maybe LISTING 3 if you want. Lastly, if you know that n will not change, and you also know the size of RAND_MAX, it might be wise to simplify LISTING 2 mathematically (using constants, not expressions) and save some CPU cycles.

Why I Buy Moleskine Journals

I keep a diary and also a programming journal. Because I want to keep all of my ideas down for future reference, I prefer to use nice journals for them (no spiral notebooks or ones with perforated pages). For these purposes, I turn to the Moleskine brand of journals — more specifically, their Folio line of A4-sized notebooks.

Yes, the Moleskine line of journals are quite overpriced. Yes, they are not made in Europe as the name implies (they are made in China). However, Moleskine is quite possibly the only company in the world that makes journals that fit the following criteria:

  • Sewn binding (pages can lie flat for sane writing)
  • A4 or larger size
  • Blank/lined/graph pages
  • Hardcover
  • Regular paper (not sketch paper)

I think some of Clairefontaine’s journals come close, but all of their A4-sized journals are lined, I believe. And plus, as great as their paper quality is (it really is fantastic to write on), it is way too bright for my taste.

By the way, if you are a programmer I strongly recommend that you keep your notes in a physical notebook. If you are learning a new programming language or starting to sketch the design goals of a new programming project, keeping a notebook for it is quite possibly one of the best decisions you will make.

Oh, by the way, you should purchase a good pen to go along with your journal(s). Don’t write with horrible, cheap pens that are difficult to write with.

Random Haskell Ephiphany: Using the liftM function

I’ve been tinkering with some monadic code and it just dawned on me that the often-used liftM function is just a shortcut (less typing) for a common use scenario which arises whenever you need to use a pure function along with an impure one.

For example, consider the following pseudocode:

randoms :: IO [Int]

Calling randoms generates a list of random Int values. Suppose we want just the first value from this list, so we naively write the following:

main :: IO ()
main = do
    rs <- randoms
    let r = head rs
    ...

But then you realize that you will only use just a single random Int, and that the variable rs is useless. So, you remember what return in Haskell means and then write this instead:

main :: IO ()
main = do
    r <- return . head =<< randoms
    ...

But this looks a bit awkward. This is where liftM comes in:

main :: IO ()
main = do
    r <- liftM head $ randoms
    ...

The code is now much simpler and cleaner. What’s not to like?

UPDATE July 15, 2012: I just realized that there is even a shorter solution, using the (<$>) function from the very useful Control.Applicative module:

main :: IO ()
main = do
    r <- head <$> randoms
    ...

Emacs: Vimlike Tab/Window Navigation

If you use Vim, you know about tabs and split windows. I’ve managed to set up Emacs to behave in almost the same way (the only difference is that there can be only 9 “tabs” in Emacs (a limitation of elscreen), unlike in Vim where the number of tabs is arbitrary).

Emacs already has a very capable window-splitting mechanism, so you can emulate Vim’s split window functionality with vanilla Emacs. For Vim’s tabs, however, You need the elscreen package. This page has a version of elscreen that works with the latest Emacs 24.1 release.

The biggest problem with navigating around Emacs’s elscreen + split windows was the problem of deleting windows. In Vim, if you use “:q”, the current window gets killed. If it’s the only window in the tab, the tab gets killed. Lastly, if there are no tabs and only 1 window, “:q” will result in exiting Vim.

The function below was hacked together (I do not know any elisp) by reading parts of the GNU Emacs manual and googling around. It emulates the same functionality as “:q” in Vim as far as split windows/elscreen is concerned:

.emacs:
; Either close the current elscreen, or if only one screen, use the ":q" Evil
; command; this simulates the ":q" behavior of Vim when used with tabs.
(defun vimlike-quit ()
  "Vimlike ':q' behavior: close current window if there are split windows;
otherwise, close current tab (elscreen)."
  (interactive)
  (let ((one-elscreen (elscreen-one-screen-p))
        (one-window (one-window-p))
        )
    (cond
     ; if current tab has split windows in it, close the current live window
     ((not one-window)
      (delete-window) ; delete the current window
      (balance-windows) ; balance remaining windows
      nil)
     ; if there are multiple elscreens (tabs), close the current elscreen
     ((not one-elscreen)
      (elscreen-kill)
      nil)
     ; if there is only one elscreen, just try to quit (calling elscreen-kill
     ; will not work, because elscreen-kill fails if there is only one
     ; elscreen)
     (one-elscreen
      (evil-quit)
      nil)
     )))

The function below is used to emulate Vim’s “:tabe” behavior, which is what I use all the time for creating a new “scratch” or temporary buffer for pasting random junk into.

.emacs:
; A function that behaves like Vim's ':tabe' commnad for creating a new tab and
; buffer (the name "[No Name]" is also taken from Vim).
(defun vimlike-:tabe ()
  "Vimlike ':tabe' behavior for creating a new tab and buffer."
  (interactive)
  (let ((buffer (generate-new-buffer "[No Name]")))
      ; create new tab
      (elscreen-create)
      ; set window's buffer to the newly-created buffer
      (set-window-buffer (selected-window) buffer)
      ; set state to normal state
      (with-current-buffer buffer
        (evil-normal-state))
    )
  )

Below are some code snippets to show you how I set up my .emacs to mimic my .vimrc, as far as tabs/split windows go:

Tab movement:

.vimrc:
nnoremap <C-l> :tabnext<cr>
nnoremap <C-h> :tabprevious<cr>
inoremap <C-l> <esc>:tabnext<cr>
inoremap <C-h> <esc>:tabprevious<cr>
----------------------------------------
.emacs:
(define-key evil-normal-state-map (kbd "C-l") 'elscreen-next)
(define-key evil-normal-state-map (kbd "C-h") 'elscreen-previous)
(define-key evil-insert-state-map (kbd "C-l") 'elscreen-next)
(define-key evil-insert-state-map (kbd "C-h") 'elscreen-previous)

Window movement:

.vimrc:
nmap <Tab> <C-w><C-w>
----------------------------------------
.emacs:
(define-key evil-normal-state-map [tab] 'other-window)

Buffer movement (change a window’s contents):

.vimrc:
nmap <S-h> :bn<cr>
nmap <S-l> :bp<cr>
----------------------------------------
.emacs:
(define-key evil-normal-state-map "H" 'evil-next-buffer)
(define-key evil-normal-state-map "L" 'evil-prev-buffer)

Create a new blank tab (for writing a new file):

.vimrc:
nmap <localleader>n :tabe<cr>
----------------------------------------
.emacs:
(define-key evil-normal-state-map ",n" 'vimlike-:tabe)

Horizontally/Vertically split current window (I sort of have the names backwards, but I prefer it this way because I tend to split windows so that they are top and bottom, and “,h” (the mapping I use to achieve this effect) is much faster than “,v”)

.vimrc:
nmap <localleader>h :sp<cr>
nmap <localleader>v :vsp<cr>
----------------------------------------
.emacs:
(define-key evil-normal-state-map ",h" (lambda () (interactive) (split-window-vertically) (balance-windows)))
(define-key evil-normal-state-map ",v" (lambda () (interactive) (split-window-horizontally) (balance-windows)))

I use <localleader> in my .vimrc to avoid conflicts with any external plugins:

.vimrc:
let maplocalleader = ","

Since the comma key is already mapped in vanilla Vim to act as “go to next character that was searched with the ‘f’, ‘F’, ‘t’, or ‘T’ key”, I salvage this lost functionality using the ‘K’ key for it. ‘K’ in vanilla Vim is also already mapped to an existing function, but it’s a useless one to people who live in terminals like myself (vanilla ‘K’ in normal mode simply looks up the word under the cursor in the manpages… pretty useless IMHO). The setup below achieves the aforementioned goals:

.vimrc:
" Change K from being mapped to interactive man pages to being used as the
" vanilla comma ',' key's functionality (intra-line backwards search repeat for
" any t, T, f, F searches).
nnoremap K ,
vnoremap K ,

In case you are curious, the ‘;’ key is the forwards-search counterpart to the vanilla ‘,’ key.

Here is the .emacs equivalent to the above code snippet:

.emacs:
; Change K from being mapped to interactive man pages to being used as the
; vanilla comma ',' key's functionality (intra-line backwards search repeat for
; any t, T, f, F searches).
(define-key evil-normal-state-map "K" 'evil-repeat-find-char-reverse)

And finally, I use Emacs with the excellent Evil plugin, which makes Emacs much more usable/sane for Vimmers like me. This is why all of the Emacs configuration code snippets in this post have “evil” in there (no pun intended ;)).

I encourage you to try out my keymappings — I find them very usable… I am especially proud of mapping the TAB key to moving between windows inside a tab; it makes jumping around windows so much faster.

Skype on Linux: Startup Crash Fix

I’ve been using Skype recently and it kept crashing randomly a few seconds after startup. I managed to salvage the error message, which states as follows:

skype: pcm.c:2811: snd_pcm_areas_copy: Assertion `src_areas' failed.

Apparently, I’m not the first to notice this problem. I learned from that link that Microsoft is in no hurry to fix this longstanding, horrible bug that is a simple 1-liner fix. The fix is as follows:

mkdir ~/.Skype/Logs

Shame on the current Skype developers.

Update May 31, 2012: Apparently, Skype automatically creates huge logfiles in the ~/.Skype/Logs directory, and there is no way to turn off this useless feature (the logfiles are written in a proprietary Skype-only format, and is only designed to help Skype fix bugs if anything goes wrong; no personal data (chat history, etc.) is written in those logs according to online sources). I was bitten by this problem earlier this evening when I realized that my home partition (a meager 9GB) filled up with a 2GB Skype logfile, which prevented me from downloading a file online. The solution is simple: make the logfile a symlink into /dev/null — this way, you don’t have to periodically delete the logfiles — they are automatically “deleted” as they are being written! I knew I could take advantage of /dev/null someday and this is the perfect time for it! EDIT June 1, 2012: Actually, Skype did crash once mysteriously because of this, so just make a symlink to some partition with more space available, and then run a script upon boot up/power off to delete all files in that directory. Sorry about the bad advice!

There you go — happy Skyping!

Update June 3, 2012: OK, so apparently there is a way to create a “blackhole” directory, much in the same spirit of the special /dev/null file. This stackexchange page addresses the same concern, and led me to nullfs, which is a FUSE program that allows you to mount /dev/null-like filesystems from userspace (as regular programs, without root privileges). The code for these programs are just a few hundred lines long and compile very quickly. There are three flavors (nul1fs, nullfs, and nulnfs) and I’ve tested out nullfs to great success. You can create a “blackhole” like directory where files can be created into, but which does not take up any disk space. I just ran Skype again right now and the output of “lsof | grep skype” looks very normal (the log files in ~/.Skype/Logs are indeed created in Logs). So here’s how to do it: compile nullfs with

g++ -o nullfs -lfuse nullfs

and then create a regular directory anywhere with

mkdir blackhole

and then mount the nullfs filesystem on that directory with

path-to-nullfs-binary path-to-blackhole-directory 

and you’re good to go. For our use case, the blackhole directory would just be ~/.Skype/Logs. The nullfs binary backgrounds itself so you don’t need to worry about backgrounding/disowning the process. You can put the one-liner nullfs invocation into your startup script of choice, and you’re all set! I haven’t tried out the nul1fs or nulnfs flavors, because I didn’t need to. Hooray for FUSE!

The KISS PRNG (2011 version) in C and Haskell

Did you know that Dr. George Marsaglia (1924-2011), creator of the famed Diehard battery of randomness tests, devised a super-simple PRNG algorithm, just a month prior to his passing? Called the KISS PRNG (or Super-KISS, as there have been multiple KISS versions released previously by Marsaglia), this algorithm boasts a period in excess of 10^40million (10^40,000,000) — an astoundingly large number, many orders of magnitude larger than the famed Mersenne Twister‘s period (only 2^19,937 − 1, or only about 4.3 x 10^6,001 according to gcalctool). Plus, it’s so simple codewise, and also very fast.

Here’s the C implementation (adapted from Marsaglia’s own code):

/* AUTHOR: Shinobu (zuttobenkyou.wordpress.com) */
/* LICENSE: PUBLIC DOMAIN */
#include <stdint.h>

typedef uint64_t u64;

#define QSIZE 0x200000
#define CNG (cng = 6906969069ULL * cng + 13579)
#define XS (xs ^= (xs << 13), xs ^= (xs >> 17), xs ^= (xs << 43))
#define KISS (B64MWC() + CNG + XS)

static u64 QARY[QSIZE];
static int j;
static u64 carry;
static u64 xs;
static u64 cng;

void randk_reset(void)
{
	j = QSIZE - 1;
	carry = 0;
	xs = 362436069362436069ULL;
	cng = 123456789987654321ULL; /* use this as the seed */
}

u64 B64MWC(void)
{
	u64 t, x;
	j = (j + 1) & (QSIZE - 1);
	x = QARY[j];
	t = (x << 28) + carry;
	carry = (x >> 36) - (t < x);
	return (QARY[j] = t - x);
}

/* Initialize PRNG with default seed */
void randk_seed(void)
{
	u64 i;
	/* Seed QARY[] with CNG+XS: */
	for (i = 0; i < QSIZE; i++)
		QARY[i] = CNG + XS;
}

void randk_seed_manual(u64 seed)
{
	cng ^= seed;
	xs ^= cng;
	randk_seed();
}

void randk_warmup(int rounds)
{
	int i;
	/* Run through several rounds to warm up the state */
	for (i = 0; i < rounds; i++)
		randk();
}

/* Generate a pseudorandom 64-bit unsigned integer. */
u64 randk(void)
{
	return KISS;
}

Simple, eh? This algorithm is actually 3 PRNG’s in one: the 64-bit Multiply-With-Carry PRNG (B64MWC()), the XOR-Shift PRNG (XS), and the simple Linear Congruential PRNG (CNG). The exorbitant period comes from the fact that this algorithm relies on three different states of the three PRNGs to generate a random number.

Now, where does Haskell come into the picture? Well, I ported the code to Haskell because I wanted a simple PRNG that was of higher quality than the default System.Random RNG. Plus, if you look into the actual source of System.Random, here is an unnerving bit of code:

stdSplit            :: StdGen -> (StdGen, StdGen)
stdSplit std@(StdGen s1 s2)
                     = (left, right)
                       where
                        -- no statistical foundation for this!
                        left    = StdGen new_s1 t2
                        right   = StdGen t1 new_s2

                        new_s1 | s1 == 2147483562 = 1
                               | otherwise        = s1 + 1

                        new_s2 | s2 == 1          = 2147483398
                               | otherwise        = s2 - 1

                        StdGen t1 t2 = snd (next std)

See, the RandomGen type class requires the definition of next, split, and genRange functions (see this page). The split function’s purpose is to take one PRNG state, and give two distinct PRNG states, so that you can get multiple unique PRNG’s to work with (this comes up in functional programming in real practice — I speak from experience). The thing is, the statistical robustness of the split function for the StdGen PRNG that comes with Haskell, as can be seen in the source listing, is a bit… annoying/worrying.

Well, when I saw this, I thought: “Hey, why not use KISS? It has 3 PRNGs built into one, so when implementing split, it could just change the state of one of the PRNGs, and you’d get a *completely* different PRNG!” And so that’s exactly what I did:

-- AUTHOR: Shinobu (zuttobenkyou.wordpress.com)
-- LICENSE: PUBLIC DOMAIN
{-# LANGUAGE RecordWildCards #-}

module KISS where

import Data.Array.Unboxed
import Data.Bits
import Data.List
import Data.Word
import System.Random (RandomGen(..))

type U64 = Word64

-- | This is the last KISS-type RNG (2011) that Dr. George Marsaglia (1924-2011)
-- released to the internet before his death. The only difference with this
-- version is that the kissMWCArraySize is 0xfff (4096), instead of 0x200000
-- (2,097,152), for purposes of speed. The period of the original was
-- approximated by Marsaglia as 10^40million, which is practically infinity for
-- most, if not all, needs for everyday programs. The reduced state size for the
-- MWC algorithm from 0x200000 to 0xfff should shorten the period, but it should
-- still be excellent for general usage; because KISS combines not only MWC but
-- also CNG (Congruential) and XSHF (XOR-shift) generators, the period should
-- still be very large.
--
-- TODO: Determine period of this KISS rng.
data KISSRNG = KISSRNG
	{ kissMWCArray :: UArray Int U64
	, kissMWCArraySize :: U64
	, kissMWCIndex :: U64
	, kissMWCCarry :: U64
	, kissCNG :: U64
	, kissXSHF :: U64
	}

kissStateSize :: U64
kissStateSize = 0xfff

roundCNG :: U64 -> U64
roundCNG cng = 6906969069 * cng + 13579

roundXSHF :: U64 -> U64
roundXSHF = round3 . round2 . round1
	where
	round1 b = xor b (unsafeShiftL b 13)
 	round2 b = xor b (unsafeShiftR b 17)
	round3 b = xor b (unsafeShiftL b 43)

roundB64MWC :: KISSRNG -> KISSRNG
roundB64MWC kiss@KISSRNG{..} = kiss
	{ kissMWCArray = array'
	, kissMWCIndex = index'
	, kissMWCCarry = carry'
	}
	where
	index' = (kissMWCIndex + 1) .&. (kissMWCArraySize - 1)
	x = kissMWCArray ! (fromIntegral index')
	t = unsafeShiftL x 28 + kissMWCCarry
	carry' = unsafeShiftR x 36 - (if t < x then 1 else 0)
	array' = kissMWCArray // [(fromIntegral index', t - x)]

makeKISSRNG :: U64 -> KISSRNG
makeKISSRNG seed = KISSRNG
	{ kissMWCArray = array (0, (fromIntegral kissStateSize - 1)) $ zip [0..] kissArray
	, kissMWCArraySize = kissStateSize
	, kissMWCIndex = kissStateSize - 1
	, kissMWCCarry = 0
	, kissCNG = cngWarmed
	, kissXSHF = xshfWarmed
	}
	where
	-- seed the MWC array with the Congruential and XOR-Shift RNG's
	(kissArray, cngWarmed, xshfWarmed) = foldl' step ([], seedCNG, seedXSHF) [0..(kissStateSize - 1)]
	step (ary, cng, xshf) _ = ((cng' + xshf'):ary, cng', xshf')
		where
		cng' = roundCNG cng
		xshf' = roundXSHF xshf
	-- default Congruential RNG seed
	seedCNG = 123456789987654321
	-- default XOR-Shift RNG seed
	seedXSHF = 362436069362436069

randKISS :: KISSRNG -> (U64, KISSRNG)
randKISS k = (kissMWC + cng + xshf, k' {kissCNG = cng, kissXSHF = xshf})
	where
	k' = roundB64MWC k
	kissMWC = (kissMWCArray k') ! (fromIntegral $ kissMWCIndex k')
	cng = roundCNG $ kissCNG k
	xshf = roundXSHF $ kissXSHF k

instance Show KISSRNG where
	show KISSRNG{..} = "kissMWC: [kissMWC..]"
		++ "\nkissMWCIdx: " ++ show kissMWCIndex ++ "\n"
		++ "kissMWCArray!!Idx: " ++ show (kissMWCArray ! (fromIntegral kissMWCIndex)) ++ "\n"
		++ "kissMWCArray!!Head: " ++ show (kissMWCArray ! 0) ++ "\n"
		++ "kissMWCArray!!Last: " ++ show (kissMWCArray ! (fromIntegral $ kissStateSize - 1)) ++ "\n"
		++ "kissMWCCarry: " ++ show kissMWCCarry ++ "\n"
		++ "kissCNG: " ++ show kissCNG ++ "\n"
		++ "kissXSHF: " ++ show kissXSHF ++ "\n"

instance RandomGen KISSRNG where
	next rng = (fromIntegral n, rng')
		where
		(n, rng') = randKISS rng
	split rng = (rng1, rng2)
		where
		rng1 = warmupXSHF rng
		rng2 = warmupMWC rng
	genRange _ = (0, 0xffffffffffffffff)

warmupRNG :: RandomGen g => g -> Int -> g
warmupRNG g rounds = foldl' warmup g [1..rounds]
	where
	warmup g' _ = snd $ next g'

warmupMWC :: KISSRNG -> KISSRNG
warmupMWC rng = roundB64MWC rng

warmupXSHF :: KISSRNG -> KISSRNG
warmupXSHF rng = rng { kissXSHF = roundXSHF $ kissXSHF rng}

In the implementation of split, you can clearly see that we simply warm up one of PRNGs (move on to the next state in the period) to get a new PRNG. Again, since KISS depends on all three PRNGs, simply changing the state of one of the PRNGs will give you a completely different PRNG.

Oh, the only weakness of the Haskell version is that its QSIZE is only 0xfff, not 0x200000 as in the original, for performance reasons. I certainly hope someone could improve the performance of the code and release it on Hackage (my code is hereby released into the PUBLIC DOMAIN, so do what you like with it), restoring the state size to 0x200000 as in Marsaglia’s original. Also, I’m not sure how large the period is, but judging by how the XOR-Shift PRNG has a large period on its own, it should still be very, very large with a 0xfff state size for the MWC algorithm.

I would sincerely appreciate it if someone more familiar with combinatorics/compsci could tell me what the size of the period is with a 0xfff state size for the MWC.

I was also pleasantly surprised by the very good quality of KISS. I used my code to write some random bits into a file, and used the ent program to judge the entroy of it. Here are the results:

Entropy = 7.999829 bits per byte.

Optimum compression would reduce the size
of this 1048576 byte file by 0 percent.

Chi square distribution for 1048576 samples is 248.29, and randomly
would exceed this value 60.65 percent of the times.

Arithmetic mean value of data bytes is 127.5231 (127.5 = random).
Monte Carlo value for Pi is 3.141895835 (error 0.01 percent).
Serial correlation coefficient is 0.001437 (totally uncorrelated = 0.0).

The results show that the KISS RNG has excellent quality random numbers. These figures make it as good (randomness-wise) as, e.g., the one based on AES encryption (AES in counter mode), which has also been analyzed with ent, as stated on the github page:

Using ent, a randomness property maker on one 1Mb sample.

cprng-AES:

Entropy = 7.999837 bits per byte.
Optimum compression would reduce the size of this 1048576 byte file by 0 percent.
Chi square distribution for 1048576 samples is 237.02.
Arithmetic mean value of data bytes is 127.3422 (127.5 = random).
Monte Carlo value for Pi is 3.143589568 (error 0.06 percent).

The rather ugly code I used to generate this file (and believe me, it took forever to generate a 1MiB file because the code is horribly unoptimized…) is below:

-- AUTHOR: Shinobu (zuttobenkyou.wordpress.com)
-- LICENSE: PUBLIC DOMAIN
{-# LANGUAGE RecordWildCards #-}

module Main where

import Data.Array.Unboxed
import Data.Bits
import Data.List
import Data.Word
import System.Random (RandomGen(..))
import qualified Data.ByteString as BS

type U64 = Word64

data KISSRNG = KISSRNG
	{ kissMWCArray :: UArray Int U64
	, kissMWCArraySize :: U64
	, kissMWCIndex :: U64
	, kissMWCCarry :: U64
	, kissCNG :: U64
	, kissXSHF :: U64
	}

kissStateSize :: U64
kissStateSize = 0xfff

roundCNG :: U64 -> U64
roundCNG cng = 6906969069 * cng + 13579

roundXSHF :: U64 -> U64
roundXSHF = round3 . round2 . round1
	where
	round1 b = xor b (unsafeShiftL b 13)
 	round2 b = xor b (unsafeShiftR b 17)
	round3 b = xor b (unsafeShiftL b 43)

roundB64MWC :: KISSRNG -> KISSRNG
roundB64MWC kiss@KISSRNG{..} = kiss
	{ kissMWCArray = array'
	, kissMWCIndex = index'
	, kissMWCCarry = carry'
	}
	where
	index' = (kissMWCIndex + 1) .&. (kissMWCArraySize - 1)
	x = kissMWCArray ! (fromIntegral index')
	t = unsafeShiftL x 28 + kissMWCCarry
	carry' = unsafeShiftR x 36 - (if t < x then 1 else 0)
	array' = kissMWCArray // [(fromIntegral index', t - x)]

makeKISSRNG :: U64 -> KISSRNG
makeKISSRNG seed = KISSRNG
	{ kissMWCArray = array (0, (fromIntegral kissStateSize - 1)) $ zip [0..] kissArray
	, kissMWCArraySize = kissStateSize
	, kissMWCIndex = kissStateSize - 1
	, kissMWCCarry = 0
	, kissCNG = xor cngWarmed seed
	, kissXSHF = xor xshfWarmed seed
	}
	where
	-- seed the MWC array with the Congruential and XOR-Shift RNG's
	(kissArray, cngWarmed, xshfWarmed) = foldl' step ([], seedCNG, seedXSHF) [0..(kissStateSize - 1)]
	step (ary, cng, xshf) _ = ((cng' + xshf'):ary, cng', xshf')
		where
		cng' = roundCNG cng
		xshf' = roundXSHF xshf
	-- default Congruential RNG seed
	seedCNG = 123456789987654321
	-- default XOR-Shift RNG seed
	seedXSHF = 362436069362436069

randKISS :: KISSRNG -> (U64, KISSRNG)
randKISS k = (kissMWC + cng + xshf, k' {kissCNG = cng, kissXSHF = xshf})
	where
	k' = roundB64MWC k
	kissMWC = (kissMWCArray k') ! (fromIntegral $ kissMWCIndex k')
	cng = roundCNG $ kissCNG k
	xshf = roundXSHF $ kissXSHF k

instance Show KISSRNG where
	show KISSRNG{..} = "kissMWC: [kissMWC..]"
		++ "\nkissMWCIdx: " ++ show kissMWCIndex ++ "\n"
		++ "kissMWCArray!!Idx: " ++ show (kissMWCArray ! (fromIntegral kissMWCIndex)) ++ "\n"
		++ "kissMWCArray!!Head: " ++ show (kissMWCArray ! 0) ++ "\n"
		++ "kissMWCArray!!Last: " ++ show (kissMWCArray ! (fromIntegral $ kissStateSize - 1)) ++ "\n"
		++ "kissMWCCarry: " ++ show kissMWCCarry ++ "\n"
		++ "kissCNG: " ++ show kissCNG ++ "\n"
		++ "kissXSHF: " ++ show kissXSHF ++ "\n"

instance RandomGen KISSRNG where
	next rng = (fromIntegral n, rng')
		where
		(n, rng') = randKISS rng
	split rng = (rng1, rng2)
		where
		rng1 = warmupXSHF rng
		rng2 = warmupMWC rng
	genRange _ = (0, 0xffffffffffffffff)

warmupRNG :: RandomGen g => g -> Int -> g
warmupRNG g rounds = foldl' warmup g [1..rounds]
	where
	warmup g' _ = snd $ next g'

warmupMWC :: KISSRNG -> KISSRNG
warmupMWC rng = roundB64MWC rng

warmupXSHF :: KISSRNG -> KISSRNG
warmupXSHF rng = rng { kissXSHF = roundXSHF $ kissXSHF rng}

main :: IO ()
main = do
	let
		rng = makeKISSRNG 0
		(bytes1MiB, _) = genBytesKISS 0x100000 rng
	BS.writeFile "data" bytes1MiB

genBytesKISS :: U64 -> KISSRNG -> (BS.ByteString, KISSRNG)
genBytesKISS len kissrng = foldl' step (BS.empty, kissrng) [1..(div len 8)] -- divide by 8, b/c e.g., to generate 8 bytes, we only need 1 U64
	where
	step (bs, rng) _ = (foldl' BS.snoc bs $ octets u64, rng')
		where
		(u64, rng') = randKISS rng

smallerChunks :: [U64] -> [Word8]
smallerChunks = concatMap octets

-- | Get a number and split it up into 8 8-bit parts (64 bits total).
octets :: (Bits a, Integral a) => a -> [Word8]
octets w = map (\n -> fromIntegral $ shiftR w n) . reverse $ take 8 [0,8..]

Here are the C and Haskell standalone versions that prove that the Haskell port behaves in the same way as the C version, given the right starting seed values and state size (both 0xfff for the MWC PRNG):

C standalone version (compile with gcc -o ckiss kiss.c):

/* AUTHOR: Shinobu (zuttobenkyou.wordpress.com) */
/* LICENSE: PUBLIC DOMAIN */
#include <stdio.h>
#include <inttypes.h>
#include <stdint.h>

typedef uint64_t u64;

#define QSIZE 0xfff
#define CNG (cng = 6906969069ULL * cng + 13579)
#define XS (xs ^= (xs << 13), xs ^= (xs >> 17), xs ^= (xs << 43))
#define KISS (B64MWC() + CNG + XS)

static u64 QARY[QSIZE];
static int j;
static u64 carry;
static u64 xs;
static u64 cng;

u64 B64MWC(void)
{
	u64 t, x;
	j = (j + 1) & (QSIZE - 1);
	x = QARY[j];
	t = (x << 28) + carry;
	carry = (x >> 36) - (t < x);
	return (QARY[j] = t - x);
}

/* Initialize PRNG with default seed */
void randk_seed(void)
{
	u64 i;
	j = QSIZE - 1;
	carry = 0;
	xs = 362436069362436069ULL;
	cng = 123456789987654321ULL;
	/* Seed QARY[] with CNG+XS: */
	for (i = 0; i < QSIZE; i++)
		QARY[i] = CNG + XS;
}

/* Generate a pseudorandom 64-bit unsigned integer. */
u64 randk(void)
{
	return KISS;
}

int main(void)
{
	randk_seed();
	printf("randk_seed called!\n");
	printf("KISS idx: %"PRIu64"\n", j);
	printf("qary[idx] is: %"PRIu64"\n", QARY[j]);
	printf("qary[0] is: %"PRIu64"\n", QARY[0]);
	printf("qary[QSIZE - 1] is: %"PRIu64"\n", QARY[QSIZE - 1]);
	printf("carry: %"PRIu64"\n", carry);
	printf("cng: %"PRIu64"\n", cng);
	printf("xs: %"PRIu64"\n", xs);
	u64 x = KISS;
	printf("\nKISS called! KISS num is: %"PRIu64"\n", x);
	printf("\nKISS idx: %"PRIu64"\n", j);
	printf("qary[idx] is: %"PRIu64"\n", QARY[j]);
	printf("qary[0] is: %"PRIu64"\n", QARY[0]);
	printf("qary[QSIZE - 1] is: %"PRIu64"\n", QARY[QSIZE - 1]);
	printf("carry: %"PRIu64"\n", carry);
	printf("cng: %"PRIu64"\n", cng);
	printf("xs: %"PRIu64"\n", xs);

	printf("x + 18334599312639636657 is: %"PRIu64"\n", x + 18334599312639636657ULL);
}

Haskell standalone version (run with runhaskell kiss.hs):

-- AUTHOR: Shinobu (zuttobenkyou.wordpress.com)
-- LICENSE: PUBLIC DOMAIN
{-# LANGUAGE RecordWildCards #-}

module KISS where

import Data.Array.Unboxed
import Data.Bits
import Data.List
import Data.Word
import System.Random (RandomGen(..))

type U64 = Word64

data KISSRNG = KISSRNG
	{ kissMWCArray :: UArray Int U64
	, kissMWCArraySize :: U64
	, kissMWCIndex :: U64
	, kissMWCCarry :: U64
	, kissCNG :: U64
	, kissXSHF :: U64
	}

kissStateSize :: U64
kissStateSize = 0xfff

roundCNG :: U64 -> U64
roundCNG cng = 6906969069 * cng + 13579

roundXSHF :: U64 -> U64
roundXSHF = round3 . round2 . round1
	where
	round1 b = xor b (unsafeShiftL b 13)
 	round2 b = xor b (unsafeShiftR b 17)
	round3 b = xor b (unsafeShiftL b 43)

roundB64MWC :: KISSRNG -> KISSRNG
roundB64MWC kiss@KISSRNG{..} = kiss
	{ kissMWCArray = array'
	, kissMWCIndex = index'
	, kissMWCCarry = carry'
	}
	where
	index' = (kissMWCIndex + 1) .&. (kissMWCArraySize - 1)
	x = kissMWCArray ! (fromIntegral index')
	t = unsafeShiftL x 28 + kissMWCCarry
	carry' = unsafeShiftR x 36 - (if t < x then 1 else 0)
	array' = kissMWCArray // [(fromIntegral index', t - x)]

makeKISSRNG :: U64 -> KISSRNG
makeKISSRNG seed = KISSRNG
	{ kissMWCArray = array (0, (fromIntegral kissStateSize - 1)) $ zip [0..] kissArray
	, kissMWCArraySize = kissStateSize
	, kissMWCIndex = kissStateSize - 1
	, kissMWCCarry = 0
	, kissCNG = cngWarmed
	, kissXSHF = xshfWarmed
	}
	where
	-- seed the MWC array with the Congruential and XOR-Shift RNG's
	(kissArray, cngWarmed, xshfWarmed) = foldl' step ([], seedCNG, seedXSHF) [0..(kissStateSize - 1)]
	step (ary, cng, xshf) _ = ((cng' + xshf'):ary, cng', xshf')
		where
		cng' = roundCNG cng
		xshf' = roundXSHF xshf
	-- default Congruential RNG seed
	seedCNG = 123456789987654321
	-- default XOR-Shift RNG seed
	seedXSHF = 362436069362436069

randKISS :: KISSRNG -> (U64, KISSRNG)
randKISS k = (kissMWC + cng + xshf, k' {kissCNG = cng, kissXSHF = xshf})
	where
	k' = roundB64MWC k
	kissMWC = (kissMWCArray k') ! (fromIntegral $ kissMWCIndex k')
	cng = roundCNG $ kissCNG k
	xshf = roundXSHF $ kissXSHF k

instance Show KISSRNG where
	show KISSRNG{..} = "kissMWC: [kissMWC..]"
		++ "\nkissMWCIdx: " ++ show kissMWCIndex ++ "\n"
		++ "kissMWCArray!!Idx: " ++ show (kissMWCArray ! (fromIntegral kissMWCIndex)) ++ "\n"
		++ "kissMWCArray!!Head: " ++ show (kissMWCArray ! 0) ++ "\n"
		++ "kissMWCArray!!Last: " ++ show (kissMWCArray ! (fromIntegral $ kissStateSize - 1)) ++ "\n"
		++ "kissMWCCarry: " ++ show kissMWCCarry ++ "\n"
		++ "kissCNG: " ++ show kissCNG ++ "\n"
		++ "kissXSHF: " ++ show kissXSHF ++ "\n"

instance RandomGen KISSRNG where
	next rng = (fromIntegral n, rng')
		where
		(n, rng') = randKISS rng
	split rng = (rng1, rng2)
		where
		rng1 = warmupXSHF rng
		rng2 = warmupMWC rng
	genRange _ = (0, 0xffffffffffffffff)

warmupRNG :: RandomGen g => g -> Int -> g
warmupRNG g rounds = foldl' warmup g [1..rounds]
	where
	warmup g' _ = snd $ next g'

warmupMWC :: KISSRNG -> KISSRNG
warmupMWC rng = roundB64MWC rng

warmupXSHF :: KISSRNG -> KISSRNG
warmupXSHF rng = rng { kissXSHF = roundXSHF $ kissXSHF rng}

main :: IO ()
main = do
	let rng = makeKISSRNG 0
	putStrLn $ show rng

EDIT May 3, 2012: Sorry about the somewhat redundant-looking code listings, but I’m too lazy/busy to clean them up.
EDIT May 4, 2012: Alternate period number for the Mersenne Twister, for easier comparison of period size with KISS.

A Better Way to Prevent Cheating for Online Games

Forrest Smith has written an article about the problem of rampant cheating (http://altdevblogaday.com/2012/04/02/extravagant-cheating-via-direct-x/) on PC games. I’ve wanted to address this topic because it online cheating has affected my gaming experience in the past (although not any longer because programming has taken up most, if not all, of my gaming time). Cheating on PC games can be seen as a classic Black Hat vs. White Hat battle that never ends. Forrest mentions data analysis as a tool to measure the likelihood of a player using one or more cheats to gain an unfair advantage (which should be noticeable in terms of raw game data). I’d like to add a few things.

I agree that data analysis is probably the best way to detect cheaters. This is because it is possible to detect players who “toggle” subtle cheats that are difficult to notice (e.g., realistic aimbot, or wallhack) — these players use these cheats for 2 minutes then turn them off, and then on for another 2 minutes, then off again, and so on. While it would be difficult for humans to detect such players, computers, with enough data points, could easily see consistent, significant fluctuations in game data and raise a red flag for further analysis by a human.

Though data analysis is excellent, I think that the amount of computation needed (and also the amount of raw data needed) to automatically (and intelligently) analyze every player, in real time, is beyond the current state of technology, or at least economic feasibility. Even if trusted supercomputers could crunch player data, no game company is going to throw in thousands, if not millions, of dollars just to check for cheaters — especially if the game is over two or three years old.

The best way to prevent cheating is to use a network of trust. This is the magic phrase that comes up over and over again for those who are engaged in real security work. Public key cryptography is an excellent example. You could easily have a server that uses the current state-of-the-art public-key cryptography to ensure that, for example, only players who share at least 2 other players in their trusted keyring can log in to play. The effect of this is that you are simulating real human behavior — people tend to avoid taking risks (in this case, the possibility of a ruined gaming experience because of cheating) with complete strangers. You could even enforce draconian measures, such as, “if a trusted person has been confirmed as a cheater, all other associated players will have their trust level reduced.” Such a rule would severely dampen the spread of new cheats, as the banned player will be prevented from playing on trusted servers, and all of his real-life friends who added him to their keyring will have their “trustworthiness” level reduced. In short, real-life peer pressure would dramatically affect the game.

Sadly, the only problem with the network of trust is that it prevents millions of players from playing against each other immediately, because, let’s face it, the vast majority of online players one encounters are complete strangers. Still, I think that this method is a viable option; I am 99.999999% sure that there are thousands, if not tens of thousands, of players out there who would be willing to sacrifice maxed-out (full player) servers in order to enjoy hours of trusted gaming that guarantees a system that no one in the world, not even world governments, is able to hack.

UPDATE April 21, 2012: Fix grammar/typos.