Debian R Archive

For those of you were were in the various Debian infrastructure channels, you might have noticed that I was playing around with wanna-build, dak, sbuild, and buildd and friends. [Thanks to everyone who answered questions, btw.] Over the past week, I've been building most of CRAN, Bioconductor, and omegahat for unstable, amd64. I plan to build the same set of packages for i386, and will start a build shortly for stable as well. This effort builds on top of Charles Blundell and Dirk Eddelbuettel's cran2deb, which does most of the heavy lifting.

If you're like me, and use lots of different R packages, or already use some of the R packages available on the previous build, you can simply point your sources.list to the [http://debian-r.debian.net] archive, load the appropriate GPG key, and away you go. I have a bit more information available here and I will try to keep that page updated as I build other architectures and build out for stable.

Posted
Debbugs forcemerge support

One of the earliest features I wrote for the Debian bug tracking system (Debbugs) after joining the team was support for forcibly merging bugs. Originally, merging two bugs required that the bugs be in exactly the same state before merging them; forcemerge removed this requirement.

Unfortunately, the way I originally implemented this was shortsighted, and merely forced the merge partners to have the same values as the merge master. This meant that owners, blocking bugs, and many other things were silently changed, which meant that people weren't notified of changes, and bugs could end up in an inconsistent state.

A while ago, I decided to fix this by calculating the changes required to actually merge the bugs, making those changes, and then merging the bugs normally; thus, doing everything that a maintainer would normally have done for them. This necessitated abstracting out the entire control apparatus into the Debbugs::Control module.

Now that it's complete, you can do the following:

> forcemerge 1 2
Bug #1 [foo] new title
Bug #2 {Done: foo@bugs.something} [foo] foo
Unset bug forwarded-to-address
Severity set to 'wishlist' from 'grave'
3 was blocked by: 2
3 was not blocking any bugs.
Removed blocking bug(s) of 3: 2
2 was blocked by: 4
2 was not blocking any bugs.
Removed blocking bug(s) of 2: 4
Bug reopened
Removed annotation that bug was owned by bar@baz.com.

Removed indication that 2 affects bleargh
Removed tag(s) unreproducible and moreinfo.
Merged 1 2
> thanks
Stopping processing here.

and bug 2 now is merged with 1 and matches the state of 1. [The above is the control output from the appropriate bit of the 06_mail_handling.t test.]

This change also means that I'll be able to finally write support for control@ operations at submit@ time. Also, all of the bug modifications that happen at submit@ or nnn@ time (setting title, found, etc.) will be implemented as calls to Debbugs::Control so we can eventually keep a postgresql database updated in addition to the flatfile database.

Using Term::Progressbar

I've been working for a while on analyzing a fairly large dataset for my Lupus genetics project. One of the major annoyances with analyzing large datasets is not knowing when a particular part of the analysis is going to finish, and whether I should go back and rewrite part of the code to be faster, or just wait for it to finish. In R, I've been using txtProgressBar to handle this, but I hadn't bothered to find a similar module for perl until now.

Luckily, Term::ProgressBar exists, and is pretty easy to use:

my $pos = $sfh->tell();
$sfh->seek(0,SEEK_END);
my $p = Term::ProgressBar->new({count => $sfh->tell,
                                remove => 1,
                                ETA => 'linear'});
$sfh->seek($pos,SEEK_SET);
while (<$sfh>) {
    ...; # yada yada yada
    $p->update($sfh->tell());       
}

producing useful output, which told me that my SQLite database creation routine would take about 2 days to finish instead of the 7 years that the slightly less optimal version wanted.

Posted
Introducing Sweavealike

I use R a lot. It's one of the primary tools I use in my day job as a scientist analyzing large datasets. If you use LaTeX with R (as I often do), you probably use Sweave to interleave R output and figures with your text describing those figures using the noweb method of literate programming.

Sweavealike is a plugin for IkiWiki that tries to do some of the useful things for IkiWiki that sweave does for R and LaTeX.

You use it like the following:

[[!sweavealike  echo=1 code="""
a <- 1
a <- a + 10
print(a)
"""]]

which produces this result when run:

> a <- 1
> a <- a + 10
> print(a)
[1] 11

You can also generate figures with it:

[[!sweavealike  fig=1 echo=1 results="hide" code="""
plot(1:10,(1:10)^2,xlab="x",ylab=expression(x^2),main="Example Figure")
"""]]

> plot(1:10,(1:10)^2,xlab="x",ylab=expression(x^2),main="Example Figure")

The plugin itself uses the neat Statistics::R perl module to handle all of the heavy lifting. I personally plan on using this plugin to help write some more entries in my learning R series of posts that I'm beginning to work on. Hopefully I'll find and fix most of the bugs as I embark on that process so anyone else who uses the plugin won't, but feel free to e-mail me if something isn't working as it should.

Finally, you shouldn't run this plugin on a publicly editable IkiWiki instance, because that would be a trivial local user exploit as R can run arbitrary code, read and write to arbitrary files, exhaust all memory, etc.