Coding
Submitted by gwolf on Wed, 11/25/2009 - 14:26
My Debian QA page shows what I consider to be a huge amount of packages — I am currently uploader for 207 packages. Why so many? There are many factors — The main one is group maintenance (I'm directly responsible only for 19; of course, this should not mean I disregard the rest of them), the second one is regularity. By far, most of my source packages (177) match lib.*perl, followed by lib.*ruby with 20.
Anyway — A strong factor that allows the pkg-perl group to be successful in maintaining 1411 packages is the regularity of the task: Packaging Perl modules is usually as easy as running dh-make-perl on them (of course, not taking away the merit of packaging the few strange corner cases…
In Ruby-land, the landscape is quite different. The developer community is quite anchored in agile worldviews, which go beyond coding practices and all the way over to confronting the way most Free Software projects distribute their work. I have previously ranted presented informed and opinionated blog posts on this topic — Ruby culture dictates the distribution via Ruby Gems, which are for many reasons not Debian friendly. Besides Gems, most projects have adopted Git for development tracking and are hosted under Github — That's why I came up with Githubredir, which basically presents an uscan-friendly listing of tags for a given project.
But if you develop in Git, you might want to split a project in its constituent parts for easier organization, without meaning that each subproject should be an independent project by itself, right? After all, that's what Git submodules are for. That's what happened with a great PDF generating library for Ruby, Prawn. Thing is, the three parts of the main project are required for the project to be built.
Anyway, that was a great reason to move the package over to the new dpkg 3.0 (quilt) source format. And, yes, it is a straightforward move! If you have not yet done so, take a look at Raphael Hertzog' explanation+FAQ wiki page. It just works, and makes many things way easier.
There are still some wrinkles in my packaging, like where I'm getting the orig tarballs from — As the submodules are not presently tagged in any way, I was only able to download a snapshot of their respective current master branches. This is suboptimal, I know, but I have talked to the upstream author, and he confirms that for the next major version (which should not be long in coming) the tags will be synchronized, and things will be even cleaner.
PS- I love Hpricot. To get the numbers for my QA page, I just had to get three dirty but useful arrays:
require 'hpricot' require 'open-uri' url = 'http://qa.debian.org/developer.php?login=gwolf&comaint=yes' doc = Hpricot(open(url)) tot = ( (doc / 'table' )[1]/'tr').map { |row| (row%'td' % 'a').inner_html rescue nil}.select {|i| i} team = ( (doc / 'table' )[1]/'tr').map { |row| (row%'td' % 'span.uploader' % 'a').inner_html rescue nil}. select {|i| i} mine = ( (doc / 'table' )[1]/'tr').map { |row| (row%'td' % 'span:not(.uploader)' % 'a').inner_html rescue nil}. select {|i| i}
And work from the three very simple lists there — i.e. tot.select {|pkg| pkg =~ /lib.*perl/}.size gives me 177.
Submitted by gwolf on Mon, 11/16/2009 - 10:40
Sigh…
I am starting to fill up my annual report for my real-life work. You know, that chore you must do every year where you score little bullets next to each completed project and talk well about yourself. For my workplace, fortunately, I do not have to lie and convince people I am worth rehiring - As this year I achieved definitividad as a Técnico Académico Asociado C de Tiempo Completo at my University, I can say for sure I have long-term job safety. UNAM is the best place for me to work, and I am most grateful — Even if I do want to advance for the future, even though I would strongly like at some point to start working in a real academic position — My job is mostly operative, limited to keeping things running smoothly in our network and servers. I work in a social sciences (Economics) research institute, and even though I have taken on an interesting project that is viewed from the social sciences I do expect to finish with a very interesting product in the near future, my interest lies in computing as a science.
Anyway, back on track… This is the time of year to start evaluating many things, many factors, from many different sides. And yes, for me that involves measuring how am I faring in my involvement in the projects I most care about — Specifically, Debian, but also several other Free Software projects, even if my involvement in them is mostly organizational.
I am once again going through a tough period in my personal life, and the impact it carries is obviously deep. However, I am not fond of finding excuses for my underachievement or underperformance. And that's what I feel now. Even more when I see posts such as Zack's and Tim's status updates, and when I see that we continue to be on a history-high streak of RC bugs.
Regarding the several teams I am (at least formally) involved with in Debian, I have been away from the pkg-perl group for far too long... It is still my first group when it comes to identifying myself with - Both as on a personal level, as I consider them as good friends and great people to work with, and I do feel the responsability to share the load with them, as maintaining >1300 packages (even if they are so highly regular) is just not an easy task. But for over a year, my involvement has been basically zero. I have been a bit more active on pkg-ruby-extras, maybe paradoxically as it is a smaller team and with less packages (as I know it is much less probable for somebody to keep my packages in adequate shape if I don't do it)... and also because I am working more with Ruby than Perl nowadays. And finally, about Cherokee, I decided during DebConf9 to redo the packaging to fully use DH7 instead of our old-style quasimanual style. I have had several bursts of activity, and am almost-almost-ready to do the first newstyle upload... But so far, have been unable to do so.
Of course, keyring-maint: With Jonathan's help, I have come to terms with most of the processes. Both Jonathan and I have been swamped lately, but at least I think I am finally helping speed up the process instead of holding it down. We do, yes, have several pending updates - but are working our way up the queue, and I hope not to leave people waiting for too long. And yes, we have discussed several ways of documenting and automating several of the tasks we currently sustain, and that should come soon
I have been also leaving maybe a bit too much responsability aside on EDUSOL, for which today we are entering the second week of activity, and I'm very sorry to see our server is just too overloaded to even reply to even answer to me — And even lacking admin powers myself, I should have worked earlier on setting up redundancy on a more automatic way (as we have an off-site backup we can promote to live and redirect to, but I am unable to do this... Given that I am the techie person on board/the only "professional" sysadmin).
This year I also –quietly– finished the bulk of the Comas rewrite. What? Comas? Still alive? Yes, and you can expect me to show it off to more people soon, and get it used for more conferences. I will talk more about it (and its motivation, and its current status) later on — But basically, the only two things that Comas shares in common with the mod_perl-based system most of you got to know (mainly at CONSOL 2004-2008 or at Debconf 5 and 6, although I know of several other conferences which used it) and the current incarnation are… The (most) basic database structure and the name. The project underwent a full rewrite, and is now a far more flexible, far easier to install, Ruby-on-Rails based application. And most important, it does no longer involve your name being Gunnar Wolf as a prerequisite for successfully setting it up ;-)
Regarding DebConf, I have promoted a Central American MiniDebConf, and we are right on track for holding it in late March in Panamá City. Everybody's invited, and we will have (surprise, surprise!) the very professional involvement of Mr. Anto Recio as local team, as it seems he didn't have enough with last year's DebConf9 and wants to suffer further. What am I lacking here? Motivation. I have been quite pessimistic, possibly turning some people away, even though we have a good first sampling of interested people's profiles and expectations. If you want to get involved, tomorrow (Tuesday 17-nov) we will have a meeting at Freenode's #sl-centroamerica, 17:00 GMT-6. Please note we do need involvement from the Central American communities, it is more than just a motivational issue. Last meeting it seemed Anto and I were the only people pushing the MiniDebConf - and frankly, that would be a basis for not even holding it. We need motivation from the very people involved in it!
Anyway… You can see I have (and it seems to be a constant in my life) a series of contradictions going on. However, the excercise of putting it all into writing helps me understand better where I am standing. When I started writing this post I felt much heavier, much more at a loss… Right now I feel I want to refocus my energy on the same projects and teams I have been involved with, yes, but feel it at least more plausible. Hope so.
Submitted by gwolf on Thu, 10/01/2009 - 18:04
Humm... Has anybody else seen a pattern like this?
I am getting a flurry of root login attempts at my main server at the University since yesterday 7:30AM (GMT-5). Now, from the machines I run in the 132.248.0.0/16 network (UNAM), only two listen to the world with ssh at port 22 — And yes, it is a very large network, but I am only getting this pattern on one of them (they are on different subnets, quite far apart). They are all attempting to log in as root, with a frequency that varies wildly, but is consistently over three times a minute right now. This is a sample of what I get in my logs:
[update] Logs omitted from blog post, as it is too wide and breaks displays for most users. You can download the log file instead.
Anyway… This comes from all over the world, and all the attempts are made as root (no attempts from unprivileged users). Of course, I have PermitRootLogin to no in /etc/ssh/sshd_config, but… I want to understand this as much as possible.
Initially it struck me that most of the attempts appeared to come from Europe (quite atypical for the usual botnet distribution), so I passed my logs through:
#!/usr/bin/perl use Geo::IP; use IO::File; use strict; my ($geoip, $fh, %by_ip, %by_ctry); $fh = IO::File->new('/tmp/sshd_log'); $geoip=Geo::IP->new(GEOIP_STANDARD); while (my $lin = <$fh>) { next unless $lin =~ /rhost=(\S+)/; $by_ip{$1}++}; print " Incidence by IP:\n", "Num Ctry IP\n", ('='x60 ), "\n"; for my $ip ( sort {$by_ip{$a} <= > $by_ip{$b}} keys %by_ip) { my $ctry = ($ip =~ /^[\d\.]+$/) ? $geoip->country_code_by_addr($ip) : $geoip->country_code_by_name($ip); $by_ctry{$ctry}++; printf "%3d %3s %s\n", $by_ip{$ip}, $ctry, $ip; } print " Incidence by country:\n", "Num Country\n", "============\n"; sort {$by_ctry{$b} <= > $by_ctry{$a}}
The top countries (where the number of attempts ≥ 5) are:
104 CN 78 US 58 BR 49 DE 43 PL 20 ES 20 IN 19 RU 17 CO 17 UA 16 IT 13 AR 12 ZA 10 CA 10 CH 8 GB 8 AT 8 JP 8 FR 7 KR 7 HK 7 PE 7 ID 6 PT 5 CZ 5 AU 5 BE 5 SE 5 RO 5 MX
I am attaching to this post the relevant log (filtering out all the information I could regarding legitimate users) as well as the full output. In case somebody has seen this kind of wormish botnetish behaviour lately… please comment.
[Update] I have tried getting some data regarding the attacking machines, running a simple nmap -O -vv against a random sample (five machines, I hope I am not being too agressive in anybody's eyes). They all seem to be running some flavor of Linux (according to the OS fingerprinting), but the list of open ports varies wildly — I have seen the following:
Not shown: 979 closed ports PORT STATE SERVICE 21/tcp open ftp 22/tcp open ssh 23/tcp open telnet 111/tcp open rpcbind 135/tcp filtered msrpc 139/tcp filtered netbios-ssn 445/tcp filtered microsoft-ds 593/tcp filtered http-rpc-epmap 992/tcp open telnets 1025/tcp filtered NFS-or-IIS 1080/tcp filtered socks 1433/tcp filtered ms-sql-s 1434/tcp filtered ms-sql-m 2049/tcp open nfs 4242/tcp filtered unknown 4444/tcp filtered krb524 6346/tcp filtered gnutella 6881/tcp filtered bittorrent-tracker 8888/tcp filtered sun-answerbook 10000/tcp open snet-sensor-mgmt 45100/tcp filtered unknown Device type: general purpose|WAP|PBX Running (JUST GUESSING) : Linux 2.6.X|2.4.X (96%), (…) Not shown: 993 filtered ports PORT STATE SERVICE 22/tcp open ssh 25/tcp open smtp 80/tcp open http 443/tcp open https 444/tcp open snpp 3389/tcp open ms-term-serv 4125/tcp closed rww Device type: general purpose|phone|WAP|router Running (JUST GUESSING) : Linux 2.6.X (91%), (…) Not shown: 994 filtered ports PORT STATE SERVICE 22/tcp open ssh 25/tcp closed smtp 53/tcp closed domain 80/tcp open http 113/tcp closed auth 443/tcp closed https Device type: general purpose Running (JUST GUESSING) : Linux 2.6.X (90%) OS fingerprint not ideal because: Didn't receive UDP response. Please try again with -sSU Aggressive OS guesses: Linux 2.6.15 - 2.6.26 (90%), Linux 2.6.23 (89%), (…) Not shown: 982 closed ports PORT STATE SERVICE 21/tcp open ftp 22/tcp open ssh 37/tcp open time 80/tcp open http 113/tcp open auth 135/tcp filtered msrpc 139/tcp filtered netbios-ssn 445/tcp filtered microsoft-ds 1025/tcp filtered NFS-or-IIS 1080/tcp filtered socks 1433/tcp filtered ms-sql-s 1434/tcp filtered ms-sql-m 4242/tcp filtered unknown 4444/tcp filtered krb524 6346/tcp filtered gnutella 6881/tcp filtered bittorrent-tracker 8888/tcp filtered sun-answerbook 45100/tcp filtered unknown Device type: general purpose|WAP|broadband router Running (JUST GUESSING) : Linux 2.6.X|2.4.X (95%), (…) Not shown: 994 filtered ports PORT STATE SERVICE 22/tcp open ssh 25/tcp open smtp 53/tcp open domain 80/tcp open http 110/tcp open pop3 3389/tcp open ms-term-serv Warning: OSScan results may be unreliable because we could not find at least 1 open and 1 closed port Device type: firewall|general purpose Running: Linux 2.6.X OS details: Smoothwall firewall (Linux 2.6.16.53), Linux 2.6.13 - 2.6.24, Linux 2.6.16
Of course, it strikes me that several among said machines seem to be Linuxes, but (appear to) run Microsoft services. Oh, and they also have P2P clients.
Submitted by gwolf on Mon, 08/31/2009 - 19:05
Today I had a nice and productive day, code-wise. Maybe that's a side effect from being unable to lose my time following E-mail?
Anyway, checking my code with git citool previous to today's git commit, I came accross this method. I didn't even pay attention to it while writing. But it did make me laugh in semi-awe thinking about the great implications it might have. The method signature:
def days_for(who, what, how)
The code itself? Naah, too pedestrian, to simplistic. It will ruin the sight. It just looks so beautifully universal!
Ok, I am compelled to share, even if it spoils it and renders it into a completely regular, even stupid method.
def days_for(who, what, how) return ' —No dates set— ' unless who.has_validity_period? days_to = who.send(how) return 'Past' if days_to <= 0 '%d days (%s)' % [days_to, who.send(what)] end
Submitted by gwolf on Mon, 06/08/2009 - 18:25
Having recently become an Unicode (ab)user, in great part due to Kragen's .XCompose, I took again the mission to convince people that resistance is futile, you will be assimilated into the multilingual world of UTF8...
...And given the recent thread in debian-devel regarding how a globbing or similar functionality should be implemented (specifically, given Giacomo's message pointing out that our beloved «/» directory separator is subject to the locale rules)...
I cannot help but to send you to this old piece of MSDN beauty: When is a backslash not a backslash?
In short: If you are surprised because in East Asia they use the local currency to separate directories... Don't be. Blame the 8 bits of extended, non-standard ASCII codepages.
Submitted by gwolf on Tue, 02/10/2009 - 14:51
I have spent a couple of days working into dh-make-drupal. Yes, you guessed right: An idea based on the wonderful dh-make-perl, but applied to the Drupal Content Management System.
Drupal's greatest strengths, IMHO, are:
- Drupal offers a huge number of modules and themes
- Drupal has an amazingly sane configuration handling, where -contrary to what usually happens in PHP-land and, in general, among webapps- you set up the code only at a single place, with only the site-specific configuration (usually a single file) handling all of the differences
Yup, even though I am quite fond of its flexibility and power, I fell for Drupal in no small part because of its sysadmin-friendliness.
Now, I hate having non-Debian-packaged files spilled over my /usr/share partition. Drupal modules want to be installed in /usr/share/drupal5/modules/module_name (or s/5/6/ for Drupal6, to which I have not yet migrated). For that reason, over the last year I have been growing my personal apt repository of Drupal stuff. Yes, it is still on, and I don't plan on taking it off. You can access it by adding deb http://www.iiec.unam.mx/apt/ etch drupal to your /etc/apt/sources. However, you can now also do the process locally. Do you fancy the wonderful Biblio module? Or the very nice Abarre theme? Great!
0 gwolf@mosca『4』/tmp$ ~/code/dh-make-drupal/dh-make-drupal --drupal 5 biblio 0 gwolf@mosca『5』/tmp$ cd drupal5-mod-biblio-1.16/ 0 gwolf@mosca『6』/tmp/drupal5-mod-biblio-1.16$ debuild -us -uc >& /dev/null 0 gwolf@mosca『7』/tmp/drupal5-mod-biblio-1.16$ cd .. 0 gwolf@mosca『8』/tmp$ su Password: 0 root@mosca[1]/tmp# dpkg -i drupal5-mod-biblio_1.16-1_all.deb Selecting previously deselected package drupal5-mod-biblio. (Reading|> database ... 275110 files and directories currently installed.) Unpacking drupal5-mod-biblio (from drupal5-mod-biblio_1.16-1_all.deb) ... Setting up drupal5-mod-biblio (1.16-1) ...
Yay!
Yes, still many more things to come (i.e. including the debuild call and whatnot), but... Enjoy!
BTW, this piece of software owes a couple of beers to Why the lucky stiff, author of Hpricot. You are insane (but we are all well aware of that). You deserve to go to the webscraping heaven. Yes, besides the programming-languages-teaching-cartoon heaven. You find out how to split the time between them.
[Update]: Of course, ITP bug #514786 has been filed, and I will soon be uploading this into Debian.
Submitted by gwolf on Wed, 01/14/2009 - 17:41
I am stunned no more people have been bitten by this. Or at least, the Intarweb has not heard about it. Censorship perhaps? I haven't researched more into the causes, but anyway...
I was pushing a project I have had lingering for some time from Rails 2.0.x to 2.1.x (yes, 2.2 is already out there, but 2.1 is the version that will ship with Lenny) - The changes should not be too invasive, as it is a minor release, but there are some quite noticeable changes.
Anyway... What was the problem? Take this very simple migration:
class CreatePeople < ActiveRecord::Migration def self.up create_table :people do |t| t.column :login, :string, :null => false t.column :passwd, :string, :null => false t.column :firstname, :string, :null => false t.column :famname, :string, :null => false t.column :email, :string t.column :pw_salt, :string t.column :created_at, :timestamp t.column :last_login_at, :timestamp end end def self.down drop_table :people end end
The problem is that PostgreSQL refuses to create a hyperdimensional string field. I offer this here to you, line-wrapped by me for your convenience.
PGError: ERROR: syntax error at OR near "(" LINE 1: ...serial PRIMARY KEY, "login" character varying(255)(255) NOT ... ^ : CREATE TABLE "people" ("id" serial PRIMARY KEY, "login" character varying(255)(255) NOT NULL, "passwd" character varying(255)(255)(255) NOT NULL, "firstname" character varying(255)(255)(255)(255) NOT NULL, "famname" character varying(255)(255)(255)(255)(255) NOT NULL, "email" character varying(255)(255)(255)(255)(255)(255) DEFAULT NULL NULL, "pw_salt" character varying(255)(255)(255)(255)(255)(255)(255) DEFAULT NULL NULL, "created_at" timestamp DEFAULT NULL NULL, "last_login_at" timestamp DEFAULT NULL NULL)
Beautiful. Now I can store strings not only as character vectors, but as planes, cubes, hypercubes, and any other hyperdimensional construct! Are we approaching quantum computers?
What is really striking is that... I found only one occurrence on tha net of this bug - one and a half years ago, in Ola Bini's blog. No stunned users looking for the culprit, no further reports... Strange.
Still, the bug was fixed in Rails 2.2 about half a year ago, although not in revisions of earlier versions. I will request the patch to be applied to earlier versions as well. Sigh.
Submitted by gwolf on Sat, 12/13/2008 - 19:35
Ok, so a handy meme is loose: Handy Git tips. We even had a crazy anatidae requesting us to post this to the Git wiki whatever we send on this regard to our personal blogs.
Following Damog's post, I will also put my .bashrc snippet:
parse_git_branch() { branch=`git branch 2> /dev/null | sed -e '/^[^*]/d' -e 's/* \(.*\)/\1/'` if [ ! -z "$branch" ] then if ! git status|grep 'nothing to commit .working directory clean' 2>&1 > /dev/null then branch="${branch}*" mod=`git ls-files -m --exclude-standard|wc -l` new=`git ls-files -o --exclude-standard|wc -l` del=`git ls-files -d --exclude-standard|wc -l` if [ $mod != 0 ]; then branch="${branch}${mod}M"; fi if [ $new != 0 ]; then branch="${branch}${new}N"; fi if [ $del != 0 ]; then branch="${branch}${del}D"; fi fi fi echo $branch }
This gives me the following information on my shell prompt:
- The git branch where we are standing
- If it has any uncommitted changes, a * is displayed next to it
- If there are changes not checked in to the index, M (modified), N (new) or D (deleted) is displayed, together with the number of files in said condition. i.e.,

Sometimes, entering a very large git tree takes a second or two... But once it has run once, it goes on quite smoothly.
Of course, I still have this also in .bashrc - but its funcionality pales in comparison:
get_svn_revision() { if [ -d .svn ] then svn info | grep ^Revision | cut -f 2 -d ' ' fi }
I am sure it can be expanded, of course - but why? :)
Submitted by gwolf on Wed, 12/10/2008 - 14:03
There is quite a bit of software whose upstream authors decide that, as they are already using Git for development, the main distribution channel should be GitHub - This allows, yes, for quite a bit of flexibility, which many authors have taken advantage of.
So, I just registered and set up http://githubredir.debian.net/ to make it easier for packagers to take advantage of it.
Specifically, what does this redirector make? Given that GitHub allows for downloading as a .zip or as a .tar.gz any given commit, it suddenly becomes enough to git tag with a version number, and GitHub magically makes that version available for download. Which is sweet!
Sometimes it is a bit problematic, though, to follow their format. Github gives a listing of the tags for each particular prooject, and each of those tags has a download page, with both archiving formats.
I won't go into too much detail here - Thing is, going over several pages becomes painful for Debian's uscan, widely used for various of our QA processes. There are other implemented redirectors, such as the one used for SourceForge.
This redirector is mainly meant to be consumed by Debian's uscan. Anybody who finds this system useful can freely use it, although you might be better served by the rich, official GitHub.com interface.
Anyway - Enough repeating what I said on the http://githubredir.debian.net/ base page. Find it useful? Go ahead and use it!
Submitted by gwolf on Mon, 12/08/2008 - 23:57
Thanks to some unexplained comments on some oldish entries on my blog, I found -with a couple of days of delay- Rubigem is from Mars, Apt-get is from Venus, in Pelle's weblog. And no, I have not yet read the huge amount of comments generated from it... Still, I replied with the following text - And I am leaving this blog post in place to remind me to further extend my opinions later on.
Wow... Quite a bit of comments. And yes, given that the author wrote a (very well phrased and balanced) post, I feel obliged to reply. But given that he refered to me first, I'll just skip the chatter for later - I'm tired this time of day ;-)
Pelle, I agree with you - This problem is because we are from two very different mindsets. I have already said so - http://www.gwolf.org/soft/debian+rails is a witness to that point.
But I do not think the divide is between sysadmins and developers. I am a developer that grew from the sysadmin stance, but that's not AFAICT that much the fact in Debian.
Thing is, in a distribution, we try to cater for common users. I have a couple of Rails apps under development that I expect to be able to package for Debian, and I think can be very useful for the general public.
Now, how is the user experience when you install a desktop application, in whatever language/framework it is written? You don't care what the platform is - you care that it integrates nicely with your environment. Yes, the webapp arena is a bit more difficult - but we have achieved quite a bit of advance in that way. Feel like using a PHP webapp? Just install it, and it's there. A Python webapp? Same thing. A Perl webapp? As long as you don't do some black magic (and that's one of the main factors that motivated me away from mod_perl), the same: Just ask apt-get to install it and you are set.
But... What about installing a Rails application? From a package manager? For a user who does not really care about what design philosophy you followed, who might not even know what a MVC pattern is?
Thing is, distributions aim at _users_. And yes, I have gradually adopted a user's point of view. I very seldom install anything not available as a .deb - and if I do, I try to keep it clean enough so I can package it for my personal use later on.
Anyway... I will post a copy of this message in my blog (http://gwolf.org/), partly as a reminder to come back here and read the rest of the buzz. And to go to the other post referenced here. And, of course, I invite other people involved in Ruby and Debian to continue sharing this - I am sure I am not the only person (or, in more fairness, that Debian's pkg-ruby-extras team is not the only team) interested in bridging this huge divide and get to a point we can interact better - And I am sure that among the Rubyists many people will also value having their code usable by non-developers as well.
Submitted by gwolf on Wed, 07/23/2008 - 13:50
Several weeks ago, the people in charge of maintaining the Windows machines in my institute were desperate because of a series of virus outbreaks - Specially, as expected, in the public lab - but the whole network smell virulent. After seeing their desperation, I asked Rolman to help me come up with a solution. He suggested me to try replacing the Windows workstations by substituting local installations by a server having several virtual machines, all regenerated from a clean image every day, and exporting rdesktop sessions. He suggested using Xen for this, as it is the virtualization/paravirtualization solution until now best offered and supported by most Linux distributions (including, of course, RedHat, towards which he is biased, and Debian, towards I am... more than biased, even bent). So far, no hassle, right?
Of course, I could just stay clear of this mess, as everything related to Windows is off my hands... But in October, we will be renewing ~150 antivirus licences. I want to save that money by giving a better solution, even if part of that money gets translated to a big server.
Get the hardware
But problems soon arose. The first issue was hardware. Xen can act in its paravirtualization mode on basically any x86 machine - but it requires a patched guest kernel. That means, I can paravitualize many several different free OSs on just any computer I lay my hands on here, but Windows requires full- or hardware-assisted- virtualization. And, of course, only one of the over 300 computers we have (around 100 of which are recent enough for me to expect to be usable as a proof-of-concept for this) has a CPU with VT extensions - And I'm not going to de-comission my firewall to become a test server! ;-)
When software gets confused for hardware
So, I requested a Intel® Core™2 Quad Q9300 CPU, which I could just drop in any box with a fitting motherboard. But, of course, I'm not the only person requiring computer-related stuff. So, after pestering the people in charge for buying stuff on a daily basis for three weeks, the head of acquisitions came smiling to my office with a little box in his hands.
But no, it was not my Core 2 Quad CPU.
It was a box containing... Microsoft Visio. Yes, they spent their effort looking for the wrong computer-related thingy :-/ And meanwhile, Debconf 8 is getting nearer and nearer. Why does that matter? Because I have a deadline: By October, I want the institute to decide not to buy 150 antivirus licenses! Debconf will take some time off that target from me.
Anyway... The university vacations started on July 5. The first week of vacations I went to sweat my ass off at Monterrey, by Monday 14 I came back to my office, and that same day I finally got the box, together with two 2GB DIMMs.
Experiences with a nice looking potential disaster
Anyway, by Tuesday I got the CPU running, and a regular Debian install in place. A very nice workhorse: 5GB RAM, quad core CPU at 2.5GHz, 6MB cache (which seems to be split in two 3MB banks, each for two cores - but that's pure speculation from me). I installed Lenny (Debian testing), which is very soon going to freeze and by the time this becomes a production server will be very close to being a stable release, and I wanted to take advantage of the newest Xen administration tools. Of course, the installation was for AMD64 - Because 64 bitness is a terrible thing to waste.
But I started playing with Xen - And all kind of disasters stroke. First, although there is a Xen-enabled 2.6.25 Linux kernel, it is -686 only (i.e. no 64 bit support). Ok, install a second system on a second partition. Oh, but this kernel is only domU-able (this is, it will correctly run in a Xen paravirtualized host), but not dom0-able (it cannot act as a root domain). Grmbl.
So, get Etch's 2.6.18 AMD64 Xen-enabled kernel, and hope for the best. After all, up to this point, I was basically aware of many of the facts I mentioned (i.e. up to this point I did reinstall once, but not three times)... And I hoped the kernel team would have good news regarding a forward-port of the Xen dom0 patches to 2.6.25 - because losing dom0 support was IMO a big regression.
But quite on time, this revealing thread came up on the debian-devel mailing list. In short: Xen is a disaster. The Xen developers have done their work quite far away from the kernel developers, and the last decent synchronization that was made was in 2.6.18, over two years ago. Not surprisingly, enterprise-editions of other Linux distributions also ship that kernel version. There are some forward-patches, but current support in Xen is... Lacking, to say the least. From my POV, Xen's future in the Linux kernel looks bleakish.
Now, on the lightweight side...
Xen is also a bit too complicated - Of course, its role is also complicated as well, and it has a great deal of tunability. But I decided to keep a clean Lenny AMD64 install, and give KVM, the Kernel Virtual Machine a go. My first gripe? What a bad choice of name. Not only Google searches for KVM gives completely unrelated answers (to a name that's already well known, even in the same context, even in the same community).
KVM takes a much, much simpler approach to virtualization (both para- and full-): We don't need no stinkin' hypervisors. The kernel can just do that task. And then, kvm becomes just another almost-regular process. How nice!
In fact, KVM borrows so very much from qemu that it even refers to qemu's manpage for everything but two command-line switches.
Qemu is a completely different project, which gets to a very similar place but from the other extreme - Qemu started off as Bochs, a very slow but very useful multi-architecture emulator. Qemu started adding all kinds of optimizations, and it is nearly useful (i.e. I use it in my desktop whenever I need a W2K machine).
Instead of a heavyweight framework... KVM is just a modprobe away - Just ask Linux to modprobe kvm, and kvm -hda /path/to/your/hd/image gets you a working machine.
Anyway - I was immediatly happy with KVM. It took me a week to get a whole "lab" of 15 virtual computers (256MB RAM works surprisingly well for a regular XP install!) configured to start at boot time off a single master image over qcow images.
KVM's shortcomings
Xen has already been a long time in the enterprise, and has a nice suite of administrative tools. While Xen depends on having a configuration file for each host, KVM expects them to be passed at the command line. To get a bird-eye view of the system, xen has a load of utilities - KVM does not. And although RedHat's virt-manager is said to support KVM and qemu virtualization (besides its native Xen, of course), it falls short of what I need (i.e. it relies on a configuration file... which lacks expresivity to specify a snapshot-based HD image).
To my surprise, KVM has attained much of Xen's most amazing capabilities, such as the live migration. And although it's easier to just use fully virtualized devices (i.e. to use an emulation of the RTL8139 network card), as they require no drivers extraneous to the operating system, performance can be greatly enhanced by using the VirtIO devices. KVM is quickly evolving, and I predict it will largely overtake Xen's (and of course, vmware and others) places.
Where I am now
So... Well, those of us that adopt KVM and want to get it into production now will have some work of building the tools to gracefully manage and report it, it seems. I won't be touching much my setup until after Debconf, but so far I've done some work over Freddie Cash's kvmctl script. I'm submitting him some patches to make his script (IMHO) more reliable and automatizable (if you are interested, you can get my current version of the script as well). And... Starting September, I expect to start working on a control interface able to cover my other needs (such as distributing configuration to the terminals-to-be, or centrally managing the configurations).
Submitted by gwolf on Thu, 07/17/2008 - 15:42
Last week (July 7-13) was basically hell on Earth, for me and for the group that somehow got the name Cabras locas, of which I am part since I joined the National Pedagogical University, where I worked full-time 2003-2005.
It was, yes, the first of my officially three weeks of Summer holiday at IIEc-UNAM, so no problems here. So, why hell on Earth? Because we were in charge basically of anything related with information flow, retrieval and manipulation at the 11th International Congress on Mathematical Education, in Monterrey.
What we thought would basically be one or two days of hard work followed by six days of relaxed vacations (we had even planned to have an internal seminar, showing off the shiny stuff each of us is working on) became... A mind-boggling eight day experience where we worked over 12 hours a day on being human replacements for Google, SQL engines, full-text parsers, report generators, printer watchdogs, and in general lines, just a bunch of unhappy firemen, ready to be called off for whatever task was necessary.
We did have, of course, several calm periods every now and then. We even had to learn how to look busy while doing something compeltely unrelated (that would explain, for example, a couple of low-hanging bugs I fixed for Debian, or some dozens of lines of code I could get off my head).
But my advice for whoever reads this: Don't trust people with long database-handling experience. Specially when they insist that hand-capturing a thousand registers is preferrable (i.e. less error-prone) than parsing three separate databases and discarding duplicates. And, of course, specially when this person is your boss, which is enough of an argument to have it his way.
Submitted by gwolf on Thu, 06/05/2008 - 16:59
I think I should follow up on Victor's lament. Yes, we have a Rails application which works fine most of the time... But quite often, throws out a segmentation fault I just have been unable to pin-point. It might be related to rmagick, the only non-pure-Ruby component I am using (and I'm tempted to try minimagick instead, even if I prefer in-memory operations than on-disk, piping an image and slurping it again).
Victor came up with an easy script to check the server - but to reduce the impact it has (I was running a single Mongrel instance, which meant, whenever it dies the whole system becomes inaccessible for everybody; I replaced it with a mongrel_cluster of five processes, plus pound as a easy-to-use balancer which looks quite nice), the very simplistic and to-the-point script did no longer work.
Anyway... Ruby rocks ;-) I'm sharing this with you mostly because I am sure some readers will find more than one useful construct, not because it is precisely beautiful code. And besides, we should work on fixing the cause, not the consequence, of the bug! :)
#!/usr/bin/ruby require 'yaml' confdir = '/etc/mongrel-cluster/sites-enabled' restart_cmd = '/etc/init.d/mongrel-cluster restart' needs_restart = false (Dir.open(confdir).entries - ['.', '..']).each do |site| conf = YAML.load_file "#{confdir}/#{site}" pid_location = [conf['cwd'], conf['pid_file']].join('/').gsub(/\.pid$/, '*.pid') pid_files = Dir.glob(pid_location) pid_files.each do |pidf| pid = File.read(pidf) begin Process.getpgid(pid.to_i) rescue Errno::ESRCH warn "Process #{pid} (cluster #{site}) is dead!" File.unlink pidf needs_restart = true end end end system(restart_cmd) if needs_restart
Works out of the box for any Debian-packaged mongrel-cluster. Sadly, mongrel-cluster does not provide a way to restart individual servers - Of course, I could (should, even) work it out to build the specific command-line... but at least, it works for now.
Uh-oh... Does that mean it's permanent?
Submitted by gwolf on Tue, 05/06/2008 - 11:30
I usually don't like me too comments... But this is something that really disappoints me of my otherwise-favorite development framework. I must echo Matt Palmer's comment on Luke Kanies' entry:
Ruby. Has. A. Distribution. Problem.
Nice, good read. Sadly, many Rails pushers see distributability as something very minor, something that should not worry Rails developers right now, as there is too much other serious work to be done - Better UTF8, a clearer language, better performance... And besides, any programmer can live well with gems. (yes, that's all taken from a rant I had with a very convinced person)
My gripe is that... Rails is no longer a small, fringe project. Rails is an enterprise-grade development framework, with thousands of deployed production systems. And if they don't start to act responsably, if the Rails developers keep pushing said problems as low-priority, the Rails developers' (that is, their users) culture will become rigid - and will constitute a serious harm to Rails' future.
Distributability and packageability is not only for OS distributors. Not only we Debian zealots care about software being easily packageable. By using Ruby Gems, you dramatically increase entropy and harm your systems' security.
Read Luke's text for more details. It is quite worth the time.
Submitted by gwolf on Sun, 04/13/2008 - 11:52
This is the third plugin I have been working on for Rails - It's the one that still needs most work to be fully usable (for what I originally envisioned it), but is almost there - I have not touched it (nor the other two I recently presented here, acts_as_catalog and Real FK because real-life has demanded my time on various other fronts.
Anyway, this is IMHO the most interesting of the three plugins I've written so far.
So, following what I did with the other plugins: You can go to acts_as_magic_model's main page, or jump straight to its SVN repo. And the basic description follows:
Rails plugin to allow for extensible models, where inter-model relations are inferred from the database's structure, and not necessarily from explicit declarations in the models.
This means, when one of your models acts_as_magic_model, this plugin will try to infer all the basic relations (has_many, belongs_to and has_and_belongs_to_many) its base table has with other tables in the database, adapting the model correspondingly.
Using the plugin
In order to use this plugin, you only have to declare it in your model. As an example, say you have the proverbial project-management system, where you have projects (each of which can be of several different types) and people, and a HABTM relation between people and projects. So, you have the following tables (expressed as migrations here):
create_table :people, :force => true do |t| t.column :name, :string end create_table :project_types, :force => true do |t| t.column :name, :string end create_table :projects, :force => true do |t| t.column :name, :string t.column :project_type_id, :integer end create_table :people_projects, :force => true, :id => false do |t| t.column :person_id, :integer t.column :project_id, :integer end
Using acts_as_magic_model, you can skip declaring the relations in your models, like this: (of course, each of the classes still has to be declared in its own file)
class Person < ActiveRecord::Base acts_as_magic_model end class Project < ActiveRecord::Base acts_as_magic_model end class ProjectType < ActiveRecord::Base acts_as_magic_model end
And the gaps will be filled in for you - that is, you can go on and ask for Project.find(:first).people, ProjectType.find(:first).projects.size, Person.find(:first).projects.map {|pr| pr.project_type}, and whatever you fancy.
In future versions of the plugin, I expect even to get rid to explicitly declare the class declarations.
Of course, you should be warned: Initializing a model using this plugin is quite database-intensive. It will clutter your logs, and it might be unbearable if you use it in development mode. Still, its convenience is very often worth it.
Its main uses should be when prototyping and when designing a system that needs to be flexible to the data models themselves - If you don't expect your system's data structures to be highly malleable, you should probably use acts_as_magic_model only as a prototyping aid.
|