Coding

10 PRINT CHR$(205.5+RND(1)); : GOTO 10 (also known as #10print )

Submitted by gwolf on Tue, 11/25/2014 - 00:41

The line of BASIC code that appears as the subject for this post is the title for a book I just finished reading — And enjoyed thoroughly. The book is available online for download under a CC-BY-NC-SA 3.0 License, so you can take a good look at it before (or instead of) buying it. Although it's among the books I will enjoy having on my shelf; the printing is of a very enjoyable good quality.

And what is this book about? Well, of course, it analizes that very simple line of code, as it ran on the Commodore 64 thirty years ago.

And the analysis is made from every possible angle: What do mazes mean in culture? What have they meant in cultures through history? What about regularity in art (mainly 20th century art)? How would this code look (or how it would be adapted) on contemporary non-C64 computers? And in other languages more popular today? What does randomness mean? And what does random() mean? What is BASIC, and how it came to the C64? What is the C64, and where did it come from? And several other beautiful chapters.

The book was collaboratively written by ten different authors, in a Wiki-like fashion. And... Well, what else is there to say? I enjoyed so much reading through long chapters of my childhood, of what attracted me to computers, of my cultural traits and values... I really hope that, in due time, I can be a part of such a beautiful project!

( categories: )

Language designers, API designers, PHP and utter fails

Submitted by gwolf on Wed, 09/07/2011 - 13:20

After many years of successfully dodging doing any serious programming in PHP, I had to get my feet wet with PHP for my RL job: I was requested to develop a simple but non-trivial module for our Institute's Drupal-based webpage.

It basically meant two and a half weeks devoted to head-scratching: I had read the very good John van Dyk's Pro Drupal Development book, and knew it would be an important resource were I to face writing a module or work on a theme beyond the most basic stuff… So I checked it out of the library, and started basically writing something similar to my good and trusty Perl code. After all, PHP seems quite similar to Perl, although forcing you to write more for no gain (i.e. requiring an array() declaration whenever you want to store more than one value together) or lacking important and useful constructs (not having a sane way to prepare a SQL statement for multiple executions with different parameters — Yes, there are DB access methods that do provide it, but Drupal 6 does not use them).

Anyway, book in hand, I started understanding while implementing (which is way different than just reading the book, right?) Drupal's notions. I cannot say I like them, but it's… ahem… doable.

Now, I hit a problem twice. I chose to ignore it the first time, as it was a corner case I'd look into later on, but had to devote for hours of my attention later on. When designing the menu (which for Drupal means not only the facility which prepares/displays links to the bits of functionality, but also the access control layer and the action dispatcher — a huge yay! for responsability separation!), I had only two access levels: Public and administrator. So, this seemed like a good fit:

  1. $public = array('first/action', 'second/action',
  2. 'third/action', 'something/else');
  3.  
  4. foreach (array_keys($items) as $item) {
  5. if (array_search($item, $public)) {
  6. $items[$item]['access callback'] = TRUE
  7. } else {
  8. $items[$item]['access arguments'] = array('diradmin');
  9. }
  10. }

But... No matter what I did, the first element in $public refused to be publicly visible.

It was not until after a severe amount of head-scratching I came across this jewel in the PHP online manual:

Warning

This function may return Boolean FALSE, but may also return a non-Boolean value which evaluates to FALSE, such as 0 or "". Please read the section on Booleans for more information. Use the === operator for testing the return value of this function.

GRAH. Using a sane language for some time had made me forget about the problems of true/false sharing space with other meaningful values. So, yes, checking for inclusion of a value in an array in PHP this way should be compared with class-bound identity (that's what === means) to FALSE, or better yet, using a function that exclusively returns a boolean (as in_array()).

Anyway… While arrays (which in PHP are any kind of list, be it keyed as a hash or consecutive as a traditional array) are such an usual construct in any language, please do take a look at PHP's array-handling API. Too long. Too complex. Too many corner cases.

I cannot but wonder what keeps PHP as a popular language. It hurts.

( categories: )

Starting work on KindleClip

Submitted by gwolf on Fri, 03/18/2011 - 19:09
Starting work on KindleClip

Quite probably, the best thing got for myself during the last year was my Kindle. I just love it! It has changed the way I interact with knowledge, and saved me from hours of boredom.

But it has also taught me the value of scribbling along the book pages and of underlining passages. Yes, I hold a deep regard for my regular (paper) books, and I never scribble on them, not even on text books. In any case, I can scribble on a post-it or something like that.

Still, when you underline or comment on a passage in the Kindle, what can you do with your annotation? Well, not much. Annotations (called clippings in Kindlespeak) are stored on an easily accessible My Clippings.txt text file, very easy to parse and work with.

So, I devoted yesterday evening to coming up with a first prototype of an app that I think can be very useful if you use clippings extensively: It displays each clipping with its base information separately, allows you to filter on the specific book to which each clipping is related as well as on the clipping type.

So, if it interests you, clone it away from github!

git://github.com/gwolf/kindleclip.git

Written in Ruby, Gtk (Glade). No further libraries are (currently?) needed. The code is far from beautiful, but is a first stab towards functionality.

Any comments welcome!

( categories: )

Debian by its numbers, as seen by keyring-maint

Submitted by gwolf on Fri, 07/02/2010 - 14:24

At keyring-maint, we got a request by our DPL, querying for the evolution of the number of keys per keyring – This can be almost-mapped to the number of Debian Developers, Debian Maintainers, retired and deleted accounts over time since the keyrings are maintained over version control.

Stefano insisted this was more out of curiosity than anything else, but given the task seemed easy enough, I came up with the following dirty thingy. I'm sure there are better ways than cruising through the whole Bazaar history, but anyway - In case you want to play, you can clone an almost-up-to-date copy of the tree: bzr clone http://bzr.debian.org/keyring/debian-keyring/

  1. #!/bin/perl
  2. use strict;
  3. my ($lastrev, @keyrings, %revs, $fh);
  4. open $fh, '>growth_stats.txt' or die $!;
  5.  
  6. @keyrings = sort qw(debian-keyring-gpg debian-keyring-pgp
  7. debian-maintainers-gpg
  8. emeritus-keyring-gpg emeritus-keyring-pgp
  9. removed-keys-gpg removed-keys-pgp);
  10.  
  11. system('bzr unbind'); # Huge speed difference :-P
  12. $lastrev = `bzr revno`;
  13.  
  14. for my $entry (split /^---+$/m, `bzr log`) {
  15. my ($rev, $stamp);
  16. for my $line (split(/\n/, $entry)) {
  17. next unless $line =~ /^(revno|timestamp): (.*)/;
  18. $rev = $2 if $1 eq 'revno';
  19. $stamp = $2 if $1 eq 'timestamp';
  20. }
  21. $revs{$rev} = { stamp => $stamp };
  22. }
  23.  
  24. spew('Revision', 'Date', @keyrings);
  25. system('bzr bind')
  26.  
  27. for my $rev (sort {$a<=>$b} keys %revs) {
  28. system("bzr update -r $rev");
  29. spew($rev, $revs{$rev}{stamp},
  30. map {my @keys=<$_/*>;scalar(@keys)} @keyrings);
  31. }
  32.  
  33. sub spew {
  34. print $fh join('|', @_),"\n"
  35. }

And as a result... Yes, I fired up OpenOffice instead of graphing from within Perl, which could even have been less painful ;-) I had intended to leave graphing the data raw (also attached here) as an excercise to the [rl]eader... But anyway, the result is here (click to view it in full resolution, I don't want to mess your reading experience with a >1000px wide image):

A couple of notes:

  • Debian Developers are close to the sum of debian-keyrings-pgp and debian-keyrings-gpg
  • After a long time pestering developers (and you can see how far down the tail we were!), as of today, debian-keyrings-pgp will cease to exist. That means, no more old, v3, vulnerable keys. Yay! All the credit goes to Jonathan. Some last DDs using it are still migrating, but we will get them hopefully soon.
  • To be fair... No, the correct number is not the sum. Some people had more than one key (i.e. when we had ~200 keys in debian-keyring-pgp). The trend is stabilizing.
  • Of course, the {removed-keys,emeritus-keyring}-{pgp,gpg} will continue to grow. Most removed keys are a result of tons of people migrating over from 1024D to stronger 4096R keys
  • You can easily see the points where we have removed inactive developers (i.e. the main WAT lack-of-response, as seen at about ¾ of the graph)
  • keyring-maint handles the Debian Maintainers keyring since late 2009. There is a sensible increase (~10% in six months), although I expected to see that line grow more. I think it's safe to say the rate of influx of DMs is similar to the rate of influx of DDs - Of course, many DMs become DDs, so the amount of new blood might be (almost) the sum of the two positive slopes?

Anyway, have fun with this. Graphics are always fun!

( categories: )

Source format 3.0 (quilt) for teh win!

Submitted by gwolf on Wed, 11/25/2009 - 14:26

My Debian QA page shows what I consider to be a huge amount of packages — I am currently uploader for 207 packages. Why so many? There are many factors — The main one is group maintenance (I'm directly responsible only for 19; of course, this should not mean I disregard the rest of them), the second one is regularity. By far, most of my source packages (177) match lib.*perl, followed by lib.*ruby with 20.

Anyway — A strong factor that allows the pkg-perl group to be successful in maintaining 1411 packages is the regularity of the task: Packaging Perl modules is usually as easy as running dh-make-perl on them (of course, not taking away the merit of packaging the few strange corner cases…

In Ruby-land, the landscape is quite different. The developer community is quite anchored in agile worldviews, which go beyond coding practices and all the way over to confronting the way most Free Software projects distribute their work. I have previously ranted presented informed and opinionated blog posts on this topic — Ruby culture dictates the distribution via Ruby Gems, which are for many reasons not Debian friendly. Besides Gems, most projects have adopted Git for development tracking and are hosted under Github — That's why I came up with Githubredir, which basically presents an uscan-friendly listing of tags for a given project.

But if you develop in Git, you might want to split a project in its constituent parts for easier organization, without meaning that each subproject should be an independent project by itself, right? After all, that's what Git submodules are for. That's what happened with a great PDF generating library for Ruby, Prawn. Thing is, the three parts of the main project are required for the project to be built.

Anyway, that was a great reason to move the package over to the new dpkg 3.0 (quilt) source format. And, yes, it is a straightforward move! If you have not yet done so, take a look at Raphael Hertzog' explanation+FAQ wiki page. It just works, and makes many things way easier.

There are still some wrinkles in my packaging, like where I'm getting the orig tarballs from — As the submodules are not presently tagged in any way, I was only able to download a snapshot of their respective current master branches. This is suboptimal, I know, but I have talked to the upstream author, and he confirms that for the next major version (which should not be long in coming) the tags will be synchronized, and things will be even cleaner.

PS- I love Hpricot. To get the numbers for my QA page, I just had to get three dirty but useful arrays:

  1. require 'hpricot'
  2. require 'open-uri'
  3. url = 'http://qa.debian.org/developer.php?login=gwolf&comaint=yes'
  4. doc = Hpricot(open(url))
  5. tot = ( (doc / 'table' )[1]/'tr').map { |row|
  6. (row%'td' % 'a').inner_html rescue nil}.select {|i| i}
  7. team = ( (doc / 'table' )[1]/'tr').map { |row|
  8. (row%'td' % 'span.uploader' % 'a').inner_html rescue nil}.
  9. select {|i| i}
  10. mine = ( (doc / 'table' )[1]/'tr').map { |row|
  11. (row%'td' % 'span:not(.uploader)' % 'a').inner_html rescue nil}.
  12. select {|i| i}

And work from the three very simple lists there — i.e. tot.select {|pkg| pkg =~ /lib.*perl/}.size gives me 177.

( categories: )

Personal assessment about myself: Being slow everywhere…

Submitted by gwolf on Mon, 11/16/2009 - 10:40

Sigh…

I am starting to fill up my annual report for my real-life work. You know, that chore you must do every year where you score little bullets next to each completed project and talk well about yourself. For my workplace, fortunately, I do not have to lie and convince people I am worth rehiring - As this year I achieved definitividad as a Técnico Académico Asociado C de Tiempo Completo at my University, I can say for sure I have long-term job safety. UNAM is the best place for me to work, and I am most grateful — Even if I do want to advance for the future, even though I would strongly like at some point to start working in a real academic position — My job is mostly operative, limited to keeping things running smoothly in our network and servers. I work in a social sciences (Economics) research institute, and even though I have taken on an interesting project that is viewed from the social sciences I do expect to finish with a very interesting product in the near future, my interest lies in computing as a science.

Anyway, back on track… This is the time of year to start evaluating many things, many factors, from many different sides. And yes, for me that involves measuring how am I faring in my involvement in the projects I most care about — Specifically, Debian, but also several other Free Software projects, even if my involvement in them is mostly organizational.

I am once again going through a tough period in my personal life, and the impact it carries is obviously deep. However, I am not fond of finding excuses for my underachievement or underperformance. And that's what I feel now. Even more when I see posts such as Zack's and Tim's status updates, and when I see that we continue to be on a history-high streak of RC bugs.

Regarding the several teams I am (at least formally) involved with in Debian, I have been away from the pkg-perl group for far too long... It is still my first group when it comes to identifying myself with - Both as on a personal level, as I consider them as good friends and great people to work with, and I do feel the responsability to share the load with them, as maintaining >1300 packages (even if they are so highly regular) is just not an easy task. But for over a year, my involvement has been basically zero. I have been a bit more active on pkg-ruby-extras, maybe paradoxically as it is a smaller team and with less packages (as I know it is much less probable for somebody to keep my packages in adequate shape if I don't do it)... and also because I am working more with Ruby than Perl nowadays. And finally, about Cherokee, I decided during DebConf9 to redo the packaging to fully use DH7 instead of our old-style quasimanual style. I have had several bursts of activity, and am almost-almost-ready to do the first newstyle upload... But so far, have been unable to do so.

Of course, keyring-maint: With Jonathan's help, I have come to terms with most of the processes. Both Jonathan and I have been swamped lately, but at least I think I am finally helping speed up the process instead of holding it down. We do, yes, have several pending updates - but are working our way up the queue, and I hope not to leave people waiting for too long. And yes, we have discussed several ways of documenting and automating several of the tasks we currently sustain, and that should come soon

I have been also leaving maybe a bit too much responsability aside on EDUSOL, for which today we are entering the second week of activity, and I'm very sorry to see our server is just too overloaded to even reply to even answer to me — And even lacking admin powers myself, I should have worked earlier on setting up redundancy on a more automatic way (as we have an off-site backup we can promote to live and redirect to, but I am unable to do this... Given that I am the techie person on board/the only "professional" sysadmin).

This year I also –quietly– finished the bulk of the Comas rewrite. What? Comas? Still alive? Yes, and you can expect me to show it off to more people soon, and get it used for more conferences. I will talk more about it (and its motivation, and its current status) later on — But basically, the only two things that Comas shares in common with the mod_perl-based system most of you got to know (mainly at CONSOL 2004-2008 or at Debconf 5 and 6, although I know of several other conferences which used it) and the current incarnation are… The (most) basic database structure and the name. The project underwent a full rewrite, and is now a far more flexible, far easier to install, Ruby-on-Rails based application. And most important, it does no longer involve your name being Gunnar Wolf as a prerequisite for successfully setting it up ;-)

Regarding DebConf, I have promoted a Central American MiniDebConf, and we are right on track for holding it in late March in Panamá City. Everybody's invited, and we will have (surprise, surprise!) the very professional involvement of Mr. Anto Recio as local team, as it seems he didn't have enough with last year's DebConf9 and wants to suffer further. What am I lacking here? Motivation. I have been quite pessimistic, possibly turning some people away, even though we have a good first sampling of interested people's profiles and expectations. If you want to get involved, tomorrow (Tuesday 17-nov) we will have a meeting at Freenode's #sl-centroamerica, 17:00 GMT-6. Please note we do need involvement from the Central American communities, it is more than just a motivational issue. Last meeting it seemed Anto and I were the only people pushing the MiniDebConf - and frankly, that would be a basis for not even holding it. We need motivation from the very people involved in it!

Anyway… You can see I have (and it seems to be a constant in my life) a series of contradictions going on. However, the excercise of putting it all into writing helps me understand better where I am standing. When I started writing this post I felt much heavier, much more at a loss… Right now I feel I want to refocus my energy on the same projects and teams I have been involved with, yes, but feel it at least more plausible. Hope so.

( categories: )

Strange scanning on my server?

Submitted by gwolf on Thu, 10/01/2009 - 18:04

Humm... Has anybody else seen a pattern like this?

I am getting a flurry of root login attempts at my main server at the University since yesterday 7:30AM (GMT-5). Now, from the machines I run in the 132.248.0.0/16 network (UNAM), only two listen to the world with ssh at port 22 — And yes, it is a very large network, but I am only getting this pattern on one of them (they are on different subnets, quite far apart). They are all attempting to log in as root, with a frequency that varies wildly, but is consistently over three times a minute right now. This is a sample of what I get in my logs:

[update] Logs omitted from blog post, as it is too wide and breaks displays for most users. You can download the log file instead.

Anyway… This comes from all over the world, and all the attempts are made as root (no attempts from unprivileged users). Of course, I have PermitRootLogin to no in /etc/ssh/sshd_config, but… I want to understand this as much as possible.

Initially it struck me that most of the attempts appeared to come from Europe (quite atypical for the usual botnet distribution), so I passed my logs through:

  1. #!/usr/bin/perl
  2. use Geo::IP;
  3. use IO::File;
  4. use strict;
  5. my ($geoip, $fh, %by_ip, %by_ctry);
  6.  
  7. $fh = IO::File->new('/tmp/sshd_log');
  8. $geoip=Geo::IP->new(GEOIP_STANDARD);
  9. while (my $lin = <$fh>) { next unless $lin =~ /rhost=(\S+)/; $by_ip{$1}++};
  10.  
  11. print " Incidence by IP:\n", "Num Ctry IP\n", ('='x60),"\n";
  12.  
  13. for my $ip ( sort {$by_ip{$a} <=> $by_ip{$b}} keys %by_ip) {
  14. my $ctry = ($ip =~ /^[\d\.]+$/) ?
  15. $geoip->country_code_by_addr($ip) :
  16. $geoip->country_code_by_name($ip);
  17.  
  18. $by_ctry{$ctry}++;
  19. printf "%3d %3s %s\n", $by_ip{$ip}, $ctry, $ip;
  20. }
  21.  
  22. print " Incidence by country:\n", "Num Country\n", "============\n";
  23. map {printf "%3d %s\n", $by_ctry{$_}, $_}
  24. sort {$by_ctry{$b} <=> $by_ctry{$a}}
  25. keys(%by_ctry);

The top countries (where the number of attempts ≥ 5) are:

  1. 104 CN
  2. 78 US
  3. 58 BR
  4. 49 DE
  5. 43 PL
  6. 20 ES
  7. 20 IN
  8. 19 RU
  9. 17 CO
  10. 17 UA
  11. 16 IT
  12. 13 AR
  13. 12 ZA
  14. 10 CA
  15. 10 CH
  16. 8 GB
  17. 8 AT
  18. 8 JP
  19. 8 FR
  20. 7 KR
  21. 7 HK
  22. 7 PE
  23. 7 ID
  24. 6 PT
  25. 5 CZ
  26. 5 AU
  27. 5 BE
  28. 5 SE
  29. 5 RO
  30. 5 MX

I am attaching to this post the relevant log (filtering out all the information I could regarding legitimate users) as well as the full output. In case somebody has seen this kind of wormish botnetish behaviour lately… please comment.

[Update] I have tried getting some data regarding the attacking machines, running a simple nmap -O -vv against a random sample (five machines, I hope I am not being too agressive in anybody's eyes). They all seem to be running some flavor of Linux (according to the OS fingerprinting), but the list of open ports varies wildly — I have seen the following:

  1. Not shown: 979 closed ports
  2. PORT STATE SERVICE
  3. 21/tcp open ftp
  4. 22/tcp open ssh
  5. 23/tcp open telnet
  6. 111/tcp open rpcbind
  7. 135/tcp filtered msrpc
  8. 139/tcp filtered netbios-ssn
  9. 445/tcp filtered microsoft-ds
  10. 593/tcp filtered http-rpc-epmap
  11. 992/tcp open telnets
  12. 1025/tcp filtered NFS-or-IIS
  13. 1080/tcp filtered socks
  14. 1433/tcp filtered ms-sql-s
  15. 1434/tcp filtered ms-sql-m
  16. 2049/tcp open nfs
  17. 4242/tcp filtered unknown
  18. 4444/tcp filtered krb524
  19. 6346/tcp filtered gnutella
  20. 6881/tcp filtered bittorrent-tracker
  21. 8888/tcp filtered sun-answerbook
  22. 10000/tcp open snet-sensor-mgmt
  23. 45100/tcp filtered unknown
  24. Device type: general purpose|WAP|PBX
  25. Running (JUST GUESSING) : Linux 2.6.X|2.4.X (96%), ()
  26.  
  27.  
  28. Not shown: 993 filtered ports
  29. PORT STATE SERVICE
  30. 22/tcp open ssh
  31. 25/tcp open smtp
  32. 80/tcp open http
  33. 443/tcp open https
  34. 444/tcp open snpp
  35. 3389/tcp open ms-term-serv
  36. 4125/tcp closed rww
  37. Device type: general purpose|phone|WAP|router
  38. Running (JUST GUESSING) : Linux 2.6.X (91%), ()
  39.  
  40. Not shown: 994 filtered ports
  41. PORT STATE SERVICE
  42. 22/tcp open ssh
  43. 25/tcp closed smtp
  44. 53/tcp closed domain
  45. 80/tcp open http
  46. 113/tcp closed auth
  47. 443/tcp closed https
  48. Device type: general purpose
  49. Running (JUST GUESSING) : Linux 2.6.X (90%)
  50. OS fingerprint not ideal because: Didn't receive UDP response. Please try again with -sSU
  51. Aggressive OS guesses: Linux 2.6.15 - 2.6.26 (90%), Linux 2.6.23 (89%), (…)
  52.  
  53. Not shown: 982 closed ports
  54. PORT STATE SERVICE
  55. 21/tcp open ftp
  56. 22/tcp open ssh
  57. 37/tcp open time
  58. 80/tcp open http
  59. 113/tcp open auth
  60. 135/tcp filtered msrpc
  61. 139/tcp filtered netbios-ssn
  62. 445/tcp filtered microsoft-ds
  63. 1025/tcp filtered NFS-or-IIS
  64. 1080/tcp filtered socks
  65. 1433/tcp filtered ms-sql-s
  66. 1434/tcp filtered ms-sql-m
  67. 4242/tcp filtered unknown
  68. 4444/tcp filtered krb524
  69. 6346/tcp filtered gnutella
  70. 6881/tcp filtered bittorrent-tracker
  71. 8888/tcp filtered sun-answerbook
  72. 45100/tcp filtered unknown
  73. Device type: general purpose|WAP|broadband router
  74. Running (JUST GUESSING) : Linux 2.6.X|2.4.X (95%), (…)
  75.  
  76. Not shown: 994 filtered ports
  77. PORT STATE SERVICE
  78. 22/tcp open ssh
  79. 25/tcp open smtp
  80. 53/tcp open domain
  81. 80/tcp open http
  82. 110/tcp open pop3
  83. 3389/tcp open ms-term-serv
  84. Warning: OSScan results may be unreliable because we could not find at least 1 open and 1 closed port
  85. Device type: firewall|general purpose
  86. Running: Linux 2.6.X
  87. OS details: Smoothwall firewall (Linux 2.6.16.53), Linux 2.6.13 - 2.6.24, Linux 2.6.16

Of course, it strikes me that several among said machines seem to be Linuxes, but (appear to) run Microsoft services. Oh, and they also have P2P clients.

( categories: )

Method for locating yourself in space and in time

Submitted by gwolf on Mon, 08/31/2009 - 19:05

Today I had a nice and productive day, code-wise. Maybe that's a side effect from being unable to lose my time following E-mail?
Anyway, checking my code with git citool previous to today's git commit, I came accross this method. I didn't even pay attention to it while writing. But it did make me laugh in semi-awe thinking about the great implications it might have. The method signature:

def days_for(who, what, how)

The code itself? Naah, too pedestrian, to simplistic. It will ruin the sight. It just looks so beautifully universal!

Ok, I am compelled to share, even if it spoils it and renders it into a completely regular, even stupid method.

  1. def days_for(who, what, how)
  2. return ' —No dates set— ' unless who.has_validity_period?
  3. days_to = who.send(how)
  4. return 'Past' if days_to <= 0
  5. '%d days (%s)' % [days_to, who.send(what)]
  6. end

( categories: )

Monetizing the value of a directory hierarchy

Submitted by gwolf on Mon, 06/08/2009 - 18:25

Having recently become an Unicode (ab)user, in great part due to Kragen's .XCompose, I took again the mission to convince people that resistance is futile, you will be assimilated into the multilingual world of UTF8...

...And given the recent thread in debian-devel regarding how a globbing or similar functionality should be implemented (specifically, given Giacomo's message pointing out that our beloved «/» directory separator is subject to the locale rules)...

I cannot help but to send you to this old piece of MSDN beauty: When is a backslash not a backslash?

In short: If you are surprised because in East Asia they use the local currency to separate directories... Don't be. Blame the 8 bits of extended, non-standard ASCII codepages.

( categories: )

dh-make-drupal - Easily debianize Drupal modules, themes

Submitted by gwolf on Tue, 02/10/2009 - 14:51

I have spent a couple of days working into dh-make-drupal. Yes, you guessed right: An idea based on the wonderful dh-make-perl, but applied to the Drupal Content Management System.
Drupal's greatest strengths, IMHO, are:

  • Drupal offers a huge number of modules and themes
  • Drupal has an amazingly sane configuration handling, where -contrary to what usually happens in PHP-land and, in general, among webapps- you set up the code only at a single place, with only the site-specific configuration (usually a single file) handling all of the differences

Yup, even though I am quite fond of its flexibility and power, I fell for Drupal in no small part because of its sysadmin-friendliness.

Now, I hate having non-Debian-packaged files spilled over my /usr/share partition. Drupal modules want to be installed in /usr/share/drupal5/modules/module_name (or s/5/6/ for Drupal6, to which I have not yet migrated). For that reason, over the last year I have been growing my personal apt repository of Drupal stuff. Yes, it is still on, and I don't plan on taking it off. You can access it by adding deb http://www.iiec.unam.mx/apt/ etch drupal to your /etc/apt/sources. However, you can now also do the process locally. Do you fancy the wonderful Biblio module? Or the very nice Abarre theme? Great!

  1. 0 gwolf@mosca『4/tmp$ ~/code/dh-make-drupal/dh-make-drupal --drupal 5 biblio
  2. 0 gwolf@mosca『5/tmp$ cd drupal5-mod-biblio-1.16/
  3. 0 gwolf@mosca『6/tmp/drupal5-mod-biblio-1.16$ debuild -us -uc >& /dev/null
  4. 0 gwolf@mosca『7/tmp/drupal5-mod-biblio-1.16$ cd ..
  5. 0 gwolf@mosca『8/tmp$ su
  6. Password:
  7. 0 root@mosca[1]/tmp# dpkg -i drupal5-mod-biblio_1.16-1_all.deb
  8. Selecting previously deselected package drupal5-mod-biblio.
  9. (Reading|> database ... 275110 files and directories currently installed.)
  10. Unpacking drupal5-mod-biblio (from drupal5-mod-biblio_1.16-1_all.deb) ...
  11. Setting up drupal5-mod-biblio (1.16-1) ...

Yay!

Yes, still many more things to come (i.e. including the debuild call and whatnot), but... Enjoy!

BTW, this piece of software owes a couple of beers to Why the lucky stiff, author of Hpricot. You are insane (but we are all well aware of that). You deserve to go to the webscraping heaven. Yes, besides the programming-languages-teaching-cartoon heaven. You find out how to split the time between them.

[Update]: Of course, ITP bug #514786 has been filed, and I will soon be uploading this into Debian.

( categories: )

Hyperdimensional strings

Submitted by gwolf on Wed, 01/14/2009 - 17:41

I am stunned no more people have been bitten by this. Or at least, the Intarweb has not heard about it. Censorship perhaps? I haven't researched more into the causes, but anyway...
I was pushing a project I have had lingering for some time from Rails 2.0.x to 2.1.x (yes, 2.2 is already out there, but 2.1 is the version that will ship with Lenny) - The changes should not be too invasive, as it is a minor release, but there are some quite noticeable changes.
Anyway... What was the problem? Take this very simple migration:

  1. class CreatePeople < ActiveRecord::Migration
  2. def self.up
  3. create_table :people do |t|
  4. t.column :login, :string, :null => false
  5. t.column :passwd, :string, :null => false
  6. t.column :firstname, :string, :null => false
  7. t.column :famname, :string, :null => false
  8. t.column :email, :string
  9.  
  10. t.column :pw_salt, :string
  11. t.column :created_at, :timestamp
  12. t.column :last_login_at, :timestamp
  13. end
  14. end
  15.  
  16. def self.down
  17. drop_table :people
  18. end
  19. end

The problem is that PostgreSQL refuses to create a hyperdimensional string field. I offer this here to you, line-wrapped by me for your convenience.
  1. PGError: ERROR: syntax error at OR near "("
  2. LINE 1: ...serial PRIMARY KEY, "login" character varying(255)(255) NOT ...
  3. ^
  4. : CREATE TABLE "people" ("id" serial PRIMARY KEY,
  5. "login" character varying(255)(255) NOT NULL,
  6. "passwd" character varying(255)(255)(255) NOT NULL,
  7. "firstname" character varying(255)(255)(255)(255) NOT NULL,
  8. "famname" character varying(255)(255)(255)(255)(255) NOT NULL,
  9. "email" character varying(255)(255)(255)(255)(255)(255) DEFAULT NULL NULL,
  10. "pw_salt" character varying(255)(255)(255)(255)(255)(255)(255) DEFAULT NULL NULL,
  11. "created_at" timestamp DEFAULT NULL NULL, "last_login_at" timestamp DEFAULT NULL NULL)

Beautiful. Now I can store strings not only as character vectors, but as planes, cubes, hypercubes, and any other hyperdimensional construct! Are we approaching quantum computers?
What is really striking is that... I found only one occurrence on tha net of this bug - one and a half years ago, in Ola Bini's blog. No stunned users looking for the culprit, no further reports... Strange.
Still, the bug was fixed in Rails 2.2 about half a year ago, although not in revisions of earlier versions. I will request the patch to be applied to earlier versions as well. Sigh.

( categories: )

My git tips...

Submitted by gwolf on Sat, 12/13/2008 - 19:35

Ok, so a handy meme is loose: Handy Git tips. We even had a crazy anatidae requesting us to post this to the Git wiki whatever we send on this regard to our personal blogs.
Following Damog's post, I will also put my .bashrc snippet:

  1. parse_git_branch() {
  2. branch=`git branch 2> /dev/null | sed -e '/^[^*]/d' -e 's/* \(.*\)/\1/'`
  3. if [ ! -z "$branch" ]
  4. then
  5. if ! git status|grep 'nothing to commit .working directory clean' 2>&1 > /dev/null
  6. then
  7. branch="${branch}*"
  8. mod=`git ls-files -m --exclude-standard|wc -l`
  9. new=`git ls-files -o --exclude-standard|wc -l`
  10. del=`git ls-files -d --exclude-standard|wc -l`
  11. if [ $mod != 0 ]; then branch="${branch}${mod}M"; fi
  12. if [ $new != 0 ]; then branch="${branch}${new}N"; fi
  13. if [ $del != 0 ]; then branch="${branch}${del}D"; fi
  14.  
  15. fi
  16. fi
  17. echo $branch
  18. }

This gives me the following information on my shell prompt:

  • The git branch where we are standing
  • If it has any uncommitted changes, a * is displayed next to it
  • If there are changes not checked in to the index, M (modified), N (new) or D (deleted) is displayed, together with the number of files in said condition. i.e.,

    Sometimes, entering a very large git tree takes a second or two... But once it has run once, it goes on quite smoothly.
    Of course, I still have this also in .bashrc - but its funcionality pales in comparison:
    1. get_svn_revision() {
    2. if [ -d .svn ]
    3. then
    4. svn info | grep ^Revision | cut -f 2 -d ' '
    5. fi
    6. }

    I am sure it can be expanded, of course - but why? :)
( categories: )

githubredir.debian.net - Delivering .tar.gz from Github tags

Submitted by gwolf on Wed, 12/10/2008 - 14:03

There is quite a bit of software whose upstream authors decide that, as they are already using Git for development, the main distribution channel should be GitHub - This allows, yes, for quite a bit of flexibility, which many authors have taken advantage of.

So, I just registered and set up http://githubredir.debian.net/ to make it easier for packagers to take advantage of it.

Specifically, what does this redirector make? Given that GitHub allows for downloading as a .zip or as a .tar.gz any given commit, it suddenly becomes enough to git tag with a version number, and GitHub magically makes that version available for download. Which is sweet!

Sometimes it is a bit problematic, though, to follow their format. Github gives a listing of the tags for each particular prooject, and each of those tags has a download page, with both archiving formats.

I won't go into too much detail here - Thing is, going over several pages becomes painful for Debian's uscan, widely used for various of our QA processes. There are other implemented redirectors, such as the one used for SourceForge.

This redirector is mainly meant to be consumed by Debian's uscan. Anybody who finds this system useful can freely use it, although you might be better served by the rich, official GitHub.com interface.

Anyway - Enough repeating what I said on the http://githubredir.debian.net/ base page. Find it useful? Go ahead and use it!

( categories: )

Apt-get and gems: Different planets, right. But it must not be the war of the worlds!

Submitted by gwolf on Mon, 12/08/2008 - 23:57

Thanks to some unexplained comments on some oldish entries on my blog, I found -with a couple of days of delay- Rubigem is from Mars, Apt-get is from Venus, in Pelle's weblog. And no, I have not yet read the huge amount of comments generated from it... Still, I replied with the following text - And I am leaving this blog post in place to remind me to further extend my opinions later on.
Wow... Quite a bit of comments. And yes, given that the author wrote a (very well phrased and balanced) post, I feel obliged to reply. But given that he refered to me first, I'll just skip the chatter for later - I'm tired this time of day ;-)
Pelle, I agree with you - This problem is because we are from two very different mindsets. I have already said so - http://www.gwolf.org/soft/debian+rails is a witness to that point.
But I do not think the divide is between sysadmins and developers. I am a developer that grew from the sysadmin stance, but that's not AFAICT that much the fact in Debian.
Thing is, in a distribution, we try to cater for common users. I have a couple of Rails apps under development that I expect to be able to package for Debian, and I think can be very useful for the general public.
Now, how is the user experience when you install a desktop application, in whatever language/framework it is written? You don't care what the platform is - you care that it integrates nicely with your environment. Yes, the webapp arena is a bit more difficult - but we have achieved quite a bit of advance in that way. Feel like using a PHP webapp? Just install it, and it's there. A Python webapp? Same thing. A Perl webapp? As long as you don't do some black magic (and that's one of the main factors that motivated me away from mod_perl), the same: Just ask apt-get to install it and you are set.
But... What about installing a Rails application? From a package manager? For a user who does not really care about what design philosophy you followed, who might not even know what a MVC pattern is?
Thing is, distributions aim at _users_. And yes, I have gradually adopted a user's point of view. I very seldom install anything not available as a .deb - and if I do, I try to keep it clean enough so I can package it for my personal use later on.
Anyway... I will post a copy of this message in my blog (http://gwolf.org/), partly as a reminder to come back here and read the rest of the buzz. And to go to the other post referenced here. And, of course, I invite other people involved in Ruby and Debian to continue sharing this - I am sure I am not the only person (or, in more fairness, that Debian's pkg-ruby-extras team is not the only team) interested in bridging this huge divide and get to a point we can interact better - And I am sure that among the Rubyists many people will also value having their code usable by non-developers as well.

( categories: )

Virtually having fun

Submitted by gwolf on Wed, 07/23/2008 - 13:50

Several weeks ago, the people in charge of maintaining the Windows machines in my institute were desperate because of a series of virus outbreaks - Specially, as expected, in the public lab - but the whole network smell virulent. After seeing their desperation, I asked Rolman to help me come up with a solution. He suggested me to try replacing the Windows workstations by substituting local installations by a server having several virtual machines, all regenerated from a clean image every day, and exporting rdesktop sessions. He suggested using Xen for this, as it is the virtualization/paravirtualization solution until now best offered and supported by most Linux distributions (including, of course, RedHat, towards which he is biased, and Debian, towards I am... more than biased, even bent). So far, no hassle, right?
Of course, I could just stay clear of this mess, as everything related to Windows is off my hands... But in October, we will be renewing ~150 antivirus licences. I want to save that money by giving a better solution, even if part of that money gets translated to a big server.
Get the hardware
But problems soon arose. The first issue was hardware. Xen can act in its paravirtualization mode on basically any x86 machine - but it requires a patched guest kernel. That means, I can paravitualize many several different free OSs on just any computer I lay my hands on here, but Windows requires full- or hardware-assisted- virtualization. And, of course, only one of the over 300 computers we have (around 100 of which are recent enough for me to expect to be usable as a proof-of-concept for this) has a CPU with VT extensions - And I'm not going to de-comission my firewall to become a test server! ;-)
When software gets confused for hardware
So, I requested a Intel® Core™2 Quad Q9300 CPU, which I could just drop in any box with a fitting motherboard. But, of course, I'm not the only person requiring computer-related stuff. So, after pestering the people in charge for buying stuff on a daily basis for three weeks, the head of acquisitions came smiling to my office with a little box in his hands.
But no, it was not my Core 2 Quad CPU.
It was a box containing... Microsoft Visio. Yes, they spent their effort looking for the wrong computer-related thingy :-/ And meanwhile, Debconf 8 is getting nearer and nearer. Why does that matter? Because I have a deadline: By October, I want the institute to decide not to buy 150 antivirus licenses! Debconf will take some time off that target from me.
Anyway... The university vacations started on July 5. The first week of vacations I went to sweat my ass off at Monterrey, by Monday 14 I came back to my office, and that same day I finally got the box, together with two 2GB DIMMs.
Experiences with a nice looking potential disaster
Anyway, by Tuesday I got the CPU running, and a regular Debian install in place. A very nice workhorse: 5GB RAM, quad core CPU at 2.5GHz, 6MB cache (which seems to be split in two 3MB banks, each for two cores - but that's pure speculation from me). I installed Lenny (Debian testing), which is very soon going to freeze and by the time this becomes a production server will be very close to being a stable release, and I wanted to take advantage of the newest Xen administration tools. Of course, the installation was for AMD64 - Because 64 bitness is a terrible thing to waste.
But I started playing with Xen - And all kind of disasters stroke. First, although there is a Xen-enabled 2.6.25 Linux kernel, it is -686 only (i.e. no 64 bit support). Ok, install a second system on a second partition. Oh, but this kernel is only domU-able (this is, it will correctly run in a Xen paravirtualized host), but not dom0-able (it cannot act as a root domain). Grmbl.
So, get Etch's 2.6.18 AMD64 Xen-enabled kernel, and hope for the best. After all, up to this point, I was basically aware of many of the facts I mentioned (i.e. up to this point I did reinstall once, but not three times)... And I hoped the kernel team would have good news regarding a forward-port of the Xen dom0 patches to 2.6.25 - because losing dom0 support was IMO a big regression.
But quite on time, this revealing thread came up on the debian-devel mailing list. In short: Xen is a disaster. The Xen developers have done their work quite far away from the kernel developers, and the last decent synchronization that was made was in 2.6.18, over two years ago. Not surprisingly, enterprise-editions of other Linux distributions also ship that kernel version. There are some forward-patches, but current support in Xen is... Lacking, to say the least. From my POV, Xen's future in the Linux kernel looks bleakish.
Now, on the lightweight side...
Xen is also a bit too complicated - Of course, its role is also complicated as well, and it has a great deal of tunability. But I decided to keep a clean Lenny AMD64 install, and give KVM, the Kernel Virtual Machine a go. My first gripe? What a bad choice of name. Not only Google searches for KVM gives completely unrelated answers (to a name that's already well known, even in the same context, even in the same community).
KVM takes a much, much simpler approach to virtualization (both para- and full-): We don't need no stinkin' hypervisors. The kernel can just do that task. And then, kvm becomes just another almost-regular process. How nice!
In fact, KVM borrows so very much from qemu that it even refers to qemu's manpage for everything but two command-line switches.
Qemu is a completely different project, which gets to a very similar place but from the other extreme - Qemu started off as Bochs, a very slow but very useful multi-architecture emulator. Qemu started adding all kinds of optimizations, and it is nearly useful (i.e. I use it in my desktop whenever I need a W2K machine).
Instead of a heavyweight framework... KVM is just a modprobe away - Just ask Linux to modprobe kvm, and kvm -hda /path/to/your/hd/image gets you a working machine.
Anyway - I was immediatly happy with KVM. It took me a week to get a whole "lab" of 15 virtual computers (256MB RAM works surprisingly well for a regular XP install!) configured to start at boot time off a single master image over qcow images.
KVM's shortcomings
Xen has already been a long time in the enterprise, and has a nice suite of administrative tools. While Xen depends on having a configuration file for each host, KVM expects them to be passed at the command line. To get a bird-eye view of the system, xen has a load of utilities - KVM does not. And although RedHat's virt-manager is said to support KVM and qemu virtualization (besides its native Xen, of course), it falls short of what I need (i.e. it relies on a configuration file... which lacks expresivity to specify a snapshot-based HD image).
To my surprise, KVM has attained much of Xen's most amazing capabilities, such as the live migration. And although it's easier to just use fully virtualized devices (i.e. to use an emulation of the RTL8139 network card), as they require no drivers extraneous to the operating system, performance can be greatly enhanced by using the VirtIO devices. KVM is quickly evolving, and I predict it will largely overtake Xen's (and of course, vmware and others) places.
Where I am now
So... Well, those of us that adopt KVM and want to get it into production now will have some work of building the tools to gracefully manage and report it, it seems. I won't be touching much my setup until after Debconf, but so far I've done some work over Freddie Cash's kvmctl script. I'm submitting him some patches to make his script (IMHO) more reliable and automatizable (if you are interested, you can get my current version of the script as well). And... Starting September, I expect to start working on a control interface able to cover my other needs (such as distributing configuration to the terminals-to-be, or centrally managing the configurations).

Syndicate content