Coding

warning: Creating default object from empty value in /home/gwolf/drupal6/modules/taxonomy/taxonomy.pages.inc on line 33.

This blog has not seen activity for a full month

Submitted by gwolf on Sun, 10/21/2007 - 23:37

That does not mean I'm dead yet, of course - It just means I've been too tied up with real life. Yes, I do feel to be close-to-MIA in Debian as in most other projects I work in out of personal interest... It's that bureaucratic point in year where we have to get all of the paperwork sorted out in order to enjoy another year working in the University. And, of course, it's a huge PITA - but all in all, it's worth it.

That point by itself is not the only reason why I was drawn away from my usual posting, mind you - My blog was broken for some time because of a b0rken table in MySQL... And it has been at least three weeks since I noticed, but I only got around to fix it today. And, BTW, how come did I fixed it? Because I'm working on a migration tool. Migration? What? Why?

Back in 2004, when I started this blog, I chose JAWS as the software to run it on. It was started by a group of good friends of mine, and they keep developing it. It is a very nice piece of work... But I've had several problems with it (and yes, I have not bothered to submit bug reports - shame on me. Anyway, I think they are related to how I made some mistakes while upgrading), and I do miss some bits of functionality... So, yes, I'm migrating away to Drupal. After all, I'll be setting up several sub-sites at my real-life work with Drupal, so I'd better get familiar with it... And, of course, as migrating a three-year-old blog is not easy (and even less migrating four sites I have lying around), so today I've been working on a nice migration script covering the components I use (blog, comments and photos). I expect it to be ready for prime-time after one more session (which I don't know when will happen).

Anyway, Jaws guys: You do rock. Thanks a lot for all the fish!

( categories: )

BLOBs in MySQL: Binary Laughable OBjects?!

Submitted by gwolf on Thu, 09/13/2007 - 08:00
Bah, MySQL keeps insisting on being a fun way to waste your time.
One of my clients hosts its systems at Dreamhost, a quite nice hosting company, which has... Well, a couple of strange details :-/
Anyway, people who work with databases know that a BLOB (or its equivalent) is the right datatype for storing images or, in general, files, right? After all, BLOB is just an acronym for Binary Large OBject. And if somebody has a BLOB field in the DB and searches on it, most RDBMS-bound programmers will at very least chuckle at a design flaw - BLOBs are just to be stored and retrieved, that's it.
Well, the system I wrote for this client uses a BLOB field - Ok, to be clear: I just declared it as a :binary in the corresponding migration. That should do the trick, and should allow me to keep my system RDBMS-independent, right? So I can still work on my development system using nice-and-trusty PostgreSQL
Anyway... My client said some documents were being corrupted. And that is Not NiceTM... Documents were being truncated at 63Kb. The bug was consistent across the three instances of my system in Dreamhost, but could not be reproduced at my machines - It didn't take too long to find out it must be the DB (and that's partly because I don't really trust MySQL :) ).
Turns out that, in MySQL speak, there are four types of blobs: TINYBLOB, BLOB, MEDIUMBLOB, and LONGBLOB. Forgoodnessake, WHY?! The stupidiest one is the tinyblob: A Binary Large OBject limited to 8 bits (255 characters). Regular blobs are 16 bit long (65535 characters - Aha! That's your 63.99Kb limit!). Mediumblobs are 24 bits (16 million), and longblobs are 32 bits (4000 million). And, of course, the blob datatype is not large at all.
Bah.
Anyway... At least this is an open and known issue already to the Rails people, and I do expect them to change the migration equivalencies to something saner.
Bah again.
( categories: )

Database abstraction, ORMs and RDBMS-agnostic coding

Submitted by gwolf on Wed, 09/05/2007 - 13:31
I came across John Wang's posting titled Database Abstraction - code vs infrastructure. In it, he talks about the problems many people face when migrating applications tied to a specific RDBMS to another one. He recommends:
One solution of modern programming is to move database abstraction from the code to the infrastructure using a ORM (Object-Relational Mapper) or Data Mapper. A ORM and Data Mapper abstracts the database for you so you no longer have to do tie db abstraction to each app. Not only does it let you code once for multiple databases it lets your users migrate their data from one database to another. This blog runs Typo which is based on Ruby on Rails and ActiveRecord. I've been contemplating migrating Typo from MySQL to PostgreSQL and I've been told that it would be as simple as exporting the data with YAML, updating the database.yml file and importing the data.
Umh... Thing is, you often want to use your DB as way more than a data store - At least, when using Postgres, I do. There are many things that can be done inside the DB (such as guarding the programmer against his own mistakes by keeping sanity checks which often go beyond just referential integrity. Constraints and triggers. Oh, and of course, user-defined functions and views. Of course, most times you can go by without using several of such tools - but when they would come in handy, you can sorely miss them.
On the other hand, John mentions (right after the previous paragraph):
ActiveRecord is a data mapper and isn't as flexible as a full blown ORM but it gets the job done for the most part. For a full-blown ORM, I think of Perl's DBIx::Class which provides a full OO interface to the RDBMS allowing you to code just once for multiple DBs without limiting you when you want to use some esoteric database-specific SQL
Currently, I'm among the many that have jumped on the Rails bandwagon - And so far, I really like it. But yes, ActiveRecord -although great for some tasks- is way under what I'd like for many actions I've got used to making.
I might be seen as a backwards guy for this, but until before I started with Rails, I prefered not to use full ORMs (not that I tried too hard anyway - Probably starting a project with DBIx::Class would be all it takes for me to become a convert, from what I've read), but to do the SQL myself - Only, of course, keeping well in mind that I should separate front-end from back-end (or using other jargon to say the same, separating cleanly what should be in a Controller and what should be in a Model - Of course, no one in their right mind would put SQL on a template or a View!). It's not so hard as many people seem to think, although mixing syntaxes for two different languages (and even more if you, as I do, like to keep your code under 80 columns) is sometimes dirt-ugly. Still, on to my last bit of rant:
There are PHP frameworks out there like Symfony and Cake but do any of them have stand-alone ORMs? If so, could Drupal move to something like that and solve their maintainership problems once and for all? Drupal is part of the Go PHP5 effort so there should be no issue using PHP 5 OO. Something to think about for the Drupal folks if a PHP ORM is available
Umh... Even if there are some great ORMs in PHP... Choosing an ORM is something you do when you start a project. I doubt the Drupal people can now just decide to move over to using a ORM, as they potentially have hundreds or thousands of points where they have DB interaction. And although for most of them moving over to an ORM should even be automatizable, every ORM has some very special features which make it damn hard to craft several kinds of query.
( categories: )

At YAPC::EU 2007!

Submitted by gwolf on Wed, 08/29/2007 - 05:45
So here I am, sitting at a talk at YAPC::EU 2007, in the beautiful Vienna. It is too early to start thinking about a status report for what I've done and talked about, but I can say in advance that I'm very positively impressed - I expected my talk on the integration between CPAN and Debian that the Debian pkg-perl group is carrying out (presentation, full article) to be marginally interesting to a couple of people. Turns out, as YAPC drew closer, I heard from several people interested in attending it. Ok, maybe it is just for courtesy? Lets not get excited about no big deal...
Well, yesterday was the first day of CPAN - Man, after getting used to Debconf, three days of an interesting conference is just way too short! YAPC guys, for the next time... More time, please? :-}
I knew I would be meeting pkg-perl members Zamoxles and Jeremiah - Very nice to finally meet you guys! And a very welcome surprise (although we have been barely even able to talk) was to find another fellow DD, Daniel Ruoso, one of the agitators that started the pkg-perl group. He is not active (in the group) anymore, but he is still one of The Patriarchs ;-)
Ok, on with the talk: Contrary to what I expected, the talk room was quite full. I am used to longer talk slots (20 minutes is just about enough to spell my name, damnit! I presented 24 slides, and jumped only two of them - But made it just in time for the very strict Austrian staff to prepare but not wave the __END__ signal ;-) Would they show a die 'argh'; afterwards?), so I had to keep the limit from the very beginning.
Of course, there was no time for questions as part of the session. However, I've since then been approached by several people and discussed several aspects of our work and ideas for the future. I'll post more about this after YAPC is over...
My warmest kudos to the very orangy, mohawk-wearing orga team. Not only they came up with a great conference and are invariably good-mooded and nice (hey, I should learn from them!), they even have the presence of mind to be very nice, to go out with us to have a good time yesterday night!
Anyway, it shows the conference's topic is Social Perl. Nice social geeks are no longer a novelty to me, but still: The Perl community is warm, welcoming and in general, very nice. I'm quite happy to have made it here!

Aggressive bug reports leading to a good answer (or so I hope)

Submitted by gwolf on Fri, 08/17/2007 - 16:47
Joey complains about users filing aggressive or otherwise inappropriate bug reports. Well, in this case I must say I bit the bullet.
Yesterday, somebody complained about the maintenance statuse of libapache2-mod-perl2. Usually, I would have sadly nodded and said that, yes, since the API change between mod_perl for Apache 1.x and mod_perl2 for Apache 2.x (which was a long time ago - with perfect timing not to be accepted in the last weeks of Sarge's hard-freeze period, causing many of us to have to maintain two or even three versions of our code depending on the API used - But that's a different story I won't go into details now). But this time, the bug report was sent with Severity: critical.
I replied to this report, intending just to lower its priority and get the attention of the maintainers by sending it to a couple of Debian lists - But soon afterwards, it became obvious that I was probably the person most interested in this package in Debian - At least, me and some other pkg-perl fellows who -as foolishly as myself- volunteered to step forward when needed.
Well, in short: I am now the proud owner of one of the packages most vital for many of the systems I've written for my work. It's also more complex than most of the other packages I maintain. From an undermaintained and quite full of warnings build process and resulting package set, I managed to make it linda- and lintian- clean, and... Well, now we have to dig through its open bugs. Its build system is far from orthodox from a Perl point of view, and that took me most of the day, but hey - TIMTOWTDI, right? :)
( categories: )

It still gets to my nerves...

Submitted by gwolf on Fri, 07/06/2007 - 21:46
I've gradually become happier and happier with Ruby on Rails. I've got mostly past the constant "WTF?WTF?WTF?" phase that's so frustrating when you try to understand the magic behind the scenes (after all, I like understanding what happens inside a framework - the bang and the bling are not what lured me into Rails).
So I've tried to become a more idiomatic, more complete. In the last project I started, I decided to -gasp- stop declaring my schema structure in SQL, and use migrations instead. All fine until I reached migration 008... The first one including a table modification (adding a admin column for users) instead of a creation. Well, that one went pretty fine, but 009 was also a table modification (adding an order column for tsgs). And I updated my server - with both migrations at the same time.
Bad news: Tsg has_many :users, and every User validates_associated :tsg as it should. And as migration 8 modifies (and saves, of course) a User, this validation was triggered... So, migration 8 died complaining about an undefined method `order' for #<Tsg:0x2b2d2dd8d0d0>. WTF?WTF?WTF? all over again. It took me quite a while to realize.
What's the moral of the story? That a sleepy programmer can and will mess up. That batching up several migrations does not guarantee success, because the model is already up to the newest version. Crap. :-/
Or maybe it will just be safer if I abstain from directly instantiating models in migrations? umh... Does not feel right :-/
( categories: )

When bad system design leads to pain...

Submitted by gwolf on Thu, 05/17/2007 - 10:22
A long time ago, I wrote the system that still manages the Cuerpo Académico Historia del Presente group in the Universidad Pedagógica Nacional. Yes, I'm happy a good portion of my project, which took me over a year of work... But I must admit a nice deal of shame as well.
Of course, it comes from not properly understanding the domain data and information volume my system would be working with - and coming up with a stupid way to implement searches. I won't get too much in detail because, even if you had access to the full search facility in the system (no, it's not available for the general public), I would not like a swarm of curious people to make last week's events come back... Anyway, the group works by daily filling in tens or hundreds of articles in the system, and having some interesting search sessions every couple of months.
I knew the performance problem was caused by an inefficient searching mechanism (explicitly, category exclusion is the prime killer). I knew loadavg jumped through the roof, memory usage did so as well... But it was not until some weeks ago we installed the mighty Munin on the machines at UPN that we got this jewel - Thanks, Victor, for putting the graphics somewhere they can be shown! ;-)
So... How much does memory usage increase during searches?

Whoa. The system has 640MB real RAM. It has as well 1GB swap. Don't ask me how the hell it reports it was using ~2GB swap - but still... And how is our load average?

Have you ever seen a (single CPU, Pentium 4 1.7GHz) Linux system with a loadavg of 80?! For those who don't know, loadavg gives you the general status on how many jobs are pending scheduling by the CPU. 1 means that all of the CPU's time during a specified timeframe was used (and, on single-core systems, it's the optimal usage level). On this machine, things start getting uncomfortable at 6 or 7. I had never before seen values even half this large.
Sigh... Well, in my defense, I must say I've warned them about this problem for over two years. My contract with them has long passed - I've repeatedly recommended them to hire somebody to fix it. So far, they have not.
( categories: )

tbm's Release Management talk

Submitted by gwolf on Wed, 04/25/2007 - 19:09
Thanks to Romain Francoise, I found and watched Martin Michlmayr's Release Management in Large Free Software Projects talk on Google Video's Open Source Speaker series. Martin: Thanks a lot, great talk. I've been following your presentations lately, as I've given some talks on this topic - Quality Assurance on Free Software Projects (Spanish only) - However hard I try to remain faithful to the subject, I end up giving a talk on what Free Software is and how its processes are naturally more prone to yielding better quality than propietary projects.
Anyhow, with this post I want to do basically three things:
  • To cheer Martin for writing and presenting the topic. I've been refering to your work for a long time, and although my presentations are not formally papers and thus have no quotation value (in fact, they often don't have the references as anything more than comments in the source files, it at all), I should thank you frontally and publicly.
  • To get some more attention to the topics presented. Your point is very interesting and important to be taken into account specifically for Debian: I'm not 100% with you regarding that time-based releases is the way to go, but the point you make when discussing GCC (IMHO) is an important one: A project can shift to time-based releases even if it does not release on time, and its general inner processes can become quite cleaner. I do think we witnessed this with Etch in Debian - The process was much more believable and peaceful (even with all the release-related mudfights) than with Sarge. I think it partly stems from what you point out: We aimed to release in December, and we failed to precisely meet the target - but having the date helped set the release goals to something believable, and not let Sid's unstability drift too far away. During Sarge's testing cycle, for a long time, few of us even cared to use testing at all, as it was... Something completely undefined.
  • To get some insight on some points. Martin, in several of the projects you analyzed, there is a tendency to get to a six months release cycle. Many other projects also follow that cycle - There is a time-based project you didn't mention that is to me one of the inspirations of such model, and has proven to work for over a decade: OpenBSD. Yes, it lacks in many areas, and definitively not everybody can aspire to become an OBSD developer, but they have done high-quality six-month releases since they exist. However, speaking specifically for Debian: Would you like our project to follow such a schedule? For many reasons (including one you mentioned, the difficulty to support several concurrent versions, or just having oldstable versions be left too early with no support) I think we should aim for a 18 month cycle, 12 at the very least. Either I lacked attention when you went over that part or you didn't mention it - but why do you think such a short cycle is good for so many projects? What would you like Debian to do? (Of course, for answering this, you might be Martin, or somebody else entirely ;-) )
( categories: )

Won a book for YAPC::Europe!

Submitted by gwolf on Thu, 04/05/2007 - 15:26
Wow!
I got a mail from the YAPC::Europe organizers telling me I won a book for registering early and sending in a submission (as I have previously told you). I was even more surprised to find out I am one out of two lucky winners! So my new book is Es lebe der Zentralfriedhof.
No news yet on whether my talk (Integrating Perl in a wider distribution: The Debian pkg-perl group) will be accepted... But this kind of incentive does push me towards attending even if I am not accepted - Of course, it depends on the University sending me there. But anyway, I'm a step closer to Vienna. Somebody wants to join over there?
Oh, and by the way: On my previous posting on this topic I linked to my conference proposal URL. Little did I know that this URL is private, accessible only to the Academic Committee and me. Yes, different from what I'm used to... but that's the way it works there.
( categories: )

Reworking .deb - Does debian/rules really need to be a makefile?

Submitted by gwolf on Tue, 03/13/2007 - 11:17
Eddy Petrisor wrote quite an interesting text about the shortcomings of the .deb packaging format, specially comparing it to Gentoo's ebuilds. And, basically, it all comes down to this phrase:
Why this is not possible for deb right now? Simple, we have it as a rule in the policy that the debian/rules file is a makefile. So even if one would implement a class-like model for deb packages, you'd still have the debian/rules file as a make file
Now... Is debian/rules really expected to be a makefile, or it is just customary for it to be so? Look at the very top of your rules files - you will see they (at least, almost) always start with #!/usr/bin/make -f - That means, of course, that you can omit the fact they are makefiles. While packaging/debugging, I often run -say- fakeroot debian/rules build && fakeroot debian/rules clean. That, my friend, is closer to the invocation you often use for a shell script than for a makefile. I don't know if we have tools that rely on having rules called via make (and that should be easy to correct if needed), but I really don't see it problematic at a first glance to create packages based on something different than a makefile. Recently, I've been tempted by CDBS. I still don't fully understand its flow, and it's still mostly a dark-magic beast for me, but at least I am comfortable using it for my everyday packaging work (hey, pkg-perl group, I'll be bugging you again with my weird ideas soon ;-) ), but it surely has the advantages you quote in your message: It takes part of the complexity away. Of course, it introduces some extra bits. By using CDBS, the packaging entry level is considerably lowered - but the real understanding of properly maintaining a package becomes somewhat more difficult. Or is it just me clinging to the comfort of having learned my way around writing debian/rules?
( categories: )

More on the unkillable XML-for-configuration rant

Submitted by gwolf on Mon, 01/15/2007 - 20:55
In short, Erich says that XML, plus the right editor, Just Works(tm). Well, yes. But when you are over a slow link, or when you are desperate with a b0rken system, you just don't have Eclipse at hand to edit a config file. Of course, you could use the half-existing XML support you talk about in vim (I have not tested it, cannot be sure of it), but it is still a PITA if your /usr is not working fine or if your termcap is too dumb to manage. Yes, there are each time less of those situations, but anyway... I won't start ranting on how YAML is the right tool for every situation where XML is used - It's clearly not. XML is, after all, a standard. Some configurations can be done by XML, say, if you have any of those Java frameworks (I've only suffered^Whated^Wgot despaired^W^Wset up JBoss), but still... Configuration files, at least the important ones, should be editable by using a lightweight, easy and available tool like nvi, pico, or even cat|sed. Oh, and about YAML's site being valid YAML: Of course, it only looks like it. But cut and paste it - It works for me :) Of course, it is not meant to replace or work over HTML. I would never dream of using YAML as a web-services language or anything of that sort. There are better tools for that. But please, leave config files hand-editable. With common, light and hard-to-break editors.
( categories: )

SmbGate: Almost entirely not frustrating

Submitted by gwolf on Thu, 12/07/2006 - 16:11
I've been working a bit over a week on writing SmbGate, a simple and quite braindead Web app giving my users web access (read-only for now at least) their home shares in a Samba server from outside the Institute, which will be basically closed for vacations/moving to a new building for over a month. It went quite smoothly. Even using a quite ugly API (Filesys::SmbClient - It works, but in an ugly fashion), getting the basic app to work took me only two days, and I've been beautifying bits of it for around a week. I even got around to write a user manual, which -to my surprise and astonishement- has been followed by the users. Wow, I'm productive! I even think this can be useful to other people, so I'll put the code online soon - As soon as I get the workplace-specific things weeded out to a configuration file. Of course, everything has its ups and downs. Yesterday, I found a bug. Today, a user reported the bug to me. And, of course, it seems to depend on MSIE's weirdness. I really, really hate my users experiencing browser incompatibilities - That's why I installed W2k under qemu (which, when used with the non-free but downloable with no fee required kqemu kernel module is perfectly speed-comparable with the completely-non-free VMware - Go try qemu now!). I tested thoroughly the system from the guest W2k system to my development machine (which is, incidentally, the same physical box), and it worked perfectly. Of course, locally, I didn't care about setting it up in a SSL-protected area. For my users, of course, access to their files is SSL-protected. I tested the production system from Linux, using Firefox. Works like a charm. So, why am I bitching? Because browsing the directories works correctly from MSIE, but downloading the files doesn't (it says, in such a Spanish that I don't really understand the error message, that the file does not exist or the site is unavailable). Of course, debugging a HTTP request over a SSL session is not feasible. I installed an instance of this system in my regular unencrypted HTTP server - But, surprise surprise, it now works fine under MSIE. Exactly the same URL, only with the https replaced by http So... I am almost entirely non frustrated. I have hit a bug which does not like being debugged. Joy, joy. But, I promise, victory will be mine.
( categories: )

Of broken promises and fixed websites

Submitted by gwolf on Thu, 11/16/2006 - 16:38
Over five years ago, I wrote a very simple web-based system. This system, however, did some basic client-side (Javascript, of course) validations before sending the data off to the server. And it worked nicely. On Linux, of course. The system went live. It worked correctly less than 1/10th of the time. Yes, somewhat strangely, quite close to the ratio of Netscape/MSIE users. Yes, a Javascript coding bug. The embarassment made me swear not to get close to Javascript ever, ever again. Of course, we live in a world where idle loops get optimized and where infinite loops have an ETA, this had to change at some point. Earlier this week, I decided to unfuck a web layout that worked (again) correctly in Mozilla and KHTML, but horribly in MSIE. I didn't care before, because this layout was used on a production system at work, but its users were only two colleagues and myself - Only I'm about to put a public module up. I re-did the site layout and CSS (I cannot believe Dreamweaver code is that ugly!)... The only problem was, I now know, quite common: I needed equal height CSS-made columns. And although I had come to several pseudo-solutions, they all appeared pseudo-b0rked in one or more pseudo-browsers. The only way I found to get it working was to free myself from prejudices and go back to Javascript. BTW, the Javascript X library looks quite handy - but at over 50k, it's not something I'm terribly happy about including in a website. What's next? Am I going to fall for coding over-AJAXy sites? I hope to maintain at least partial sanity.
( categories: )

FUSE vs. GnomeVFS?

Submitted by gwolf on Tue, 08/15/2006 - 09:11
Aigars: I agree with Womble's comment in your blog. Maybe GnomeVFS is just way too much? Maybe it could be substituted by an on-demand FUSE-based mounter and unmounter? It seems to me it'd be saner to get all the relevant file manager GUIs (or plainly UIs, maybe even some overpowered shells) to be able to interpret an URL request as just a call to said script and a local filesystem operation? Mounting via FUSE in a protected, per-user area, and then just unmounting after a given inactivity timeout. Yes, I know GnomeVFS is able to do all that and more. But as it's always the case with Gnome and me: I doubt most of the times you need all the "all that". And probably there are saner ways to implement it than via yet-another-layer-for-yet-another-already-solved-thing.
( categories: )

1st Debianmexico bug squishing party!

Submitted by gwolf on Sun, 08/13/2006 - 21:13
We had a very nice day of work, following Rodrigo's BSP invitation to the debianmexico list. The day started at 10 AM, when Rodrigo and I arrived to Nul-unu. After a small time setting up some cables and some coffee, we sat down and started working our way through the BTS. Soon afterwards the rest of the crew arrived. We were seven people. We spent the day not only bug-squishing, but giving informal talks and one-to-one lectures on how Debian's processes work, on how the BTS (and the BSPs) work. The bug list is not impressive, and they were not very difficult bugs to fix, but it's a very good step towards establishing a working, strong Debian community in Mexico. We closed bugs #379589, #382715, #374663, #380872, #382399, #382322, #382096, #335765, #368745, #368207, #382039 and #381130. There is a document in preparation, as Ángel felt the need for a simple process documenting how to fix bugs, and there is even a video, that Toño promised to have ready soon, for some definition of soon. ...Nice chat, nice company, entertaining Sunday. And nice way to wish a happy birthday to Debian, even if it is a couple of days in advance.
( categories: )
Syndicate content