Security in Perl scripts

Jesus Alejandro Juárez Robles
alex@campus.iztacala.unam.mx
http://www.openbsd.org.mx/~alex/

Gunnar Eyal Wolf Iszaevich
gwolf@campus.iztacala.unam.mx
http://www.gwolf.cx/

FES Iztacala - Universidad Nacional Autónoma de México

1. Introduction

1.1 What is Perl?

Perl is an all-purpose, easily extensible programming language that has become quite popular in the last years. Both the language's specification and its implementation are free, which has led to Perl being used not only as an independent language, but as a glue language and an embedded language as well.

Perl has important characteristics that help us create safe code, but it has also many subtleties that, if we are unaware of them, can easily act against us.

1.2 Good things about Perl

Regular Perl programmers speak wonders about many aspects of this language: We like how it can be written, with flexible sintactical rules modeled after natural languages. We like the richness of the language. We like a real lot of things about this language - However, which of them are relevant security-wise? Mainly:

1.2.1 Non-tipified variables

In Perl, we do not have to worry about declaring variables as having a specific type or magnitude. Perl will automatically take care of using the right size and type for our data, and converting it if necessary to any other type.

The same goes for arrays and hashes - When we declare an array, we do not need to state how many elements will it have, and we do not need to worry about the data types of each of them. Hashes and arrays are of variable length, Perl will not make you spend memory for yet-unused locations. Finally, while multidimensional arrays are not really natively supported in Perl, we can seamlessly and painlessly emulate them using references.

1.2.2 Automatic memory management

Many languages require the programmer to manually manage dynamically allocated memory. In Perl, the memory is automatically assign when it is needed, and the language's own garbage collection mechanisms will claim back any space which is no longer referenced (and therefore used). This means we don't have to worry about the feared buffer overflows, array sizes, data structures' complexity and other details. This not only makes the development time much shorter, but also helps avoid human mistakes, which are so common in tedious, repetitive processes.

1.2.3 High extensibility

Perl is a tremendously extensive language. There are modules already written allowing us to do practically everything, and, without bonding us to use programming paradigms which are not always practical (as Java does, making everything into an object, which sometimes is quite awkward and artificial), favors known and trusted code reutilization.

One of Perl's main strengths is the CPAN (Comprehensive Perl Archive Network), a very large repository of Perl modules covering practically every area of development. It is a very good idea to periodically check the CPAN during a project's development cycle, as perhaps we will find the answers to many problems we will run into. This will surely save us long hours of development and debugging.

1.2.4 Quick compilation

Although many people see this as a drawback instead of a good thing, I think this is one of the most important factors to Perl's success.

Although many people seem to think that Perl is an interpreted language, this is not true --- Perl is a true compiled language. The compilation, yes, takes place to memory. We can (using the B and O modules, and the perlcc program) compile in different ways, producing binary files, although this is not really faster than compiling our program at runtime, thanks to the highly optimized Perl compiler.

Quick compilation saves a lot of time when programming, even more while debugging. I have participated in projects rounding tens of thousands of lines (plus the included modules), and compilation time at startup was only a couple of seconds. We were able to test even our smallest changes without long delays. I cannot imagine how would we have been slowed down were we to do this project in a language such as C or Java.

1.3 Bad things about Perl

Even though I like talking about Perl (and I like even more using it), not all is perfect in this great language. There are many dangerous points which we must keep in mind. Some of them are:

1.3.1 A lot of simplicity and power --- Maybe too much

Perl allows us to easily interact with the operating system, and the programmer may forget to check for the results of its actions, or to correctly validate its arguments. This can lead us to very severe security problems.

1.3.2 Non-prototyped functions

Not tipifying functions is a great quality in Perl --- As variables do are non tipified, a single function can behave differently according to its arguments, to its context, and basically, to whatever the programmer thinks fit. However, although this gives great flexibility, it has two important drawbacks:

Note that there is a way to use prototyped functions in modern versions of Perl (see Prototypes in perldoc perlfunc), although its syntax is somewhat awkward, and is not widely used. Quoting from this document:

Some folks would prefer full alphanumeric prototypes. Alphanumerics have been intentionally left out of prototypes for the express purpose of someday in the future adding named, formal parameters. The current mechanism's main goal is to let module writers provide better diagnostics for module users. Larry feels the notation quite understandable to Perl programmers, and that it will not intrude greatly upon the meat of the module, nor make it harder to read. The line noise is visually encapsulated into a small pill that's easy to swallow.

1.3.3 Objects are essentially just a patch

Starting with Perl 5 we can truly do object-oriented programing in Perl. It is, however, easy to notice that this implementation is essentialy a patch, although quite a nice one, and not a nice and clean implementation of objects. Object oriented programming in Perl usually involves many steps that we can skip in other languages (or we can even not think about them), and become that repetitive, tedious task that usually Perl helps us to avoid.

1.4 Types of Perl

Being faithful to the TIMTOWTDI philosophy (There Is More Than One Way To Do It, one of Perl's main lemmas), there are many ways for using Perl. The most common ones are:

1.4.1 Traditional Perl

The Perl binary is called together with the program, in order to compile and execute it. This can be made using the shebang syntax, giving Perl the scriptname as an argument, giving Perl the whole script as an argument after a -e, sending the whole script via STDIN to Perl, and probably many other ways.

Perl's compiler is highly efficient. However, compilation time does cost computer resources. If a program is to be executed frequently, compilation time can become very important.

Practically all Perl modules have been written with traditional Perl in mind, and will probably work better with it.

1.4.2 Perl as a module or part of another application

Perl was originally concieved as a glue language, made to work together with other applications and languages, and has therefore been converted to modules for different applications and to an embedded language. Probably the best example for this is mod_perl, which embeds Perl in the popular Apache web server.

mod_perl is an answer to the problem stated above. A copy of Perl will reside in the Apache server process. Apache, of course, will have a substantially bigger memory footprint than if it would not include Perl, and takes some extra milliseconds to initialize it. However, if a program is requested repeatedly (i.e., a frequently used CGI script), its compiled binary image will be ready in memory, and execution times will be much lower.

There are many ways of using mod_perl. Probably the most frequently used one (although it cannot use important capabilities of mod_perl) is through the Apache::Register module, which allows us to run, practically unmodified, the CGIs written in traditional style.

On the other hand, we have Mason, ePerl and other similar alternatives, allowing us -in the purest PHP and ASP style- to embed our code in HTML.

The best way, however, to take most advantage of mod_perl is to program directly Apache modules. This will give us the power to take part on each stage the server passes, starting with Apache's initialization and configuration, and on each stage of the request's handling cycle. The whole Apache API will be at our disposal if we use mod_perl. There are many things that can only be made in Perl or in C - And most of you will agree that it is usually more convenient to do them with Perl.

1.5 Languages comparable with Perl

It is not my intention to start a holy war on programming languages, but I do think that one of the best ways to define the factors that make a programming language unique is to compare it to others. Of course, comparing to each programming language would be tedious and ridiculous. I will compare it, then, with the ones it usually competes with.

1.5.1 Shell

Perl is a wonderful language that can help us with our system administration tasks. In this area, of course, the logical alternative would be using the different Unix shells (Bourne, Korn, C-Shell and their derivatives). This languages qualify perfectly as general-purpose programming languages, and are powerful enough to carry out almost any simple task. However, when we reach intermediate sized problems, we will quickly realize the impossibility to code them cleanly: The variables and functions are global, we do not have even the most basic data structures (and it is quite complex to implement them), and it is very difficult to mantain a modular programming style.

1.5.2 PHP

From its very beginning, PHP was meant to help easily design interactive Web pages (Perl became a top choice for interactive web pages practically since they were first concieved), although there are already ways to use PHP for administration scripts and other many roles. PHP clearly exhibits its heritage - Since its beginning it was meant to make better many aspects of Perl, allowing Web designers to easily do basic programming.

PHP's syntax is very similar to Perl's. Perl users will always find something lacking in PHP. PHP will, unlike Perl, almost always appear scattered among HTML code - Perl users often frown at this practice, as we find it to be dirty.

PHP is a purely interpreted language, while Perl compiles the whole program into memory. This often makes PHP debugging slower and more complicated.

Typically, executing a PHP program is quicker than calling a CGI script, as PHP's interpreter is embedded in the Web server, and does not need to be loaded each time the program is invoked. This trend is clearly reverted if we use mod_perl with Apache, which besides boosting execution speed allows us to reach a much greater functionality and control of the Web server.

Although PHP is quickly maturing into a serious programming language, its design shows its much humbler beginnings. Many parts of the language do not have a coherent syntax, and it's not impossible to find which parts of the language were designed by different people.

1.5.3 Python

Probably Python is the language with which Perl more directly "competes" with. Python shares many of Perl's characteristics, but it does have a fundamental design difference: Although one of Perl's most often quoted philosophical lemmas is TIMTOWTDI (There Is More Than One Way To Do It), Guido van Rossum (Python's creator) prefers to say that there should be only one way to do it.

P>Python emphasizes on code legibility, using much clearer and cleaner constructs, and often avoiding the creation of unmantainable code. Many Perl fans complain about Python's lack of naturality - Perl was not designed by a computer scientist, but rather by a linguist, Larry Wall, who modeled its creation to be as close as possible to a natural language.

1.5.4 Java

Java is a poorly understood concept, created at Sun Microsystems. Java's goal is to have a universally portable system, compiling source code to an intermediate form called bytecode, which is later compiled at runtime in the destination computer using a Java Virtual Machine (JVM).

Java was concieved to allow for execution of remote, unchecked code (often inside a Web browser) as safely as possible. It includes a sandbox, which avoids an untrusted program downloaded from the network from carrying certain actions which can jeopardize security.

In Java, everything we do must strictly follow the object oriented programming paradigm. Although it is true that OOP can be very useful for certain projects, making it mandatory for any application makes Java programming slow and tedious.

Java is unfortunately too slow for medium or large applications, and the different versions of the language in widespread use make the promise of "write once, run everywhere" largely a myth, as they are not completely compatible between each other.

1.5.5 Javascript

This language's name is quite misleading - Javascript has nothing to do with Java. Javascript is a specific purpose language, which runs as part of a Web browser. Javascript is strictly for client-side programming, and we typically use it to validate form input before sending it to the server to be processed, to allow for dynamic HTML pages, and similar tasks.

Javascript unfortunately also suffers from having a large amount of not completely compatible, widespread implementations. Even though the language's specification is quite clear and -as Javascript fans say- is quite elegant, using it is often very problematic. Many Web programmers prefer not using Javascript's advantages in order not to be victims of its incompatibilities.

It is very common to find CGI scripts in different languages generating Javascript code to be run at client side.

2 Avoiding insecure programing practices: the strict pragma

Pragmas are directives targeted at the Perl compiler, requesting it to act in a specific way for certain part of the code, to modify what it accepts as valid code, or how it will carry specific operations. Throughout this talk we will talk about various different pragmas, but I do think that strict is not only among the most useful and important pragmas available in Perl - It is also the one that better illustrates what pragmas are.

An important note about pragmas: If our program spans various files, being divided in modules or libraries, activating a pragma in one of them does not automatically activate it in the other ones. It is a very good practice to start all of our modules or libraries activating the same pragmas, so that the compiler will show a consistent behavior in all of the code.

2.1 Introduction to strict

strict's purpose is to require us not to fall in insecure programming practices, which, although very useful sometimes, often jeopardize our code. Perl usually accepts certain ways of using variables, subroutines or references that are not really secure - and strict will not allow us to use them, aborting at compile time if it detects we are doing something wrong.

Yes, it is often awkward having the compiler complain and not letting us program as we like, but it is fundamental to use strict in any project we will use more than just a couple of times, as bad practices will surely come back at us and bite us.

2.1.1 Activating/deactivating strict

To ask the compiler to turn on the strict checks from the current point in the program on, we tell it to

use strict;
and if we want to deactivate this behavior from a certain point on, we ask the compiler:
no strict;
We can specify that we want to activate or deactivate just part of strict's functionality this way:
use strict 'vars';
use strict 'subs';
use strict 'refs';
If we don't specify a mode, it implies we are referring to all of them.

For further documentation on this pragma, read perldoc strict.

2.2 strict 'vars': Variable's scoping

This mode will make Perl generate a compile time error if we try to use a variable which has not been declared with use vars, my or our, or called with its full name (including its namespace, i.e. $main::var instead of $var). The use of the local scope is disallowed too.

To better understand this, lets briefly take a look at the possible variable scopes in Perl.

It is important to note that using strict vars will not break if we use Perl's special global variables (i.e. $|, $_, $^W, etc. Perl's special variables can be used independently of the strict 'vars' setting.

2.2.1 Global scope

2.2.2 Global scope in a package - our

The following expressions are equivalent:

$myPackage::variable = 1;
and
package myPackage
our $variable = 1;
Additionally, they are both valid syntaxes when working under strict vars.

2.2.3 Local scope

2.2.4 Lexical variables with my

2.2.5 Why should we avoid using global variables?

2.2.6 Behavior of the global scope when using strict

If we try to use globally scoped variables without declaring them or giving their whole name, Perl will generate the following errors:

use strict;
$var = 1234;
Global symbol "$var" requires explicit package name at programa.pl
line 2.
# We get this error code immediatly, as it gets generated at compile time.
Execution of programa.pl aborted due to compilation errors.
# This error is generated at execution time

2.3 strict 'subs': Non-explicit subroutines

strict 'subs' will require every bareword to be valid function calls, to be enclosed within curly braces or to be at the left hand side of a => operator. This means that the following forms are permitted:

The following will cause a compile time error:

2.4 strict 'refs' --- Symbolic references

A quite obscure characteristic of Perl 5 is the use of symbolic references. This is quite a nice and fun concept, but dangerous enough that, as soon as it was introduced, strict 'refs' was added to strict, avoiding the abuse of this kind of references.

2.4.1 What are symbolic references?

Unlike real references, which we will often find in Perl, symbolic references don not point to the memory address of a variable, but only to a variable's name. The best way for explaining this strange concept is through an example, taken directly from perldoc perlref:

$name = 'foo';
$$name = 1; # Stores 1 in $foo 
${$name} = 2; # Stores 2 in $foo 
${$name x 2} = 3; # Stores 3 in $foofoo
$name->[0] = 4; # Stores 4 in $foo[0] 
@$name = (); # Empties @foo
&$name(); # Calls the &foo() function
$pack = "THAT"; 
${"${pack}::$name"} = 5; # Stores 5 in $THAT::foo without the need for an eval!

2.4.2 And what is so bad about symbolic references?

3 Reporting warnings

Perl has the ability to warn the programmer via the standard error (STDERR) that he might be doing something wrong, inviting him to check the code to verify that he did not make an error that can lead to an important failure under certain circumstances.

Perl has been able to report warnings since a long time ago, although its behavior has strongly changed with the introduction of Perl version 5.6.0

3.1 Reporting warnings with Perl < 5.6.0

With Perl versions prior to 5.6.0, the behavior of the warnings report is defined by a switch specified at runtime, or by a special global variable, according to the following rules:

3.2 Using lexical warnings with Perl >= 5.6.0

Under Perl 5.6, this behavior still works, but a new, more flexible and powerful behavior has been introduced: Defining the warnings as a pragma. Please refer to perldoc perllexwarn for further information.

Please remind that, unlike the switches mentioned above, pragmas apply only to the file where they are declared.

Activating warnings report as a pragma allows us to activate or deactivate individually the following categories of warnings (taken from perldoc perllexwarn): chmod, closure, exiting, glob, io (which is further subdivided in closed, exec, newline, pipe and unopened), misc, numeric, once, overflow, pack, portable, recursion, redefine, regexp, severe (subdivided in debugging, inplace, internal and malloc), signal, substr, syntax (subdivided in ambiguous, bareword, deprecated, digit, parenthesis, precedence, printf, prototype, qw, reserved and semicolon), taint, umask, uninitialized, unpack, untie, utf8, void and y2k.

Each of this categories is activated or deactivated individually. For example, if we want to activate warnings on symbols used only once and about unopened filehandlers being used, and we want to ignore the report of recursion and uninitialized values warnings,

use warnings qw(once unopened);
no warnings qw(recursion uninitialized);

Additionaly, we can raise the level of this categories to force them become fatal errors. For example, if we want Perl to die instead of just warn when we are redefining a function (this is, when we define two functions with the same name), we can use:

use warnings FATAL =>
qw(redefine);

If we define warnings report behavior using pragmas, it will have precedence over the behavior of the $^W variable or the -w switch. However, if we invoke Perl with the -W or -X switches, they will have precedence.

4 Handling tainted data

Most programs will not be limited to do internal data processing or data generation - Most programs will take some data set as its input and will generate another data set as its output. This is natural, but can become quite problematic and dangerous, especially when this data can affect the execution process of our program.

For more detailed information on this subject, please check the official doccumentation - perldoc perlsec.

4.1 What is tainted data?

Perl has a mode where it will avoid any external use we give to tainted data without cleaning it first, this means, it will complain whenever we try to do something that has an effect outside our program's execution space without having first been validated.

Data is considered to be tainted if:

And Perl will refuse to use tainted data for:

Refusing to use them means that if we want to do one of these restricted operations with tainted data, Perl will send a runtime exception, which will kill the running process (unless eval'ed).

Activating tainted data reports

Perl will follow the above described behavior if:

Once we enter tainted mode, it will not be possible to leave it - The whole execution of our program will be carried out with taint checks.

4.2.1 Detecting tainted data

To detect tainted data, we can use this function (taken from perldoc perlsec):

sub is_tainted {

    return ! eval { join ('', @_), kill 0;

1;

}

We can also use the Taint module (available at CPAN) this way:

use Taint;

warn "Datos sucios" if tainted ($var1,@var2, $3var3, %var4);
Thus avoiding possible fatal exceptions.

Certain values will always be tainted, as they come from the outside world. For example, the execution path ($ENV{PATH}) is recieved from the invoking process, and Perl cannot trust it to be safe. Any external program we execute from within Perl (with system(), exec(), qx(), backticks, etc. will require $ENV{PATH} to be untainted (cleaning) to allow its execution:

 $data = ; # $data contains tainted data, as they come from the user

if ( $data =~ /^([\w\b\d]+)$/ ) { # Accepts only the specified pattern and stores it in main memory
    $clean_data = $1; # $1 contains the text which matched the regular expression
} else {
    die "I did not expect this condition: $data";
}

4.2.3 A warning about unconsciously untainting

After what we just said, this might sound tempting:

$data = ;
$data =~ /^(.+)$/;
$data = $1;

However, accepting any data without checking defeats the use of Tainted checks. It is not impossible for us to want to do something like this, but it is very important to double- and triple-check if this is our only way out.

Unconsciously untainting data can be very harmful. Many computer security schemes of all kinds fail because they make the user believe he is completely sure, even if they are just a little help to heighten a bit the overall system security. Tainted checks are a tool for a programmer to ensure he is not forgetting to check something, and circunventing this mechanism would be completely pointless.

5 Sugestions to remember when using functions

There are many subtleties concerning functions and their use. Among the most important ones:

5.1 Invoking functions

If we are not using strict, Perl will allow us to invoke functions directly, giving only the function's name without explicitly stating that we are doing a function call. Although this seems often elegant and clean, it can lead to confusions to people reading our code, and can clash with a reserved word in a future Perl version. Functions should preferably be called as function() or function($arg1,$arg2), indicating an argument list even if it is empty, making clear we are talking about a function call.

Many people like adding the prefix & to the function calls. This is a syntax inherited from old Perl versions and not needed anymore. If you choose to use this syntax, it is also very important to explicitly give the argument list even if it is empty (&function()). If instead we call it as &function, this function will inherit the argument list (@_) for the current function call.

5.2 Recieving parameters

It is very common for the first line of a function to be:
my ($var1, $var2, $var3) = @_;
or my $var1 = shift; my $var2 = shift; my $var3 = shift;

This syntax is completely correct. However, if we just assume the function was correctly called, we can end up with undefined parameters, or ignored arguments. It is very advisable to check each of the recieved parameters, searching for undefined values, incorrect data types (i.e., a variable containing text where we should have a numeric value, or a scalar value where we expect a reference). Having incorrect data can cause erratic and hard to trace behavior.

5.2.1 Note on the reason of having @_

Many people ask why Perl uses a default array (@_), something not done by any other modern programming language. A simple example will better illustrate this:

In C, we declare a function like this:
int funct (int var1, char* var2);

In Python:
def func (var1, var2):

In PHP:
function func ($var1, $var2) { (...) }

And the list goes on for practically any other language. Why do we have to juggle around in Perl as we discussed on the previous section, manually assigning variables from a default array called @_?

Believe it or not, this is done by design. In Perl's first incarnations the lexical variables did not exist. If we were to declare a function together with its variable names, this would impact on the global namespace. Rather than polluting namespaces, the Perl architects decided to give arguments one standard and well defined name: The default array, or @_. In fact, if we will code a short function and this will not make our code illegible (and even more if the function will be called often), it is advisable to use the default array directly inside the functions. This way, we will not have to worry about manual creation and assignment of variables, saving some microseconds. Yes, some microseconds may be very little in most cases, but it can become quite noticeable in some circumstances.

5.3 Handling internal variabls

We have talked this over a couple of times already. It is very important to make all of our variables (or at least, as many as possible) belong to the lexical scope (with my). This will avoid data in our function affecting other functions, and it will optimize memory usage.

5.4 Being careful when using references - ref

When we use references to pass arguments to or return values from a function, it is very important to check they are effectively references, and that they refer to the right data type. It is invariably better to return from the function with an error message and allow the invoking program or function to handle this condition than just raising a run-time error, leading to aborting the execution, or maybe even worse, handling incorrect data.

To timely detect these errors, we can recur to the ref builtin function this way:

$varref($var)
\\\$something'REF'
\$data'SCALAR'
\@arr or [1,2,3]'ARRAY'
\%hash or {a=>1,b=>2}'HASH'
\&func'CODE'
\*other'GLOB'
$non_ref'' (empty string, not undef)

5.4.1 GLOB references

A special reference type is a GLOB reference - It is not a reference to a scalar, an array, a hash or code, but rather a reference to everything.

Namespaces don't only manage the data types just mentioned. They also manage filehandles, which do not have a identifying prefix,a nd are not easy to pass between functions. If we want, for example, to pass a filehandle aas an argument, the traditional way is to do it via a global ref: func(*STDOUT). This makes func's first argument an object of IO::Handle type.

Yes, back in their day GLOBs were a needed alternative, and even an elegant way of doing something the language needed. However, nowadays the best use of GLOBs is to confuse fellow programmers. If you want to pass filehandlers between functions, it's often better to use the native object oriented implementation of open, through IO::Handle (or one of its subclasses, as IO::File or IO::Socket):

use IO::File;

if (need_open()) {

    my $handle = new IO::File;

    if ($handle->open('

Here, $handle has all the attributes of a lexical variable: We can pass it to the use_file function as a parameter, it automatically disappears when leaving the block where it is declared (cleanly closing it, of course, thanks to the object's destructor), etc.

5.5 Returning results

When we leave a function, either after a successful or an unsuccessful operation, we should return a result, as simple and coherent as possible.

Every function should have a consistent way of expressing success or failure, and the function call should handle this result accordingly.

When we successfully complete a function, we should find the simplest way of returning the processed data - For example, sometimes it will be much simpler to process multiple results of a function call if they are all grouped in a single reference to an array, or better yet, to a logical structure reflecting their relationships, instead of sending them back as a collection of unrelated scalar values.

6 Sugestions to remember when using objects

Perl 5 introduced many important advantages over Perl 4. Maybe the most important of them is the support for object oriented programming. However, as we will soon see, Perl's OOP implementation is neither complete nor clean, if we follow the traditional, formal criteria for defining objects. Using objects correctly as they are implemented, however, can be very convenient, once the particularities are taken care of.

To work optimally with objects, please refer to perldoc perlobj, perldoc perltoot and perldoc perlbot.

6.1 Living with a patchy object implementation

Perl's OOP syntax often forces the programmer to write more than what he would in other languages, and for non-expert eyes, the style is not very clear, often even confusing. For example, we refer to an object's attribute by using $obj->{attr}, but we refer to an object's method with $obj->meth(). A common error is to try $obj->attr, which Perl will translate to a method call. In case the attr method exists, it will be called and its return value will be given to the caller, probably leading to unexpected behavior. If the method does not exist, this will lead to a runtime error, aborting our program execution.

In traditional object oriented languages we are used to having private and public attributes and methods. Perl invites us not to think that way. Perl invites us to make everything public. Quoting Larry Wall, I will let you enter my house, and I trust you not to steal my belongings because I trust in you, and not because I have a security surveillance system at home. Yes, there are ways of implementing private methods and attributes, but they look more like black magic and contorsionism than calls.

6.2 Checking for required and optional attributes

When we create an object, it is important to check that we have all the required parameters, and that we were not invoked with a parameter we don't know how to handle. To do so, I suggest to include in your constructor methods something like this:

my @needed = qw(color size type);
my %temp = ();
my %valid = ();

map {$temp{$_} = $valid{$_} = 1} (keys %$self);
$valid{$_} = 1 foreach (qw(texture temperature));
foreach (@needed) {
    if (defined $temp{$_}) {
        delete $temp{$_};
    } else {
        die "I need $_ to create the object!";
    }
}

if (my @tmp2 = keys(%temp)) {
    die "Unknown elements at object initialization: @tmp2";
}

7 Sugestions to remember when using modules and libraries

A large project can be much better managed if it is built using modules, splitting a large program into many separate and function-specific files. This has the additional advantage of making much easier code reutilization for other projects.

Besides modularizing our code, in Perl we will frequently use modules built by other people, as we have a large collection of modules, voluntary contributions from various Perl developers worldwide, organized in the CPAN.

In Perl we often talk about modules and libraries. They are very similar, but with one subtle yet important difference: A library is only a collection of functions, whereas a module is encapsulated in its own namespace, and is often design with object orientation in mind. Of course, as we will soon see, we can call a module as if it were a library or the other way around.

7.1 Packages and namespaces

It is a very common and advisable practice to use separate namespaces when we work with modules. This will help us avoid global functions and variables existing in one of our modules collide with others, defined with the same name, in the main program or in any other module.

The default namespace is main, and by default, all symbols (functions, variables and filehandles) in Perl are prepended with their namespace's name behind the scenes - The real name for $var, &func or FILE is $main::var, &main::func and main::FILE. If we specify a null namespace (i.e. $::var), Perl will translate it to an explicit call to $main::var.

With the package command, we can change the namespace where we are working, as shown here:

$var = 'value';
$Other::var = 250;
print $var; # 'value'

package Otro;
print $var; # 250
print $main::var; # will always be 'value'
print $Otro::var; # will always be 250
print $::var; # will always be 'value'

Lexical variables (those declared with my) do not belong to any namespace, and are not affected by packages.

7.2 Methods for the inclusion of modules and libraries

There are three methods for including code files: do, require and use. Files included with the first two are libraries, and files included with use are modules.

7.2.1 do

Its effect is exactly including the file's text in the exact point in execution it is called at. Quoting from the definition of do in perldoc perlfunc, do 'file.pl' is equivalent (although more efficient) to doing scalar eval `cat file.pl`. Each time we find a do, the file is evaluated again, so it should not be used inside loops.

The value returned from a file's inclusion is the one of the last expression evaluated in the file. If the file was not successfully included, the return value will be undef, and the special $! variable will have the error message. If the file was successfully read but could not be compiled, the return value will be undef and the error will be recieved in the special $@ variable.

do is normally used for reading configuration files. Again, from perldoc perlfunc:

unless ($return = do '/usr/local/etc/myconf') { 
    warn "couldn't parse $file: $@" if $@; 
    warn "couldn't do $file: $!" unless defined $return; 
    warn "couldn't run $file" unless $return; 
}

7.2.2 require

require reaches a bit beyond what do does. The main differences are:

7.2.3 use

This inclusion method was born with Perl 5, and was thought to be used with modules and objects. When we include a file with use, additional to require's behavior: