Security in Perl scripts

Jesus Alejandro Juárez Robles
alex@campus.iztacala.unam.mx
http://www.openbsd.org.mx/~alex/

Gunnar Eyal Wolf Iszaevich
gwolf@campus.iztacala.unam.mx
http://www.gwolf.cx/

FES Iztacala - Universidad Nacional Autónoma de México

1. Introduction

1.1 What is Perl?

Perl is an all-purpose, easily extensible programming language that has become quite popular in the last years. Both the language's specification and its implementation are free, which has led to Perl being used not only as an independent language, but as a glue language and an embedded language as well.

Perl has important characteristics that help us create safe code, but it has also many subtleties that, if we are unaware of them, can easily act against us.

1.2 Good things about Perl

Regular Perl programmers speak wonders about many aspects of this language: We like how it can be written, with flexible sintactical rules modeled after natural languages. We like the richness of the language. We like a real lot of things about this language - However, which of them are relevant security-wise? Mainly:

1.2.1 Non-tipified variables

In Perl, we do not have to worry about declaring variables as having a specific type or magnitude. Perl will automatically take care of using the right size and type for our data, and converting it if necessary to any other type.

The same goes for arrays and hashes - When we declare an array, we do not need to state how many elements will it have, and we do not need to worry about the data types of each of them. Hashes and arrays are of variable length, Perl will not make you spend memory for yet-unused locations. Finally, while multidimensional arrays are not really natively supported in Perl, we can seamlessly and painlessly emulate them using references.

1.2.2 Automatic memory management

Many languages require the programmer to manually manage dynamically allocated memory. In Perl, the memory is automatically assign when it is needed, and the language's own garbage collection mechanisms will claim back any space which is no longer referenced (and therefore used). This means we don't have to worry about the feared buffer overflows, array sizes, data structures' complexity and other details. This not only makes the development time much shorter, but also helps avoid human mistakes, which are so common in tedious, repetitive processes.

1.2.3 High extensibility

Perl is a tremendously extensive language. There are modules already written allowing us to do practically everything, and, without bonding us to use programming paradigms which are not always practical (as Java does, making everything into an object, which sometimes is quite awkward and artificial), favors known and trusted code reutilization.

One of Perl's main strengths is the CPAN (Comprehensive Perl Archive Network), a very large repository of Perl modules covering practically every area of development. It is a very good idea to periodically check the CPAN during a project's development cycle, as perhaps we will find the answers to many problems we will run into. This will surely save us long hours of development and debugging.

1.2.4 Quick compilation

Although many people see this as a drawback instead of a good thing, I think this is one of the most important factors to Perl's success.

Although many people seem to think that Perl is an interpreted language, this is not true --- Perl is a true compiled language. The compilation, yes, takes place to memory. We can (using the B and O modules, and the perlcc program) compile in different ways, producing binary files, although this is not really faster than compiling our program at runtime, thanks to the highly optimized Perl compiler.

Quick compilation saves a lot of time when programming, even more while debugging. I have participated in projects rounding tens of thousands of lines (plus the included modules), and compilation time at startup was only a couple of seconds. We were able to test even our smallest changes without long delays. I cannot imagine how would we have been slowed down were we to do this project in a language such as C or Java.

1.3 Bad things about Perl

Even though I like talking about Perl (and I like even more using it), not all is perfect in this great language. There are many dangerous points which we must keep in mind. Some of them are:

1.3.1 A lot of simplicity and power --- Maybe too much

Perl allows us to easily interact with the operating system, and the programmer may forget to check for the results of its actions, or to correctly validate its arguments. This can lead us to very severe security problems.

1.3.2 Non-prototyped functions

Not tipifying functions is a great quality in Perl --- As variables do are non tipified, a single function can behave differently according to its arguments, to its context, and basically, to whatever the programmer thinks fit. However, although this gives great flexibility, it has two important drawbacks:

The programmer must provide the intelligence to ensure the function was invoked with the right parameters in the right format
makes the code harder to read and mantain

Note that there is a way to use prototyped functions in modern versions of Perl (see Prototypes in perldoc perlfunc), although its syntax is somewhat awkward, and is not widely used. Quoting from this document:

Some folks would prefer full alphanumeric prototypes. Alphanumerics have been intentionally left out of prototypes for the express purpose of someday in the future adding named, formal parameters. The current mechanism's main goal is to let module writers provide better diagnostics for module users. Larry feels the notation quite understandable to Perl programmers, and that it will not intrude greatly upon the meat of the module, nor make it harder to read. The line noise is visually encapsulated into a small pill that's easy to swallow.

1.3.3 Objects are essentially just a patch

Starting with Perl 5 we can truly do object-oriented programing in Perl. It is, however, easy to notice that this implementation is essentialy a patch, although quite a nice one, and not a nice and clean implementation of objects. Object oriented programming in Perl usually involves many steps that we can skip in other languages (or we can even not think about them), and become that repetitive, tedious task that usually Perl helps us to avoid.

1.4 Types of Perl

Being faithful to the TIMTOWTDI philosophy (There Is More Than One Way To Do It, one of Perl's main lemmas), there are many ways for using Perl. The most common ones are:

1.4.1 Traditional Perl

The Perl binary is called together with the program, in order to compile and execute it. This can be made using the shebang syntax, giving Perl the scriptname as an argument, giving Perl the whole script as an argument after a -e, sending the whole script via STDIN to Perl, and probably many other ways.

Perl's compiler is highly efficient. However, compilation time does cost computer resources. If a program is to be executed frequently, compilation time can become very important.

Practically all Perl modules have been written with traditional Perl in mind, and will probably work better with it.

1.4.2 Perl as a module or part of another application

Perl was originally concieved as a glue language, made to work together with other applications and languages, and has therefore been converted to modules for different applications and to an embedded language. Probably the best example for this is mod_perl, which embeds Perl in the popular Apache web server.

mod_perl is an answer to the problem stated above. A copy of Perl will reside in the Apache server process. Apache, of course, will have a substantially bigger memory footprint than if it would not include Perl, and takes some extra milliseconds to initialize it. However, if a program is requested repeatedly (i.e., a frequently used CGI script), its compiled binary image will be ready in memory, and execution times will be much lower.

There are many ways of using mod_perl. Probably the most frequently used one (although it cannot use important capabilities of mod_perl) is through the Apache::Register module, which allows us to run, practically unmodified, the CGIs written in traditional style.

On the other hand, we have Mason, ePerl and other similar alternatives, allowing us -in the purest PHP and ASP style- to embed our code in HTML.

The best way, however, to take most advantage of mod_perl is to program directly Apache modules. This will give us the power to take part on each stage the server passes, starting with Apache's initialization and configuration, and on each stage of the request's handling cycle. The whole Apache API will be at our disposal if we use mod_perl. There are many things that can only be made in Perl or in C - And most of you will agree that it is usually more convenient to do them with Perl.

1.5 Languages comparable with Perl

It is not my intention to start a holy war on programming languages, but I do think that one of the best ways to define the factors that make a programming language unique is to compare it to others. Of course, comparing to each programming language would be tedious and ridiculous. I will compare it, then, with the ones it usually competes with.

1.5.1 Shell

Perl is a wonderful language that can help us with our system administration tasks. In this area, of course, the logical alternative would be using the different Unix shells (Bourne, Korn, C-Shell and their derivatives). This languages qualify perfectly as general-purpose programming languages, and are powerful enough to carry out almost any simple task. However, when we reach intermediate sized problems, we will quickly realize the impossibility to code them cleanly: The variables and functions are global, we do not have even the most basic data structures (and it is quite complex to implement them), and it is very difficult to mantain a modular programming style.

1.5.2 PHP

From its very beginning, PHP was meant to help easily design interactive Web pages (Perl became a top choice for interactive web pages practically since they were first concieved), although there are already ways to use PHP for administration scripts and other many roles. PHP clearly exhibits its heritage - Since its beginning it was meant to make better many aspects of Perl, allowing Web designers to easily do basic programming.

PHP's syntax is very similar to Perl's. Perl users will always find something lacking in PHP. PHP will, unlike Perl, almost always appear scattered among HTML code - Perl users often frown at this practice, as we find it to be dirty.

PHP is a purely interpreted language, while Perl compiles the whole program into memory. This often makes PHP debugging slower and more complicated.

Typically, executing a PHP program is quicker than calling a CGI script, as PHP's interpreter is embedded in the Web server, and does not need to be loaded each time the program is invoked. This trend is clearly reverted if we use mod_perl with Apache, which besides boosting execution speed allows us to reach a much greater functionality and control of the Web server.

Although PHP is quickly maturing into a serious programming language, its design shows its much humbler beginnings. Many parts of the language do not have a coherent syntax, and it's not impossible to find which parts of the language were designed by different people.

1.5.3 Python

Probably Python is the language with which Perl more directly "competes" with. Python shares many of Perl's characteristics, but it does have a fundamental design difference: Although one of Perl's most often quoted philosophical lemmas is TIMTOWTDI (There Is More Than One Way To Do It), Guido van Rossum (Python's creator) prefers to say that there should be only one way to do it.

P>Python emphasizes on code legibility, using much clearer and cleaner constructs, and often avoiding the creation of unmantainable code. Many Perl fans complain about Python's lack of naturality - Perl was not designed by a computer scientist, but rather by a linguist, Larry Wall, who modeled its creation to be as close as possible to a natural language.

1.5.4 Java

Java is a poorly understood concept, created at Sun Microsystems. Java's goal is to have a universally portable system, compiling source code to an intermediate form called bytecode, which is later compiled at runtime in the destination computer using a Java Virtual Machine (JVM).

Java was concieved to allow for execution of remote, unchecked code (often inside a Web browser) as safely as possible. It includes a sandbox, which avoids an untrusted program downloaded from the network from carrying certain actions which can jeopardize security.

In Java, everything we do must strictly follow the object oriented programming paradigm. Although it is true that OOP can be very useful for certain projects, making it mandatory for any application makes Java programming slow and tedious.

Java is unfortunately too slow for medium or large applications, and the different versions of the language in widespread use make the promise of "write once, run everywhere" largely a myth, as they are not completely compatible between each other.

1.5.5 Javascript

This language's name is quite misleading - Javascript has nothing to do with Java. Javascript is a specific purpose language, which runs as part of a Web browser. Javascript is strictly for client-side programming, and we typically use it to validate form input before sending it to the server to be processed, to allow for dynamic HTML pages, and similar tasks.

Javascript unfortunately also suffers from having a large amount of not completely compatible, widespread implementations. Even though the language's specification is quite clear and -as Javascript fans say- is quite elegant, using it is often very problematic. Many Web programmers prefer not using Javascript's advantages in order not to be victims of its incompatibilities.

It is very common to find CGI scripts in different languages generating Javascript code to be run at client side.

2 Avoiding insecure programing practices: the `strict` pragma

Pragmas are directives targeted at the Perl compiler, requesting it to act in a specific way for certain part of the code, to modify what it accepts as valid code, or how it will carry specific operations. Throughout this talk we will talk about various different pragmas, but I do think that strict is not only among the most useful and important pragmas available in Perl - It is also the one that better illustrates what pragmas are.

An important note about pragmas: If our program spans various files, being divided in modules or libraries, activating a pragma in one of them does not automatically activate it in the other ones. It is a very good practice to start all of our modules or libraries activating the same pragmas, so that the compiler will show a consistent behavior in all of the code.

2.1 Introduction to `strict`

strict's purpose is to require us not to fall in insecure programming practices, which, although very useful sometimes, often jeopardize our code. Perl usually accepts certain ways of using variables, subroutines or references that are not really secure - and strict will not allow us to use them, aborting at compile time if it detects we are doing something wrong.

Yes, it is often awkward having the compiler complain and not letting us program as we like, but it is fundamental to use strict in any project we will use more than just a couple of times, as bad practices will surely come back at us and bite us.

2.1.1 Activating/deactivating `strict`

To ask the compiler to turn on the strict checks from the current point in the program on, we tell it to

use strict;

and if we want to deactivate this behavior from a certain point on, we ask the compiler:

no strict;

We can specify that we want to activate or deactivate just part of strict's functionality this way:

use strict 'vars';
use strict 'subs';
use strict 'refs';

If we don't specify a mode, it implies we are referring to all of them.

For further documentation on this pragma, read perldoc strict.

2.2 `strict 'vars'`: Variable's scoping

This mode will make Perl generate a compile time error if we try to use a variable which has not been declared with use vars, my or our, or called with its full name (including its namespace, i.e. $main::var instead of $var). The use of the local scope is disallowed too.

To better understand this, lets briefly take a look at the possible variable scopes in Perl.

It is important to note that using strict vars will not break if we use Perl's special global variables (i.e. $|, $_, $^W, etc. Perl's special variables can be used independently of the strict 'vars' setting.

2.2.1 Global scope

Default scope in Perl
Global variables are available anywhere in the program
If we want to use a global variable while using strict, we can predeclare them this way:
```
use vars qw($var1 @var2 %var3);
```
The global scope is equivalent to explicitly specify the main package. This means that instead of predeclaring the variables as shown above, we can refer to them as $main::var1, @main::var2 and %main::var3. We suggest, however, for the sake of clarity, to predeclare the variables instead of using the full names.

2.2.2 Global scope in a package - `our`

This scope type is available only in Perl 5.6.0 on.
Variables declared with our act as global variables inside a package. The package must have been explicitly mentioned before.

The following expressions are equivalent:

$myPackage::variable = 1;

and

package myPackage
our $variable = 1;

Additionally, they are both valid syntaxes when working under strict vars.

2.2.3 Local scope

The variables will be defined inside the actual block and in any block called from it:

sub func {
    local $var = 1;
    func2(); # Here $var is defined, so func2 prints 1
    print $var; # We are in the block where we defined $var, so this prints 1.
}
func2(); # Here $var is undefined, so func2 will print nothing
sub func2 {
    print $var;
}

local scope will not be accepted when working with strict vars. In fact, local is often regarded as a leftover from Perl 4 - Avoid it, you don't need it.

2.2.4 Lexical variables with `my`

Variables declared with my will only exist inside the block they were created in, and they will be destroyed (and their memory space will be reclaimed) as soon as this block is abandoned.

sub func {
    my $var = 1;
    func2(); # func2 cannot reach $var, which is undefined outside this block
    print $var; # We are in the block where we defined $var, so this prints 1.
}
func2(); # Here $var does not exist anymore, so func2 will print nothing
sub func2 {
    print $var;
}

This is the reccommended scope for mostly anything we do in our programs - Try to stick to it :)

2.2.5 Why should we avoid using global variables?

Different modules or functions interfering with each other
When we program, most of us are quite predictable - How many times have you used the variables $i, $tmp or $num? How many other functions we may use, written by third parties, will have similar variable names? If we use global variables, we will continously have to take care not to interfere with variables used in other parts of our programs.
Code gets harder to mantain
Using global variables means we have to document their use globally, not only to avoid de above mentioned problems, but also to allow for future extensibility. Using variables with a more limited scope allows us to document their purpose at the beginning of the function or block they exist in.
Better memory administration
Perl's garbage collector is quite a good one - It claims back the space used by our data as soon as our data is no longer needed. In order to know if our data will be ever used again, the compiler can check if the data is still referenced by a variable - If we use global variables, the garbage collector will always find the storage space referenced, as variables will never be automatically destroyed. When using lexically scoped variables, as soon as we leave the block where they were created, the space used by them will be freed and claimed back by the garbage collector.

2.2.6 Behavior of the global scope when using `strict`

If we try to use globally scoped variables without declaring them or giving their whole name, Perl will generate the following errors:

use strict;
$var = 1234;
Global symbol "$var" requires explicit package name at programa.pl
line 2.
# We get this error code immediatly, as it gets generated at compile time.
Execution of programa.pl aborted due to compilation errors.
# This error is generated at execution time

2.3 `strict 'subs'`: Non-explicit subroutines

strict 'subs' will require every bareword to be valid function calls, to be enclosed within curly braces or to be at the left hand side of a => operator. This means that the following forms are permitted:

$var{key}
(key1 => 'value1', key2 => 'value2')
function (as long as it is a valid function name and the function has already been declared)

The following will cause a compile time error:

$var = value; (unless value is a valid function name)
(key1 => value1, key2 => value2) (unless value1 and value2 ar valid function names)

2.4 `strict 'refs'` --- Symbolic references

A quite obscure characteristic of Perl 5 is the use of symbolic references. This is quite a nice and fun concept, but dangerous enough that, as soon as it was introduced, strict 'refs' was added to strict, avoiding the abuse of this kind of references.

2.4.1 What are symbolic references?

Unlike real references, which we will often find in Perl, symbolic references don not point to the memory address of a variable, but only to a variable's name. The best way for explaining this strange concept is through an example, taken directly from perldoc perlref:

$name = 'foo';
$$name = 1; # Stores 1 in $foo 
${$name} = 2; # Stores 2 in $foo 
${$name x 2} = 3; # Stores 3 in $foofoo
$name->[0] = 4; # Stores 4 in $foo[0] 
@$name = (); # Empties @foo
&$name(); # Calls the &foo() function
$pack = "THAT"; 
${"${pack}::$name"} = 5; # Stores 5 in $THAT::foo without the need for an eval!

2.4.2 And what is so bad about symbolic references?

It tends to make our code harder to understand... To prove it, just take a look at the above example (and remember it was taken straight off the manual!)
Its behavior might seem impredictible. For example, if we have a global $val variable, with the 'myvalue' value, and we also have a lexically scoped variable called $val with the 'other' value, and from inside this block we call $$val, we will be refering to $myvalue and not to $other, because even though the internal $val is valid, only the external one appears in the symbol table.
This behavior, in case it is truly required, can be achieved through the use of eval - in a much clearer way.

3 Reporting `warnings`

Perl has the ability to warn the programmer via the standard error (STDERR) that he might be doing something wrong, inviting him to check the code to verify that he did not make an error that can lead to an important failure under certain circumstances.

Perl has been able to report warnings since a long time ago, although its behavior has strongly changed with the introduction of Perl version 5.6.0

3.1 Reporting warnings with Perl < 5.6.0

With Perl versions prior to 5.6.0, the behavior of the warnings report is defined by a switch specified at runtime, or by a special global variable, according to the following rules:

By default, the global special variable $^W has a false value.
If we invoke Perl with the -w switch, this will make the global special variable $^W to have a true value. We can invoke Perl with this switch calling our program as perl -w program.pl, or making the first line of our program #!/usr/bin/perl -w.
If we invoke Perl with the -W switch, this will make warnings to be always reported, ignoring the value of $^W.
If we invoke Perl with the -X switch, this will make warnings never to be reported, ignoring the value of $^W.
Having this rules, warnings will be reported whenever $^W has a true value.

3.2 Using lexical `warnings` with Perl >= 5.6.0

Under Perl 5.6, this behavior still works, but a new, more flexible and powerful behavior has been introduced: Defining the warnings as a pragma. Please refer to perldoc perllexwarn for further information.

Please remind that, unlike the switches mentioned above, pragmas apply only to the file where they are declared.

Activating warnings report as a pragma allows us to activate or deactivate individually the following categories of warnings (taken from perldoc perllexwarn): chmod, closure, exiting, glob, io (which is further subdivided in closed, exec, newline, pipe and unopened), misc, numeric, once, overflow, pack, portable, recursion, redefine, regexp, severe (subdivided in debugging, inplace, internal and malloc), signal, substr, syntax (subdivided in ambiguous, bareword, deprecated, digit, parenthesis, precedence, printf, prototype, qw, reserved and semicolon), taint, umask, uninitialized, unpack, untie, utf8, void and y2k.

Each of this categories is activated or deactivated individually. For example, if we want to activate warnings on symbols used only once and about unopened filehandlers being used, and we want to ignore the report of recursion and uninitialized values warnings,

use warnings qw(once unopened);


no warnings qw(recursion uninitialized);

Additionaly, we can raise the level of this categories to force them become fatal errors. For example, if we want Perl to die instead of just warn when we are redefining a function (this is, when we define two functions with the same name), we can use:

use warnings FATAL =>
qw(redefine);

If we define warnings report behavior using pragmas, it will have precedence over the behavior of the $^W variable or the -w switch. However, if we invoke Perl with the -W or -X switches, they will have precedence.

4 Handling tainted data

Most programs will not be limited to do internal data processing or data generation - Most programs will take some data set as its input and will generate another data set as its output. This is natural, but can become quite problematic and dangerous, especially when this data can affect the execution process of our program.

For more detailed information on this subject, please check the official doccumentation - perldoc perlsec.

4.1 What is tainted data?

Perl has a mode where it will avoid any external use we give to tainted data without cleaning it first, this means, it will complain whenever we try to do something that has an effect outside our program's execution space without having first been validated.

Data is considered to be tainted if:

It comes from direct interaction with the user
Environment variables
Parameters recieved from a Web form

And Perl will refuse to use tainted data for:

Using this data in any command that invokes a shell process
Using this data in any command which modifies files or directories
Using this data in a command that interacts with the process table

Refusing to use them means that if we want to do one of these restricted operations with tainted data, Perl will send a runtime exception, which will kill the running process (unless eval'ed).

Activating tainted data reports

Perl will follow the above described behavior if:

When we run the program, we specify the -T switch, as in perl -T program.pl
The first line of our program is #!/usr/bin/perl -T
If the program is running with either the SUID or SGID bits on.

Once we enter tainted mode, it will not be possible to leave it - The whole execution of our program will be carried out with taint checks.

4.2.1 Detecting tainted data

To detect tainted data, we can use this function (taken from perldoc perlsec):

sub is_tainted {

    return ! eval { join ('', @_), kill 0;

1;

}

We can also use the Taint module (available at CPAN) this way:

use Taint;

warn "Datos sucios" if tainted ($var1,@var2, $3var3, %var4);

Thus avoiding possible fatal exceptions.

Certain values will always be tainted, as they come from the outside world. For example, the execution path ($ENV{PATH}) is recieved from the invoking process, and Perl cannot trust it to be safe. Any external program we execute from within Perl (with system(), exec(), qx(), backticks, etc. will require $ENV{PATH} to be untainted (cleaning) to allow its execution:

 $data = ; # $data contains tainted data, as they come from the user

if ( $data =~ /^([\w\b\d]+)$/ ) { # Accepts only the specified pattern and stores it in main memory
    $clean_data = $1; # $1 contains the text which matched the regular expression
} else {
    die "I did not expect this condition: $data";
}

4.2.3 A warning about unconsciously untainting

After what we just said, this might sound tempting:

$data = ;
$data =~ /^(.+)$/;
$data = $1;

However, accepting any data without checking defeats the use of Tainted checks. It is not impossible for us to want to do something like this, but it is very important to double- and triple-check if this is our only way out.

Unconsciously untainting data can be very harmful. Many computer security schemes of all kinds fail because they make the user believe he is completely sure, even if they are just a little help to heighten a bit the overall system security. Tainted checks are a tool for a programmer to ensure he is not forgetting to check something, and circunventing this mechanism would be completely pointless.

5 Sugestions to remember when using functions

There are many subtleties concerning functions and their use. Among the most important ones:

5.1 Invoking functions

If we are not using strict, Perl will allow us to invoke functions directly, giving only the function's name without explicitly stating that we are doing a function call. Although this seems often elegant and clean, it can lead to confusions to people reading our code, and can clash with a reserved word in a future Perl version. Functions should preferably be called as function() or function($arg1,$arg2), indicating an argument list even if it is empty, making clear we are talking about a function call.

Many people like adding the prefix & to the function calls. This is a syntax inherited from old Perl versions and not needed anymore. If you choose to use this syntax, it is also very important to explicitly give the argument list even if it is empty (&function()). If instead we call it as &function, this function will inherit the argument list (@_) for the current function call.

5.2 Recieving parameters

It is very common for the first line of a function to be:
my ($var1, $var2, $var3) = @_;
or my $var1 = shift; my $var2 = shift; my $var3 = shift;

This syntax is completely correct. However, if we just assume the function was correctly called, we can end up with undefined parameters, or ignored arguments. It is very advisable to check each of the recieved parameters, searching for undefined values, incorrect data types (i.e., a variable containing text where we should have a numeric value, or a scalar value where we expect a reference). Having incorrect data can cause erratic and hard to trace behavior.

5.2.1 Note on the reason of having `@_`

Many people ask why Perl uses a default array (@_), something not done by any other modern programming language. A simple example will better illustrate this:

In C, we declare a function like this:
int funct (int var1, char* var2);

In Python:
def func (var1, var2):

In PHP:
function func ($var1, $var2) { (...) }

And the list goes on for practically any other language. Why do we have to juggle around in Perl as we discussed on the previous section, manually assigning variables from a default array called @_?

Believe it or not, this is done by design. In Perl's first incarnations the lexical variables did not exist. If we were to declare a function together with its variable names, this would impact on the global namespace. Rather than polluting namespaces, the Perl architects decided to give arguments one standard and well defined name: The default array, or @_. In fact, if we will code a short function and this will not make our code illegible (and even more if the function will be called often), it is advisable to use the default array directly inside the functions. This way, we will not have to worry about manual creation and assignment of variables, saving some microseconds. Yes, some microseconds may be very little in most cases, but it can become quite noticeable in some circumstances.

5.3 Handling internal variabls

We have talked this over a couple of times already. It is very important to make all of our variables (or at least, as many as possible) belong to the lexical scope (with my). This will avoid data in our function affecting other functions, and it will optimize memory usage.

5.4 Being careful when using references - `ref`

When we use references to pass arguments to or return values from a function, it is very important to check they are effectively references, and that they refer to the right data type. It is invariably better to return from the function with an error message and allow the invoking program or function to handle this condition than just raising a run-time error, leading to aborting the execution, or maybe even worse, handling incorrect data.

To timely detect these errors, we can recur to the ref builtin function this way:

`$var`	`ref($var)`
`\\\$something`	`'REF'`
`\$data`	`'SCALAR'`
`\@arr` or `[1,2,3]`	`'ARRAY'`
`\%hash` or `{a=>1,b=>2}`	`'HASH'`
`\&func`	`'CODE'`
`\*other`	`'GLOB'`
`$non_ref`	`''` (empty string, not `undef`)

5.4.1 `GLOB` references

A special reference type is a GLOB reference - It is not a reference to a scalar, an array, a hash or code, but rather a reference to everything.

Namespaces don't only manage the data types just mentioned. They also manage filehandles, which do not have a identifying prefix,a nd are not easy to pass between functions. If we want, for example, to pass a filehandle aas an argument, the traditional way is to do it via a global ref: func(*STDOUT). This makes func's first argument an object of IO::Handle type.

Yes, back in their day GLOBs were a needed alternative, and even an elegant way of doing something the language needed. However, nowadays the best use of GLOBs is to confuse fellow programmers. If you want to pass filehandlers between functions, it's often better to use the native object oriented implementation of open, through IO::Handle (or one of its subclasses, as IO::File or IO::Socket):

use IO::File;

if (need_open()) {

    my $handle = new IO::File;

    if ($handle->open('
Here, $handle has all the attributes of a lexical
variable: We can pass it to the use_file function as a
parameter, it automatically disappears when leaving the block where it
is declared (cleanly closing it, of course, thanks to the object's
destructor), etc.

5.5 Returning results
When we leave a function, either after a successful or an
unsuccessful operation, we should return a result, as simple
and coherent as possible.
Every function should have a consistent way of expressing success
or failure, and the function call should handle this result
accordingly.
When we successfully complete a function, we should find the
simplest way of returning the processed data - For example, sometimes
it will be much simpler to process multiple results of a function call
if they are all grouped in a single reference to an array, or better
yet, to a logical structure reflecting their relationships, instead of
sending them back as a collection of unrelated scalar values.

6 Sugestions to remember when using objects
Perl 5 introduced many important advantages over Perl 4. Maybe the
most important of them is the support for object oriented
programming. However, as we will soon see, Perl's OOP implementation
is neither complete nor clean, if we follow the traditional, formal
criteria for defining objects. Using objects correctly as they are
implemented, however, can be very convenient, once the particularities
are taken care of.
To work optimally with objects, please refer to perldoc
perlobj, perldoc perltoot and perldoc
perlbot.

6.1 Living with a patchy object implementation
Perl's OOP syntax often forces the programmer to write more than
what he would in other languages, and for non-expert eyes, the style
is not very clear, often even confusing. For example, we refer to an
object's attribute by using $obj->{attr}, but we refer to an
object's method with $obj->meth(). A common error is to try
$obj->attr, which Perl will translate to a method call. In
case the attr method exists, it will be called and its return
value will be given to the caller, probably leading to unexpected
behavior. If the method does not exist, this will lead to a runtime
error, aborting our program execution.
In traditional object oriented languages we are used to having
private and public attributes and methods. Perl invites us not to
think that way. Perl invites us to make everything public. Quoting
Larry Wall, I will let you enter my house, and I trust you not to
steal my belongings because I trust in you, and not because I have a
security surveillance system at home. Yes, there are ways of
implementing private methods and attributes, but they look more like
black magic and contorsionism than calls.

6.2 Checking for required and optional attributes
When we create an object, it is important to check that we have all
the required parameters, and that we were not invoked with a parameter
we don't know how to handle. To do so, I suggest to include in your
constructor methods something like this:

my @needed = qw(color size type);
my %temp = ();
my %valid = ();

map {$temp{$_} = $valid{$_} = 1} (keys %$self);
$valid{$_} = 1 foreach (qw(texture temperature));
foreach (@needed) {
    if (defined $temp{$_}) {
        delete $temp{$_};
    } else {
        die "I need $_ to create the object!";
    }
}

if (my @tmp2 = keys(%temp)) {
    die "Unknown elements at object initialization: @tmp2";
}

7 Sugestions to remember when using modules and libraries
A large project can be much better managed if it is built using
modules, splitting a large program into many separate and
function-specific files. This has the additional advantage of making
much easier code reutilization for other projects.
Besides modularizing our code, in Perl we will frequently use
modules built by other people, as we have a large collection of
modules, voluntary contributions from various Perl developers
worldwide, organized in the CPAN.
In Perl we often talk about modules and libraries. They are very
similar, but with one subtle yet important difference: A library is
only a collection of functions, whereas a module is encapsulated in
its own namespace, and is often design with object orientation in
mind. Of course, as we will soon see, we can call a module as if it
were a library or the other way around.

7.1 Packages and namespaces
It is a very common and advisable practice to use separate
namespaces when we work with modules. This will help us avoid global
functions and variables existing in one of our modules collide with
others, defined with the same name, in the main program or in any
other module.
The default namespace is main, and by default, all symbols
(functions, variables and filehandles) in Perl are prepended with
their namespace's name behind the scenes - The real name for
$var, &func or FILE is $main::var,
&main::func and main::FILE. If we specify a null
namespace (i.e. $::var), Perl will translate it to an
explicit call to $main::var.
With the package command, we can change the namespace
where we are working, as shown here:
$var = 'value';
$Other::var = 250;
print $var; # 'value'

package Otro;
print $var; # 250
print $main::var; # will always be 'value'
print $Otro::var; # will always be 250
print $::var; # will always be 'value'
Lexical variables (those declared with my) do not belong
to any namespace, and are not affected by packages.

7.2 Methods for the inclusion of modules and libraries
There are three methods for including code files: do,
require and use. Files included with the first two
are libraries, and files included with use are modules.
7.2.1 do
Its effect is exactly including the file's text in the exact point
in execution it is called at. Quoting from the definition of
do in perldoc perlfunc, do 'file.pl' is
equivalent (although more efficient) to doing scalar eval `cat
file.pl`. Each time we find a do, the file is evaluated
again, so it should not be used inside loops.
The value returned from a file's inclusion is the one of the last
expression evaluated in the file. If the file was not successfully
included, the return value will be undef, and the special
$! variable will have the error message. If the file was
successfully read but could not be compiled, the return value will be
undef and the error will be recieved in the special
$@ variable.
do is normally used for reading configuration
files. Again, from perldoc perlfunc:
unless ($return = do '/usr/local/etc/myconf') { 
    warn "couldn't parse $file: $@" if $@; 
    warn "couldn't do $file: $!" unless defined $return; 
    warn "couldn't run $file" unless $return; 
}

7.2.2 require
require reaches a bit beyond what do does. The
main differences are:

The specified file must exist. If it does not exist, an runtime
error is generatied, and Perl aborts execution.
The last evaluated expression in the file must be true - If it is
not so, the execution of the program will terminate with the message
library did not return a true value. Because of this,
it is very common to have 1; as the last instruction of a
library file.
The file will be evaluated and executed only once. When we include
a file with require, Perl checks first if its name exists
already in the %INC hash, and if it does, skips the
inclussion. When Perl finishes the inclusion, it adds the library's
name to %INC.
If the argument we give to require is a bareword, Perl
will look for the source file in every directory specified in every
directory listed in the @INC array, assuming a .pm
extension, substituting the :: characters for /. If
we give require a string or a variable as its argument,
although Perl does look in @INC, this substitutions will not
be carried out.
If the argument we give to require is numeric, the
execution will be aborted if we are running under a Perl version
lesser than the one specified.


7.2.3 use
This inclusion method was born with Perl 5, and was thought to be
used with modules and objects. When we include a file with
use, additional to require's behavior:

The inclusion is done at compile time, not at execution time. The
code included with use, even if the use statement
appears in the last line of the program, will be available from the
very beginning.
use's argument must be a bareword.
After the module's name, you can specify a list of functions to be
imported into the invoker's namespace. You can request specifically
not to modify the invoker's namespace by providing an empty list.
Given use's syntactic flexibility, it is also used for
the activation of pragmas.
If a module has the unimport function, we can free our
namespace from this module's functionality using no module
(list).
For further details, refer to use's section in perldoc
perlfunc, as well as to object specific documentation in
perldoc perlmod.

7.3 Modules, pragmas, namespaces and the compiler's behavior
If we define a pragma in any of our modules or libraries (or in
the central program), the behavior this pragma defines will not be
inherited by any other file we use. Check on the documentation for
each pragma to see if its effects are lexical (they apply to the whole
block where they appear) or for the whole file.
Restricted compartments: The Safe module
If we don't trust the code for a certain program, module or
library, we might want to add security restrictions to it when it is
run. The module Safe was written for this purpose, and is
part of Perl's default distribution. In this talk we will only skim
it. For further details, refer to perldoc Safe.
Safe's basic syntax is very simple - For creating a new
restricted compartment, we use:
use Safe;
$compart = new Safe;

8.1 Restricted namespace
We can define a namespace to which this code will be restricted to,
from which it will be denied interaction with any simbol located
outside it. All data interchange between the restricted compartment
and the rest of the program must be done using the default variables
($_, @_ and %_), as well as the symbols
explicitly declared when we create the restricted compartment.
The default namespace is Safe::Root0 for the first
compartment, Safe::Root1 for the second, etc. To ask a
compartment which namespace is it running in, we can use the
root method.
We can specify the namespace we want the compartment to use by
calling it this way:
my $comp = new Safe 'hidden';
print $comp->root;
This will cause the namespace for the compartment to be hidden.

8.2 Opcodes masks
Another way to control code execution is limiting the operations it
will be allowed to carry out. If we use a restricted compartment, we
can specify which Opcodes will be valid in its code, aborting
execution in case an invalid opcode is requested. The default Opcode
mask is :default, which allos all the opcodes to be
executed.
This topic goes beyond the reach of this talk, for more information
please refer to the manual page refering to opcodes (perldoc
Opcode).

8.3 What will Safe not protect us from
As any other security tool, Safe is a very useful helper,
but is far from perfect, and it is very important to remember its
limitations. perldoc Safe mentions the following cases where
Safe will not be of great help. Some of them, of course, can
be avoided by prohibiting the corresponding Opcodes, but it is very
difficult to find every case. Some of them simply cannot be restricted
without seriously crippling Perl itself for the compartment, maybe to
the point of becoming useless.

Resource exhaustion, as simple as a while (1) {}.
Code that spies on the system and sends information somewhere
else
Generating signals that affect the program's execution, taking
advantage of poorly made signal handlers.
Operations affecting the whole process, such as
chdir
The verification is done at compile time, so a eval
"(...)" will not suffer from the Opcode restrictions.

Security in Perl scripts

1. Introduction

1.1 What is Perl?

1.2 Good things about Perl

1.2.1 Non-tipified variables

1.2.2 Automatic memory management

1.2.3 High extensibility

1.2.4 Quick compilation

1.3 Bad things about Perl

1.3.1 A lot of simplicity and power --- Maybe too much

1.3.2 Non-prototyped functions

1.3.3 Objects are essentially just a patch

1.4 Types of Perl

1.4.1 Traditional Perl

1.4.2 Perl as a module or part of another application

1.5 Languages comparable with Perl

1.5.1 Shell

1.5.2 PHP

1.5.3 Python

1.5.4 Java

1.5.5 Javascript

2 Avoiding insecure programing practices: the strict pragma

2.1 Introduction to strict

2.1.1 Activating/deactivating strict

2.2 strict 'vars': Variable's scoping

2.2.1 Global scope

2.2.2 Global scope in a package - our

2.2.3 Local scope

2.2.4 Lexical variables with my

2.2.5 Why should we avoid using global variables?

2.2.6 Behavior of the global scope when using strict

2.3 strict 'subs': Non-explicit subroutines

2.4 strict 'refs' --- Symbolic references

2.4.1 What are symbolic references?

2.4.2 And what is so bad about symbolic references?

3 Reporting warnings

3.1 Reporting warnings with Perl < 5.6.0

3.2 Using lexical warnings with Perl >= 5.6.0

4 Handling tainted data

4.1 What is tainted data?

Activating tainted data reports

4.2.1 Detecting tainted data

4.2.3 A warning about unconsciously untainting

5 Sugestions to remember when using functions

5.1 Invoking functions

5.2 Recieving parameters

5.2.1 Note on the reason of having @_

5.3 Handling internal variabls

5.4 Being careful when using references - ref

5.4.1 GLOB references

5.5 Returning results

6 Sugestions to remember when using objects

6.1 Living with a patchy object implementation

6.2 Checking for required and optional attributes

7 Sugestions to remember when using modules and libraries

7.1 Packages and namespaces

7.2 Methods for the inclusion of modules and libraries

7.2.1 do

7.2.2 require

7.2.3 use

7.3 Modules, pragmas, namespaces and the compiler's behavior

Restricted compartments: The Safe module

8.1 Restricted namespace

8.2 Opcodes masks

8.3 What will Safe not protect us from