Frequently Asked Questions on CGI programming
------------------------------
Subject: Table of Contents
==========================
0. Preamble
0.1. Changes
0.2. Notice and Disclaimer
0.3. Where to get this document
0.4. How to contribute to this document?
0.5. Can I email the author my questions?
0.6. What's up with posting to comp.infosystems.www.authoring.cgi?
0.7. Credits
1. Basic Questions
1.1. What is CGI?
1.2. Is it a script or a program?
1.3. When do I need to use CGI?
1.4. Should I use CGI or JAVA?
1.5. Should I use CGI or SSI?
1.6. Should I use CGI or an API?
1.7. What do I absolutely need to know?
1.8. Does CGI create new security risks?
1.9. Do I need to be on Unix?
1.10. Do I have to use Perl?
1.11. Do I have to put it in cgi-bin?
1.12. Do I have to call it *.cgi? *.pl?
1.13. What is CGIWrap, and how does it affect my program?
1.14. How do I decode the data in my Form?
2. HTTP Headers and NPH Scripts
2.1. What is HTTP (HyperText Transfer Protocol)?
2.2. What HTTP request headers can I use?
2.3. What Environment variables are available to my application?
2.4. What HTTP response headers do I need to know about?
2.5. What is NPH?
2.6. Must/should/can I write nph scripts?
2.7. Do I have to call it nph-*
2.8. What is the difference between GET and POST?
3. Techniques: "How do I..."
3.1. Can I get information about who is visiting?
3.2. Can I get the email of visitors?
3.3. "But I saw some.kool.site display my email address..."
3.4. Can I verify the email addresses people enter in my Form?
3.5. Can I get browser details and return different pages?
3.6. Can I trace where a user has come from/is going to?
3.7. Can I launch a long process and return a page before it's finished?
3.8. Can I launch a long process which the user interacts with?
3.9. Can I password-protect my pages?
3.10. Can I do HTTP authentication using CGI?
3.11. Can I identify users/sessions without password protection?
3.12. Can I redirect users to another page?
3.13. Can I run a CGI script without returning a new page to the browser?
3.14. Can I write output to a different Netscape frame?
3.15. Can I write output to several frames at once?
3.16. Can I use a CGI script to generate both text and inline images?
3.17. How can I use Caches to make CGI scripts faster and more Net-friendly?
3.18. How can I avoid users hitting "submit" twice?
3.19. How can I stop my CGI script reading and writing files as "nobody"?
4. Applications: Is there an existing script to ...
4.1. Where to look for programs, scripts, and other resources?
4.2. Where to look for free scripts for my application?
4.3. Discussion group/bulletin board
4.4. CSCW/Groupware
4.5. Database
4.6. Is than a non-setuid script to allow users to change password?
5. Troubleshooting a CGI application
5.1. Are there some interactive debugging tools and services available?
5.2. I'm having trouble with my headers. What can I do?
5.3. Why do I get Error 500 ("the script misbehaved", or "Internal Server Error")
5.4. I tried to use (Content-Type|Location|whatever), but it appears in my Browser?
6. Further Reading
6.1. Other FAQs/collections (including online book)
6.2. Reference Pages
INDEX
-------------------------------------------------------------
Subject: SECTION 0 - PREAMBLE
NOTE: the Reply-to address in this FAQ is an autoresponder. If you
want to write to me, you'll have to set the "To:" line by hand:
mailto:nick@webthing.com
NOTE: the numbering in this document is automatically generated by my
posting software, and will change between postings if new questions are
added (as _may_ happen when I see - or someone contributes - a FAQ I've
previously overlooked :-)
------------------------------
Subject: 0.1 Changes
Last Modified: June 4th 1997:
* Updated "where to get this document"
* Added two new "why doesn't it work" Q&A's
* Added question on URLencoding to basics
* Added question on user "nobody" (and how to get round it)
* Expanded question on resources for debugging.
* Added existing script question on allowing users to change password
* Added reference to CGI authentication question
* Added more indexing keywords
------------------------------
Subject: 0.2 Notice and Disclaimer
Copyright 1996-7 Nick Kew.
You are free to copy or distribute this document in whole or in part
for any purpose and on any medium you choose, provided:
You DON'T do so for profit.
You DO include this notice and disclaimer in full.
Disclaimer: This information is offered in good faith and in the hope
that it may be of use, but is not guaranteed to be correct, up to date
or suitable for any particular purpose. The author accepts no liability
in respect of this information or its use.
------------------------------
Subject: 0.3 Where to get this document
The homes of this document on the Web are now
* the WebThing Virtual Office, at http://www.webthing.com/:
URL http://www.webthing.com/page.cgi/cgifaq
* the Web Design Group, at http://htmlhelp.com/
URL http://htmlhelp.com/faq/cgifaq.html
NOTE - If you want to mirror the FAQ on your WWW site, the best document
to use is the HTML version from my autoresponder (see below). If you're
putting it on a publicly-visible server, please make sure you keep it
up-to-date (if you let me know you have it, I can automate the updates).
Other known sources are:
(1) USENET: posted to newsgroups (TEXT)
news:comp.infosystems.www.authoring.cgi
news:comp.answers
news:news.answers
(2) RTFM and mirror sites (TEXT)
ftp://rtfm.mit.edu/pub/usenet/news.answers/www/cgi-faq
(3) RTFM WWW mirror sites, including (Partial HTML)
Europe - http://www.cs.ruu.nl/cgi-bin/faqwais
America - http://www.cis.ohio-state.edu/hypertext/faq/usenet/
(4) By EMAIL from my autoresponder (HTML or TEXT)
Send blank email to
mailto:nick+cgi_text@webthing.com
or
mailto:nick+cgi_html@webthing.com
(depending on which version you want)
**** NOTE CHANGE FROM PREVIOUS AUTORESPONDER SETUP! ****
(5) By EMAIL from the FAQserver at RTFM (TEXT)
Send email to mailto:mail-server@rtfm.mit.edu with
send usenet/news.answers/www/cgi-faq
in the body of your message
------------------------------
Subject: 0.4 How to contribute to this document?
The WebThing software permits collaborative authoring using your web
browser. When you are reading any entry in this InterFAQ, you can add a
new entry which will then appear as another "more on" subject.
http://www3.pair.com/webthing/
(note: the version at this site is no longer listed in the previous question)
In order to maintain the quality of the FAQ, and avoid inappropriate
'commercial' entries, write permission is limited using an Access Control
List. If you have a contribution to make, send me an email including your
WebThing userid (i.e. what you entered in the registration form) and I'll
add you to the list.
InterFAQ readers - If your browser isn't showing a "new entry" button, then
either you aren't logged in or you're not on the access control list.
Note that this InterFAQ is limited to questions-and-answers appropriate to
periodic Usenet posting. Other types of contribution can be added
elsewhere in the WebCentre. For example
* If you have a relevant website and want to link to it, enter it the
appropriate collection (e.g. "scripts" or "misc"). You can then
also include a description of your site, and have it indexed.
* If you want to post a question or comment on something in this
document, you can post it as a followup to the "flat" version of the
FAQ (library document in the "FAQS" collection).
If you don't want to use the InterFAQ you can always mail me
( mailto:nick@webthing.com )
------------------------------
Subject: 0.5 Can I email the author my questions?
I already get more email than I can possibly answer personally, so
in general the answer is no - I'm NOT a free advice centre.
The possible exception is when something already in the FAQ needs
clarifying: don't expect a personal reply, but I *might* add
something to the answer in question, so check the next posting (or three).
The newsgroup is the appropriate place for free advice. But remember:
bad questions usually get bad answers, so think carefully before posting.
------------------------------
Subject: 0.6 What's up with posting to comp.infosystems.www.authoring.cgi?
This is now a moderated newsgroup. The moderator is a bot run by
Thomas Boutell ( mailto:boutell@boutell.com ). The charter for
moderation is as follows:
This newsgroup is self-moderated. Your first posting will not appear
until you have read and responded to an automatic welcome mailing, at
which point your posting will appear with no further delay. Provision
will also be made to automatically approve first postings that contain
a header requesting this. Subsequent postings are approved
automatically.
If posting normally doesn't work - as could be the case if your
newsfeed has trouble with moderated groups - you can post articles
by emailing them to:
mailto:authoring-cgi@boutell.com
Provided the return address in your mail is correct, you will then
receive precise instructions for having your post(s) automatically approved.
Alternative means of posting are detailed in the WWW FAQ, posted
regularly by Thomas Boutell.
------------------------------
Subject: 0.7 Credits
This FAQ was written by Nick Kew, and has been considerably improved
with the help of comments and criticisms, newsgroup posts and
miscellaneous suggestions from Nathan Neulinger, Maurice L. Marvin,
Matthew Healy and Alan J. Flavell.
-------------------------------------------------------------
Subject: SECTION 1 - BASIC QUESTIONS
This section aims to deal with basic questions, addressing the role and
nature of CGI, and its place in Web programming. Questions/answers which
just don't appear to 'fit' under any other section may also be included
here.
------------------------------
Subject: 1.1 What is CGI?
[ from the CGI reference http://hoohoo.ncsa.uiuc.edu/cgi/overview.html ]
The Common Gateway Interface, or CGI, is a standard for external
gateway programs to interface with information servers such as HTTP servers.
A plain HTML document that the Web daemon retrieves is static,
which means it exists in a constant state: a text file that doesn't change.
A CGI program, on the other hand, is executed in real-time, so that it
can output dynamic information.
------------------------------
Subject: 1.2 Is it a script or a program?
The distinction is semantic. Traditionally, compiled executables
(binaries) are called programs, and interpreted programs are usually
called scripts. In the context of CGI, the distinction has become
even more blurred than before. The words are often used interchangably
(including in this document). Current usage favours the word "scripts"
for CGI programs.
------------------------------
Subject: 1.3 When do I need to use CGI?
There are innumerable caveats to this answer, but basically any
Webpage containing a form will require a CGI script or program
to process the form inputs.
------------------------------
Subject: 1.4 Should I use CGI or JAVA?
[answer to this non-question hopes to try and reduce the noise level of
the recurrent "CGI vs JAVA" threads].
CGI and JAVA are fundamentally different, and for most applications
are NOT interchangable. Neither are the two isomorphic: you could
in principle write a CGI program in JAVA, although it is hard to
think of an instance where this would be the best choice.
CGI is a mechanism for running programs on a WWW server.
Typical applications include accessing a database, submitting
an order, or posting messages to a bulletin board.
JAVA enables programs to run on the Client machine, and is
suited to such tasks as detailed manipulation of an image.
Alternatives to JAVA may include the X windows client/server
protocol, use of browser plugins and helper applications, and
other clientside languages such as SafeTCL and perl/penguin.
In certain instances the two may be combined in a single application:
for example a JAVA applet to define a region of interest from a
geographical map, together with a CGI script to process a query
for the area defined.
------------------------------
Subject: 1.5 Should I use CGI or SSI?
CGI and SSI (Server-Side Includes) are often interchangable, and it may
be no more than a matter of personal preference. Here are a few
guidelines:
1) CGI is a common standard agreed and supported by all major HTTPDs.
SSI is NOT a common standard, but an innovation of NCSA's HTTPD
which has been widely adopted in later servers. CGI has the
greatest portability, if this is an issue.
2) If your requirement is sufficiently simple that it can be done
by SSI without invoking an exec, then SSI will probably be
more efficient. A typical application would be to include
sitewide 'house styles', such as toolbars, netscapeised
tags or embedded CSS stylesheets.
3) For more complex applications - like processing a form -
where you need to exec (run) a program in any case, CGI
is usually the best choice.
------------------------------
Subject: 1.6 Should I use CGI or an API?
APIs are proprietary programming interfaces supported by particular
platforms. By using an API, you lose all portability. If you know
your application will only ever run on one platform (OS and HTTPD),
and it has a suitable API, go ahead and use it. Otherwise stick to CGI.
------------------------------
Subject: 1.7 What do I absolutely need to know?
If you're already a programmer, CGI is extremely straightforward, and just
three resources should get you up to speed in the time it takes to read them:
1) Installation notes for your HTTPD. Is it configured to run CGI
scripts, and if so how does it identify that a URL should be executed?
(Check your manuals, READMEs, ISP webpages/FAQS, and if you still can't
find it ask your server administrator).
2) The CGI specification at NCSA tells you all you need to know
to get your programs running as CGI applications.
http://hoohoo.ncsa.uiuc.edu/cgi/interface.html
3) WWW Security FAQ. This is not required to 'get it working', but
is essential reading if you want to KEEP it working!
http://www-genome.wi.mit.edu/WWW/faqs/www-security-faq.html
If you're NOT already a programmer, you'll have to learn. If you would
find it hard to write, say, a 'grep' or 'cat' utility to run from the
commandline, then you will probably have a hard time with CGI. Make
sure your programs work from the commandline BEFORE trying them with CGI,
so that at least one possible source of errors has been dealt with.
------------------------------
Subject: 1.8 Does CGI create new security risks?
Yes. Period.
There is a lot you can do to minimise these. The most important thing
to do is read and understand Lincoln Stein's excellent WWW security
FAQ, at http://www-genome.wi.mit.edu/WWW/faqs/www-security-faq.html .
------------------------------
Subject: 1.9 Do I need to be on Unix?
No, but it helps. The Web, along with the Internet itself, C, Perl,
and almost every other Good Thing in the last 20 years of computing,
originated in Unix. At the time of writing, this is still the
most mature and best-supported platform for Web applications.
------------------------------
Subject: 1.10 Do I have to use Perl?
No - you can use any programming language you please. Perl is simply
today's most popular choice for CGI applications. Some other widely-
used languages are C, C++, TCL, BASIC and - for simple tasks -
even shell scripts.
Reasons for choosing Perl include its powerful text manipulation
capabilities (in particular the 'regular' expression) and the fantastic
WWW support modules available.
------------------------------
Subject: 1.11 Do I have to put it in cgi-bin?
see next question
------------------------------
Subject: 1.12 Do I have to call it *.cgi? *.pl?
Maybe. It depends on your server installation.
These types of filenames are commonly used conventions - no more.
It is up to the server administrator whether or not CGI scripts are
enabled, and (if so) what conventions tell the server to run or
to print them.
If you are running your own server, read the manual.
If you're on ISP or other rented webspace, check their webpages for
information or FAQs. As a last resort, ask the server administrator.
------------------------------
Subject: 1.13 What is CGIWrap, and how does it affect my program?
[ quoted from http://www.umr.edu/~cgiwrap/intro.html ]
> CGIWrap is a gateway program that allows general users to use CGI scripts
> and HTML forms without compromising the security of the http server.
> Scripts are run with the permissions of the user who owns the script. In
> addition, several security checks are performed on the script, which will not
> be executed if any checks fail.
>
> CGIWrap is used via a URL in an HTML document. As distributed, cgiwrap
> is configured to run user scripts which are located in the
> ~/public_html/cgi-bin/ directory.
See http://www.umr.edu/~cgiwrap/
------------------------------
Subject: 1.14 How do I decode the data in my Form?
The normal format for data in HTTP requests is URLencoded. All Form data
is encoded in a string, of the form
param1=value1¶m2=value2&...paramn=valuen
Many non-alphanumeric characters are "escaped" in the encoding:
the character whose hexadecimal number is "XY" will be represented by
the character string "%XY".
Decoding this string is a fundamental function of every CGI library.
Another format is "multipart/form-data", also known as "file upload".
You will get this from the HTML markup