Приглашаем посетить
Database Search Engine
BASIC INFORMATION
Name: Selena Sol's Database Search Engine
Version: 5.02
Last Modified: 01-23-96
DESCRIPTION
This database search script allows users to search a flatfile UNIX
database by keyword, number or date range.
COPYRIGHT INFORMATION
This application was written by Selena Sol (selena@eff.org,
http://www.eff.org/~erict) and Gunther Birznieks
(birzniek@hlsun.redcross.org)having been inspired by countless other Perl
authors. Feel free to copy, cite, reference, sample, borrow, resell or
plagiarize the contents. However, if you don't mind, please let me know
where it goes so that I can at least watch and take part in the
development of the memes. Information wants to be free, support public
domain freware. Donations are appreciated and will be spent on further
upgrades and other public domain scripts.
Finally, PLEASE SEND WORKING URL's to selena@eff.org. I maintain a list
of implementations around the world.
SUPPORT
This script comes with no gaurentees or warranties. Don't expect the
scripts to be perfect. They have evolved continually over the last two
years and will continue to do so.
Bug reports are greatly appreciated but installation support is
extremely discouraged. I have attempted to include as much information as
I could think of in this README and in the Customization and Installation
FAQ available at http://www.eff.org/~erict/Scripts/. And there is
always lively discussion in Selena Sol's Discussion Forum
(http://www.sidestreets.com/info/sssa/). Please try ALL available sources
of information BEFORE you email me. But if you must, make sure to include
the following bits of information (I may not respond to your email if you
do not answer ALL of the following questions):
1. What type of Web server are you running?
2. What type of Operating System is the Web server running on?
3. What is the "exact" error message from the Web?
4. What is the "exact" error message in your web server's error log?
5. What is the "exact" error message you receive when running the script
from the command line.
6. Are you running this script on an ISP? If so, what is the email
address of the Sysadmin there?
7. Are you using a virtual server setup? If so, what is the root path set
in your Web server's environment?
8. In which directory is the Perl interpreter located?
9. In which directory is sendmail located (if you are using a script which
demands use of sendmail)
Again, I MAY NOT ANSWER YOUR QUESTION unless you have answered all nine of
these questions.
BASIC INSTALLATION (DOWNLOADING THE SCRIPT)
It is recommended that you point your Web browser to "Selena Sol's Script
Archive" to get the latest version of this script. The Script Archive is
located at the following URL:
http://www.eff.org/~erict/Scripts/
From the "Script Archive" frontpage follow the hyperlinks to the detailed
page dedicated to this script. Then click on the hyperlink "Download the
scripts as a single tar file".
BASIC INSTALLATION (UNARCHIVING THE APPLICATION)
Tar is a UNIX command that allows you to create a single archive file
containing many files. Such archiving allows you to maintain directory
relationships and facilitates transferring complex programs with many
separate but integrated parts which must have their relationships
preserved. Tar has a motley of options which allow you to do archiving and
unarchiving in many ways. However, for the purpose of untarring this
application, the commands will be fairly simple.
Once you have downloaded the TAR file transfer it to an executable
directory on your web server and "untar" it. On UNIX systems, you may
type the following at the command line (in the same directory as the TAR
file itself):
tar xvfp web_store.tar
Tar will go through the archive file and separate out each individual
directory and file, expanding them into their appropriate places
underneath the current directory. The "xvfp" letters in the tar command
above are parameters that tell the program to extract the files and
directories out of the ".tar" file. Specifically, "x" tells tar to extract
the files. "v" tells tar to output information about the status of its
extraction while it is performing the work, "f" informs tar to use the
".tar" filename as the source of the files to be extracted, and "p" notes
that the original permissions hould be maintained. The reason the
"f" parameter has to be used is that tar, by default, archives files and
directories to a tape drive. Tar is actually short for "[T]ape [AR]chive".
(Note: If you are using a non-UNIX Operating System, you may
download a TAR/UNTAR program by pointing your Web browser
to http://www.shareware.com I suggest using "untar" as your
keyword when you search their inventory).
BASIC INSTALLATION (SETTING PERMISSIONS)
Untarring the files is only one part of the equation of installing the
application and getting it to actually run. Frequently, the Web
server needs to be given special permission to run your scripts and have
the scripts perform their job with the appropriate "rights".
The cardinal rule for setting up Web server software is that the server
should be given only minimal capabilities. This definitely rules out the
Web server running as the ROOT user (Super user on UNIX). More often than
not, it means the Web server is run as a user that has no rights to do
anything significant -- the user "nobody". By default, "nobody" usually
does not have permission to read any files in directories that you create.
However, when you download scripts, you need to make it so that the
scripts can be read and executed by the Web server software. In other
words, "nobody" has to be able to get to the files.
The magic command is
chmod [permission digit] filename
Thus, the following command gives everyone rights to read and execute, but
noone rights to write to the file hello.cgi.
chmod 555 hello.cgi
The actual value of the digit determines the permissions granted to that
area. Permissions consist of three numbers -- 4 for read, 2 for write, and
1 for execute access. By adding these numbers together, you form the
permissions that make up one digit. For example, 4 + 2 + 1 = 7 which
grants read, write, and execute permissions. 4 + 1 = 5 which only grants
read and execute permissions. Thus, 755 grants 7 (read, write, execute) to
the owner of the file, and 5 (read and execute) to the group the file is
in and the world. Below is a chart which will can be used as a quick
reference
PERMISSION COMMAND
U G W
rwx rwx rwx chmod 777 filename
rwx rwx r-x chmod 775 filename
rwx r-x r-x chmod 755 filename
rw- rw- r-- chmod 664 filename
rw- r-- r-- chmod 644 filename
U = User
G = Group
W = World
r = Readable
w = writable
x = executable
- = no permission
[WARNING] You may be tempted to simply use chmod 777 on all the files and
directories since that assures the Web server can do anything with the
files. However, it is strongly advised that you do not leave the files in
this state. It is considered a big security risk to leave your scripts
open to changes by the Web server instead of being read-only. Anyone on
the server could use another rogue CGI script to write over your scripts
and make them do something completely different. There is still a risk
involved in making the messages directory writable, but at least if
someone is going to be messing with your area, they will only destroy a
bit of data and not your main programs. It is "OK" to set the scripts to
777 if you are troubleshooting a problem and want to rule out permissions
entirely, but do not leave the scripts like this. On another security
note, if you are really concerned with the security of your data such as,
please do not use a shared server where other people can write CGI scripts
using the same Web server configuration. It is much better to use your
own server software or purchase space on a "virtual server" which may be
shared, but is set up in such a way that each user's scripts are shielded
from each other.
Note: Not setting your permissions correctly is the
NUMBER 1 reason why installations fail. Take time to
get this right.
The actual permissions required for the subdirectories and files used by
this application are listed in the next section.
BASIC INSTALLATION (FILES, DIRECTORIES, AND PERMISSIONS)
The TAR file will then expand into a root directory called
Database_search. Database_search will contain several sub-directories
and several files. The diagram below depicts the directory structure as
well as the permissions which must be applied to the files and
subdirectories used by the application.
Database_search Root Directory (drwxr-xr-x)
|____Data_files (drwxr-xr-x)
| |____address_book.data (-rw-r--r--)
|____Library Subdirectory (drwxr-xr-x)
| |____auth-extra-html.pl (-rw-r--r--)
| |____auth-extra-lib.pl (-rw-r--r--)
| |____auth-lib-fail-html.pl (-rw-r--r--)
| |____auth-lib.pl (-rw-r--r--)
| |____auth-server-lib.pl (-rw-r--r--)
| |____auth_fail_html.pl (-rw-r--r--)
| |____cgi-lib.pl (-rw-r--r--)
| |____cgi-lib.sol (-rw-r--r--)
| |____db-lib.pl (-rw-r--r--)
| |____mail-lib.pl (-rw-r--r--)
|____Setup_files (drwxr-xr-x)
| |____address_book.setup (-rw-r--r--)
|____Users (drwxrwxrwx)
| |____default.users (-rw-rw-rw-)
|____db_search.cgi (-rwxr-xr-x)
Database_search is the root directory of this application. It must have
its permissions set to be readable and executable by the web server.
Data_files is a subdirectory containing all of the data files
which db_search is supposed to seacrh. The
directory itself must be readable and executable and all the files
within it must be readble.
By default, we have included one sample data file called
address_book.data
Every data file is simply a pipe (|) delimited database using a
newline to represent a new database row. For example, the first
line might read:
1234|Bic Ball Point Pen|1.45|1
As you can see, every database field is separated by the pipe
symbol (which means that the pipe symbol may not appear in your
data!). In this example, field one is the item number, field two
is the item name, field three is the price, and field four is the
unique database id number.
Every database row MUST have a unique database id number, so if
you want to add database rows manually, you must make sure to add
this final field carefully.
Comment lines are acceptable within the data file, but they must
be specified using the comment tag "COMMENT:" flush against the
left margin. The first line of video.data uses a comment line to
describe the database fields as follows:
COMMENT: Category|Item #|Name|Description|Size|Price|URL|Link...
Library is a subdirectory containing various supporting files used by the
application. The directory must be redable and executable by the
web server and each library file within must be readable.
cgi-lib.pl is used to read and parse form input.
cgi-lib.sol is used to lock the data file when it is being
manipulated.
db-lib.pl is used to search through the datafile according to the
criteria entered by the user.
all libraries beginning with "auth" are used for authentication
mail-lib.pl is used with authentication to mail new registrants.
Setup_files is a subdirectory containing the setup files which describe
server specific variables and database specific variables. The
subdirectory must be readable and executable by the web server and
the setup files within the directory must be readable by the web
server. By default, we have included one sample setup file called
address_book.setup. Below is a short description of the variables
and their meanings.
$data_file_path is the location of the data file which will be
searched
$max_rows_returned is the maximum number of rows which will be
returned at one time. If the users search criteria turn
up more than that they will not be displayed unless you
define a "see next hits" button in your HTML code.
$current_century is the current century.
%db defines the the structure of your datafile. It is an
associative array which matches up the
variable names which you will use to access data files
with the actual numerical index of that field. Remember
that arrays start counting at zero.
$index_for_email is the numberical index of email in case you want
to make it a hyperlink.
$index_of_field_to_be_sorted_by defines which field we should sort
the database by default. This index corresponds to %db and can be
overridden by a form variable called sort_by.
@db_display_fields are the headers of the fields that you wish to
display to the user when they receive the search results.
You do not have to display all the fields you defined in
%db, but you must have one element in @db_display_fields
for every element in @db_index_for_display
@db_index_for_display is the index into %db of the fields that you
want displayed when the user get a search results page.
This corresponds to @db_display_fields
@db_query_criteria defines how each database field will be
searched. The specifics of how to use @db_query_criteria
are discussed in db-lib.pl
The array contains pipe-delimited fields inside each
list item. The fields are the
1. form variable name
2. index into the database that this criteria applies to
3. operator for comparison
Possible values: >,<,>=,<=,=,!= (not equal)
The operator is compared the following way:
form_variable OPERATOR database_field_value
That is, (1) above is the left hand side of the
operator and (2) above is the right hand side of
the operator.
4. data type of the field (This is cool because it
determines how the operator in (3) gets
applied to the data.
The data type can be:
date
number
string
If the data type is a date, then the operator
for comparison is done after the form value
and the fields being compared in the database
are converted to DATES.
If the data type is a number, then the operator
for comparison is done based off of numerical
if operators (>,<,==, etc.)
If the data type is a string, then the operator
for comparison is done based off of string
if operators (gt, lt, eq, ne, etc.) with ONE
EXCEPTION.
If the datatype is a STRING *AND* the operator
is =, then the search that is done becomes a
more flexible search.
1. All the words in the form variable are split
apart and searched as seperate keywords in the
text of the fields.
2. By default, the search on string = string
is a pattern match search and is not case
sensitive.
3. If you want this special string,= combination
searching to be case sensitive and to match on
whole words only, you MUST set up two new form
variables: case_sensitive and exact_match
If exact_match is on (checkbox) then the
combination of string,= in the query criteria
array will match on WHOLE WORDS only.
If case_sensitive is on (checkbox) then the
combination of string,= in the query criteria
array must have matching case values (upper/lower).
There are three main cases that you generally want to set
up a query_criteria array for:
Case 1: General keyword search through the database.
(a) Set the first field equal to your keywords form variable
(eg keywords). You will need something like the follwoign in your
HTML page.
<INPUT TYPE = "text" NAME = "keywords" SIZE = "40"
MAXLENGTH = "40">
(b) Set the 2nd field equal to field numbers of the
database file you want to search. Since you are going to
do a keyword search through the whole database, you want
to comma-seperate these (eg 1,2,3,4,6 where database filed #5 is
not searched)
(c) operator is set to =, data type is set to string so
that the keyword search is done using pattern matching
and case insensitive.
@sc_db_query_criteria would be equal to
("keywords|1,2,3,4,6|=|string)
Case 2: Just want to do a search on a field like lname
and include that search term within URLs in a frontpage such
as web_store.cgi?lname=Smith.
(a) Set the form variable equal to the above (lname)
(b) set the 2nd field equal to the field in the database
corresponding to a lnamename (eg 1).
(c) set operator =, data type string to do a keyword
search that is case insensitive.
@sc_db_query_criteria would be equal to
("lname|1|=|string")
Case 3: You want to make a front page form where several
fields in the database are being searched.
You want to allow the user to search on an age
range, plus a keyword search on lname field.
Here is the setup:
@sc_db_query_criteria would be equal to
("age_low|9|<=|number",
"age_high|9|>=|number",
"lname|1|=|string");
Notice that we set up TWO form variables for allowing
the age range searching. This is because we allow the
user to enter the low range and the high range of the age
they want to search for (database field #9).
Note, also, that the "age_low|2|<=" means that
the database row returns a match if and only if the
value of "age_low" form field is <=- the value of
field #9 in the database row (remember counting starts at 0).
Then, we set the lname form variable to be a keyword
(=,string) search on database field #1.
The form itself would have these fields as HTML:
Lowest Age To Search For:
<INPUT TYPE=TEXT NAME=age_low VALUE="">
Highest Age To Search For:
<INPUT TYPE=TEXT NAME=age_high VALUE="">
Enter Last Name:
<INPUT TYPE=TEXT NAME=lname VALUE="">
SPECIAL NOTE: Only criteria that is entered on the form
is queried against. If the user leaves one or more fields blank
(or you neglect to place them on the form), then those fields
never get queried.
Of the criteria that IS entered on the form, all the criteria
must be satisfied for that row before the row will be considered
safe to display to the user.
The same goes for the string,= (special case for keywords).
When the string is split into keywords separated by whitespace,
all the keywords must be found in the database field before
the program will consider it a valid match.
This logic is there because we want to provide the capability
of letting the user narrow down the query as they enter more
data into the form.
$auth_lib = location of the authentication library files. By
default, they are all located in the Library subdirectory and are
all prefixed with "auth" so that they are easily distinguishable
as a package.
$auth_server = are you using server based authentication with
access.conf type stuff? If so, set this to on in order to take
advantage of the excellent power of server-based authentication.
Typically though, you won't have server based authentication opn
because most people will not have access to those services on
their ISP. The benefit of using server based authentication is
that you can absolutely validate the user name against the server
authentication database.
$auth_cgi = If you are not using server based authentication, then
you must be using cgi based authentication with your own flatfile
user database. The distribution copy assumes this, so the
variable is set to "on".
$auth_user_file is the flatfile user database of validated admins.
By default, this file should be empty. When you are ready to add
yourself as an admin, you will do so and the script will
automatically add you as a validated admin to this file. You
should then expect to see something like the following line in
that file:
encrypted password|username|admin|fname|lname|you@yourdomain.com
$auth_alt_user_file is an extra security option. If alt_user_file
is defined, when a user registers, their information will be
stored in the alternate user file until the system admin (you)
copies them over. Normally, you will just let them register into
the main file however.
$auth_default_group - Since security for this type of script mainly
focusses on one user, I'd just leave this as admin and not worry
about it...I don't anticipate much need for groups and users. If
you did decide to incorporate different levels of security into
the script, you would have to write your own logic into the main
script anyway.
$auth_add_register gives the script the ability to add new users
"directly" to the database. $auth_allow_register determines
whether or not the authentication frontscreen shows a registration
button, or just a logon button.
We recommend that you set both of these OFF unless you are adding
validated users...If not, someone could theoretically, make a
mirror form somewhere else and add themselves to the
database...these options disable the add function entirely until
the time when you actually want to do it.
Thus, by default, we have set both these variables to "off".
Therefore, the first time you try to run the administrative
functions you will hit a deadend. There will not be any validated
admins in the user file AND there will be no way for you to
register yourself as an admin.
What you'll have to do is change these two variables in the setup
file to "on". Then, you will call up the script from the web
with a URL something like the following:
http://www.you.com/cgi/Mailing_list/mailing_list.cgi?action=admin
At that point, you should now get the login screen, but this time you
should see an extra button, "Register for an Account". You
should click this button. Then, you will get the login screen.
Fill out the information and submit the data to the script. The
script will then encrypt your password, add you as an admin and
tell you that you may now log in as an administrator.
At this point it is time for you to go back into the setuyp file
and change both the variables back to "no" so that noone else can
add themselves as an admin.
$auth_email_register determines if this script should mail the admin a
note when someone registers...I would leave this on so that you
have a double check that no one uninvited can use the admin
interface.
$auth_admin_from_address and $auth_admin_email_address help
mail-lib.pl send email notification on new registrations. If you
are having trouble with the mailing options, make sure you go into
mail-lib.pl and set the path to sendmail correctly. Also, if you
are using aliases, you need to remove the -t option.
$auth_session_length determines the number of days session files stay
around for. Session files are used to keep track of the
admin information while they are performing a mass mailing. The
script needs to be able to keep track of the admin from screen to
screen to make sure that she is validated. This info is kept in a
session file in the Session_files directroy. However, these files
are only needed "while" the admin is performing administratove
functions...which might only be 5 minutes. Thus, the scriopt
automatically deltes these files when they are no longer needed at
some time interval defined by you.
$auth_session_dir is the location of the session file directory.
$auth_register_message is the mesage that you will receive after
you have registered.
$auth_allow_search determines whether or not the search button will
appear on the registration screen...since you will have direct
access to the user file, you don't need this option at all.
Typically, it is used so that new registrants can search the user
file for usernames so that they will not pick one already in use.
$auth_generate_password if set to "on", directs the script
to generate a password for you. Leave it off if you want to
generate your own.
$auth_check_duplicates checks to make sure no one has already
used your username that you submit when creating a new validated
user. It is best to leave this "on", but there shouldn't be
so many valid users that you'd forget.
$auth_password_message is the message that is sent to the user
with their password if $auth_generate_password is set to on.
@auth_extra_fields and @auth_extra_desc are arrays used to keep track
of information in the user file besides username and password.
Leave this the way it is unless you really need to collect data on
admins.
output_html_query_form is responsible for printing out the query
form which users can submit their search criteria on. It
is not essential of course that you have the script produce the HTML
frontend search form. You can write your own HTML form provided
you make sure to hardcode the essential information that the
script will need to proces the request. For example, you must
make sure that the FORM tag points to db_search.cgi and that you
include the setup file information as a hidden variable. for
example:
<INPUT TYPE = "hidden" NAME = "setup_file"
VALUE = "address_book.setup">
Also, you must make sure that you include a submit button with the NAME
parameter set to "submit_search". For example:
<INPUT TYPE = "submit" NAME = "submit_search"
VALUE = "Submit These Search Parameters">
The VALUE parameter can be anything of course.
Further, you must include input fields so that the user can submit
search criteria. These fields must have NAME parameters which
correspond to %db and @sc_db_query_criteria.
Also, you may use the special form variable "sort_by" to give the
user the ability to choose which field they want to sort the returned
databse rows by. For example, the following SELECT list gives the
user several options.
<TH>Sort by which field</TH>
<TD><SELECT NAME = "sort_by">
<OPTION VALUE = "0">First Name
<OPTION SELECTED VALUE = "1">Last Name
<OPTION VALUE = "2">Email
<OPTION VALUE = "9">Age
</SELECT></TD>
</TR>
Notice that the VALUE parameter corresponds to the index of that
field in %db
Finally, be aware that all of these subroutines take advantage of
the qq method of printing such that the print delimiter is
changed to a tilde (~) instead of a quoite ("). This is done so
that it is easier to display HTML tags which use quotes
themselves. Otherwise we would have to backslash all quote marks
the same way we backslash at signs (@). Of course, all tildes
must be escaped with backslashes aasd a result, but tildes are
less common in HTML code.
search_results_header displays the header HTML of the search
results page. Make sure that if you want users to be able
to use a "see more hits" button, that you create a self
referential form tag pointing back to db_search.cgi.
If you are providing sort options, you must include a hidden form
field for sort_by so that you pass that information on to every
following screen created by this script.
Typically, you just need to include the following lines along
with your template HTML:
<FORM METHOD = "post" ACTION = "db_search.cgi">
<INPUT TYPE = "hidden" NAME = "session_file"
VALUE = "$session_file">
<INPUT TYPE = "hidden" NAME = "sort_by"
VALUE = "$index_of_field_to_be_sorted_by">
search_results_body defines the format of how matched databse rows
should be displayed to the user on the search results page. This
is a little bit more complicated than simply writing HTML code
within a print qq block. This routine basically is responsible
for printing out all the database rows so you need to carefully
define the format of rows. In the example setup file with this
distribution, we have commented our own setup so you can get the
idea of how you could achieve our own format.
search_results_footer displays the HTML footer for the search
results page. Note that if you want to provide a "See more hits"
button, you must include both the hits_seen and the setup_file
variables as hidden form fields.
no_hits_message displays the message the user receives if their
search turns up no hits.
Users is a subdirecotry containing user files with validated
users. The directory must be read, write and executable by the
web server and all user files must be readable and writable.
db_search.cgi is the main script for the application. It must be redable
and executable by the web server.
RUNNING THE SCRIPT
To run the script, simply point your browser to the main script with a URL
like the following:
http://www.foobar.com/cgi-bin/Database_seacrh/db_search.cgi?setup_file=address_book.setup
Don't forget that you need to pass the setup_file variable whenever you
call the script