Working with script variables, including command line arguments



Last revision August 6, 2004

In order to make general scripts, that can be used with different files or under different circumstances, you need to have variables. A variable in a shell script -- or any programming language -- is like a variable in an algebraic expression. It is simply a name that can stand for a value which can vary. In shell scripts, you can create variables and set their values by many different methods:

  • Command line arguments typed after the script name when you run it.
  • Simple set statements within the script itself, for example, to initialize a value that may change later, or to gather all the values that you might want to change into a single list of parameters at the top of the script.
  • set statements that follow a test of some kind: if there is one result, set the variable to one value; if another result from the test, set the variable to a different value.
  • Arithmetic computations that modify the value of a variable.
  • A command substitution that runs another command within the script and captures its output to be the value of a variable.

You can use these variables as parts of other commands that you run from within the shell script: as the list of options, or the names of files to be affected, etc.

Specific C-shell implementations may have limits on the size of variables. The csh program on pangea, which runs Tru64 Unix v4.0g, apparently has no limit on the total size of a variable (total number of bytes of data you can assign to a variable), but does limit individual words within the variable's data to be no more than 1024 bytes each. Here, a word is defined as a string of contiguous characters that includes no blank or tab characters (no "white space"). That is, the contents of the variable are broken into words wherever a blank or tab is found.

Command line arguments to scripts

When you start a script from your interactive login shell, you can provide arguments to that script on the command line. These are automatically turned into variables that can be used inside the script.

If the command line contains filename wildcard characters, variable substitution references, or command substitution references, those are expanded or substituted first. Then the command line string is broken into separate arguments at blanks, except that a quoted string can contain embedded blanks.

You refer to these arguments as separate variables within the script itself by using the dollar sign (variable substitution operator) followed by an integer number, for example,
      cp $1 $2

This statement inside a shell script would run the cp program with the first "argument" to the shell script (first word on the command line that started the shell script) passed as the name of the file to copy via $1, and the second argument to the shell script passed as the name of the new copy via $2.

The entire list of command line arguments can be referenced as one string with the syntax
     $*

Specific C-shell implementations may impose limits on the number or size of arguments that can be passed to a shell script. The csh program on pangea uses a memory area of 38,912 bytes in length to store the expanded list of arguments (after wildcard filename matching or variable or command substitution is done) that can be passed to a shell script, or indeed, to any program that is started by the shell. Environment variables are also stored in this same memory area, so if you have many environment variables, you reduce the total length of an argument list that you can use. You can see how many total bytes are used by your environment variables with this command:
      printenv | wc -c
A typical pangea user will use 500 to 1000 bytes for environment variables, thus reducing the maximum size of the complete argument list for a shell script or other command by that amount

Making and setting your own variables in a script

In addition to the command line arguments, the shell maintains a table of other user-created or special purpose variables in memory. Each variable has a name and a value.

  • Names - up to 20 letters or digits (start with letter) - case matters!
  • Values are strings of characters or digits of arbitrary length without any intrinsic "type". They are treated as character strings or numeric values, depending upon how they are used.

It is also possible to treat any variable as an array of words and access each word separately (see detailed documentation on the C-shell).

Certain variable names are reserved by the shell for special uses, such as path or term.

You can create any number of variables.

Use the set command to create/assign variables, for example:

    set name=single_word
    set name=(word list)
    set name="string with embedded blanks"

A set command with no value just creates the variable as a flag that is "on", and will have the value "true" in a logical expression, for example:
     set optionflag

The unset command removes a variable completely from memory, for example:
     unset name

Using variables in the script

"Variable substitution" is the process of replacing a reference to the name of a variable with its actual value. This is how we use variables.

The dollar sign ($) is the basic substitution operator when it is used as the prefix for a variable name. Anytime you use the dollar sign as the first letter of a word in a shell command, it will expect the word to be the name of a variable. If you want the dollar sign to be interpreted as just a simple dollar sign, precede it wth the backslash (\) "escape" character. Here are the basic formats for variable substitution:

     $?name
This tests whether the name variable exists. If the variable does exist, the shell substitutes the value 1 (one, true); if not, the value 0 (zero, false). Use this form if you are just using the variable as a flag. The result can be used in an if statement to conditionally execute some commands.

     $name
This form causes the entire word list value of name to be substituted for the reference. If name is not defined (was never set), you get an error.

     $#name
This substitutes the number of words contained within the name variable. If the variable has a null value (that is, simply set as a "flag" variable), it substitutes zero. If the variable has never been set, you get an error.

     $name[n]
This substitutes the "nth" word (blank separated value) from the name variable. The square brackets are required to enclose the value n that specifies which word is wanted, and must follow the variable name with no intervening spaces. This is a way to treat a variable containing a multi-word value as an array of separate words. If you specify a word index value n that is greater than the actual number of words in the variable, you get an error.

Examples:

     set a = ($b)
Sets new variable a equal to the word list in existing variable b.

     echo $b
Echoes (prints) the value of existing variable b to the standard output (terminal).

Comments or Questions?