MAST667-010 Coastal Oceanography: From Physics to Fish

Spring 2001

Instructor Andreas Münchow

Workshop #2 (Mar. 9, 2001)

The purpose of this workshop is to familiarize you with a low-level programming environment like DOS or UNIX where programs and files are written, stored, and run. All tasks could also be performed in more complex environments such as Windows2000 or similar, however, the required software overhead can be excessive and prone to error. This and subsequent exercises will exposure you to both UNIX and C-like programming. You will learn a robust scripting language (AWK or NAWK) that is ideally suited for simple data and text string manipulations.

The specific task today is to automate data preparation and to merge sealevel data from stations along Delaware Bay. Hence each line in your final file all.dat should represent h (x,t=const) while each column should represent h (x=const,t). In a future homework (Monday) you will be asked to interpret the data physically for the effects of resonance, friction, and the earth’s rotation. You shall learn here that you can call programs from within programs, that you can pass arguments/variables from file to file, and that you can adapt a set of scripts quickly to run on a large number of data files with little additional effort.

1. Open a DOS (Disk-Operating-System) window and make a subdirectory MAST667 on the D-drive that contain subdirectories called "software" and "data", respectively.

2. Copy software files from your floppy disk to the hard-drive; specify the location (path) of the software that you will use subsequently from the data or other directories:

>path=d:\mast667\software

3. Open a Netscape window, download sealevel data along the Delaware Estuary from Lewes, Reedy Point, Philadelphia, and Cape May for March 1-8, 2001. Save the data as ASCII text files in your data directory. Plot the graph of each time series from within (view graph).

4. Write a fully automated script consisting of ".awk" programs and ".bat" control files that strip both header and empty lines from each data file. Your prior "getdata.bat" is a good start, think hard about the "pattern matching" ability of awk. Recall that each sealevel station has a unique station identifier as its first column.

HINT: You can pass arguments from the .bat file to the awk script, for example,

>awk -f getdata.awk ID=9999 input >output1

passes the value 9999 as a variable called ID into the awk script where it could be used, e.g.,

--- { if ($1==id)

{ print

} }

5. You can automate this further by passing the variable from the DOS command into the getdata.bat file, that is, you call your program as

getdata 9999

where the argument 9999 is passed as a text string into getdata.bat as a variable name %1:

>awk -f getdata.awk ID=%1 input >output1

HINT: You could also use the %1 text string for other purposes such as input/output filenames!

6. Make sure that your clean data files contains the number of data N as well as N lines of data. Here N should be 1924 as the time step between two readings is constant at 6 minutes. Write this number automatically as the first line of a new "clean" data file along with its station identifier.

HINT: This task can either be done from within an awk script by assembling a data array, counting its elements as you go, and then write everything to file after the END statement OR this can be done much shorter with two separate scripts such as

>awk ‘END{print NR}’ output1 >clean

>awk ‘ {print }’ output1 >>clean

where the redirection ">>clean" indicates that you are appending to the file "clean."

7. Compute the mean sealevel for each station and write this mean to a file called "remove.bat",

--- { ... } # calculate the sum of sealevel values

--- END{ ... # calculate the average from the sum of sealevel values

--- print "awk -f remove.awk avg="avg," output1 >output2"

--- }

In effect you are using a program to generate a control file automatically that you run subsequently with the added variable "avg" representing the mean sealevel, that is, remove.bat should look like

>awk -f remove.awk avg=999 output1 >output2

8. Write a program "remove.awk" that subtracts variable "avg" from each value of sealevel.

HINT: This script should not contain more then a single line of code.

9. Run the just generated remove.bat from within your getdata.bat file by including the line

>call remove.bat

10. Create one clean file for each of the four stations and, finally, string them together with the "paste" command, that is,

>paste Cape-May Lewes Reedy Philli >all.dat

where Cape-May, Lewes, Reedy, and Philli represent the clean files for each station.