The plots below are prepared in the following manner.
Scan historical database. The collected
CSV-format
egg data summaries (which are available in the
http://noosphere.princeton.edu/data/eggsummary/
directory as basketdata-
YYYY-
MM-
DD.csv.gz
files) are read. Each record in these files represents
the results for each egg reporting in that day for an
individual second of the day which the file represents.
Transform date and time. Time in the historical database is given in Universal (Greenwich Mean) time, and thus a record represents (to the precision of time alignment across the network) a simultaneous one-second interval in which all eggs took the samples present in the record. If we wish to explore dependence on other measures of time (for example, mean local solar time, which correlates with the day-night cycle at the egg site location), the Universal time in the record must be transformed, for each egg's data in the record, to the desired time. For example, to analyse data by local time, Universal Time would be adjusted by the time zone offset (or, for true solar time, longitude difference) between the egg site and the prime meridian. An auxiliary database of egg site locations is used to perform this transformation. Transformation of time may place the sample in a different day, but since we're interested only in diurnal correlations here, only the time is relevant.
Bin by time. The transformed time for each egg's result in the record is "binned" by assigning it to a 15 minute interval during the day (the program which produces the reports can use any binning interval--the plots below use the default binning of 15 minutes). Per-bin counters of numbers of trials and number of one bits are updated for the data from each egg reporting in the record.
Store and plot aggregate results. After all data in the database have been examined, the aggregate data for each bin is written to a CSV file in the following format:
Field | Contents |
---|---|
1 | Bin start time in seconds |
2 | One bits in trials in this bin |
3 | Trials in this bin |
4 | Mean for this bin |
The "Mean for this bin" field is simply the number of one bits divided by the number of trials, then multiplied by the standard 200 bits per trial; it is provided for convenience only and may be recomputed from the contents of fields 2 and 3. The time in field 1 is the transformed time selected for the individual report. Links below the charts allow you to download the data from which they were prepared.
This CSV summary file is then read by a small program which generates input for gnuplot to create the charts in PPM format, which is then translated to GIF with utilities from the PBMplus toolkit.
Database starts: | 1998-08-05 |
---|---|
Report ends: | 1998-11-01 |
Total trials: | 7,115,859,000 |
Total ones: | 3,557,960,649 |
Aggregate mean: | 100.008755 |
Binning by local sidereal time aggregates data based on the apparent position of celestial objects beyond the solar system at each individual egg site. For example, at about 17:42 local sidereal time the centre of the Milky Way galaxy will transit the meridian above an egg site.
TransformTime
subroutine in
longwave.pl to perform whatever transformation
you wish, and run the report over the collected
CSV databases on the Project server.