Sea cucumber trawl logbooks

The purpose of this guide is to orient scientists and other data users to California’s commercial sea cucumber trawl logbook data. These data are confidential and are only provided through authorized data sharing agreements. This guide seeks to help scientists and other data users anticipate how these data are formatted and how they may have to “clean” (a.k.a., process) the data before they can be used in analysis. I also seek to highlight special tricks and caveats gleaned from my experience using the data. Ultimately, I hope this guide will help clarify what data are available and how these data can be processed to maximize their utility.

Overview

The California Department of Fish and Wildlife (CDFW) has been compiling catch statistics on California’s fisheries since 1916. CDFW has required commercial sea cucumber trawl vessels to submit logbooks since 1982. These logbook data record start and end haul locations, time, depth, and duration of trawl tows, total catch by species market category, and gear used. Additional information is available here.

This guide provides an overview of California’s commercial sea cucumber trawl logbook data based on a data request processed in January 2024 for all cucumber trawl logbooks from 1982-2022. In this guide, I review the attributes of the squid logbook data, the steps required to clean the raw data, and some visualizations of non-confidential summaries of the squid data.

Data attributes

The sea cucumber trawl logbook data includes the following columns:

Column Description
LogSerialNum Logbook id
FisherNum Fisher id (e.g., L12345)
VesselNum Vessel id (e.g., 12345)
VesselName Vessel name
DepartureDate Date of departure
LandingDate Date of landing
PortCode Port code
PortDesc Port name
NetTypeDesc Net type (single- or double-rigged)
HeadropeLength Head rope length (ft)
OldYear Year
DetailDate Date of tow
DragNumber Tow number
BlockNumber Block id
SetTime Time at start of tow (HH:MM:SS AP)
SetDepth Depth (fathoms) at start of tow
UpTime Time at end of tow (HH:MM:SS AP)
UpDepth Depth (fathoms) at end of tow
TotalTime Duration (minutes)
SpeciesCode Species id
MarketCatDesc Species name
TotalPounds Catch (lbs)
DetailComments Comments
SetLatDeg Latitude degrees at start of tow
SetLatDec Latitude minutes at start of tow
SetLongDeg Longitude degrees at start of tow
SetLongDec Longitude minutes at start of tow
UpLatDeg Latitude degrees at end of tow
UpLatDec Latitude minutes at end of tow
UpLongDeg Longitude degrees at end of tow
UpLongDec Longitude minutes at end of tow
SetLoranCx LORAN-C x value at start of tow
SetLoranCy LORAN-C y value at start of tow
SetLoranCw LORAN-C w value at start of tow
UpLoranCx LORAN-C x value at end of tow
UpLoranCy LORAN-C y value at end of tow
UpLoranCw LORAN-C w value at end of tow
SetLoranAmin LORAN-A min at start of tow
SetLoranAmax LORAN-A max at start of tow
UpLoranAmin LORAN-A min at end of tow
UpLoranAmax LORAN-A max at end of tow

Attribute completeness

The figure below illustrates how consistently each attribute was filled out.

Data completeness through time

The following figure shows data completeness through time. The most challenging part is that the logbooks don’t report species and catch until 2010. I think this is a database error as opposed to the logbooks never having asked for this information pre-2010.

Number of logbooks over time

Attribute information and issues

Logbook id

The logbook id (“LogSerialNum”) is a unique identifier for each logbook page. Note that a trip could potentially span multiple logbook pages.

Vessel id and name

The vessel id (“VesselID”) is a 5-number unique identifier assigned to each vessel by CDFW. The vessel name (“VesselName”) is the name of the vessel. Missing values could be filled through examination of another dataset or by querying this resource.

Fisher id

The fisher id (“FisherID”) is a 6-digit unique identifier assigned to each captain by CDFW and is formatted as follows: “L12345”. In this dataset, many ids are missing the leading “L”.

Departure, landing, and tow dates

These columns describe the date of the tow (“DetailDate”) as well as the date of departure (“DepartureDate”) and landing (“LandingDate”). The date of departure and landing were not frequently reported in the early years of the logbooks.

Port code and name

The port of landing is identified using a 3-number code (“PortCode”). I provide a key to relate these codes to their names and complexes here. The name of the port is provided in the “PortDesc” column. This information is missing from a large number of entries.

Head rope length (ft)

The head rope length (“HeadropeLength”) is provided in feet. Head rope lengths <40 ft and >100 ft might not be correct.

The head rope is the line at the top of the trawl mouth (see diagram below).

Net type

The net type (“NetTypeDesc”) is either (a) single-rigged or (b) double-rigged.  The diagram below shows several types of trawl rigs (McHugh et al. 2017).

Comments

The comments (“Comments”) column includes a lot of useful information that would ideally be parsed into the database. For example, it includes lots of catch information that is missing from the database. I have not attempted this.

Tow number

The tow number (“DragNumber”) column describes the order of the tows.

Start time, end time,  and tow duration

The start (“StartTime”) and end (“EndTime”) times are in 12-hour format: “HH:MM:SS AM/PM” but I converted them to numeric times for analysis. The duration (“TotalTime”) column describes the duration of the fishing effort in minutes.  The reported duration perfectly matches the duration derived from the reported start and end times.

I suspect some of the zero (mid-night) start and end times may be “unknowns” rather than true mid-night values based on the shape of the distribution. This may be diagnosable but I haven’t explored this yet.

Start and end depth

The depths at the start (“SetDepth”) and end (“UpDepth”) of the tow are reported in fathoms. Depths deeper than 250 fathoms might be typos.

Set and up location

The coordinates for the start and end of a tow are sometimes reported. They are reported mostly in GPS coordinates but some are reported in LORAN-C coordinates.

The GPS coordinates split the degrees and minutes portions of the coordinate and need to be merged. The longitudes also need to be multiplied by -1. Many of the coordinates are invalid or unrealistic. The unrealistic ones could be filtered by seeing if they fall within the reported block id.

I’ve written an R function to use this NOAA web tool to convert the LORAN-C coordinates to GPS coordinates. The R function is the ‘?loran_to_gps’ function in the wcfish R package.

Interestingly, many years have near 100% complete GPS coordinate data while other years are just missing these data entirely. It’s not totally clear to me why this is.

Species id and name

There are 53 species documented in the logbook catch.

The following five species are dominant: giant red sea cucumber, ridgeback prawn, unspecified sole, unspecified rock crab, and California lizardfish.

CDFW suggests that warty sea cucumber and unspecified sea cucumber are both likely to actually be giant red sea cucumber:

“Giant red sea cucumber (Apostichopus californicus), also known as California sea cucumber, is the primary target of the sea cucumber trawl fishery in California. Warty sea cucumber (Apostichopus parvimensis) is occasionally reported on trawl logs and is included in this data set. However, trawl data that includes warty sea cucumber is likely erroneous due to the shallower distribution of warty sea cucumber relative to where the trawl fishery operates. Unspecified sea cucumber that are reported on trawl logs are presumably mis-identified giant red sea cucumber as well.”

Also note that the species and catch is not recorded prior to 2010.

“The recording of a standardized species code and species name was not implemented until February 2010 as indicated in the data fields, “SpeciesCode” and “MarketCatDesc.” The species associated with trawl deployment prior to February 2010 reported in these data fields are completely missing; however, notes provided in the data fields “Comments” and “DetailComments” may be used as an alternative for some records to determine the target species. Although using these comment fields to determine species is challenging since there are many inconsistencies in how species are reported (i.e., cucumber, cuke, CQ, species code 755, etc.).”

Catch (lbs)

The catch (“TotalPounds”)  is recorded in pounds (lbs). The condition of the catch – i.e., whether it is cut (eviscerated) or whole – is not recorded. Also, fishermen will sometimes report the total catch for the day rather than the catch for a specific tow. In these cases, CDFW evenly divides the catch among tows and notes this in the “Comments” field.

See above about how the catch is not reported before 2010.