California dive fishery logbooks

The purpose of this guide is to orient scientists and other data users to California’s commercial dive fishery logbook data. These data are confidential and are only provided through authorized data sharing agreements. This guide seeks to help scientists and other data users anticipate how these data are formatted and how they may have to “clean” (a.k.a., process) the data before they can be used in analysis. I also seek to highlight special tricks and caveats gleaned from my experience using the data. Ultimately, I hope this guide will help clarify what data are available and how these data can be processed to maximize their utility.

Overview

Since the 1992-1993 fishing season, commercial dive fishermen have been required to complete and submit logbooks detailing daily dive activities. The logbooks record the date and location of fishing, the length of dives, and the amount of catch.

An example logbook is available here. Additional information is available here.

This guide provides an overview of California’s commercial dive logbook data based on a data request processed in August 2022 for all dive logbooks from 2000-2020 (only sea cucumber data were shared; the urchin data were not ready to be shared). In this guide, I review the attributes of the dive logbook data, the steps required to clean the raw data, and some visualizations of non-confidential summaries of the logbook data.

Data attributes

The CPFV logbook data includes the following columns:

Column Description
LogSerialNum Logbook id
LogGroupYear Year
LogGroupMonth Month
LogDateString Date
Vessel Vessel id and name (ID – NAME)
PermiteeID Fisher id
PermitteeName Fisher name (FIRST – LAST)
DiveNumber
CDFWBlock Block id
Position Location
LatitudeDec Latitude (°N)
LongitudeDec Longitude (°W)
Landmark Landmark
MinDepthFeet Minimum depth (ft)
MaxDepthFeet Maximum depth (ft)
DiverHours Hours of diving
PortCode Port id
DealerID Dealer id
DealerName Dealer name
Remarks Comments
DiveSpeciesPounds Species code and catch (ID | CATCH)
SpeciesCode Species id
PoundsHarvested Catch (lbs)
LandingReceiptNumber Landing receipt id

 

Attribute information and issues

Data completeness

The receipt id, dive number, lat/long, and remarks are not commonly reported. The species, catch, and dealer are also not always reported. The remaining attributes are often reported.

Logbook id

A single logbook can document fishing trips on different days and/or at different locations.

I created a unique id for each dive by combining the following: logbook id,  date, vessel id, fisher id, port id, block id, minimum depth, maximum depth.

Vessel id and name

The “Vessel” column provides both the vessel id and name delineated by a hyphen (e.g., “12345 – FISHING BOAT”). I separated this information into two columns.

A few small edits were necessary to harmonize the vessel ids and names throughout the data (e.g., converting to upper case, filling in a missing vessel name, etc.).

In the end, vessel id is a unique identifier, but vessel name is not (AGAPE and PENGUIN are both associated with two vessels).

Fisher id and name

In general, the “PermitteeID” column provides the fisher license numbers, which are all supposed to begin with the letter “L”. However, many are missing the “L”, so harmonizing the data requires adding an “L” to some ids.

The “PermitteeName” column generally provides the fisher name in the following syntax: “FIRST – LAST”. However, in some cases, the first initial is provided, even though the full name is provided in other rows. I harmonized the fisher names to use full names in title case (e.g., First M. Last”).

Port id, name, and complex

The “PortCode” column provides the numeric id of the port, which can be linked to the port name and complex using the key provided here. All of the port codes are valid.

Species id and name

The “SpeciesCode” column provides the numeric id of each species, which can be linked to a common and scientific name using the key provided here. There are 5 species in these data:

Code Common name Scientific name
683 Keyhole limpet Megathura crenulata
731 Kellet’s whelk Kelletia kelletii
747 Top snail Megastraea undosa
754 Giant red sea cucumber Apostichopus californicus
757 Warty sea cucumber Parastichopus parvimensis

Block id

Every block id is valid! This is unprecedented.

Coordinates

The GPS coordinates often fall on land or at unrealistic offshore locations.

GPS coordinates have not be provided consistently over time. See the figure below for the percent of logbooks submitted each year that include GPS coordinates.

Depth (ft)

The logbooks report the minimum and maximum depth of the dives. A few values may be too deep to be realistic.

Effort (hours of diving)

Fishers are asked to report the length of dives in hours. Most dives are between 2 and 4 hours long, but many longer dives are reported, and may be questionable.

Catch

Fishers are asked to report their catch in pounds. The magnitude of catch varies by species but is generally between 10-500 pounds. It is unclear whether large values are realistic.