The purpose of this guide is to orient scientists and other data users to California’s commercial dive fishery logbook data. These data are confidential and are only provided through authorized data sharing agreements. This guide seeks to help scientists and other data users anticipate how these data are formatted and how they may have to “clean” (a.k.a., process) the data before they can be used in analysis. I also seek to highlight special tricks and caveats gleaned from my experience using the data. Ultimately, I hope this guide will help clarify what data are available and how these data can be processed to maximize their utility.
Overview
Since the 1992-1993 fishing season, commercial dive fishermen have been required to complete and submit logbooks detailing daily dive activities. The logbooks record the date and location of fishing, the length of dives, and the amount of catch.
An example logbook is available here. Additional information is available here.
This guide provides an overview of California’s commercial dive logbook data based on a data request processed in August 2022 for all dive logbooks from 2000-2020 (only sea cucumber data were shared; the urchin data were not ready to be shared). In this guide, I review the attributes of the dive logbook data, the steps required to clean the raw data, and some visualizations of non-confidential summaries of the logbook data.
Data attributes
The CPFV logbook data includes the following columns:
Column | Description |
LogSerialNum | Logbook id |
LogGroupYear | Year |
LogGroupMonth | Month |
LogDateString | Date |
Vessel | Vessel id and name (ID – NAME) |
PermiteeID | Fisher id |
PermitteeName | Fisher name (FIRST – LAST) |
DiveNumber | |
CDFWBlock | Block id |
Position | Location |
LatitudeDec | Latitude (°N) |
LongitudeDec | Longitude (°W) |
Landmark | Landmark |
MinDepthFeet | Minimum depth (ft) |
MaxDepthFeet | Maximum depth (ft) |
DiverHours | Hours of diving |
PortCode | Port id |
DealerID | Dealer id |
DealerName | Dealer name |
Remarks | Comments |
DiveSpeciesPounds | Species code and catch (ID | CATCH) |
SpeciesCode | Species id |
PoundsHarvested | Catch (lbs) |
LandingReceiptNumber | Landing receipt id |
Attribute information and issues
Data completeness
The receipt id, dive number, lat/long, and remarks are not commonly reported. The species, catch, and dealer are also not always reported. The remaining attributes are often reported.
Logbook id
A single logbook can document fishing trips on different days and/or at different locations.
I created a unique id for each dive by combining the following: logbook id, date, vessel id, fisher id, port id, block id, minimum depth, maximum depth.
Vessel id and name
The “Vessel” column provides both the vessel id and name delineated by a hyphen (e.g., “12345 – FISHING BOAT”). I separated this information into two columns.
A few small edits were necessary to harmonize the vessel ids and names throughout the data (e.g., converting to upper case, filling in a missing vessel name, etc.).
In the end, vessel id is a unique identifier, but vessel name is not (AGAPE and PENGUIN are both associated with two vessels).
Fisher id and name
In general, the “PermitteeID” column provides the fisher license numbers, which are all supposed to begin with the letter “L”. However, many are missing the “L”, so harmonizing the data requires adding an “L” to some ids.
The “PermitteeName” column generally provides the fisher name in the following syntax: “FIRST – LAST”. However, in some cases, the first initial is provided, even though the full name is provided in other rows. I harmonized the fisher names to use full names in title case (e.g., First M. Last”).
Port id, name, and complex
The “PortCode” column provides the numeric id of the port, which can be linked to the port name and complex using the key provided here. All of the port codes are valid.
Species id and name
The “SpeciesCode” column provides the numeric id of each species, which can be linked to a common and scientific name using the key provided here. There are 5 species in these data:
Code | Common name | Scientific name |
683 | Keyhole limpet | Megathura crenulata |
731 | Kellet’s whelk | Kelletia kelletii |
747 | Top snail | Megastraea undosa |
754 | Giant red sea cucumber | Apostichopus californicus |
757 | Warty sea cucumber | Parastichopus parvimensis |
Block id
Every block id is valid! This is unprecedented.
Coordinates
The GPS coordinates often fall on land or at unrealistic offshore locations.
GPS coordinates have not be provided consistently over time. See the figure below for the percent of logbooks submitted each year that include GPS coordinates.
Depth (ft)
The logbooks report the minimum and maximum depth of the dives. A few values may be too deep to be realistic.
Effort (hours of diving)
Fishers are asked to report the length of dives in hours. Most dives are between 2 and 4 hours long, but many longer dives are reported, and may be questionable.
Catch
Fishers are asked to report their catch in pounds. The magnitude of catch varies by species but is generally between 10-500 pounds. It is unclear whether large values are realistic.