The purpose of this guide is to orient scientists and other data users to California’s commercial dive fishery logbook data. These data are confidential and are only provided through authorized data sharing agreements. This guide seeks to help scientists and other data users anticipate how these data are formatted and how they may have to “clean” (a.k.a., process) the data before they can be used in analysis. I also seek to highlight special tricks and caveats gleaned from my experience using the data. Ultimately, I hope this guide will help clarify what data are available and how these data can be processed to maximize their utility.
Overview
Since the 1992-1993 fishing season, commercial dive fishermen have been required to complete and submit logbooks detailing daily dive activities. Logbooks were voluntarily submitted before this time period. The logbooks record the date and location of fishing, the length of dives, and the amount of catch.
An example logbook is available here. Additional information is available here.
This guide provides an overview of California’s commercial dive logbook data based on a data request processed in January 2024 for all dive logbooks from 1980-2023. In this guide, I review the attributes of the dive logbook data, the steps required to clean the raw data, and some visualizations of non-confidential summaries of the logbook data.
The data shown here are missing much of the effort for sea urchins. Logbooks are initially sent to either the Fort Bragg, Santa Barbara, or Seal Beach offices. They are then date stamped and the (1) sea urchin logs are sent to San Luis Obispo and (2) sea cucumber logs are sent to Santa Barbara for entering into the MLS system. Entering the data requires a full time seasonal person, which is not always available. This person also has other duties besides entering dive logs. As a result, CDFW is behind on logbook data entry. As of 3-22-24, they still have a few 1000 sea urchin dive logs need to be entered from 2010-2023. The sea cucumber fishery logbooks are better off because the fishery is much smaller, and data entry much quicker to keep up on.
Data attributes
The dive fishery logbook data includes the following columns:
Column | Description |
LogSerialNum | Logbook id |
LogGroupYear | Year |
LogGroupMonth | Month |
LogDateString | Date |
Vessel | Vessel id and name (ID – NAME) |
PermiteeID | Fisher id |
PermitteeName | Fisher name (FIRST – LAST) |
DiveNumber | Dive number for the trip |
CDFWBlock | Block id |
Position | Location |
LatitudeDec | Latitude (°N) |
LongitudeDec | Longitude (°W) |
Landmark | Landmark |
MinDepthFeet | Minimum depth (ft) |
MaxDepthFeet | Maximum depth (ft) |
DiverHours | Hours of diving |
PortCode | Port id |
DealerID | Dealer id |
DealerName | Dealer name |
Remarks | Comments |
DiveSpeciesPounds | Species code and catch (ID | CATCH) |
SpeciesCode | Species id |
PoundsHarvested | Catch (lbs) |
LandingReceiptNumber | Landing receipt id |
Attribute completeness
The receipt id, dive number, lat/long, and remarks are not commonly reported. The species, catch, and dealer are also not always reported. The remaining attributes are often reported.
Number of logbooks
The number of logbook entries over time. Since the 1992-1993 fishing season, commercial dive fishermen have been required to complete and submit logbooks detailing daily dive activities. Logbooks were voluntarily submitted before this time period.
As of 3-22-24, they still have a few 1000 sea urchin dive logs need to be entered from 2010-2023. The sea cucumber fishery logbooks are better off because the fishery is much smaller, and data entry much quicker to keep up on.
Still, the warty sea cucumber data does not appear to really start until 2003.
The following plot shows the catch documented in the landing receipts (bars) compared to the logbooks (lines) to demonstrate when the logbooks may not capture the majority of effort.
Attribute information and issues
Logbook id
A single logbook can document fishing trips on different days and/or at different locations.
Vessel id and name
The “Vessel” column provides both the vessel id and name delineated by a hyphen (e.g., “12345 – FISHING BOAT”). I separated this information into two columns.
A few small edits were necessary to harmonize the vessel ids and names throughout the data (e.g., converting to upper case, filling in a missing vessel name, etc.).
In the end, vessel id is a unique identifier, but vessel name is not (AGAPE and PENGUIN are both associated with two vessels).
Fisher id and name
In general, the “PermitteeID” column provides the fisher license numbers, which are all supposed to begin with the letter “L”. However, many are missing the “L”, so harmonizing the data requires adding an “L” to some ids.
The “PermitteeName” column generally provides the fisher name in the following syntax: “FIRST – LAST”. However, in some cases, the first initial is provided, even though the full name is provided in other rows. I harmonized the fisher names to use full names in title case (e.g., First M. Last”).
Port id, name, and complex
The “PortCode” column provides the numeric id of the port, which can be linked to the port name and complex using the key provided here. All of the port codes are valid.
Species id and name
The “SpeciesCode” column provides the numeric id of each species, which can be linked to a common and scientific name using the key provided here. There are 8 species in these data:
Code | Common name | Scientific name |
683 | Keyhole limpet | Megathura crenulata |
731 | Kellet’s whelk | Kelletia kelletii |
747 | Top snail | Megastraea undosa |
752 | Red sea urchin | Mesocentrotus franciscanus |
753 | Purple sea urchin | Strongylocentrotus purpuratus |
754 | Giant red sea cucumber | Apostichopus californicus |
756 | White sea urchin | Lytechinus anamesus |
757 | Warty sea cucumber | Parastichopus parvimensis |
Block id
Coordinates
The GPS coordinates often fall on land or at unrealistic offshore locations.
GPS coordinates have not be provided consistently over time. See the figure below for the percent of logbooks submitted each year that include GPS coordinates. I identified reliable coordinates as coordinates falling within the reported block. Coordinates cannot be verified for logbooks that do not report a block.
Depth (ft)
The logbooks report the minimum and maximum depth of the dives. A few values may be too deep to be realistic.
Effort (hours of diving)
Fishers are asked to report the length of dives in hours. Most dives are between 2 and 4 hours long, but many longer dives are reported, and may be questionable.
Catch
Fishers are asked to report their catch in pounds. The magnitude of catch varies by species but is generally between 10-500 pounds. It is unclear whether large values are realistic.