California gillnet logbooks

The purpose of this guide is to orient scientists and other data users to California’s commercial gillnet logbook data. These data are confidential and are only provided through authorized data sharing agreements. This guide seeks to help scientists and other data users anticipate how these data are formatted and how they may have to “clean” (a.k.a., process) the data before they can be used in analysis. I also seek to highlight special tricks and caveats gleaned from my experience using the data. Ultimately, I hope this guide will help clarify what data are available and how these data can be processed to maximize their utility.

Overview

The California Department of Fish and Wildlife (CDFW) has been compiling catch statistics on California’s fisheries since 1916. CDFW has required commercial gillnet vessels to submit logbooks since 1982. Additional information is available here.


This guide provides an overview of California’s commercial gillnet logbook data based on a data request processed in January 2023 for all gillnet logbooks from 1982-2022. In this guide, I review the attributes of the gillnet logbook data, the steps required to clean the raw data, and some visualizations of non-confidential summaries of the gillnet data.

Note that the data I received do no include the skipper name, the landing receipt id, or the crew members names or license numbers, despite all of this occurring on the log.

Data attributes

The gillnet logbook data includes the following columns:

Column Description
SN Logbook id
VESSEL_NAME Vessel name
BOATNO Coast Guard boat number
VESSEL_ID Vessel id (e.g., 12345)
PERMIT Permit id
FISHING_DATE Date of fishing
Year Year of fishing
TARSPC Target species – better version
Final Target Species  Target species
DRIFT_SET Net type (drift, set)
Final Net Type (Set, Drift) Net type (drift, set) – better version
FG_BLOCKS Block id
DEPTHS Depths (fathoms)
NET_LENGTH Net length (fathoms)
MESH_SIZE Mesh size (in)
BOUY_LINE_DEPTH Buoy line depth (ft)
HOURS_NET_SOAKED Soak time (hrs)
COMMON_NAME Common name, format 1 (e.g., California halibut)
FinalMLDS_Common_Name Common name, format 2 (e.g., Halibut, California)
MLDS_Species_Code Species code
STATUS Status (kept, lost, released)
NUM_CATCH Number of fish caught
WEIGHTS Weight of fish caught (lbs)
PREDATOR Predators

Attribute completeness

The figure below illustrates how consistently each attribute was filled out.

Number of logbooks by year

The figure below illustrates the number of logbook entries over time.

Attribute information and issues

Logbook id

The logbook id (“SerialNumber”) is a unique identifier for each logbook page. Note that a trip could potentially span multiple logbook pages or that a page could include multiple trips.

Vessel ids and name

The vessel id (“VesselID”) is a 5-number unique identifier assigned to each vessel by CDFW. The vessel name (“VesselName”) is the name of the vessel. The vessel name is missing for many vessels but the vessel id column is roughly complete. These missing values could be filled through examination of another dataset or by querying this resource.

Permit id

The permit id (“PERMIT”) is a 5-number unique identifier for the permit used for the fishing trip.

Block id

The block id (“FG_BLOCKS”) indicates the statistical reporting block where the majority of fishing occurred. This is the only spatial information recorded in the gillnet logbooks.

Net type

The net type, i.e., whether it is a set or drift gillnet, is reported in two columns: “DRIFT_SET” and “Final Net Type (Set, Drift)”. The second column is clean (all “drift” or “set”) but less complete. The first column is nearly complete but includes a number of invalid codes. I create a final determination using the second column preferentially over the first column.

The following codes could be correctly formatted:

  • S/s = Set
  • D/d = Drift
  • 67 = Set (large-mesh set gillnet gear code)
  • 68 = Set (small-mesh set gillnet gear code)

The following codes are unknown: 1,  2, 3, 5, H, N, Q, W, X

Depth and soak time

The depth (“DEPTHS”) is reported in fathoms and sometimes multiple depths are provided.

The soak times (“HOURS_NET_SOAKED”)  are reported in hours and sometimes multiple soak times are provided.

There are unrealistic outliers in both values. Values of zero should be re-coded as NAs.

Mesh size, net length, buoy line depth

The mesh size (“MESH_SIZE”) is reported in inches and sometimes multiple values are provided.

The net length (“NET_LENGTH”)  is reported in fathoms and sometimes multiple values are provided.

The buoy line depth (“BOUY_LINE_DEPTH”)  is reported in feet and sometimes multiple values are provided.

There are unrealistic outliers in all three values. Values of zero should be re-coded as NAs.

Target species

The logbook asks fishers to specify the target species and to use the following official codes for common target species:

  • B – Barracuda
  • H – Halibut
  • C – White croaker
  • W – White seabags
  • S – Shark/swordfish
  • X – Soupfin shark

I assume the following assignments for two other codes:

  • T = Thresher (I forget what led me to believe this)
  • YELTL – Yellowtail

The following codes remain unmatched: 1, 4, 8, D, E, F, J, L, M, N, O, P, R, SW, Z

Species id and name

The species information is extremely poorly formatted.

The species information is spread across three columns:

  • COMMON_NAME = Common name in format: “California halibut”
  • FinalMLDS_Common_Name = Common name in format: “Halibut, California”
  • MLDS_Species_Code = CDFW species code

However, there are exceptions within these columns on the formatting.

In general, I used information in the common name columns to fill in missing species codes and then attached nicely formatted common names based on the species codes. This requires heavy and careful formatting and I encourage users to check out my code.

I was unable to identify a species code for “harbor seal” or “unspecified sea urchin”. I was unable to identify a species for common names listed as: Sb, X, S, Grass Back, Verde, or Grass Bass. If anyone knows what species those are, please let me know.

Catch (number, weight, status)

The catch is reported as the number (“NUM_CATCH”) and pounds (“WEIGHTS”) of fish caught. The status (“STATUS”) of the catch – whether it was kept, released, or lost – is also listed. The following status codes are not understood: 1, 2, 3, 4, 5.

Predators

This column (“PREDATOR”) is supposed to list the number and species of fish lost to a type of predator. However, in practice, it appears to only list the type of predator present. It includes some erroneous entries such as: “used as bait”, “NMFS”, and “horne”.