California squid vessel logbooks

The purpose of this guide is to orient scientists and other data users to California’s commercial squid logbook data. These data are confidential and are only provided through authorized data sharing agreements. This guide seeks to help scientists and other data users anticipate how these data are formatted and how they may have to “clean” (a.k.a., process) the data before they can be used in analysis. I also seek to highlight special tricks and caveats gleaned from my experience using the data. Ultimately, I hope this guide will help clarify what data are available and how these data can be processed to maximize their utility.

Overview

The California Department of Fish and Wildlife (CDFW) has been compiling catch statistics on California’s fisheries since 1916. CDFW has required commercial squid vessels to submit logbooks since 2000. These logbook data are used to monitor fishing locations, environmental conditions, fishing effort, catch amounts, the use of catch, and fleet characterization and capacity. Additional information is available here.

An example logbook is shown below.

This guide provides an overview of California’s squid logbook data based on a data request processed in January 2024 for all squid logbooks from 2000-2022. In this guide, I review the attributes of the squid logbook data, the steps required to clean the raw data, and some visualizations of non-confidential summaries of the squid data.

Data attributes

The squid logbook data includes the following columns:

Column Description
LogSerialNumber Logbook id
VesselID Vessel id (e.g., 12345)
VesselName Vessel name
CaptainID Captain id (e.g, L12345)
CaptainName Captain name
VesselPermitNumber Permit number (e.g., SVT123)
Comments Comments
LogDateString Date
SetNumber Set number
StartTime Start time
EndTime End time
ElapsedTime Elapsed time
SetPosition GPS position
SetLatitude Latitude (°N)
SetLongitude Longitude (°W)
Temperature Temperature (°F)
BottomDepth Bottom depth (fathoms)
CatchEstimate Catch estimate (short tons)
LtdByMarketOrder Limited by market order? (y/n)
LightBrailSetUpon Name of light boat set upon
ByCatch Bycatch (species – lbs)
LandingReceipts Landing receipts

Attribute completeness

The figure below illustrates how consistently each attribute was filled out.

Attribute information and issues

Logbook id

The logbook id (“SerialNumber”) is a unique identifier for each logbook page. Note that a trip could potentially span multiple logbook pages.

Vessel id and name

The vessel id (“VesselID”) is a 5-number unique identifier assigned to each vessel by CDFW. The vessel name (“VesselName”) is the name of the vessel. The vessel id is nearly complete but the vessel name is missing for more vessels. These missing values could be filled through examination of another dataset or by querying this resource.

Captain id and name

The captain id (“CaptainID”) is a 6-digit unique identifier assigned to each captain by CDFW  and is formatted as follows: “L12345”. The captain name (“CaptainName”) is consistently listed as the first initial followed by the last name: e.g., “J SMITH.” The captain name is missing more often than the captain id. 

Vessel permit

The vessel permit number (“VesselPermitNumber”) is the permit id associated with the vessel and follows the following format: “SVT123”.  It is often not provided. 

  • SVT = Squid Vessel Transferable
  • SVN = Squid Vessel Non-Transferable

Comments

The comments (“Comments”) column describes anecdotal information such as additional bycatch information, equipment problems, interference from other boats, weather-related problems, day set activity, etc. It also includes CDFW staff comments during data entry.

Date

The date (“LogDateString”) column describes the date of fishing.

Set number

The set number (“SetNumber”) column describes the numerical order of sets.

Start time, end time, elapsed time

The start (“StartTime”) and end (“EndTime”) times are in 24-hour format. They are often 1, 2, or 3 digits and although I expect the missing 0s are on the left side they could be on the right side. The duration (“ElapsedTime”) column describes the duration of the fishing effort in minutes.I have yet to test the extent to which the reported duration matches a duration derived from the reported start and end times.

GPS position, latitude, longitude

The GPS position is reported in the format “XX°XX.XXX’ XXX°XX.XXX’”. A few missing longitude (“SetLongitude”) and latitude (“SetLatitude”) values can be derived from this data. The value “0” frequently occurs in the longitudes/latitudes and should be treated as unknown (“NA”).

The coordinates often fall on land and other unlikely places. They could be improved by looking at the coordinates provided by the light boats for the same fishing trip or by figuring out whether they fall within the block id reported in the landing receipts.

Depth and temperature

These columns indicate the bottom depth (“BottomDepth”) and surface temperature (“Temperature”) of the water where the braille was set.

Catch estimate and order limit

The catch estimate (“CatchEstimate”) column provides an estimate of the catch in short tons. A short ton is equivalent to 2000 pounds (lbs).

This column (“LtdByMarketOrder”) indicates whether the catch was limited by a market order and can be “yes” or “no”. It is often not provided.

Id of light boat set upon

This column (“LightBrailSetUpon”) provides a comma separated list of the vessel ids of the light boat the braille was set upon. It looks like the following: “12345, 23456, 34567”. If no light boat was used, this field will be empty.

Bycatch

This column (“Bycatch”) provides a comma separated list of bycatch species and the amount of bycatch using the following syntax: “SPECIES CODE – SPECIES NAME : pounds of bycatch”. An example entry is: “51 – Mackerel, Pacific : 152, 55 – Mackerel, Jack : 152, 100 – Sardine, Pacific : 50”. I have not parsed these data yet.

Landings receipts

This column (“LandingReceipts”) provides a comma separated list of the landing receipt ids associated with the catch. An example entry is: “W123456, W234567, W345678”.