The purpose of this guide is to orient scientists and other data users to California’s commercial squid light boat logbook data. These data are confidential and are only provided through authorized data sharing agreements. This guide seeks to help scientists and other data users anticipate how these data are formatted and how they may have to “clean” (a.k.a., process) the data before they can be used in analysis. I also seek to highlight special tricks and caveats gleaned from my experience using the data. Ultimately, I hope this guide will help clarify what data are available and how these data can be processed to maximize their utility.
Overview
The California Department of Fish and Wildlife (CDFW) has been compiling catch statistics on California’s fisheries since 1916. CDFW has required commercial squid light boat vessels to submit logbooks since 2000. These logbook data are used to monitor fishing locations, environmental conditions, fishing effort, catch amounts, the use of catch, and fleet characterization and capacity. Additional information is available here.
An example logbook is shown below.
This guide provides an overview of California’s squid light boat logbook data based on a data request processed in January 2024 for all squid light boat logbooks from 2000-2022. In this guide, I review the attributes of the squid light boat logbook data, the steps required to clean the raw data, and some visualizations of non-confidential summaries of the squid data.
Data attributes
The squid light boat logbook data includes the following columns:
Column | Description |
SerialNumber | Logbook id |
VesselID | Vessel id (e.g., 12345) |
VesselName | Vessel name |
CaptainID | Captain id (e.g., L12345) |
CaptainName | Captain name |
PermitNumber | Permit number (e.g., LBT123) |
Comments | Comments |
LogDateString | Date |
Location | GPS position |
GeneralLocation | Location code (e.g., AI-AR) |
LocationDescription | Location description (e.g., ANACAPA ISLAND-ARCH ROCK) |
Lat_DD | Latitude (°N) |
Long_DD | Longitude (°W) |
HoursSearching | Hours spent searching |
HoursLighting | Hours spent lighting |
Seiner | Seiner id |
EstTonnageRemaining | Estimated tonnage remaining |
BirdsPresent | Birds present (y/n)? |
MammalsPresent | Mammals present (y/n)? |
StartTime | Start time |
EndTime | End time |
ElapsedTime | Elapsed time |
BottomDepth | Bottom depth (fathoms) |
AmountSold | Amount sold (short tons) |
LandingReceipt | Landing receipt |
AmtForLiveBait | Amount for live bait (lbs) |
ByCatch | Bycatch (species – lbs) |
Attribute completeness
The figure below illustrates how consistently each attribute was filled out.
Attribute information and issues
Logbook id
The logbook id (“SerialNumber”) is a unique identifier for each logbook page. A trip could span multiple logbooks pages.
Vessel id and name
The vessel id (“VesselID”) is a 5-number unique identifier assigned to each vessel by CDFW. The vessel name (“VesselName”) is the name of the vessel. The vessel id is nearly complete but the vessel name is missing for more vessels. These missing values could be filled through examination of another dataset or by querying this resource.
Captain id and name
The captain id (“CaptainID”) is a 6-digit unique identifier assigned to each captain by CDFW and is formatted as follows: “L12345”. The captain name (“CaptainName”) is presented in several formats. The captain name is missing more often than the captain id.
Vessel permit
The vessel permit number (“VesselPermitNumber”) is the permit id associated with the vessel and follows the following format: “SVT123”. It is not often provided.
- ABT = ??? (a typo for SBT?)
- LBN = Light Boat Non-Transferable
- LBT = Light Boat Transferable;
- LVT = ???? (a typo for LBT?)
- SBT = Squid Brail Transferable
- SVN = Squid Vessel Non-Transferable
- SVT = Squid Vessel Transferable
Comments
The comments (“Comments”) column describes anecdotal information such as additional bycatch information, equipment problems, interference from other boats, weather-related problems, day set activity, etc. It also includes CDFW staff comments during data entry.
Date
The date (“LogDateString”) column describes the date of fishing.
Location code and description
The location code (“GeneralLocation”) provides a code describing the location of fishing:
- AI = Anacapa Island
- CA = Santa Catalina Island
- CL = San Clemente Island
- CO = Coastal
- CR = Santa Cruz Island
- MN = Monterey Bay
- PT = Point Conception
- SB = Santa Barbara Island
- SF = San Francisco
- SN = San Nicolas Island
- SR = Santa Rosa Island
A few codes needed to be corrected to use a “-” instead of a “_”. A few of the location code entries actually report block ids that I moved to a different column.
The location description column spells out the name of the location associated with the code. CDFW can provide a key that provides the block id associated with each location.
GPS position, latitude (°N), longitude (°W)
The GPS position is reported in the format “XX°XX.XXX’ XXX°XX.XXX’”. A few missing longitude (“SetLongitude”) and latitude (“SetLatitude”) values can be derived from this data. The value “0” frequently occurs in the longitudes/latitudes and should be treated as unknown (“NA”). A few of the GPS position entries actually report block ids that I moved to a different column.
The coordinates often fall on land and other unlikely places. They could be improved by looking at the coordinates provided by the seiners for the same fishing trip or by figuring out whether they fall within the block id reported in the landing receipts.
Hours spent searching and lighting
The “HoursSearching” column describes the number of hours spent searching (metering) for squid. It also includes the time spent holding locations for day sets. It should include one value per date or one value for each location if operating in multiple locations on the same date. It represents the total across seiners if the vessel provided light for more than one seiner.
The “HoursLighting” column describes the number of hours spent attempting to attract squid with lights. It should include one value per date or one value for each location if operating in multiple locations on the same date. It represents the total across seiners if the vessel provided light for more than one seiner.
Seiner id
This column (“Seiner”) provides a comma separated list of the vessel ids that the light boat provided light for. It is reported using the following syntax: “VESSEL ID – VESSEL NAME : Estimated short tons”.
Birds and/or mammals present (y/n)?
These columns indicate whether birds (“BirdsPresent”) or mammals (“MammalsPresent”) were present: “yes” or “no”.
Start time, end time, elapsed time
The start (“StartTime”) and end (“EndTime”) times describe the beginning and end to a set when brailing. The elapsed time (“ElapsedTime”) describes the duration of brailing in minutes. See the figure above for a visualization of the distribution of reported durations.
Bottom depth (fathoms)
This column (“BottomDepth”) describes the bottom depth in fathoms if fishing by brail.
Biomass remaining, amount sold, amount for live bait
The estimated tonnage remaining (“EstTonnageRemaining”) describes the amount of squid left in the surrounding area in short tons. It should include one value per date or one value for each location if operating in multiple locations on the same date. It represents the total across seiners if the vessel provided light for more than one seiner.
The amount sold (“AmountSold”) represents the amount of a vessel’s brail caught squid sold to market in short tons.
The amount for live bait (“AmtForLiveBait”) represents the vessel’s live bait catch amount in pounds.
Landing receipts
This column (“LandingReceipts”) provides a comma separated list of the landing receipt ids associated with the catch. An example entry is: “W123456, W234567, W345678”.
Bycatch (species – lbs)
This column (“Bycatch”) provides a comma separated list of bycatch species and the amount of bycatch using the following syntax: “SPECIES CODE – SPECIES NAME : pounds of bycatch”. An example entry is: “51 – Mackerel, Pacific : 152, 55 – Mackerel, Jack : 152, 100 – Sardine, Pacific : 50”. I have not parsed these data yet.