The purpose of this guide is to orient scientists and other data users to California’s commercial shrimp/prawn trawl logbook data. These data are confidential and are only provided through authorized data sharing agreements. This guide seeks to help scientists and other data users anticipate how these data are formatted and how they may have to “clean” (a.k.a., process) the data before they can be used in analysis. I also seek to highlight special tricks and caveats gleaned from my experience using the data. Ultimately, I hope this guide will help clarify what data are available and how these data can be processed to maximize their utility.
Overview
The California Department of Fish and Wildlife (CDFW) has been compiling catch statistics on California’s fisheries since 1916. CDFW has required commercial shrimp/prawn trawl vessels to submit logbooks since 1982. These logbook data record start and end haul locations, time, depth, and duration of trawl tows, total catch by species market category, gear used, and information about the vessel and crew. Additional information is available here.
This guide provides an overview of California’s commercial shrimp/prawn trawl logbook data based on a data request processed in January 2024 for all shrimp/prawn logbooks from 1982-2023. In this guide, I review the attributes of the shrimp/prawn trawl logbook data, the steps required to clean the raw data, and some visualizations of non-confidential summaries of the shrimp/prawn data.
Data attributes
The shrimp/prawn trawl logbook data includes the following columns:
Column | Description |
logserialnum | Logbook id |
departuredate | Departure date |
landingdate | Landing date |
VesselNum | Vessel id (e.g., 12345) |
VesselName | Vessel name |
PortCode | Port code |
PortDesc | Port name |
NetTypeDesc | Net type (single- or double-rigged) |
tblLog_Comments | Comments #1 |
oldyear | Year |
detaildate | Date of tow |
dragnumber | Tow number |
blocknumber | Block id |
settime | Time at start of tow (HH:MM) |
setdepth | Depth at start of tow (fathoms) |
updepth | Depth at end of tow (fathoms) |
uptime | Time at end of tow (HH:MM) |
totaltime | Duration of tow (minutes) |
setlatdeg | Latitude degrees at start of tow |
setlatdec | Latitude minutes at start of tow |
setlngdeg | Longitude degrees at start of tow |
setlngdec | Longitude minutes at start of tow |
uplatdeg | Latitude degrees at end of tow |
uplatdec | Latitude minutes at end of tow |
uplngdeg | Longitude degrees at end of tow |
uplngdec | Longitude minutes at end of tow |
setlorancx | LORAN-C x-value at start of tow |
setlorancy | LORAN-C y-value at start of tow |
setlorancw | LORAN-C w-value at start of tow |
uplorancx | LORAN-C x-value at end of tow |
uplorancy | LORAN-C y-value at end of tow |
uplorancw | LORAN-C w-value at end of tow |
setloranamin | LORAN-A min-value at start of tow |
setloranamax | LORAN-A max-value at start of tow |
uploranamin | LORAN-A min-value at end of tow |
uploranamax | LORAN-A max-value at end of tow |
SpeciesCode | Species code |
MarketCatDesc | Species name |
totalpounds | Catch (lbs) |
tblLogDetail_Comments | Comments #2 |
Attribute completeness
The figure below illustrates how consistently each attribute was filled out.
Attribute completeness over time
Number of logbooks by year
There are no logbooks for 2008, 2011, or 2012. A few other years have very few logbooks. I don’t know why yet.
Logged catch vs. landing receipts
Attribute information and issues
Logbook id
The logbook id (“LogSerialNum”) is a unique identifier for each logbook page. Note that a trip could potentially span multiple logbook pages or that a page could include multiple trips.
Vessel id and name
The vessel id (“VesselID”) is a 5-number unique identifier assigned to each vessel by CDFW. The vessel name (“VesselName”) is the name of the vessel. The vessel name is missing for many vessels but the vessel id column is roughly complete. These missing values could be filled through examination of another dataset or by querying this resource.
Departure, return, and tow dates
These columns describe the date of the tow (“DetailDate”) as well as the date of departure (“DepartureDate”) and return (“LandingDate”). The date of the tow is most commonly reported. The year for a few logbooks missing dates can be recovered from the “OldYear” column.
Port code and name
The port of landing is identified using a 3-number code (“PortCode”). I provide a key to relate these codes to their names and complexes here. The name of the port is provided in (“PortDesc”).
Net type
The net type (“NetTypeDesc”) is either (a) single-rigged or (b) double-rigged. The diagram below shows several types of trawl rigs (McHugh et al. 2017).
Comments
There are two comments columns: “tblLog_Comments” and “tblLogDetail_Comments”. They include a lot of useful information that would ideally be parsed into the database.
Tow number
The tow number (“DragNumber”) column describes the order of the tows. There are a few large tow numbers that are actually the time the tow was conducted that need to be corrected.
Block id
The effort is reported to statistical reporting blocks. CDFW provides the following guidance about the location of the different shrimp/prawn fisheries:
- The spot prawn trawl fishery was closed in 2003.
- The ridgeback prawn fishery occurs in southern California (south of Pt. Conception) between the ports of Santa Barbara to San Pedro or Terminal Island, and occasionally off Oceanside.
- The pink shrimp fishery is primarily from Morro Bay/Port San Luis (north of Pt. Conception) to the California/Oregon border or Crescent City.
- There may be pink shrimp caught south of Pt. Conception, but they are caught incidentally while fishing for one of the main target trawl species in southern California (California halibut, giant red sea cucumber, or ridgeback prawn
Set time, up time, duration
The start (“StartTime”) and end (“EndTime”) times are in 24-hour format: “HH:MM”. The duration (“TotalTime”) column describes the duration of the fishing effort in minutes. The reported duration matches a duration derived from the reported start and end times. This also helps to fill missing values.
Set and up depth
The depths at the start and end of the tow are reported in fathoms.
Set and up location
The coordinates for the start and end of a tow are sometimes reported. They are reported mostly in GPS coordinates but some are reported in LORAN-C coordinates.
The GPS coordinates split the degrees and minutes portions of the coordinate and need to be merged. The longitudes also need to be multiplied by -1. Many of the coordinates are invalid or unrealistic. The unrealistic ones could be filtered by seeing if they fall within the reported block id.
I’ve written an R function to use this NOAA web tool to convert the LORAN-C coordinates to GPS coordinates. The R function is the ‘?loran_to_gps’ function in the wcfish R package.
Interestingly, many years have near 100% complete GPS coordinate data while other years are just missing these data entirely. It’s not totally clear to me why this is.
Species id and name
There are 48 species documented in the catch. The following species are dominant: ridgeback prawn, spot prawn, and Pacific pink shrimp.
Catch (lbs)
The catch (“TotalPounds”) is recorded in pounds (lbs).
CDFW notes: Some fishermen report a total poundage for a complete day of fishing effort and some report poundage on a per trawl tow basis. In these cases, we attempt to divide the daily total poundage by the number of tows on that day to provide an estimate of the poundage on a per tow basis. We indicate when this process is used in the “DetailComments” field.