We recommend working with this notebook on Google Colab Open In Colab

Downloading Tide Data from NOAA CO-OPS API#

Author of this document: Timothy Divoll

The purpose of this notebook is to demonstrate how to download data from NOAA’s CO-OPS Data API. In this example, data are parsed into a dataframe and also written to CSV files (as needed to use in the Assessing Accuracy of the Tide Predictions of the Ocean State Ocean Model notebook.

If needed, dataframes could be saved in other common formats and exported as needed.

First, we need to install the noaa_coops Python wrapper (GClunies/noaa_coops)

%pip install noaa_coops -q # this package sends requests to the NOAA CO-OPS Data API
DEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621
DEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621

[notice] A new release of pip available: 22.3.1 -> 23.0
[notice] To update, run: /opt/homebrew/opt/python@3.9/bin/python3.9 -m pip install --upgrade pip
Note: you may need to restart the kernel to use updated packages.
# import dependencies and ignore warnings in code outputs
import noaa_coops
import pandas as pd
from datetime import datetime
import warnings
warnings.filterwarnings('ignore')

Direct data dowload - example#

First make a list of all the stations to pull data for

station_list = ['8461490', '8510560', '8447930', '8449130', '8452660', '8454049', '8447386', '8452944', '8454000']

The next cell has code to extract the data directly from the NOAA CO-OPS API rather than downloading from the webpage. This block only shows one example station and the following code block loops through the station list to pull data for each station.

# send a request to the CO-OPS API for Data Retrieval
# https://api.tidesandcurrents.noaa.gov/api/prod/
# MLLW = mean lower low water

# New London, CT example
new_london = noaa_coops.Station(station_list[0]) #use a different index for a different station
new_london_verified = new_london.get_data(
    begin_date = "20120601",
    end_date = "20220616",
    product = "water_level",
    datum = "MLLW",
    units = "metric",
    time_zone = "gmt",
    interval = "h")
new_london_predicted = new_london.get_data(
    begin_date = "20120601",
    end_date = "20220616",
    product = "predictions",
    datum = "MLLW",
    units = "metric",
    time_zone = "gmt",
    interval = "h")

# merge verified and predicted, then rename columns to match `readcsv` function
new_london_df = pd.merge(new_london_verified, new_london_predicted, on="date_time").drop(columns = ["sigma", "flags", "QC"]).rename(columns={"water_level": "Verified (m)", "predicted_wl": "Predicted (m)"}).reset_index()
new_london_df["Date"] = new_london_df["date_time"].dt.strftime("%m/%d/%Y")
new_london_df["Time (GMT)"] = new_london_df["date_time"].dt.strftime("%H:%M")

# see the next example to save data to Google Drive or to a local folder
new_london_df.head()
date_time Verified (m) Predicted (m) Date Time (GMT)
0 2004-01-01 00:00:00 0.353 0.408 01/01/2004 00:00
1 2004-01-01 01:00:00 0.238 0.311 01/01/2004 01:00
2 2004-01-01 02:00:00 0.119 0.194 01/01/2004 02:00
3 2004-01-01 03:00:00 0.101 0.104 01/01/2004 03:00
4 2004-01-01 04:00:00 0.118 0.098 01/01/2004 04:00

Set up a destination to save results#

Execute the commands below directly in this notebook to connect to your Google Drive (via Colab) or set a path locally.

NOTE #1: If you are working in Google Colab, the next block will mount a results folder in Google Drive, otehrwise, the resuts folder will be in the current local directory

NOTE #2: The following command will open a pop-up window requesting access to your Google Drive file system

try:
    from google.colab import drive
    drive.mount('/content/gdrive/', force_remount=True)
    %mkdir ./gdrive/MyDrive/noaa_coops_tide_data/
    results_path = "./gdrive/MyDrive/noaa_coops_tide_data/"
except ModuleNotFoundError:
    import os
    results_dir = "noaa_coops_tide_data"
    parent_dir = "./"
    results_path = os.path.join(parent_dir, results_dir)
    os.mkdir(results_path)

The following code chunk pulls data for all stations in the list. It exports each station’s data to a CSV in the format expected by readcsv in the Assessing Accuracy of the Tide Predictions of the Ocean State Ocean Model notebook.

# Loop through each station and retreive data
# Use the station list defined in the prior code block
# Only one year of data is extracted for this example, but the date ranges can be changed

results = pd.DataFrame()
for i in station_list:
    station_data = noaa_coops.Station(i)
    station_data_verified = station_data.get_data(
        begin_date = "20210616",
        end_date = "20220616",
        product = "water_level",
        datum = "MLLW",
        units = "metric",
        time_zone = "gmt",
        interval = "h")
    station_data_predicted = station_data.get_data(
        begin_date = "20210616",
        end_date = "20220616",
        product = "predictions",
        datum = "MLLW",
        units = "metric",
        time_zone = "gmt",
        interval = "h")
    results_df = pd.merge(station_data_verified, station_data_predicted, on="date_time").drop(columns = ["sigma", "flags", "QC"]).rename(columns={"water_level": "Verified (m)", "predicted_wl": "Predicted (m)"}).reset_index()
    results_df["Station ID"] = station_data.name
    results_df["Date"] = results_df["date_time"].dt.strftime("%m/%d/%Y")
    results_df["Time (GMT)"] = results_df["date_time"].dt.strftime("%H:%M")
    results_df.to_csv(f'{results_path}/{station_data.name}_tide_data.csv')
    results = results.append(results_df)
    

View the full dataframe containing data from all stations. Note that the head and tail of the results df are displaying data from different stations.

results
date_time Verified (m) Predicted (m) Station ID Date Time (GMT)
0 2021-01-01 00:00:00 0.223 0.241 New London 01/01/2021 00:00
1 2021-01-01 01:00:00 0.410 0.431 New London 01/01/2021 01:00
2 2021-01-01 02:00:00 0.558 0.564 New London 01/01/2021 02:00
3 2021-01-01 03:00:00 0.647 0.637 New London 01/01/2021 03:00
4 2021-01-01 04:00:00 0.676 0.639 New London 01/01/2021 04:00
... ... ... ... ... ... ...
13171 2022-06-16 19:00:00 0.084 -0.006 Providence 06/16/2022 19:00
13172 2022-06-16 20:00:00 0.137 0.004 Providence 06/16/2022 20:00
13173 2022-06-16 21:00:00 0.303 0.182 Providence 06/16/2022 21:00
13174 2022-06-16 22:00:00 0.523 0.373 Providence 06/16/2022 22:00
13175 2022-06-16 23:00:00 0.678 0.588 Providence 06/16/2022 23:00

118584 rows × 6 columns