{ "cells": [ { "attachments": {}, "cell_type": "markdown", "id": "a6f1bbe4", "metadata": {}, "source": [ "___We recommend working with this notebook on Google Colab___\n", "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ridatadiscoverycenter/riddc-jbook/blob/main/riddc/notebooks/fox-kemper/noaa_coops_download.ipynb)" ] }, { "cell_type": "markdown", "id": "4cf680d0", "metadata": {}, "source": [ "# Downloading Tide Data from NOAA CO-OPS API" ] }, { "attachments": {}, "cell_type": "markdown", "id": "c7e9278a", "metadata": {}, "source": [ "Author of this document: Timothy Divoll [](https://github.com/tdivoll)" ] }, { "cell_type": "markdown", "id": "3258b417", "metadata": {}, "source": [ "The purpose of this notebook is to demonstrate how to download data from NOAA's CO-OPS Data API. In this example, data are parsed into a dataframe and also written to CSV files (as needed to use in the `Assessing Accuracy of the Tide Predictions of the Ocean State Ocean Model` notebook." ] }, { "cell_type": "markdown", "id": "c9d2d0d8", "metadata": {}, "source": [ "If needed, dataframes could be saved in other common formats and exported as needed." ] }, { "cell_type": "markdown", "id": "0afff31d", "metadata": {}, "source": [ "First, we need to install the noaa_coops Python wrapper (https://github.com/GClunies/noaa_coops)" ] }, { "cell_type": "code", "execution_count": 9, "id": "bf1ba2aa", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[33mDEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621\u001b[0m\u001b[33m\n", "\u001b[0m\u001b[33mDEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at https://github.com/Homebrew/homebrew-core/issues/76621\u001b[0m\u001b[33m\n", "\u001b[0m\n", "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip available: \u001b[0m\u001b[31;49m22.3.1\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m23.0\u001b[0m\n", "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49m/opt/homebrew/opt/python@3.9/bin/python3.9 -m pip install --upgrade pip\u001b[0m\n", "Note: you may need to restart the kernel to use updated packages.\n" ] } ], "source": [ "%pip install noaa_coops -q # this package sends requests to the NOAA CO-OPS Data API" ] }, { "cell_type": "code", "execution_count": 10, "id": "9cc6c5ba", "metadata": {}, "outputs": [], "source": [ "# import dependencies and ignore warnings in code outputs\n", "import noaa_coops\n", "import pandas as pd\n", "from datetime import datetime\n", "import warnings\n", "warnings.filterwarnings('ignore')" ] }, { "cell_type": "markdown", "id": "1d8b2129", "metadata": {}, "source": [ "### Direct data dowload - example\n", "First make a list of all the stations to pull data for" ] }, { "cell_type": "code", "execution_count": 21, "id": "0eca4516", "metadata": {}, "outputs": [], "source": [ "station_list = ['8461490', '8510560', '8447930', '8449130', '8452660', '8454049', '8447386', '8452944', '8454000']" ] }, { "cell_type": "markdown", "id": "6cdc9483", "metadata": {}, "source": [ "The next cell has code to extract the data directly from the NOAA CO-OPS API rather than downloading from the webpage. This block only shows one example station and the following code block loops through the station list to pull data for each station." ] }, { "cell_type": "code", "execution_count": 22, "id": "7d052486", "metadata": { "scrolled": false }, "outputs": [], "source": [ "# send a request to the CO-OPS API for Data Retrieval\n", "# https://api.tidesandcurrents.noaa.gov/api/prod/\n", "# MLLW = mean lower low water\n", "\n", "# New London, CT example\n", "new_london = noaa_coops.Station(station_list[0]) #use a different index for a different station\n", "new_london_verified = new_london.get_data(\n", " begin_date = \"20120601\",\n", " end_date = \"20220616\",\n", " product = \"water_level\",\n", " datum = \"MLLW\",\n", " units = \"metric\",\n", " time_zone = \"gmt\",\n", " interval = \"h\")\n", "new_london_predicted = new_london.get_data(\n", " begin_date = \"20120601\",\n", " end_date = \"20220616\",\n", " product = \"predictions\",\n", " datum = \"MLLW\",\n", " units = \"metric\",\n", " time_zone = \"gmt\",\n", " interval = \"h\")\n", "\n", "# merge verified and predicted, then rename columns to match `readcsv` function\n", "new_london_df = pd.merge(new_london_verified, new_london_predicted, on=\"date_time\").drop(columns = [\"sigma\", \"flags\", \"QC\"]).rename(columns={\"water_level\": \"Verified (m)\", \"predicted_wl\": \"Predicted (m)\"}).reset_index()\n", "new_london_df[\"Date\"] = new_london_df[\"date_time\"].dt.strftime(\"%m/%d/%Y\")\n", "new_london_df[\"Time (GMT)\"] = new_london_df[\"date_time\"].dt.strftime(\"%H:%M\")\n", "\n", "# see the next example to save data to Google Drive or to a local folder" ] }, { "cell_type": "code", "execution_count": 20, "id": "c6509306", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
date_timeVerified (m)Predicted (m)DateTime (GMT)
02004-01-01 00:00:000.3530.40801/01/200400:00
12004-01-01 01:00:000.2380.31101/01/200401:00
22004-01-01 02:00:000.1190.19401/01/200402:00
32004-01-01 03:00:000.1010.10401/01/200403:00
42004-01-01 04:00:000.1180.09801/01/200404:00
\n", "
" ], "text/plain": [ " date_time Verified (m) Predicted (m) Date Time (GMT)\n", "0 2004-01-01 00:00:00 0.353 0.408 01/01/2004 00:00\n", "1 2004-01-01 01:00:00 0.238 0.311 01/01/2004 01:00\n", "2 2004-01-01 02:00:00 0.119 0.194 01/01/2004 02:00\n", "3 2004-01-01 03:00:00 0.101 0.104 01/01/2004 03:00\n", "4 2004-01-01 04:00:00 0.118 0.098 01/01/2004 04:00" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "new_london_df.head()" ] }, { "cell_type": "markdown", "id": "554b6ee0", "metadata": {}, "source": [ "## Set up a destination to save results\n", "\n", "Execute the commands below directly in this notebook to connect to your Google Drive (via Colab) or set a path locally.\n", "\n", "**NOTE #1: If you are working in Google Colab, the next block will mount a results folder in Google Drive, otehrwise, the resuts folder will be in the current local directory**\n", "\n", "**NOTE #2: The following command will open a pop-up window requesting access to your Google Drive file system**" ] }, { "cell_type": "code", "execution_count": 68, "id": "c00d8c72", "metadata": { "scrolled": true }, "outputs": [], "source": [ "try:\n", " from google.colab import drive\n", " drive.mount('/content/gdrive/', force_remount=True)\n", " %mkdir ./gdrive/MyDrive/noaa_coops_tide_data/\n", " results_path = \"./gdrive/MyDrive/noaa_coops_tide_data/\"\n", "except ModuleNotFoundError:\n", " import os\n", " results_dir = \"noaa_coops_tide_data\"\n", " parent_dir = \"./\"\n", " results_path = os.path.join(parent_dir, results_dir)\n", " os.mkdir(results_path)" ] }, { "cell_type": "markdown", "id": "b18e28f2", "metadata": {}, "source": [ "The following code chunk pulls data for all stations in the list. It exports each station's data to a CSV in the format expected by `readcsv` in the `Assessing Accuracy of the Tide Predictions of the Ocean State Ocean Model` notebook." ] }, { "cell_type": "code", "execution_count": 71, "id": "f0b0e379", "metadata": {}, "outputs": [], "source": [ "# Loop through each station and retreive data\n", "# Use the station list defined in the prior code block\n", "# Only one year of data is extracted for this example, but the date ranges can be changed\n", "\n", "results = pd.DataFrame()\n", "for i in station_list:\n", " station_data = noaa_coops.Station(i)\n", " station_data_verified = station_data.get_data(\n", " begin_date = \"20210616\",\n", " end_date = \"20220616\",\n", " product = \"water_level\",\n", " datum = \"MLLW\",\n", " units = \"metric\",\n", " time_zone = \"gmt\",\n", " interval = \"h\")\n", " station_data_predicted = station_data.get_data(\n", " begin_date = \"20210616\",\n", " end_date = \"20220616\",\n", " product = \"predictions\",\n", " datum = \"MLLW\",\n", " units = \"metric\",\n", " time_zone = \"gmt\",\n", " interval = \"h\")\n", " results_df = pd.merge(station_data_verified, station_data_predicted, on=\"date_time\").drop(columns = [\"sigma\", \"flags\", \"QC\"]).rename(columns={\"water_level\": \"Verified (m)\", \"predicted_wl\": \"Predicted (m)\"}).reset_index()\n", " results_df[\"Station ID\"] = station_data.name\n", " results_df[\"Date\"] = results_df[\"date_time\"].dt.strftime(\"%m/%d/%Y\")\n", " results_df[\"Time (GMT)\"] = results_df[\"date_time\"].dt.strftime(\"%H:%M\")\n", " results_df.to_csv(f'{results_path}/{station_data.name}_tide_data.csv')\n", " results = results.append(results_df)\n", " " ] }, { "cell_type": "markdown", "id": "3e02bd41", "metadata": {}, "source": [ "View the full dataframe containing data from all stations. Note that the head and tail of the `results` df are displaying data from different stations." ] }, { "cell_type": "code", "execution_count": 72, "id": "8b7cc7de", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
date_timeVerified (m)Predicted (m)Station IDDateTime (GMT)
02021-01-01 00:00:000.2230.241New London01/01/202100:00
12021-01-01 01:00:000.4100.431New London01/01/202101:00
22021-01-01 02:00:000.5580.564New London01/01/202102:00
32021-01-01 03:00:000.6470.637New London01/01/202103:00
42021-01-01 04:00:000.6760.639New London01/01/202104:00
.....................
131712022-06-16 19:00:000.084-0.006Providence06/16/202219:00
131722022-06-16 20:00:000.1370.004Providence06/16/202220:00
131732022-06-16 21:00:000.3030.182Providence06/16/202221:00
131742022-06-16 22:00:000.5230.373Providence06/16/202222:00
131752022-06-16 23:00:000.6780.588Providence06/16/202223:00
\n", "

118584 rows × 6 columns

\n", "
" ], "text/plain": [ " date_time Verified (m) Predicted (m) Station ID \\\n", "0 2021-01-01 00:00:00 0.223 0.241 New London \n", "1 2021-01-01 01:00:00 0.410 0.431 New London \n", "2 2021-01-01 02:00:00 0.558 0.564 New London \n", "3 2021-01-01 03:00:00 0.647 0.637 New London \n", "4 2021-01-01 04:00:00 0.676 0.639 New London \n", "... ... ... ... ... \n", "13171 2022-06-16 19:00:00 0.084 -0.006 Providence \n", "13172 2022-06-16 20:00:00 0.137 0.004 Providence \n", "13173 2022-06-16 21:00:00 0.303 0.182 Providence \n", "13174 2022-06-16 22:00:00 0.523 0.373 Providence \n", "13175 2022-06-16 23:00:00 0.678 0.588 Providence \n", "\n", " Date Time (GMT) \n", "0 01/01/2021 00:00 \n", "1 01/01/2021 01:00 \n", "2 01/01/2021 02:00 \n", "3 01/01/2021 03:00 \n", "4 01/01/2021 04:00 \n", "... ... ... \n", "13171 06/16/2022 19:00 \n", "13172 06/16/2022 20:00 \n", "13173 06/16/2022 21:00 \n", "13174 06/16/2022 22:00 \n", "13175 06/16/2022 23:00 \n", "\n", "[118584 rows x 6 columns]" ] }, "execution_count": 72, "metadata": {}, "output_type": "execute_result" } ], "source": [ "results" ] }, { "cell_type": "code", "execution_count": null, "id": "bb810a94", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3.9.7 64-bit ('3.9.7')", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.7" }, "vscode": { "interpreter": { "hash": "b240976dc37aaf1a529ebe7133fc70f8114476d5871f1696eba5bdf4fd2ca117" } } }, "nbformat": 4, "nbformat_minor": 5 }