| Title: | Convenient Access to NYS Open Data API Endpoints |
|---|---|
| Description: | Provides helper functions to access datasets from the NYS Open Data platform <https://data.ny.gov/>. Functions return results as tidy tibbles and support optional filtering, sorting, and row limits via the Socrata API. |
| Authors: | Christian Martinez [aut, cre] (GitHub: martinezc1, ORCID: <https://orcid.org/0009-0005-6026-6454>) |
| Maintainer: | Christian Martinez <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.1 |
| Built: | 2026-06-09 05:51:42 UTC |
| Source: | https://github.com/martinezc1/nysopendata |
Downloads any NYS Open Data dataset given its Socrata JSON endpoint.
nys_any_dataset( json_link, limit = 10000, timeout_sec = 30, clean_names = TRUE, coerce_types = TRUE )nys_any_dataset( json_link, limit = 10000, timeout_sec = 30, clean_names = TRUE, coerce_types = TRUE )
json_link |
A Socrata dataset JSON endpoint URL (e.g., "https://data.ny.gov/resource/28gk-bu58.json"). |
limit |
Number of rows to retrieve (default = 10,000). |
timeout_sec |
Request timeout in seconds (default = 30). |
clean_names |
Logical; if TRUE, convert column names to snake_case (default = TRUE). |
coerce_types |
Logical; if TRUE, attempt light type coercion (default = TRUE). |
A tibble containing the requested dataset.
# Examples that hit the live nys Open Data API are guarded so CRAN checks # do not fail when the network is unavailable or slow. if (interactive() && curl::has_internet()) { endpoint <- "https://data.ny.gov/resource/28gk-bu58.json" out <- try(nys_any_dataset(endpoint, limit = 3), silent = TRUE) if (!inherits(out, "try-error")) { head(out) } }# Examples that hit the live nys Open Data API are guarded so CRAN checks # do not fail when the network is unavailable or slow. if (interactive() && curl::has_internet()) { endpoint <- "https://data.ny.gov/resource/28gk-bu58.json" out <- try(nys_any_dataset(endpoint, limit = 3), silent = TRUE) if (!inherits(out, "try-error")) { head(out) } }
Retrieves the current Open NY catalog and returns datasets available for use with 'nys_pull_dataset()'.
nys_list_datasets()nys_list_datasets()
Keys are generated from dataset titles using 'janitor::make_clean_names()'.
A tibble of available datasets, including generated 'key', dataset 'uid', and dataset 'title'.
if (interactive() && curl::has_internet()) { nys_list_datasets() }if (interactive() && curl::has_internet()) { nys_list_datasets() }
Uses a dataset 'key' or 'uid' from 'nys_list_datasets()' to pull data from NYS Open Data.
nys_pull_dataset( dataset, limit = 10000, filters = list(), date = NULL, from = NULL, to = NULL, date_field = NULL, where = NULL, order = NULL, timeout_sec = 30, clean_names = TRUE, coerce_types = TRUE )nys_pull_dataset( dataset, limit = 10000, filters = list(), date = NULL, from = NULL, to = NULL, date_field = NULL, where = NULL, order = NULL, timeout_sec = 30, clean_names = TRUE, coerce_types = TRUE )
dataset |
A dataset key or UID from 'nys_list_datasets()'. |
limit |
Number of rows to retrieve (default = 10,000). |
filters |
Optional named list of filters. Supports vectors (translated to IN()). |
date |
Optional single date (matches all times that day) using 'date_field'. |
from |
Optional start date (inclusive) using 'date_field'. |
to |
Optional end date (exclusive) using 'date_field'. |
date_field |
Optional date/datetime column to use with 'date', 'from', or 'to'. Must be supplied when 'date', 'from', or 'to' are used. |
where |
Optional raw SoQL WHERE clause. If 'date', 'from', or 'to' are provided, their conditions are AND-ed with this. |
order |
Optional SoQL ORDER BY clause. |
timeout_sec |
Request timeout in seconds (default = 30). |
clean_names |
Logical; if TRUE, convert column names to snake_case (default = TRUE). |
coerce_types |
Logical; if TRUE, attempt light type coercion (default = TRUE). |
Dataset keys are generated from dataset titles using 'janitor::make_clean_names()'. Because keys are derived from live catalog metadata, dataset UIDs are the more stable option.
A tibble.
if (interactive() && curl::has_internet()) { # Pull by key nys_pull_dataset("311_service_requests", limit = 3) # Pull by UID nys_pull_dataset("28gk-bu58", limit = 3) # Filters nys_pull_dataset("28gk-bu58", limit = 3, filters = list(award_name = "MBA")) }if (interactive() && curl::has_internet()) { # Pull by key nys_pull_dataset("311_service_requests", limit = 3) # Pull by UID nys_pull_dataset("28gk-bu58", limit = 3) # Filters nys_pull_dataset("28gk-bu58", limit = 3, filters = list(award_name = "MBA")) }