| Title: | Convenient Access to Chicago Open Data API Endpoints |
|---|---|
| Description: | Provides simple, reproducible access to datasets from the Chicago Open Data portal <https://data.cityofchicago.org/>. Functions return results as tidy tibbles and support optional filtering, sorting, and row limits via the Socrata API. |
| Authors: | Christian Martinez [aut, cre] (GitHub: martinezc1, ORCID: <https://orcid.org/0009-0005-6026-6454>) |
| Maintainer: | Christian Martinez <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.0 |
| Built: | 2026-05-17 07:59:56 UTC |
| Source: | https://github.com/martinezc1/chiopendata |
Downloads any CHI Open Data dataset given its Socrata JSON endpoint.
chi_any_dataset( json_link, limit = 10000, timeout_sec = 30, clean_names = TRUE, coerce_types = TRUE )chi_any_dataset( json_link, limit = 10000, timeout_sec = 30, clean_names = TRUE, coerce_types = TRUE )
json_link |
A Socrata dataset JSON endpoint URL (e.g., "https://data.cityofchicago.org/resource/fuz6-n5nj.json"). |
limit |
Number of rows to retrieve (default = 10,000). |
timeout_sec |
Request timeout in seconds (default = 30). |
clean_names |
Logical; if TRUE, convert column names to snake_case (default = TRUE). |
coerce_types |
Logical; if TRUE, attempt light type coercion (default = TRUE). |
A tibble containing the requested dataset.
# Examples that hit the live chi Open Data API are guarded so CRAN checks # do not fail when the network is unavailable or slow. if (interactive() && curl::has_internet()) { endpoint <- "https://data.cityofchicago.org/resource/fuz6-n5nj.json" out <- try(chi_any_dataset(endpoint, limit = 3), silent = TRUE) if (!inherits(out, "try-error")) { head(out) } }# Examples that hit the live chi Open Data API are guarded so CRAN checks # do not fail when the network is unavailable or slow. if (interactive() && curl::has_internet()) { endpoint <- "https://data.cityofchicago.org/resource/fuz6-n5nj.json" out <- try(chi_any_dataset(endpoint, limit = 3), silent = TRUE) if (!inherits(out, "try-error")) { head(out) } }
Retrieves the current Open NY catalog and returns datasets available for use with 'chi_pull_dataset()'.
chi_list_datasets()chi_list_datasets()
Keys are generated from dataset names using 'janitor::make_clean_names()'.
A tibble of available datasets, including generated 'key', dataset 'uid', and dataset 'name'.
if (interactive() && curl::has_internet()) { chi_list_datasets() }if (interactive() && curl::has_internet()) { chi_list_datasets() }
Uses a dataset 'key' or 'id' from 'chi_list_datasets()' to pull data from CHI Open Data.
chi_pull_dataset( dataset, limit = 10000, filters = list(), date = NULL, from = NULL, to = NULL, date_field = NULL, where = NULL, order = NULL, timeout_sec = 30, clean_names = TRUE, coerce_types = TRUE )chi_pull_dataset( dataset, limit = 10000, filters = list(), date = NULL, from = NULL, to = NULL, date_field = NULL, where = NULL, order = NULL, timeout_sec = 30, clean_names = TRUE, coerce_types = TRUE )
dataset |
A dataset key or ID from 'chi_list_datasets()'. |
limit |
Number of rows to retrieve (default = 10,000). |
filters |
Optional named list of filters. Supports vectors (translated to IN()). |
date |
Optional single date (matches all times that day) using 'date_field'. |
from |
Optional start date (inclusive) using 'date_field'. |
to |
Optional end date (exclusive) using 'date_field'. |
date_field |
Optional date/datetime column to use with 'date', 'from', or 'to'. Must be supplied when 'date', 'from', or 'to' are used. |
where |
Optional raw SoQL WHERE clause. If 'date', 'from', or 'to' are provided, their conditions are AND-ed with this. |
order |
Optional SoQL ORDER BY clause. |
timeout_sec |
Request timeout in seconds (default = 30). |
clean_names |
Logical; if TRUE, convert column names to snake_case (default = TRUE). |
coerce_types |
Logical; if TRUE, attempt light type coercion (default = TRUE). |
Dataset keys are generated from dataset titles using 'janitor::make_clean_names()'. Because keys are derived from live catalog metadata, dataset IDs are the more stable option.
A tibble.
if (interactive() && curl::has_internet()) { # Pull by key chi_pull_dataset("current_employee_names_salaries_and_position_titles", limit = 3) # Pull by ID chi_pull_dataset("xzkq-xp2w", limit = 3) # Filters chi_pull_dataset("xzkq-xp2w", limit = 3, filters = list(salary_or_hourly = "HOURLY")) }if (interactive() && curl::has_internet()) { # Pull by key chi_pull_dataset("current_employee_names_salaries_and_position_titles", limit = 3) # Pull by ID chi_pull_dataset("xzkq-xp2w", limit = 3) # Filters chi_pull_dataset("xzkq-xp2w", limit = 3, filters = list(salary_or_hourly = "HOURLY")) }