| Title: | Convenient Access to Los Angeles Open Data API Endpoints |
|---|---|
| Description: | Provides simple, reproducible access to datasets from the Los Angeles Open Data portal <https://data.lacity.org/>. Functions return results as tidy tibbles and support optional filtering, sorting, and row limits via the Socrata API. |
| Authors: | Christian Martinez [aut, cre] (GitHub: martinezc1, ORCID: <https://orcid.org/0009-0005-6026-6454>) |
| Maintainer: | Christian Martinez <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.0 |
| Built: | 2026-05-17 07:58:22 UTC |
| Source: | https://github.com/martinezc1/laopendata |
Downloads any Los Angeles Open Data dataset given its Socrata JSON endpoint.
la_any_dataset( json_link, limit = 10000, timeout_sec = 30, clean_names = TRUE, coerce_types = TRUE )la_any_dataset( json_link, limit = 10000, timeout_sec = 30, clean_names = TRUE, coerce_types = TRUE )
json_link |
A Socrata dataset JSON endpoint URL (e.g., "https://data.lacity.org/resource/6rrh-rzua.json.json"). |
limit |
Number of rows to retrieve (default = 10,000). |
timeout_sec |
Request timeout in seconds (default = 30). |
clean_names |
Logical; if TRUE, convert column names to snake_case (default = TRUE). |
coerce_types |
Logical; if TRUE, attempt light type coercion (default = TRUE). |
A tibble containing the requested dataset.
# Examples that hit the live Los Angeles Open Data API are guarded so CRAN checks # do not fail when the network is unavailable or slow. if (interactive() && curl::has_internet()) { endpoint <- "https://data.lacity.org/resource/6rrh-rzua.json.json" out <- try(la_any_dataset(endpoint, limit = 3), silent = TRUE) if (!inherits(out, "try-error")) { head(out) } }# Examples that hit the live Los Angeles Open Data API are guarded so CRAN checks # do not fail when the network is unavailable or slow. if (interactive() && curl::has_internet()) { endpoint <- "https://data.lacity.org/resource/6rrh-rzua.json.json" out <- try(la_any_dataset(endpoint, limit = 3), silent = TRUE) if (!inherits(out, "try-error")) { head(out) } }
Retrieves the current Open NY catalog and returns datasets available for use with 'la_pull_dataset()'.
la_list_datasets()la_list_datasets()
Keys are generated from dataset names using 'janitor::make_clean_names()'.
A tibble of available datasets, including generated 'key', dataset 'id', and dataset 'name'.
if (interactive() && curl::has_internet()) { la_list_datasets() }if (interactive() && curl::has_internet()) { la_list_datasets() }
Uses a dataset 'key' or 'id' from 'la_list_datasets()' to pull data from Los Angeles Open Data.
la_pull_dataset( dataset, limit = 10000, filters = list(), date = NULL, from = NULL, to = NULL, date_field = NULL, where = NULL, order = NULL, timeout_sec = 30, clean_names = TRUE, coerce_types = TRUE )la_pull_dataset( dataset, limit = 10000, filters = list(), date = NULL, from = NULL, to = NULL, date_field = NULL, where = NULL, order = NULL, timeout_sec = 30, clean_names = TRUE, coerce_types = TRUE )
dataset |
A dataset key or ID from 'la_list_datasets()'. |
limit |
Number of rows to retrieve (default = 10,000). |
filters |
Optional named list of filters. Supports vectors (translated to IN()). |
date |
Optional single date (matches all times that day) using 'date_field'. |
from |
Optional start date (inclusive) using 'date_field'. |
to |
Optional end date (exclusive) using 'date_field'. |
date_field |
Optional date/datetime column to use with 'date', 'from', or 'to'. Must be supplied when 'date', 'from', or 'to' are used. |
where |
Optional raw SoQL WHERE clause. If 'date', 'from', or 'to' are provided, their conditions are AND-ed with this. |
order |
Optional SoQL ORDER BY clause. |
timeout_sec |
Request timeout in seconds (default = 30). |
clean_names |
Logical; if TRUE, convert column names to snake_case (default = TRUE). |
coerce_types |
Logical; if TRUE, attempt light type coercion (default = TRUE). |
Dataset keys are generated from dataset titles using 'janitor::make_clean_names()'. Because keys are derived from live catalog metadata, dataset IDs are the more stable option.
A tibble.
if (interactive() && curl::has_internet()) { # Pull by key la_pull_dataset("current_employee_names_salaries_and_position_titles", limit = 3) # Pull by ID la_pull_dataset("xzkq-xp2w", limit = 3) # Filters la_pull_dataset("xzkq-xp2w", limit = 3, filters = list(salary_or_hourly = "HOURLY")) }if (interactive() && curl::has_internet()) { # Pull by key la_pull_dataset("current_employee_names_salaries_and_position_titles", limit = 3) # Pull by ID la_pull_dataset("xzkq-xp2w", limit = 3) # Filters la_pull_dataset("xzkq-xp2w", limit = 3, filters = list(salary_or_hourly = "HOURLY")) }