--- title: "Look up information" author: "Martin Westgate & Dax Kellie" date: '2024-11-19' output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Look up information} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- galah supports two functions to look up information: `show_all()` and `search_all()`. The first argument to both functions is a type of information that you wish to look up; for example to see what fields are available to filter a query by, use: ``` r show_all(fields) ``` ``` ## # A tibble: 695 × 3 ## id description type ## <chr> <chr> <chr> ## 1 _nest_parent_ <NA> fields ## 2 _nest_path_ <NA> fields ## 3 _root_ <NA> fields ## 4 abcdTypeStatus <NA> fields ## 5 acceptedNameUsage Accepted name fields ## 6 acceptedNameUsageID Accepted name fields ## 7 accessRights Access rights fields ## 8 annotationsDoi <NA> fields ## 9 annotationsUid Referenced by publication fields ## 10 assertionUserId Assertions by user fields ## # ℹ 685 more rows ``` And to search for a specific field: ``` r search_all(fields, "australian states") ``` ``` ## # A tibble: 2 × 3 ## id description type ## <chr> <chr> <chr> ## 1 cl2013 ASGS Australian States and Territories fields ## 2 cl22 Australian States and Territories fields ``` Here is a list of information types that can be used with `show_all()` and `search_all()`: <table class="table lightable-paper" style='width: auto !important; margin-left: auto; margin-right: auto; font-family: "Arial Narrow", arial, helvetica, sans-serif; margin-left: auto; margin-right: auto;'> <thead> <tr> <th style="text-align:left;"> Information type </th> <th style="text-align:left;"> Description </th> <th style="text-align:left;"> Sub-functions </th> </tr> </thead> <tbody> <tr grouplength="3"><td colspan="3" style="background-color: #fdebf2; color: #343a40;"><strong>Configuration</strong></td></tr> <tr> <td style="text-align:left;font-family: monospace;padding-left: 2em;" indentlevel="1"> atlases </td> <td style="text-align:left;"> Show what living atlases are available </td> <td style="text-align:left;font-family: monospace;"> show_all_atlases(), search_atlases() </td> </tr> <tr> <td style="text-align:left;font-family: monospace;padding-left: 2em;" indentlevel="1"> apis </td> <td style="text-align:left;"> Show what APIs & functions are available for each atlas </td> <td style="text-align:left;font-family: monospace;"> show_all_apis(), search_apis() </td> </tr> <tr> <td style="text-align:left;font-family: monospace;padding-left: 2em;" indentlevel="1"> reasons </td> <td style="text-align:left;"> Show what values are acceptable as 'download reasons' for a specified atlas </td> <td style="text-align:left;font-family: monospace;"> show_all_reasons(), search_reasons() </td> </tr> <tr grouplength="3"><td colspan="3" style="background-color: #fdebf2; color: #343a40;"><strong>Taxonomy</strong></td></tr> <tr> <td style="text-align:left;font-family: monospace;padding-left: 2em;" indentlevel="1"> taxa </td> <td style="text-align:left;"> Search for one or more taxonomic names </td> <td style="text-align:left;font-family: monospace;"> search_taxa() </td> </tr> <tr> <td style="text-align:left;font-family: monospace;padding-left: 2em;" indentlevel="1"> identifiers </td> <td style="text-align:left;"> Take a universal identifier and return taxonomic information </td> <td style="text-align:left;font-family: monospace;"> search_identifiers() </td> </tr> <tr> <td style="text-align:left;font-family: monospace;padding-left: 2em;" indentlevel="1"> ranks </td> <td style="text-align:left;"> Show valid taxonomic ranks (e.g. Kingdom, Class, Order, etc.) </td> <td style="text-align:left;font-family: monospace;"> show_all_ranks(), search_ranks()) </td> </tr> <tr grouplength="2"><td colspan="3" style="background-color: #fdebf2; color: #343a40;"><strong>Filters</strong></td></tr> <tr> <td style="text-align:left;font-family: monospace;padding-left: 2em;" indentlevel="1"> fields </td> <td style="text-align:left;"> Show fields that are stored in an atlas </td> <td style="text-align:left;font-family: monospace;"> show_all_fields(), search_fields() </td> </tr> <tr> <td style="text-align:left;font-family: monospace;padding-left: 2em;" indentlevel="1"> assertions </td> <td style="text-align:left;"> Show results of data quality checks run by each atlas </td> <td style="text-align:left;font-family: monospace;"> show_all_assertions(), search_assertions() </td> </tr> <tr grouplength="2"><td colspan="3" style="background-color: #fdebf2; color: #343a40;"><strong>Group filters</strong></td></tr> <tr> <td style="text-align:left;font-family: monospace;padding-left: 2em;" indentlevel="1"> profiles </td> <td style="text-align:left;"> Show what data quality profiles are available </td> <td style="text-align:left;font-family: monospace;"> show_all_profiles(), search_profiles() </td> </tr> <tr> <td style="text-align:left;font-family: monospace;padding-left: 2em;" indentlevel="1"> lists </td> <td style="text-align:left;"> Show what species lists are available </td> <td style="text-align:left;font-family: monospace;"> show_lists(), search_lists() </td> </tr> <tr grouplength="3"><td colspan="3" style="background-color: #fdebf2; color: #343a40;"><strong>Data providers</strong></td></tr> <tr> <td style="text-align:left;font-family: monospace;padding-left: 2em;" indentlevel="1"> providers </td> <td style="text-align:left;"> Show which institutions have provided data </td> <td style="text-align:left;font-family: monospace;"> show_all_providers(), search_providers() </td> </tr> <tr> <td style="text-align:left;font-family: monospace;padding-left: 2em;" indentlevel="1"> collections </td> <td style="text-align:left;"> Show the specific collections within those institutions </td> <td style="text-align:left;font-family: monospace;"> show_all_collections(), search_collections() </td> </tr> <tr> <td style="text-align:left;font-family: monospace;padding-left: 2em;" indentlevel="1"> datasets </td> <td style="text-align:left;"> Shows all the data groupings within those collections </td> <td style="text-align:left;font-family: monospace;"> show_all_datasets(), search_datasets() </td> </tr> </tbody> </table> # `show_all_` subfunctions While `show_all` is useful for a variety of cases, you can still call the underlying subfunctions if you prefer. Functions with the prefix `show_all_` do exactly that; they show all the possible values of the category specified. ``` r show_all_atlases() ``` ``` ## # A tibble: 10 × 4 ## region institution acronym url ## <chr> <chr> <chr> <chr> ## 1 Australia Atlas of Living Australia ALA https://www.ala.org.au ## 2 Austria Biodiversitäts-Atlas Österreich BAO https://biodiversityat… ## 3 Brazil Sistemas de Informações sobre a Biodiversidade Brasileira SiBBr https://sibbr.gov.br ## 4 France Portail français d'accès aux données d'observation sur les espèces OpenObs https://openobs.mnhn.fr ## 5 Global Global Biodiversity Information Facility GBIF https://gbif.org ## 6 Guatemala Sistema Nacional de Información sobre Diversidad Biológica de Guatemala SNIBgt https://snib.conap.gob… ## 7 Portugal GBIF Portugal GBIF.pt https://www.gbif.pt ## 8 Spain GBIF Spain GBIF.es https://gbif.es ## 9 Sweden Swedish Biodiversity Data Infrastructure SBDI https://biodiversityda… ## 10 United Kingdom National Biodiversity Network NBN https://nbn.org.uk ``` ``` r show_all_reasons() ``` ``` ## # A tibble: 13 × 2 ## id name ## <int> <chr> ## 1 1 biosecurity management/planning ## 2 11 citizen science ## 3 5 collection management ## 4 0 conservation management/planning ## 5 7 ecological research ## 6 3 education ## 7 2 environmental assessment ## 8 12 restoration/remediation ## 9 4 scientific research ## 10 8 systematic research/taxonomy ## 11 13 species modelling ## 12 6 other ## 13 10 testing ``` # `search_` subfunctions You can also call subfunctions that use the `search_` prefix to lookup information. `search_` subfunctions differ from `show_all_` in that they require a query to work, and they useful to search for detailed information that can't be summarised across the whole atlas. `search_taxa()` is an especially useful function in galah. It let's you search for a single taxon or multiple taxa by name. ``` r search_taxa("reptilia") ``` ``` ## # A tibble: 1 × 9 ## search_term scientific_name taxon_concept_id rank match_type kingdom phylum class issues ## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> ## 1 reptilia REPTILIA https://biodiversity.org.au/afd/taxa/682e1228… class exactMatch Animal… Chord… Rept… noIss… ``` ``` r search_taxa("reptilia", "aves", "mammalia", "pisces") ``` ``` ## # A tibble: 1 × 9 ## search_term scientific_name taxon_concept_id rank match_type kingdom phylum class issues ## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> ## 1 reptilia REPTILIA https://biodiversity.org.au/afd/taxa/682e1228… class exactMatch Animal… Chord… Rept… noIss… ``` Alternatively, `search_identifiers()` is the partner function to `search_taxa()`. If we already know a taxonomic identifier, we can search for which taxa the identifier belongs to. ``` r search_identifiers("urn:lsid:biodiversity.org.au:afd.taxon:682e1228-5b3c-45ff-833b-550efd40c399") ``` ``` ## # A tibble: 1 × 15 ## search_term success scientific_name taxon_concept_id rank rank_id lft rgt match_type kingdom kingdom_id phylum ## <chr> <lgl> <chr> <chr> <chr> <int> <int> <int> <chr> <chr> <chr> <chr> ## 1 urn:lsid:biodi… TRUE REPTILIA https://biodive… class 3000 33626 36658 taxonIdMa… Animal… https://b… Chord… ## # ℹ 3 more variables: phylum_id <chr>, class <chr>, class_id <chr> ``` # `show_values()` & `search_values()` Once a desired field is found, you can use `show_values()` to understand the information contained within that field. For example, we can show the values contained in the field `basisOfRecord`. ``` r search_all(fields, "basisOfRecord") |> show_values() ``` ``` ## ! Search returned 2 matched fields. ## • Showing values for 'basisOfRecord'. ``` ``` ## # A tibble: 9 × 1 ## basisOfRecord ## <chr> ## 1 HUMAN_OBSERVATION ## 2 PRESERVED_SPECIMEN ## 3 MACHINE_OBSERVATION ## 4 OCCURRENCE ## 5 OBSERVATION ## 6 MATERIAL_SAMPLE ## 7 LIVING_SPECIMEN ## 8 MATERIAL_CITATION ## 9 FOSSIL_SPECIMEN ``` Use this information to pass meaningful queries to `galah_filter()`. ``` r galah_call() |> galah_filter(basisOfRecord == "LIVING_SPECIMEN") |> atlas_counts() ``` ``` ## # A tibble: 1 × 1 ## count ## <int> ## 1 126135 ``` This works for other types of query, such as data profiles: ``` r search_all(profiles, "ALA") |> show_values() |> head() ``` ``` ## • Showing values for 'ALA'. ``` ``` ## # A tibble: 6 × 5 ## id enabled description filter displayOrder ## <int> <lgl> <chr> <chr> <int> ## 1 94 TRUE "Exclude all records where spatial validity is \"false\"" "-spa… 1 ## 2 96 TRUE "Exclude all records with an assertion that the scientific name provided does not ma… "-ass… 1 ## 3 97 TRUE "Exclude all records with an assertion that the scientific name provided is not stru… "-ass… 2 ## 4 98 TRUE "Exclude all records with an assertion that the name and classification supplied can… "-ass… 3 ## 5 99 TRUE "Exclude all records with an assertion that kingdom provided doesn't match a known k… "-ass… 4 ## 6 100 TRUE "Exclude all records with an assertion that the scientific name provided in the reco… "-ass… 5 ```