---
title: "Look up information"
author: "Martin Westgate & Dax Kellie"
date: '2024-11-19'
output:
  rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Look up information}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---


galah supports two functions to look up information: `show_all()` and 
`search_all()`. The first argument to both functions is a type of information that you 
wish to look up; for example to see what fields are available to filter a query by, use:


``` r
show_all(fields)
```

```
## # A tibble: 695 × 3
##    id                  description               type  
##    <chr>               <chr>                     <chr> 
##  1 _nest_parent_       <NA>                      fields
##  2 _nest_path_         <NA>                      fields
##  3 _root_              <NA>                      fields
##  4 abcdTypeStatus      <NA>                      fields
##  5 acceptedNameUsage   Accepted name             fields
##  6 acceptedNameUsageID Accepted name             fields
##  7 accessRights        Access rights             fields
##  8 annotationsDoi      <NA>                      fields
##  9 annotationsUid      Referenced by publication fields
## 10 assertionUserId     Assertions by user        fields
## # ℹ 685 more rows
```

And to search for a specific field:


``` r
search_all(fields, "australian states")
```

```
## # A tibble: 2 × 3
##   id     description                            type  
##   <chr>  <chr>                                  <chr> 
## 1 cl2013 ASGS Australian States and Territories fields
## 2 cl22   Australian States and Territories      fields
```

Here is a list of information types that can be used with `show_all()` and 
`search_all()`:

<table class="table lightable-paper" style='width: auto !important; margin-left: auto; margin-right: auto; font-family: "Arial Narrow", arial, helvetica, sans-serif; margin-left: auto; margin-right: auto;'>
 <thead>
  <tr>
   <th style="text-align:left;"> Information type </th>
   <th style="text-align:left;"> Description </th>
   <th style="text-align:left;"> Sub-functions </th>
  </tr>
 </thead>
<tbody>
  <tr grouplength="3"><td colspan="3" style="background-color: #fdebf2; color: #343a40;"><strong>Configuration</strong></td></tr>
<tr>
   <td style="text-align:left;font-family: monospace;padding-left: 2em;" indentlevel="1"> atlases </td>
   <td style="text-align:left;"> Show what living atlases are available </td>
   <td style="text-align:left;font-family: monospace;"> show_all_atlases(), search_atlases() </td>
  </tr>
  <tr>
   <td style="text-align:left;font-family: monospace;padding-left: 2em;" indentlevel="1"> apis </td>
   <td style="text-align:left;"> Show what APIs &amp; functions are available for each atlas </td>
   <td style="text-align:left;font-family: monospace;"> show_all_apis(), search_apis() </td>
  </tr>
  <tr>
   <td style="text-align:left;font-family: monospace;padding-left: 2em;" indentlevel="1"> reasons </td>
   <td style="text-align:left;"> Show what values are acceptable as 'download reasons' for a specified atlas </td>
   <td style="text-align:left;font-family: monospace;"> show_all_reasons(), search_reasons() </td>
  </tr>
  <tr grouplength="3"><td colspan="3" style="background-color: #fdebf2; color: #343a40;"><strong>Taxonomy</strong></td></tr>
<tr>
   <td style="text-align:left;font-family: monospace;padding-left: 2em;" indentlevel="1"> taxa </td>
   <td style="text-align:left;"> Search for one or more taxonomic names </td>
   <td style="text-align:left;font-family: monospace;"> search_taxa() </td>
  </tr>
  <tr>
   <td style="text-align:left;font-family: monospace;padding-left: 2em;" indentlevel="1"> identifiers </td>
   <td style="text-align:left;"> Take a universal identifier and return taxonomic information </td>
   <td style="text-align:left;font-family: monospace;"> search_identifiers() </td>
  </tr>
  <tr>
   <td style="text-align:left;font-family: monospace;padding-left: 2em;" indentlevel="1"> ranks </td>
   <td style="text-align:left;"> Show valid taxonomic ranks (e.g. Kingdom, Class, Order, etc.) </td>
   <td style="text-align:left;font-family: monospace;"> show_all_ranks(), search_ranks()) </td>
  </tr>
  <tr grouplength="2"><td colspan="3" style="background-color: #fdebf2; color: #343a40;"><strong>Filters</strong></td></tr>
<tr>
   <td style="text-align:left;font-family: monospace;padding-left: 2em;" indentlevel="1"> fields </td>
   <td style="text-align:left;"> Show fields that are stored in an atlas </td>
   <td style="text-align:left;font-family: monospace;"> show_all_fields(), search_fields() </td>
  </tr>
  <tr>
   <td style="text-align:left;font-family: monospace;padding-left: 2em;" indentlevel="1"> assertions </td>
   <td style="text-align:left;"> Show results of data quality checks run by each atlas </td>
   <td style="text-align:left;font-family: monospace;"> show_all_assertions(), search_assertions() </td>
  </tr>
  <tr grouplength="2"><td colspan="3" style="background-color: #fdebf2; color: #343a40;"><strong>Group filters</strong></td></tr>
<tr>
   <td style="text-align:left;font-family: monospace;padding-left: 2em;" indentlevel="1"> profiles </td>
   <td style="text-align:left;"> Show what data quality profiles are available </td>
   <td style="text-align:left;font-family: monospace;"> show_all_profiles(), search_profiles() </td>
  </tr>
  <tr>
   <td style="text-align:left;font-family: monospace;padding-left: 2em;" indentlevel="1"> lists </td>
   <td style="text-align:left;"> Show what species lists are available </td>
   <td style="text-align:left;font-family: monospace;"> show_lists(), search_lists() </td>
  </tr>
  <tr grouplength="3"><td colspan="3" style="background-color: #fdebf2; color: #343a40;"><strong>Data providers</strong></td></tr>
<tr>
   <td style="text-align:left;font-family: monospace;padding-left: 2em;" indentlevel="1"> providers </td>
   <td style="text-align:left;"> Show which institutions have provided data </td>
   <td style="text-align:left;font-family: monospace;"> show_all_providers(), search_providers() </td>
  </tr>
  <tr>
   <td style="text-align:left;font-family: monospace;padding-left: 2em;" indentlevel="1"> collections </td>
   <td style="text-align:left;"> Show the specific collections within those institutions </td>
   <td style="text-align:left;font-family: monospace;"> show_all_collections(), search_collections() </td>
  </tr>
  <tr>
   <td style="text-align:left;font-family: monospace;padding-left: 2em;" indentlevel="1"> datasets </td>
   <td style="text-align:left;"> Shows all the data groupings within those collections </td>
   <td style="text-align:left;font-family: monospace;"> show_all_datasets(), search_datasets() </td>
  </tr>
</tbody>
</table>


# `show_all_` subfunctions
While `show_all` is useful for a variety of cases, you can 
still call the underlying subfunctions if you prefer. Functions with 
the prefix `show_all_` do exactly that; they show all the 
possible values of the category specified. 




``` r
show_all_atlases()
```

```
## # A tibble: 10 × 4
##    region         institution                                                             acronym url                    
##    <chr>          <chr>                                                                   <chr>   <chr>                  
##  1 Australia      Atlas of Living Australia                                               ALA     https://www.ala.org.au 
##  2 Austria        Biodiversitäts-Atlas Österreich                                         BAO     https://biodiversityat…
##  3 Brazil         Sistemas de Informações sobre a Biodiversidade Brasileira               SiBBr   https://sibbr.gov.br   
##  4 France         Portail français d'accès aux données d'observation sur les espèces      OpenObs https://openobs.mnhn.fr
##  5 Global         Global Biodiversity Information Facility                                GBIF    https://gbif.org       
##  6 Guatemala      Sistema Nacional de Información sobre Diversidad Biológica de Guatemala SNIBgt  https://snib.conap.gob…
##  7 Portugal       GBIF Portugal                                                           GBIF.pt https://www.gbif.pt    
##  8 Spain          GBIF Spain                                                              GBIF.es https://gbif.es        
##  9 Sweden         Swedish Biodiversity Data Infrastructure                                SBDI    https://biodiversityda…
## 10 United Kingdom National Biodiversity Network                                           NBN     https://nbn.org.uk
```

``` r
show_all_reasons()
```

```
## # A tibble: 13 × 2
##       id name                            
##    <int> <chr>                           
##  1     1 biosecurity management/planning 
##  2    11 citizen science                 
##  3     5 collection management           
##  4     0 conservation management/planning
##  5     7 ecological research             
##  6     3 education                       
##  7     2 environmental assessment        
##  8    12 restoration/remediation         
##  9     4 scientific research             
## 10     8 systematic research/taxonomy    
## 11    13 species modelling               
## 12     6 other                           
## 13    10 testing
```

# `search_` subfunctions
You can also call subfunctions that use the `search_` prefix to lookup information. 
`search_` subfunctions differ from `show_all_` in that they require a query to work,
and they useful to search for 
detailed information that can't be summarised across the whole atlas.

`search_taxa()` is an especially useful function in galah. It let's you search 
for a single taxon or multiple taxa by name.


``` r
search_taxa("reptilia")
```

```
## # A tibble: 1 × 9
##   search_term scientific_name taxon_concept_id                               rank  match_type kingdom phylum class issues
##   <chr>       <chr>           <chr>                                          <chr> <chr>      <chr>   <chr>  <chr> <chr> 
## 1 reptilia    REPTILIA        https://biodiversity.org.au/afd/taxa/682e1228… class exactMatch Animal… Chord… Rept… noIss…
```

``` r
search_taxa("reptilia", "aves", "mammalia", "pisces")
```

```
## # A tibble: 1 × 9
##   search_term scientific_name taxon_concept_id                               rank  match_type kingdom phylum class issues
##   <chr>       <chr>           <chr>                                          <chr> <chr>      <chr>   <chr>  <chr> <chr> 
## 1 reptilia    REPTILIA        https://biodiversity.org.au/afd/taxa/682e1228… class exactMatch Animal… Chord… Rept… noIss…
```

Alternatively, `search_identifiers()` is the partner function to `search_taxa()`. 
If we already know a taxonomic identifier, we can search for 
which taxa the identifier belongs to.


``` r
search_identifiers("urn:lsid:biodiversity.org.au:afd.taxon:682e1228-5b3c-45ff-833b-550efd40c399")
```

```
## # A tibble: 1 × 15
##   search_term     success scientific_name taxon_concept_id rank  rank_id   lft   rgt match_type kingdom kingdom_id phylum
##   <chr>           <lgl>   <chr>           <chr>            <chr>   <int> <int> <int> <chr>      <chr>   <chr>      <chr> 
## 1 urn:lsid:biodi… TRUE    REPTILIA        https://biodive… class    3000 33626 36658 taxonIdMa… Animal… https://b… Chord…
## # ℹ 3 more variables: phylum_id <chr>, class <chr>, class_id <chr>
```

# `show_values()` & `search_values()`

Once a desired field is found, you can use `show_values()` to understand the 
information contained within that field. For example, we can show the values 
contained in the field `basisOfRecord`.


``` r
search_all(fields, "basisOfRecord") |> show_values()
```

```
## ! Search returned 2 matched fields.
## • Showing values for 'basisOfRecord'.
```

```
## # A tibble: 9 × 1
##   basisOfRecord      
##   <chr>              
## 1 HUMAN_OBSERVATION  
## 2 PRESERVED_SPECIMEN 
## 3 MACHINE_OBSERVATION
## 4 OCCURRENCE         
## 5 OBSERVATION        
## 6 MATERIAL_SAMPLE    
## 7 LIVING_SPECIMEN    
## 8 MATERIAL_CITATION  
## 9 FOSSIL_SPECIMEN
```

Use this information to pass meaningful queries to `galah_filter()`.


``` r
galah_call() |> 
  galah_filter(basisOfRecord == "LIVING_SPECIMEN") |> 
  atlas_counts()
```

```
## # A tibble: 1 × 1
##    count
##    <int>
## 1 126135
```

This works for other types of query, such as data profiles:


``` r
search_all(profiles, "ALA") |> 
  show_values() |> 
  head()
```

```
## • Showing values for 'ALA'.
```

```
## # A tibble: 6 × 5
##      id enabled description                                                                           filter displayOrder
##   <int> <lgl>   <chr>                                                                                 <chr>         <int>
## 1    94 TRUE    "Exclude all records where spatial validity is \"false\""                             "-spa…            1
## 2    96 TRUE    "Exclude all records with an assertion that the scientific name provided does not ma… "-ass…            1
## 3    97 TRUE    "Exclude all records with an assertion that the scientific name provided is not stru… "-ass…            2
## 4    98 TRUE    "Exclude all records with an assertion that the name and classification supplied can… "-ass…            3
## 5    99 TRUE    "Exclude all records with an assertion that kingdom provided doesn't match a known k… "-ass…            4
## 6   100 TRUE    "Exclude all records with an assertion that the scientific name provided in the reco… "-ass…            5
```