Efficient data access for data reporting

Context

This topic relates to approaches to efficiently access GBIF data for the purpose of generating data reports. Recent developments at GBIF include the availability of GBIF snapshots on cloud computing infrastructures and the ability to query GBIF using SQL allowing for e.g. data metrics and custom data reports rather than downloading individual records. Additionally, emerging data cube formats will provide an opportunity to efficiently analyze biodiversity trends on spatiotemporal data.

Learning objectives

After completing this module, you should be able to perform the following:

  • Identify the data access landscape (i.e, what tools are available and how they are used)

  • Identify appropriate solutions for given data access use cases

  • Gain insight into coding and rGBIF

  • Explore scripting reproducible workflows

Trainers

The following trainers have developed the content for this topic:

Dag Endresen, Node Manager, Norway

Will Morris, Node Manager, Finland

Secretariat consultants: Andrew Rodrigues and John Waller

Preparation

Complete the following activities to prepare for the onsite sessions:

  1. Participants should come with 2-3 data use cases/data reporting requirements in mind to propose as subjects for the group discussion.

  2. Create an account on GBIF.org (if you don’t already have one).

  3. Install software

Installation instructions reference earlier versions of the R and R Studio Desktop. Please install the most current and stable versions.