Clean the COVID19 staging data

Build a datalink that will clean some data quality issues with the COVID19 staging data.

Introduction

In this tutorial you will learn some method to clearn staging data. There are two key parts of this data which require cleaning:

  • The country names are regularly duplicated.
  • There are days which have missing data, and days which have duplicate data. These will be evened out using the series transform.

Prerequisites

Before proceeding with the tutorial ensure that you:

Cleaning Names

The source data has country names which have changed over the lifecycle of the data. To see this: