Data Cleansing In Power BI
- 03:45
What data cleansing or transforming is.
Downloads
No associated resources to download.
Transcript
Data Cleansing in Power BI Data cleansing is the process of preparing data so it's in a suitable format and ready for analysis.
Data is messy it comes in all shapes and sizes, and it's full of mistakes and errors. Some of these are human error or something has been typed incorrectly. Sometimes it's just going to machine output and how has been downloaded from another system.
We need to be able to fix these errors before we can use the data in our analysis.
There are two steps to cleansing data, the first one is to identify what the issues are and the second one is to resolve those issues.
Some of the most common data cleansing tasks that we have to perform are: removing blank rows or columns, we'll often find a data set contains blank columns or blank rows and it's important that we actually get rid of these before we perform our analysis. Replacing values is another common data cleansing task, it's important that we have consistency in a column if we find typing errors, spelling errors, it's important we change those so that we can use that column in our analysis.
Another common data cleansing task is splitting columns. Sometimes data comes down in one column and it should really be into or three separate columns. So we use the splitting data tool to do that.
It's important that we identify and remove duplicate rows of data.
And another common task is changing the data type.
In Power BI cleansing data, we use Power BI query editor.
It's known as transforming data in Power BI.
And we can do this right at the start when we connect to datasource or we can easily switch at any point in time and make some more transformations.
So here we have an example of unceansed data in query editor. We can see lots of blank rows and we even have a blank column, column 3 if we look closely at column 2. We'll see that US dollars is spelled in various ways and it's important that we only use one of those options. We would use replace values to fix that up. If we look at column 4 we can see that we have the value that we want to calculate with joined together with some other text. It's important that we split that column before we do our analysis.
Sometimes we can't even see so clearly the mystics in our data set and so we have to use some tools to help us find those issues.
Once we have cleansed the data in query editor, it will look completely different and we're ready to go into Power BI and do our analysis.
Let's have a quick look at how to use Query editor in Power BI.
When we're in Power BI the first thing we do is we connect to your data source. So I'm just going to connect to an Excel workbook here module 3 lesson 1.
And when we connect we'll get a little preview of what the data looks like, so I'm going to select my assets table here and I can see at a glance that it is full of errors. So rather than load this data straight into my Power BI report and start working with it. I'm going to choose transform data and if I choose that that will take me into query editor and it's from here I can start to make all the transformations.