Please find below a transcription of the audio portion of Ben Howard’s session, Power BI for Business Users and Project Managers – Part 2: Deep Dive into Shaping Data using Power Query, being provided by MPUG for the convenience of our members. You may wish to use this transcript for the purposes of self-paced learning, searching for specific information, and/or performing a quick review of webinar content. There may be exclusions, such as those steps included in product demonstrations. You may watch the live recording of this webinar at your convenience.
Kyle: Hello everyone, and welcome back for part two of the Power BI for Business Users and Project Managers course. This lesson is a deep dive into shaping data using Power Query. My name is Kyle and I’ll be the moderator today. And today’s session is eligible for one PDU in the Strategic category. The MPUG Activity Code for claiming that with PMI is on the screen now. Like all MPUG webinars, a recording of this session will be posted to MPUG.com shortly after the live presentation ends. All MPUG members can watch the recording at any time and still be eligible to earn the PDU credit. All the sessions you watch on demand can be submitted to your webinar history, and the live sessions you attend are automatically submitted. Within your history you can print or download your transcript and certificates of completion, including the one for today. You can access that by logging in to MPUG.com, click on the My Account button, and then click on the Webinar Reports Link.
Kyle: If you have any questions during today’s presentation, please send those over at any time using the chat question box on the Go To Webinar control panel. We do plan to answer those questions for you throughout the session today.
Kyle: All right, we’ll go ahead and get started. Very happy to welcome back Ben Howard today. At this time I’ll go ahead and hand it over to you Ben to get us started with today’s part two.
Ben: Brilliant, thank you Kyle, and welcome everybody, wherever you are in the world and whatever time it is. Morning, noon, afternoon as it is for he here, or even evening. Let me just share a screen here. Kyle, can you just pipe up and say, “Yeah, I can see a big screen-
Ben: … that’s got Power BI on it.”
Kyle: That looks great.
Ben: Brilliant, thank you. Welcome, as I said. Power BI for Business Users and Project Managers part two. Part one is on the MPUG website, and you can review that where we really just covered off what was Power BI. Today we’re going to do a bit more of a deep dive, and this will be very demo-based, into shaping data using a tool that’s part of a Power BI data set, or part of a Power BI suite of applications if you like, called Power Query.
Ben: Before we go ahead, let’s just remind ourselves who we might be in terms of who did I design this three part course for. Basically project managers or program managers or analysts, that’s the kind of role I’m thinking you might have. Of course, you might be an IT manager as well, interested in Power BI. The tools that you might use today for reporting or managing projects would be things like Microsoft Project, of course. We are, of course, a Microsoft Project user group, Project Online, Project Over Web, a new version of Microsoft Project, Roadmaps, Planner, Excel of course, SharePoint Lists and maybe a myriad of other things. JIRA, Primavera, those sorts of things as well.
Ben: The need, very much, which we identified, is this idea of reporting. This idea of reporting, we all have to report, of course. But reporting takes a long time, so what we’re trying to do is provide a standardized way of quickly reporting across multiple projects and programs, and being able to share those reports out to our execs and senior leadership team or senior management team. So that they can understand the status of our, and indeed their, programs and projects. Because ultimately those things are always, they’re just an investment, right? They’re a way to spend quickly $20,000, $100,000, $2 million, $20 million dollars. And people want to know when you’re spending money, are you doing it the right sort of way, and is it going to give the return on investment that has been identified early in the business case. So that’s really the need for the reporting, or that’s the need that reporting tries to service. So that’s who this course is designed for.
Ben: Just a quick one pager on who am I. My name’s Ben Howard, as you know. I’ve been a Microsoft MVP for 12 years, always in the project space. So that’s been great, I’ve been working with Microsoft Project since 2002. The 2002 version, so not as long as some people on this course for sure. I do write and produce courses, not just for my project user group, but also for Plural Sight, and they’re available commercially. I tweet very occasionally @benproject, and you can also grab me on my email and socials there as well. I do have a YouTube channel which covers off project and Power BI as well, so please feel free to have a look at that. And please do subscribe and post any useful comments or any feedback on those videos. You’ll find some stuff around Power BI on there as I said, and some stuff around Project, and using Power BI for Project and planning as well.
Ben: Okay, so specifically today, what are we going to cover off? These three areas are the main topics that we cover off in the courses. We are here at deep dive into shaping using Power Query. So what we’re really talking about, shaping data using this tool that’s built into the Power BI Desktop. We can also shape data using something called Data Flows today as well, so we might have a look at that if we get time.
Ben: Okay, so specifically, now we know that we’re in part two. What are we going to cover off within this next 50-odd minutes? We’re going to shape some data, so third party data that we get ahold of. Now, whether it is actually third party data, maybe somebody exports some data for us out of a project system and we have to report on it, or whether it’s our own data from Project Online or JIRA, or Planner, the data that we export has been designed by somebody else. Very rarely does any data exist in the exact format that we want it to. So, often we need to massage that data, that’s a nice term for changing it. The data might need massaging in some respect, I might go and change some data fields, so well remove some errors. I might remove columns of data or fields of data that I no longer need, because there’s no point providing any sort of analysis on data and carrying that data around if I’m not going to use it in the future, if you like.
Ben: And/or I might derive new data from an existing data set. So I might have a day of the week, or the date of a transaction or project, something like that, and I might want to derive the actual day of the week that that happened on, whether that could be a Saturday, Sunday, et cetera. So I’ll often take a data set and open it up in Power Query and start to massage that data set, so that it meets the needs of my own reporting, whatever that might be in that instance.
Ben: So, what are we going to do in this session, the third bullet point there? Well, in this session I’m going to try and teach you how to use Query Editor inside our Power Query. Query Editor, Power Query, it’s the same thing, inside the Power BI Desktop client. When I say inside the Power BI Desktop client, because that’s how you access it, it’s just another tool. Think of it a bit like being VBA inside of Excel, or Word, or Project, or whatever. It’s just there, and it’s accessed by opening up that application. So we’ll use Power Query or Query Editor inside of a Power BI Desktop client to transform the existing data that we have, so that it begins to suit our purpose.
Ben: It’s not the only way to transform the data, I can do some transformations inside the Power BI Desktop without even touching the Query Editor. And I can use that using DAX, or I can perform that using DAX data analysis expressions, that DAX language that we have. But that’s really beyond the initial capability that most people want to try and reach. Excuse me, should have grabbed some water before we did this.
Ben: I’m going to use, initially, some fairly generic examples to show you the Query Editor. And then I’ll show you some real world examples using Microsoft Planner and Project Online data sets that we’ve got. And both of those are available anyway on the web. But both of the Planner and Project Online queries are quite sophisticated, so we need to start off from something a little simpler, hence the generic reason for doing this.
Ben: Okay, so let’s dig into the Power Query Editor. What does it do? Well, the first thing it does is it allows us to cleanse data, wash it if you like, so that we can get rid of any of the bad fields, so that we can get rid of any of the data that we don’t want, so that we can derive some new data. That’s termed as cleansing. We can also combine data from different data sets, and we do do this in the Planner Power BI data set that I’ve got. And again, I’ll show you that, I’ll show you how we can combine data. But we can combine data in a couple of different ways as well, and I’ll show you an example which has been quite useful, and I’ve used in other project management solutions as well.
Ben: In terms of data cleansing, as I’ve already mentioned, it’s unlikely that any data that we’re going to bring in to our Power BI is a perfect fit for our needs. Even Project Online data, I’m always modifying that in Power Query, I’m extending it in some respect, or getting rid of other data, or just changing maybe concatenating two data sets for whatever reason. The reason no data’s a perfect fit for our needs is other people have created it to suit their needs. The data that comes out of Project Online is there to support Microsoft Project and the features in Project Online. It’s a happy coincidence that we can bring it into Power BI and do some reporting on it, but it wasn’t designed with reporting in mind, it was designed for displaying data in Microsoft Project in mind. Even the data that’s fully rich might need some cleansing modifications.
Ben: Could have inherent errors in there, so I potentially lots of spelling errors, which okay, ideally I’d fix at source, but we don’t always have the opportunity to do that. So there could be inherent errors that I want to fix. I could have too much data. A typical Project Online issue that we face all of the time is the assignment data. Assignment data and assignment time phase data, so what is Ben Howard doing today, and tomorrow, and the day after, the day after? If I put myself on a project plan for five weeks, then I’m pulling down five sevens, I’m pulling down 35 assignment records of data. And that’s just on one task. If I’m on 100 tasks, for five weeks in a single project, then I’m pulling down 3500 records of data. Multiply that by being on 20 projects, or 100 projects, you’ve got a lot of records of data that you’re bringing down. Assignment information like that is often… Well, I don’t want to bring it down for a year in the past because it’s old data, it doesn’t need to be brought down.
Ben: Quite often we’ll filter out the data that we won’t want to bring down. We might filter that in Query so we’re not bringing down as much data as we’re querying it, or we might choose to, and this is the lazy way of doing it of course, bring down all of the data, bad idea, but then just filter out anything that’s more than two years old. So there’s lots of things that Power Query will allow us to do, and lots of things we need to think about.
Ben: Here’s a data cleansing example, one of the sorts of data cleansing things that I can do when I bring down data into Power Query. There must be 100-odd things that you can do, but here’s some things that I typically will do. Typically I’ll get rid of any columns that I no longer need. Again, ideally I’ll try and reduce the number of columns from the query, but sometimes you’re not able to do that. So I’ll delete columns that I don’t need. I might filter out specific rows, so again, data that’s before a certain date would be something we would typically do. Or when we bring down data from Project Online, and I’m looking just at a count of projects, I always filter out the time sheet project, projects with ID of number seven. I don’t want those, because those are time sheet projects, so they’re always filtered out straightaway.
Ben: I might split some columns. If we have an email address, I might split the email address at the @ sign, so I’ve got the first portion of the email address. Just so I’ve got a user name, something that’s easier to work with. I might replace some values. So this is where we spelt something incorrectly, or we’ve got two departments and actually I want to see them as one department, so I can replace value one with value two, et cetera. So again, fairly easy to do.
Ben: I might sort the data. Don’t often do this, I must admit, but it could be sorted so that I can then add an index to the data. So I might sort the data by date, oldest to newest, and then add an index zero to whatever number, just so that I can do some indexing on that data, so that’s quite useful, or could be useful in that instance. And I might perform a custom calculation. Again, we’ll probably do that in the demo, so in my generic demo I might have a sales price and a number of units that I’ve sold, and therefore I can get the sales amount. Something like that. So we can perform those sorts of custom calculations.
Ben: I did some analysis of some tweets once, for Donald Trump. Not for Donald Trump, but of Donald Trump’s tweets, and I just worked out did he do tweets in the morning of afternoon, so I just had a custom calculation that created a new column, and that column was either populated with A.M. or P.M. based upon the time of the tweet. We could see when he tweeted more often very, very easily using that sort of thing. So there’s some examples, very quick examples of data cleansing.
Ben: I think this is a good time to do a demo, so let’s just pop over to this screen here, or this folder. Here you can see I’ve got a folder and it’s called Part 2, so it’s okay. I’ve got a sales.xls file in there. So again, this isn’t, at the moment, project specific, but this is a great file to understand how we can begin to use Power Query. I’ve got a couple of fields in here, a couple of values I want to point out. Let’s just expand this a little bit. This is a table, and we can see that if I click in the table, it’s just called table financial data. So that table has some columns in there, we’ve got a transaction ID, we’ve got a segment, so which country, what was the product, what was the discount band, the units sold. And then in pound shilling and pence, of course we’ve got the manufacturing price, the sales price, the gross sales, which will be the units sold, multiplied by the sales price. We’ve got a discount value, a sales value, a date that we made the sale, and a sales person.
Ben: This is all fictional data, of course. And I’ve got a few, 700-odd rows in here if I just come around we can see at the bottom there, bottom left it says 716 rows. So not a massive amount of data, but sufficient that we can bring it into Power BI.
Ben: Now, in the country column here, I’ve highlighted a couple of columns. My sales data, when I sold to Spain, Spain’s either been tracked as the word Spain, or as the word Espania. So this is one of the things that I’m going to correct and replace when we bring this into Power BI. I don’t think there’s anything else I need to mention too much in there, it’s just a standard table.
Ben: Okay. Did we mention at the end there the salesperson? I’m not sure we did. Every transaction has a salesperson ID. So that’s the ID of the salesperson, the employee that made that sale, if you like. We’ll use that later on for combining the data, but remember we’ve got that salesperson ID there. Okay, let me close that down. I’m going to open up Power BI, let’s just open up this blank Power BI sheet here. Not sheet, but report. So this is just for Power BI Desktop, which we will have covered off in potentially a demo last week, but we’ve opened that up.
Ben: Now, what I’m able to do, to transform data, I click on the transform data button here. I haven’t yet got any data, and there’s a couple of ways I can go and get data. I can either click on the Get Data button here and just say, “Hey, let’s go and get some data from an Excel file or a tech.csb file,” or I could click on Transform Data and use that button to get the data. In this case I’m going to just click on Transform Data to show you what that looks like. This opens up a new page, or a new window if you like, within the Power BI or from Power BI Desktop. I’m just going to maximize that. At the top you can see that it says untitled Power Query Editor. So there’s no excuse for me to forget exactly where I am within the tool set.
Ben: I’ve also got the ability in the Power Query Editor to go and get data. Same sort of UI as within Power BI, the Power BI Desktop. So let’s go and get data from an Excel file. Which one are we going to bring in? We’ll bring in that sales data, or sales.xls that we’ve just looked at. Okay, so Power Query does this, it’s the way that we get data anyway. Even if we got data through Power BI it would open up Power Query in the background and go through the same process. It wouldn’t necessarily take us into Power Query.
Ben: Now that I’ve navigated, or said, “Hey, I’d like to get data from that Excel file,” Power BI and Power Query goes to query that XL file, and it finds two areas that I can get data from. Firstly it says, “Hey Ben, we’ve got a table here,” which is this icon here with the blue horizontal row at the top. It says, “Hey Ben, I’ve got a table here, and it’s called TBL_Financial Data.” I can click on there and just have a quick preview of that data and the table. So that’s the table that we’ve just looked at. Or I could look at just the sheet. We had a worksheet that was just called WIV Table, and this is what that worksheet looks like. Notice the worksheet’s got column one here, which has got some information in, mainly no’s, but I did write check this in column one. And some data around column three, or some data in the first line of rows as well.
Ben: Where you can, if you bring in data from Excel, always try and format that data in a table anyway, it’s a much better experience. So I’m going to select the table, which is done by the checkbox here, and then I’m just going to click okay. And that’s going to bring that data into Power BI, and effectively into Power Query. You’ll see it refreshed the screen a couple of times there. But you can now see the first 1,000 rows of that data set presented to me within Power Query. If we look at the bottom, just cast your eyes down to the bottom here, we can see we’ve got 13 columns of data and 703 rows of data. So it always brings in the first 1,000 rows. If you need more, then you need to sort it in some sort of way to bring in the other rows.
Ben: Okay. On the left hand side here, we’ve got the query name, TBL_Financial Data, it’s picked up the table name. You can see the same name is there. I’m just going to rename this, I’ll just call this Sales Data. It’s nice to have queries which make, at least, sense. Okay. In the middle here, we effectively get the table. We get the columns as they were ordered in the table, and we get to see all the data.
Ben: Over on the right hand side, we have some applied steps. Every time, when I cleanse this data, I’m going to add a new step. And I’m able to navigate through these applied steps and see the way that the data has changed if we add a step. Now, the steps for cleansing the data typically, are all up in these, available via these buttons here. So I’ve got Home button, Transform button, Add a Column button, View, et cetera, et cetera.
Ben: One thing I will point out is on the View tab, I’ve got the formula bar checked. The formula bar displays this bar her, and this bar here is displaying is effectively what’s called M-code. This is a functional language, you can go and buy books on this, et cetera. But this is the language that Power Query uses to manipulate the data that we’re bringing in from this Excel sheet. So you don’t have to learn M, but you find you end up picking up bits of M along the way. I always have a look at the formula bar, because sometimes it’s quicker to edit something in the formula bar than it is to modify the applied step.
Ben: Okay, so the first thing I’m going to do is I’m going to get rid of the columns that I don’t want. This is the thing I always do first. We brought all the data in. I don’t need this transaction ID, so I’m just going to right click on there and I’m going to remove that. So by right clicking on the column header, I’m given a whole host of options that I have access to for this query. And you can get at these options either here, right clicking, which is my favorite way because it’s the quickest way to do it, or from the Home tab I’ve got the ability to remove columns here. Transaction ID is highlighted, so I could select that and click remove columns. Or, if you just wanted to keep one column, just highlight it and remove all of the other columns. I’m going to remove Transaction ID, and that will do me for now.
Ben: Now, notice what’s happened is the Transaction ID’s gone from here, and I’ve got a new applied step. The applied step is called remove columns. If I click on Changed Type, I go back and I’m able to have a look at the data as it existed at that time, before I remove the columns. So you can see the Transaction ID is in there. And that’s something to understand about Power Query, it’s a functional language, and these queries are performed in these applied steps, or performed in the order that they’re displayed on screen. We can move things around as well, in terms of orders, just by right clicking and moving up or after.
Ben: Okay, so that’s the first thing I’m going to do. Now, remember my Espania. I’ll see if we can see it in here. It should be in the country somewhere, if we just scroll down. I would have thought that would have been fairly near the top. I can’t see it. Okay. I’m going to do something here, what we’ll do is we’ll filter. So we’ll apply a filter to this, and I’ll just select the value of Espania. And I’ve got two records that we can see in here. Again, notice how we’ve applied a filter, so the only thing we’re now looking at is two rows, as you can see at the bottom right there. What I want to do is I want to change this word Espania to be Spain.
Ben: So I’m just going to right click on here, in the cell, and I can choose replace values. This will open up a little dialogue box for me, and it says, “Hey Ben, what do you want to replace Espania with?” I’ll replace it with Spain, we’ll click OK, and then those two fields have been replaced with the word Spain. You can see the M-code for this appear in here in the formula bar.
Ben: What I’ve done is absolutely great, but I’ve actually only got two rows in my data set now. What I need to do is I need to get rid of that filter row applied step here, remember we put that in. Next to each applied step I’ve got a little X, so if we were to delete the applied step of Espania, then what would happen is we get a little warning, are you sure you want to do that? Yes please. Okay. What happens is we go straight from removing that column, which was that Transaction ID, so we remove that column, and then we go and replace any Espania in the country field, any country with the name Espania, with the name Spain. And where you see a little icon to the right of the applied step here, the gear wheel, you can double click on there and it will open up the user interface for you. So if you need to check anything, you always can.
Ben: Okay. So that’s fairly simple. We’ve covered off getting rid of columns, changing some values, replacing some values. Let’s look at what else we’ve got. The other thing that I always do is I typically check the values, so this 1-2. That means this column contains decimal numbers. So all of my numerics are pretty much decimal numbers, except the sale prices, which is a whole number in this case. Maybe I only sell things in whole pounds. Similarly for the manufacturing price, the unit sold is a decimal, so it looks like I can change the, sell .1 of a unit, which might not be so good. Maybe the units sold I want as a whole number. So we can do that, and we can begin to change these titles.
Ben: Maybe we’d like to calculate something. So we’ve got the sales price, we’ve got the gross sales, we’ve got discount, we’ve got the sales value, we’ve got the manufacturing price and the units sold. If I was to multiply the units sold by the manufacturing price, I’d get effectively the cost of goods sold. So let’s do that. I’m going to do that by selecting here on the tab to add a new column, and this is going to be a custom column.
Ben: Okay, so again I brought up, or in this case a user interface is presented to me. For new column name, I’ll just call this Cost of Goods Sold. And this is a point and click thing, so I can say let’s take the units sold and multiply that by the manufacturing price, and that should be okay. Click OK, and all of a sudden I’ve got a new column, added custom, called Cost of Goods Sold. And again, we can see how that has been created for me in the M-code using the formula bar. This shouldn’t be ABC, 123, so I’m just going to change that so that is a fixed decimal number. Fixed decimal number is actually always going to give us two decimal places, so anything that is a monetary value, typically you want to put that as a fixed decimal number. That’s good.
Ben: Let’s just do one other transformation on this data. I want to see the day of the week that things are sold. I’ve got a date field here. So what I’m going to do, is I want to take that date field and based upon that date field, I want to work out whether things were sold on a Monday, or a Tuesday or a Wednesday, et cetera. What we’ll do is for that column, we will duplicate it. I can do that a couple of ways here. I’m looking for the duplicate column button, which I can never, can’t see. So instead, we’ll right click here and I’ll say duplicate column. I always find it’s easier just to right click here. This has called this Date Copy. I could change the name directly in here, so I will do. I’ll call this Day. Okay, so you can see it’s called it Day. Sometimes I’ll come into my M-code and I’ll try and reduce the number of queries or number of applied steps we need to do.
Ben: At the moment, this day is displayed as Day-Day, Day-Month, Year. Again, if I right click on here, I’ve got the ability to transform this. I’m going to transform this to be a day, and this always disappears. Let me just minimize that a little bit, it disappears onto my third monitor, onto the other side there, actually my fourth monitor. So let’s do it this way, transform day, and I’ll have the name of the day. So the 10th of September, ’20 was a Thursday, the 2nd of November was a Saturday, et cetera. So those are nice things that we can do.
Ben: That’s probably all I want to do for now. What I would do then is I’d close and apply that. Okay, so Power BI will then re-read that whole file, and apply the changes that we just created or made, into that whole file. And this is going to cover off a little bit of what we’re going to cover next week, but you can see in the fields list here, I’ve got a column called Day and a column called Cost of Goods Sold. Okay. If I wanted to, I could have a look at the sales value, which is a numeric value. And your total sales, about $120 million. And if I wanted to look at that by the day, okay I can see we well much more on a Friday than we do on a Saturday. So I’ve created some derived data just to show you that.
Ben: So that’s Power Query, or at least that’s part of how Power Query. Okay. Let’s go and have a look at… well, let’s go and have a look back at the slides, see where we were. So that’s a little bit about data cleansing.
Ben: Not only can we cleanse data, but we can combine data together as well. This is really, really, really useful for where we’ve got files of the same structure, either coming from different projects, this is how Planner works, or we hae temporal data. So we’ve got monthly sales figures, or incremental production figures. Planner is the best use case I can have for data combining in this case, from a project perspective, but you can see how you can use it in other areas. For incremental files, we would append data files that have the same shape. We can also get data from different data sources. In which case those files might not be of the same shape, but they contain the same sort of information, or they contain information that we want to merge together. So I might be taking customer data from an online CRM system, sales order from a sales system, public data from some government or web or public sources, and I want to bring all of those together.
Ben: There’s two reasons we might combine the data, and ultimately here’s an example. If I’m in a multi-national company, I might be doing projects in the U.S., using an ELP system, I might be doing projects, maybe using Salesforce in a CRM system, and I could just have CSV files elsewhere. I want to bring together one view of the data, I need to merge that data together. So that’s why we might combine queries. And I’ve got a simple example here. We tend to append queries together if they are of… if the query, if you like, if the data set is of the same structure. So we would append the queries together, and then we would typically cleanse them. If we’re merging data, we would tend to merge them into a single data set, using something called a matched column.
Ben: Okay, so let’s just go and have a look at an example for that. Remember our sales.xls, at the end of that I had an employee number. I’ve also got an employee table. And here’s my employee table, we can just see that it’s called Emp Table. So I have an employee ID, first name, surname, and a commission level. Okay, this becomes interesting, doesn’t it? Let’s bring that data into Power BI. We’ll close that, we’ll open up, again, in this case, the Power BI Desktop. We’ll go into Transform Data, we’ll go and get some new data, so from Excel. There’s my employee data, and this will open up and bring in a new query. It’s going to ask me where do you want to bring it in from again, Ben? Okay, there’s my employee table. The screen refreshes a couple times, and here we are back in Power Query.
Ben: Notice I’ve now got two queries. One called Sales Data, and one called Employee Table. Let’s just name that to be Employee Data. Okay. Now, if I want to find out, or if I want to combine these two data sets together. And you may want to do this, you may not want to do it, and those of you who know Power BI, you might say, “Hey Ben, we would just create a relationship in the model.” And yes, you can do that. But maybe for each sale, I want to know which employee made each sale, and I want it in this record. What we’re able to do is use this field here, the salesperson ID to effectively match which user, which employee, made that sale. So let’s do that.
Ben: Okay, on the Transform… No, add column. Let’s just have a look. Ah, here we go. On the Home tab, I’ve got a button that says either merge queries, or append queries. And under both of those, I’ve got the ability to merge queries and create a new query, or merge an existing query into one query, or merge queries as new. I’m going to just create a new query here. Actually, we’ll just merge these two queries. We’ll merge a couple queries. To merge this query, we’re going to have a query in the file. So I’ve got the sales data, and that’s the first query we’re going to merge. We’re going to merge that with the employee data, and what it looks for is select a table, matching columns to create a merged table. Well, my matching column is going to be Salesperson ID, which is going to be there. So we’ll click OK, and you see what it’s done is it’s created a new table for me, or a new column at the end, which is a table.
Ben: If I click on this double expansion here, then I can choose which columns to add in. I’ll just have the first name, the surname and the commission percentage. Okay. So now we can see, for this first sale here, Emma has made that sale, and Emma Rodriguez, and the commission was 3%. Again, if I just close and apply that, then those changes will be applied in Power BI, and we’ll see the employee first name, last name, et cetera in here. So we can see who’s made, we can do a similar sort of visualization as to this one. If I just copy and then paste that, instead of looking at that by day, those sales, we can now have a loot at that by employee first name. So we can see actually Emma makes most of our sales as well.
Ben: Okay, so that’s a little bit about merging data together. We can do something else. If I just take a new report. I’ll show you this very quickly, so Power BI is just going to open up. Whilst that’s opening up, let’s go and have a look at some new Excel data. So this is, this bar here, 20-11 Cheshire CSV, this is police data from the UK. So these were the crimes reported in Cheshire, within, what was that, 11, so November, 2020. And you can see we’ve got a lot of them, I’m not sure how many there are. Each crime has a crime type and so on. Okay.
Ben: Let’s think about this. This is an incremental set of data. So I have one for 20-11, I’ve got one for 20-12, I’m certainly going back about four years, actually. Let’s bring that data into Power BI. Okay. Well, we can do it from here. We can go and get the data from, in this case a CSV file, we’ll get Cheshire data. If we do this directly from Power BI, then we get the same interface at the beginning. So here’s the same interface, and it says, “Do you want to load this data Ben, or do you want to transform it?” I’m going to transform it. Now, transforming it effectively means, “Ben, load the data and then take me into the Power Query Editor.” So here’s all of my crimes if you like, reported in 2011.
Ben: What happens when 2012 comes out? Well, when 2020-12 data comes out, I need to re-import it again, don’t I? So I can go and select 2012, open, and now the same thing will happen. Let’s just bring that in, click OK. And now I’ll have two queries, one for 2011, and one for 2012. And I could, of course, append those queries together. Before we did a merge, this time I can append the queries as new. Well, let’s just append both of these two sets of data together because they’re the same shape, and end up with something which is called All Crimes.
Ben: Now, Power BI will do this for us, I don’t need to do this manual appending. It’s really neat. If we were to, let’s just have a look at what we’ve got in here. If we were to take the data from here, so this is all my police data, including 2012, I can bring data into Power BI and Power Query from a folder. So notice this is all the police data for Cheshire for the last two years, from 2018 through to 20-12. I’m just going to highlight that, and again open up a new Power BI file, so we’ll click new. This always opens up on a different screen for some reason, never quite worked out how or why that is.
Ben: You might be thinking, “Hey Ben, so why’s this relevant to Project?” Well, this is relevant when you export Planner files. So I want to show you the scenario, and then we’ll go and have a look at real world Planner data. What I’m able to do now, again, a new Power BI file, is get data. But if we click down in more, then I can get data from a folder. And what Power BI will do is it will go and automatically append all of that data together in one query. So here’s the folder that we selected, it’s the police data folder, and within there there’s lots of sub-folders. And within each sub-folder, there’s one or more files for police data. What this is ultimately going to do is combine all of those files together. Power BI has worked out what’s in this data set, and one of the options I’ve got is combine and transform the data. So let’s just have a look at that.
Ben: That’s going to do it’s usual thing in the background, and then we pick up the first file, yep that’s fine. And you can see, or you will see, it will navigate through each file. Let’s just have a look. Okay, let’s get back to Query Editor. You can see it’s supplied some steps. Let’s just really briefly navigate through these. But we’ve got some source data, and you can see in there it’s picked up all of the Cheshire Street files, from 2018-01 through to 20-12. It’s done a few bits of filtering and built a custom function, dah, dah, dah. This is the important bit. It’s expanded the table, and effectively appended everything together. So I’ve now got all the files in there, and if I close and apply that, you’ll see that when this query runs, if you look carefully here, you’ll see that it navigates through all of the different CSV files. I’m going to let that run because I’m finished with that.
Ben: Okay. Now we understand about appending and merging queries, then let’s go and have a look at a couple of real world examples, where we do effectively that, we append and combine. Now, the real world examples I’ve got, as I’ve already alluded to, are Planner and Microsoft Project, Project Online. Just trying to bring up the correct web browser. There we go. What I’ve done is on GitHub, I’ve published a couple of templates, Power BI templates for this. We’ve got one here called Power BI Planner, and one called Power BI Project Online, Content Pack with Currency.
Ben: Let’s just open up the Planner one. If we go back to the data here, in part two, notice I’ve downloaded that Power BI… now I have part two. I’ve got a Planner Power BI file. And I there, what I’ve done from Microsoft Planner is I’ve exported some Planner files. You can go into Microsoft Planner and you can export your Planner file. So I’ve got four Planner files. We just have a look at one of these, these are all demo files, so the data in them is sparse, in this case. But you can see I’ve got a plan name, some task names and some other things. But ultimately, all of these planner files come out in the same sort of way. So, what I’ve been able to do, and what I’ve done, is I’ve created this Planner 7.5.P-bit file. A P-bit file is a Power BI template file. That’s going to open up on my other desktop.
Ben: And what we’ll do, first of all, is we’ll go and point that at this folder here, called Planner files. So I’ll just Control C that. The Power BI template file is going to ask us for a folder to input from. Remember, we’ve got one file in there called Bobbins. This opens up on a different screen, so it says here’s a Planner 7.5, what’s the path to the Planner files? So that’s the path to the Planner files. And then we can just click… I’ll click Load, because that will open up Power BI, which we will come to next week. That’s just running bits on a different screen, we’ll bring this screen over when we can. Okay.
Ben: Here’s the Power BI file, it’s just going to load up. We’re not interested in that today. What we are interested in is the Transform Data. So Transform Data goes and gets some data, in this case from the Planner file, from the Planner folder. It invokes that hidden column, and then I do a load of other things and I end up then sorting by rows and ending up with a Planner name, a Task DI, the Task Name, the Bucket, some of this was in different languages, the progress, priority, et cetera, et cetera.
Ben: Okay, so that’s great. That’s what it does. At the moment, this only gives me information for one plan. However, what we’re able to do is, if I add more data into here, then I can append all of the data. So let’s bring up these other Planner export files. I’ll just copy those. We’ll just port them into Planner files here, I’ll just paste those in. Okay. And then, with those pasted in, I’m going to click refresh on here. And what the refresh is going to do is it’s going to go back and find all of the new Planner files, so we can see that happened really quickly, and give me all of the tasks. And that’s just doing effectively an append together, so we can see that we’ve got lots of those different files. We can go in then and when this refreshes, which it should do pretty quickly… I’ll have to just click refresh here. Then we’ll see the new plans and the new tasks left come into this view here.
Ben: So if you’re using Planner to manage multiple plans, but you can’t report on it, well now you’ve got a way of reporting onto it. It’s a bit manual, you have to export those files.
Ben: Okay, so that’s a bit about query. If you wanted to have a look at the advanced query, and you think, “Oh, this M stuff looks quite nice,” we have an Advanced Editor, and that takes you through all of the M-code. Looks a bit scary in here, but all of it is actually fairly easy to debug, and there’s lots of books and help around here.
Ben: So that’s Planner. Let’s pick up another tool. Of course we would look at Project Online. Okay, so here I’ve got a Project Online file. Again, this is available either from Microsoft or from the GitHub site that I showed you which just was about loading up. You just go to GitHub and just search for Ben-Howard, you’ll find these templates available for you. Okay, so that’s taking a little while to log on, or to load up. Now doing it on the other screen of course, we’ll bring that over. Okay, let’s just let that refresh. So this is, again, effectively a cut of Microsoft’s Power BI content pack. Again, we’re not interested in Power BI with visualization at this stage, we’re interested in Power Query.
Ben: So they bring in many, many, many different queries. Okay, so here’s the query to do with projects, here’s the one to do with resources, here’s the one to do with risks, issues, et cetera. If we have a look at the Project one, then you can see here will be applied steps for the Project one. So the source goes and gets some Project data from a URL we enter, but then there’s a filter. So here is where we remove the time sheet project. And then we can remove other columns, et cetera. So again, at the end this is what we’re left with in terms of Project data that we can then visualize. I’ve taken this risks one, and added in a probability impact matrix set of features. Again, that works quite well.
Ben: Okay, the Power Query ultimately is the tool that sits behind Power BI, that lets you take a set of data and begin to cleanse that data. So you can turn that data from something that’s maybe half useful, into something that’s very useful. Or you can take data that’s sitting from the, as we do with Planner, take separate sets of data that then aggregate those together or combine them together so that you can visualize all of that disparate data in one place.
Ben: Okay. So that’s really the end of what I’m able to cover in an hour. Please again do reach out for me on the socials. And do join us for part three, which of course is next Wednesday, where we’ll go and take what we’ve learned. We’ll create some reports using the Power BI Desktop, we’ll publish those reports so that they can be shared with our team. Kyle, what do we have on the questions? Anything or nothing, my friend?
Kyle: Thanks Ben. We do have one question that came in from Steve. He was just curious if there’s any macro functionality that’s available.
Ben: Macro functionality. In Power BI, no. There’s no macro. Just trying to work it out. The Transform, in terms of maybe recording a macro, could you record a macro to change your data? No, you can’t. That’s the only thing I can think that Steve might be asking at this stage. This, of course, is a set of codes that we can go and edit and copy, and move around, and comment, et cetera, et cetera. So that’s as far as I think I can answer Steve’s question, unless he comes back with anything else to you quickly.
Kyle: Okay. No, I think that was it. And I think that’s it for questions as well. So we are ready to close out here today. For those of your taking the full course, I’ll put that info back on the screen, as well as the PDU code here. Today’s session, eligible for one Strategic PMI PDU. And if you missed any of today’s session, or would like to go back and review Ben’s lesson today, the recording will be posted to mpug.com in just a couple hours, and you’ll receive an email with a link to that. Part one is also available on demand right now, if you happened to miss that lesson, you can catch that on mpug.com.
Kyle: And as Ben mentioned, part three will be next Wednesday at the same time. That one will cover creating reports and using the Power BI best top app. And that does it for today. So once again, thanks to you Ben for the great lesson today, and we want to thank everyone who joined us live, or is watching this on demand. We hope you have a great rest of your day, and we’ll see you back next Wednesday for our third and final lesson in this course. Thanks again.