The road to London

It’s time for me to go on the road again, and this time I’m headed to London and the UK Cloud Infrastructure User Group where I will be delivering a brand new session on self-service BI from an infrastructure perspective.

This session is not only brand new, it is also a bit of an presentation style experiment. I will be delivering the session in no less than three different voices – as in, three differing points of opinion. The subtitle for this session is “arguing with myself” for a reason…

My goal with the 60-minute session is to provide a walk-through what self-service BI is and what makes it so potentially awesome, how it can (and will!) affect the infrastructure people and what can be done to turn a disaster into an opportunity.
It is primarily focused on Power BI as this is my weapon of choice these days, but the information is immediately applicable to all situations. It’s more about the mindset than the specific implementation – something I find is most often the case.

The usergroup meeting is held at WeWork Waterhouse Square in Holborn (the closes tube station should be Chancery Lane) at 6.30 on Monday. Register here and come join!

Power BI Dataflows example – part 2

There are several more use cases for a dataflow, but one that is very useful is the ability to share a dataset between apps. Previously we had to duplicate the dataset to each and every app that needed to use it, increasing the risk that one dataset was ignored, not refreshed properly or otherwise out of sync with reality. By using dataflows we can have several apps rely on the same dataflow (via a dataset), and thus it is quite possible to have a “master dataset”.

Here is one way to do it:

1. Create an app workspace to keep all your dataflows that you are planning on sharing to different app workspaces. This way you have things neatly ordered and you don’t have to go searching for the dataflow in several app workspaces.

2. Create a dataflow in said app workspace.

3. Create a new app workspace for your app.

4. Create a file using Power BI Desktop that connects to the dataflow from step 2 and create all the visualizations you need.

5. Publish this report to the app workspace from step 3.

6. Publish the app to your users.

 

7. Access the app from “Get Apps > My organization” as a user who has been granted access to the app in step 6.

The beauty of an app is that the dataset underneath has its own refresh cycle apart from the app. That means that the app itself doesn’t have to be refreshed or updated in order for the dataset to update. However – as there is no automatic refresh of a dataset on top of a dataflow, refresh schedules have to be set for both the dataflow AND the dataset. Failure to refresh either (or both!) can lead to some confusion. This all works in Pro as well as Premium.

Power BI Dataflows example – part 1

With Power BI Dataflows out in public preview and everyone exclaiming how absolutely amazing they are, I decided to put together a bit of an example of how you might use it to “hide” some of the data prep work that goes into cleaning a dataset. To do this I will build on a blog post from Erik Svensen that he wrote a few years ago where he uses data from Statistics Sweden to work through a JSON API. Exactly what data I use here isn’t quite as important as how I use it, but the idea is to provide a prepared dataset ready-to-use for an analyst.

The steps are as follows:

  1. Find a datasource containing the data to be analyzed
  2. Using Power BI Desktop, ingest the datasource and apply the necessary transformations to make the data usable
  3. Create a Power BI Dataflow to provide this prepared dataset as a datasource for anyone to use

Let’s say we want to provide a dataset that shows population numbers for Sweden. This can be found at the Statistics Sweden site here. In order to get this data we could either save it as a CSV or XLS file and toss it into Power BI Desktop, but as that will make things unnecessarily cumbersome whenever new data is made available, we should instead aim to use the  API provided. This is where Erik’s excellent blog post comes into play and explain how to get this data into Power BI. We grab the API URL and JSON query displayed at the bottom of the Statistics Sweden site here:

and then modify the JSON query using Notepad++ like this:

Then we put the whole shebang into Power BI Desktop using a blank query and the following Power Query code (also from Erik’s blog post linked above):

let

PostContents= "{
  ""query"": [
    {
      ""code"": ""Alder"",
      ""selection"": {
        ""filter"": ""agg:Ålder10år"",
        ""values"": [
          ""-4"",
          ""5-14"",
          ""15-24"",
          ""25-34"",
          ""35-44"",
          ""45-54"",
          ""55-64"",
          ""65-74"",
          ""75-84"",
          ""85-94"",
          ""95+""
        ]
      }
    },
    {
      ""code"": ""Kon"",
      ""selection"": {
        ""filter"": ""item"",
        ""values"": [
          ""1"",
          ""2""
        ]
      }
    }
  ],
  ""response"": {
    ""format"": ""csv""
  }
}
",

Source = Web.Contents("http://api.scb.se/OV0104/v1/doris/sv/ssd/START/BE/BE0101/BE0101A/BefolkningR1860",[Content=Text.ToBinary(PostContents)]),
#"Imported CSV" = Csv.Document(Source,null,",",null,1252)
in 
#"Imported CSV"

This will give us the raw resultset from Statistics Sweden, but in order to be able to do any useful analysis we need to clean and format the data. We start with promoting the headers, unpivot the years in order to get them to one column, split some text away, replace a few values and hide and rename a few columns. Basic stuff that most often needs to be done to pretty much all datasets. The end Power Query code look like this:

let

PostContents= "{
  ""query"": [
    {
      ""code"": ""Alder"",
      ""selection"": {
        ""filter"": ""agg:Ålder10år"",
        ""values"": [
          ""-4"",
          ""5-14"",
          ""15-24"",
          ""25-34"",
          ""35-44"",
          ""45-54"",
          ""55-64"",
          ""65-74"",
          ""75-84"",
          ""85-94"",
          ""95+""
        ]
      }
    },
    {
      ""code"": ""Kon"",
      ""selection"": {
        ""filter"": ""item"",
        ""values"": [
          ""1"",
          ""2""
        ]
      }
    }
  ],
  ""response"": {
    ""format"": ""csv""
  }
}
",

  Source = Web.Contents("http://api.scb.se/OV0104/v1/doris/sv/ssd/START/BE/BE0101/BE0101A/BefolkningR1860",[Content=Text.ToBinary(PostContents)]),
  #"Imported CSV" = Csv.Document(Source, [Delimiter = ",", Encoding = 1252]),
  #"Promoted Headers" = Table.PromoteHeaders(#"Imported CSV", [PromoteAllScalars = true]),
  #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers", {{"ålder", type text}, {"kön", type text}, {"Folkmängd 1860", Int64.Type}, {"Folkmängd 1861", Int64.Type}, {"Folkmängd 1862", Int64.Type}, {"Folkmängd 1863", Int64.Type}, {"Folkmängd 1864", Int64.Type}, {"Folkmängd 1865", Int64.Type}, {"Folkmängd 1866", Int64.Type}, {"Folkmängd 1867", Int64.Type}, {"Folkmängd 1868", Int64.Type}, {"Folkmängd 1869", Int64.Type}, {"Folkmängd 1870", Int64.Type}, {"Folkmängd 1871", Int64.Type}, {"Folkmängd 1872", Int64.Type}, {"Folkmängd 1873", Int64.Type}, {"Folkmängd 1874", Int64.Type}, {"Folkmängd 1875", Int64.Type}, {"Folkmängd 1876", Int64.Type}, {"Folkmängd 1877", Int64.Type}, {"Folkmängd 1878", Int64.Type}, {"Folkmängd 1879", Int64.Type}, {"Folkmängd 1880", Int64.Type}, {"Folkmängd 1881", Int64.Type}, {"Folkmängd 1882", Int64.Type}, {"Folkmängd 1883", Int64.Type}, {"Folkmängd 1884", Int64.Type}, {"Folkmängd 1885", Int64.Type}, {"Folkmängd 1886", Int64.Type}, {"Folkmängd 1887", Int64.Type}, {"Folkmängd 1888", Int64.Type}, {"Folkmängd 1889", Int64.Type}, {"Folkmängd 1890", Int64.Type}, {"Folkmängd 1891", Int64.Type}, {"Folkmängd 1892", Int64.Type}, {"Folkmängd 1893", Int64.Type}, {"Folkmängd 1894", Int64.Type}, {"Folkmängd 1895", Int64.Type}, {"Folkmängd 1896", Int64.Type}, {"Folkmängd 1897", Int64.Type}, {"Folkmängd 1898", Int64.Type}, {"Folkmängd 1899", Int64.Type}, {"Folkmängd 1900", Int64.Type}, {"Folkmängd 1901", Int64.Type}, {"Folkmängd 1902", Int64.Type}, {"Folkmängd 1903", Int64.Type}, {"Folkmängd 1904", Int64.Type}, {"Folkmängd 1905", Int64.Type}, {"Folkmängd 1906", Int64.Type}, {"Folkmängd 1907", Int64.Type}, {"Folkmängd 1908", Int64.Type}, {"Folkmängd 1909", Int64.Type}, {"Folkmängd 1910", Int64.Type}, {"Folkmängd 1911", Int64.Type}, {"Folkmängd 1912", Int64.Type}, {"Folkmängd 1913", Int64.Type}, {"Folkmängd 1914", Int64.Type}, {"Folkmängd 1915", Int64.Type}, {"Folkmängd 1916", Int64.Type}, {"Folkmängd 1917", Int64.Type}, {"Folkmängd 1918", Int64.Type}, {"Folkmängd 1919", Int64.Type}, {"Folkmängd 1920", Int64.Type}, {"Folkmängd 1921", Int64.Type}, {"Folkmängd 1922", Int64.Type}, {"Folkmängd 1923", Int64.Type}, {"Folkmängd 1924", Int64.Type}, {"Folkmängd 1925", Int64.Type}, {"Folkmängd 1926", Int64.Type}, {"Folkmängd 1927", Int64.Type}, {"Folkmängd 1928", Int64.Type}, {"Folkmängd 1929", Int64.Type}, {"Folkmängd 1930", Int64.Type}, {"Folkmängd 1931", Int64.Type}, {"Folkmängd 1932", Int64.Type}, {"Folkmängd 1933", Int64.Type}, {"Folkmängd 1934", Int64.Type}, {"Folkmängd 1935", Int64.Type}, {"Folkmängd 1936", Int64.Type}, {"Folkmängd 1937", Int64.Type}, {"Folkmängd 1938", Int64.Type}, {"Folkmängd 1939", Int64.Type}, {"Folkmängd 1940", Int64.Type}, {"Folkmängd 1941", Int64.Type}, {"Folkmängd 1942", Int64.Type}, {"Folkmängd 1943", Int64.Type}, {"Folkmängd 1944", Int64.Type}, {"Folkmängd 1945", Int64.Type}, {"Folkmängd 1946", Int64.Type}, {"Folkmängd 1947", Int64.Type}, {"Folkmängd 1948", Int64.Type}, {"Folkmängd 1949", Int64.Type}, {"Folkmängd 1950", Int64.Type}, {"Folkmängd 1951", Int64.Type}, {"Folkmängd 1952", Int64.Type}, {"Folkmängd 1953", Int64.Type}, {"Folkmängd 1954", Int64.Type}, {"Folkmängd 1955", Int64.Type}, {"Folkmängd 1956", Int64.Type}, {"Folkmängd 1957", Int64.Type}, {"Folkmängd 1958", Int64.Type}, {"Folkmängd 1959", Int64.Type}, {"Folkmängd 1960", Int64.Type}, {"Folkmängd 1961", Int64.Type}, {"Folkmängd 1962", Int64.Type}, {"Folkmängd 1963", Int64.Type}, {"Folkmängd 1964", Int64.Type}, {"Folkmängd 1965", Int64.Type}, {"Folkmängd 1966", Int64.Type}, {"Folkmängd 1967", Int64.Type}, {"Folkmängd 1968", Int64.Type}, {"Folkmängd 1969", Int64.Type}, {"Folkmängd 1970", Int64.Type}, {"Folkmängd 1971", Int64.Type}, {"Folkmängd 1972", Int64.Type}, {"Folkmängd 1973", Int64.Type}, {"Folkmängd 1974", Int64.Type}, {"Folkmängd 1975", Int64.Type}, {"Folkmängd 1976", Int64.Type}, {"Folkmängd 1977", Int64.Type}, {"Folkmängd 1978", Int64.Type}, {"Folkmängd 1979", Int64.Type}, {"Folkmängd 1980", Int64.Type}, {"Folkmängd 1981", Int64.Type}, {"Folkmängd 1982", Int64.Type}, {"Folkmängd 1983", Int64.Type}, {"Folkmängd 1984", Int64.Type}, {"Folkmängd 1985", Int64.Type}, {"Folkmängd 1986", Int64.Type}, {"Folkmängd 1987", Int64.Type}, {"Folkmängd 1988", Int64.Type}, {"Folkmängd 1989", Int64.Type}, {"Folkmängd 1990", Int64.Type}, {"Folkmängd 1991", Int64.Type}, {"Folkmängd 1992", Int64.Type}, {"Folkmängd 1993", Int64.Type}, {"Folkmängd 1994", Int64.Type}, {"Folkmängd 1995", Int64.Type}, {"Folkmängd 1996", Int64.Type}, {"Folkmängd 1997", Int64.Type}, {"Folkmängd 1998", Int64.Type}, {"Folkmängd 1999", Int64.Type}, {"Folkmängd 2000", Int64.Type}, {"Folkmängd 2001", Int64.Type}, {"Folkmängd 2002", Int64.Type}, {"Folkmängd 2003", Int64.Type}, {"Folkmängd 2004", Int64.Type}, {"Folkmängd 2005", Int64.Type}, {"Folkmängd 2006", Int64.Type}, {"Folkmängd 2007", Int64.Type}, {"Folkmängd 2008", Int64.Type}, {"Folkmängd 2009", Int64.Type}, {"Folkmängd 2010", Int64.Type}, {"Folkmängd 2011", Int64.Type}, {"Folkmängd 2012", Int64.Type}, {"Folkmängd 2013", Int64.Type}, {"Folkmängd 2014", Int64.Type}, {"Folkmängd 2015", Int64.Type}, {"Folkmängd 2016", Int64.Type}, {"Folkmängd 2017", Int64.Type}}),
  #"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Changed Type", {"ålder", "kön"}, "Attribute", "Value"),
  #"Split Column by Delimiter" = Table.SplitColumn(#"Unpivoted Columns", "Attribute", Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.Csv), {"Attribute.1", "Attribute.2"}),
  #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter", {{"Attribute.1", type text}, {"Attribute.2", Int64.Type}}),
  #"Renamed Columns" = Table.RenameColumns(#"Changed Type1", {{"Attribute.2", "Year"}, {"kön", "Sex"}, {"ålder", "Age"}}),
  #"Split Column by Delimiter1" = Table.SplitColumn(#"Renamed Columns", "Age", Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.Csv), {"Age.1", "Age.2"}),
  #"Changed Type2" = Table.TransformColumnTypes(#"Split Column by Delimiter1", {{"Age.1", type text}, {"Age.2", type text}}),
  #"Renamed Columns1" = Table.RenameColumns(#"Changed Type2", {{"Age.1", "Age"}}),
  #"Removed Columns" = Table.RemoveColumns(#"Renamed Columns1", {"Age.2", "Attribute.1"}),
  #"Replaced Value" = Table.ReplaceValue(#"Removed Columns", "män", "Male", Replacer.ReplaceText, {"Sex"}),
  #"Replaced Value1" = Table.ReplaceValue(#"Replaced Value", "kvinnor", "Female", Replacer.ReplaceText, {"Sex"}),
  #"Inserted conditional column" = Table.AddColumn(#"Replaced Value1", "AgeSorter", each if [Age] = "0-4" then "1" else if [Age] = "5-14" then "2" else if [Age] = "15-24" then "3" else if [Age] = "25-34" then "4" else if [Age] = "35-44" then "5" else if [Age] = "45-54" then "6" else if [Age] = "55-64" then "7" else if [Age] = "65-74" then "8" else if [Age] = "75-84" then "9" else if [Age] = "85-94" then "10" else if [Age] = "95+" then "" else null),
  #"Changed column type" = Table.TransformColumnTypes(#"Inserted conditional column", {{"AgeSorter", Int64.Type}})
in
  #"Changed column type"

Looking at the data, it’s much more useful for further analysis.

In order to reuse the work we just went through, we create a Power BI Dataflow. It is done from the Power BI service here, and note that it can only be done from an app workspace – not your own “my workspace”.

Then it’s time to create the dataflow by choosing “add new entities”.

 

There are quite a few data sources available already, even though some that are available in Power BI Desktop have yet to make it the the service (Data Lake Storage, for example). As we already have our Power Query code done, we’ll go ahead and choose a blank query that lets us write the Power Query code ourselves.

Next it’s just a question of pasting the Power Query code from before into the window like so, and click next:

If all goes according to plan (remember that this IS a preview feature and some parts of Power Query have yet to be implemented) the result should be identical to what we saw earlier in Power BI Desktop:

 

Let’s change the name of the dataset from “Query” to “Population” and click done.

There, that was the first entity of the dataflow done! We could add more entities, but for now we’ll just go ahead and save the dataflow by clicking save.

We’ll give the dataflow a useful name, a description and click save again.

Upon saving we have the opportunity to set the refresh schedule of the dataflow if we so desire. Unless the dataflow was refreshed at least once, there will be no data available when we go and try to use the dataset. The dataflow refresh settings can also be accessed here:

 

With this freshly minted dataflow in hand, let’s go back to Power BI Desktop and see how we can use it. Start by clicking “Get Data – Power BI – Power BI dataflows (Beta)”

 After logging in we are presented with the available dataflows:

Unless the dataflow is refreshed at least once, we will receive an error upon clicking the entity:

Provided the dataflow is refreshed, however, it looks exactly like any other datasource and we can go ahead and load or edit the query as usual:

So far we’ve removed some of the tedious work of prepping the data for further analysis, as anyone wanting to use this dataset no longer have to do all the preparation steps to get the data into a useful format.

 

Custom date formats in Power BI

Until the Power BI team gives us the ability to specify a custom date format on a column, one has to jump through a few hoops to get a custom date format to be recognized as a an actual date format. I finally decided to blog about it as it seems a common enough issue with a solution that is as simple as it is convoluted. Here goes.

 

As you can see, Power BI has failed to recognize this data as a date field and has decided to go for a numerical field. However, if we just try to change the format to a date, we are rewarded with this instead:

Not very helpful, I’m afraid. Clicking on the error message brings up a dialog that says that Power BI could not convert the field to a date – as there are just a few very select date formats that Power BI actually recognizes.

In order to get the original data (20180522 in this specific case) into a format that Power BI *will* recognize, one way is to do the following.

First, let’s split the column into the individual date parts. This can be done using “split column”, located on the Transform tab:

Unfortunately we can’t do the whole thing in one go, so we’ll do it in a few steps. The first step is to split the column by number of characters, like this:

This will give us one column with the four leftmost characters (2018) and one column with the rest. See where this is going? Right, but here is where it gets a tad evil.

You see, Power BI has already decided to automatically done a conversion to a numerical data type. That’s all fine and dandy, but it will wreak havoc with the next step as that’ll be splitting the second column just like we did with the first one. As the month part (05 in this case) has a leading zero, if we let Power BI convert this to numeric, the zero will be gone and the split will produce the wrong output. This is what we want to have at this stage, note the ABC icons for strings instead of the 123 of numbers.

After doing the splitting jig again, we’re left with this:

…and Power BI went ahead and changed the data type to a numeric AGAIN! Remove the “changed type” step and crack on – time to combine the columns into a usable field. Select the three fields we want to combine, and look for the “Merge Columns” button on the “Add Column” tab.

Power BI expects a date to behave in a certain way – one of the acceptable behaviors is having a dash between the date parts. When merging the columns we can select a custom separator and get the friggin’ dash in there, like so:

We are now treated to a new and shiny column called “MeasureDate” and Power BI thinks it’s a string. That’s OK, we’ll set Power BI straight in a second.

So can we NOW kindly have Power BI accept this as a date? Why, yes, thank you for asking! Power BI can easily interpret the above as a date, and we’ll end up with something like this (note the calendar icon of a date):

Now all that is left is housekeeping by selecting the three columns that we no longer need, right-clicking and choosing “Remove Columns”.

A quick column reorder later, we’re finally back where we started – but this time with a date field instead of a numerical one.

When you get tired of this clunky approach to what shouldn’t be a problem from the beginning, check out Lukas Lötter’s elegant PowerQuery function approach to solving the same problem here: https://www.oraylis.de/blog/local-date-formats-in-power-bi

 

Combining tables in Power BI and finding two(?) bugs

A student of mine had a question about combining data from multiple Excel sheets and doing calculations on the data. I realized I didn’t have a clear-cut answer so I decided to do some digging. The scenario looked like this:

I created three excel files, each with a column for name and one for value. The idea is to simulate sales persons, sales amounts and having one file per product. I called the first file ProductA, the second ProductB and the third ProductC.

 

pbiblog1

 

Then it was time to combine the tables. I wanted the result to be a table of all the data for the other tables, but it should also include where the data came from – i.e name, sales amount and product. This can be done in several ways, and one is to add a step in the loading of the excel files to add a column with a static name:

pbiblog2
and then appending them together, like this:

pbiblog3

pbiblog4

 

The trouble is that this is static. There had to be a better way of doing it, and after spending a few hours trying to find one I asked Reza Rad (@Rad_Reza, radacad.com), one of the most knowledgeable (and kind!) Power BI people on the planet. He showed me the use of the #shared construct. #shared lists all the Power Query (M) functions like this:

pbiblog5

This is very useful for very many other applications, but in this case the goal was to remove the need for static table names. The first rows contain the names of the tables we’ve been using, so filtering those rows and expanding the data might give us what we need:

pbiblog6

This looks exactly what we were after! With just a slight tweak of the column names we’re good to go. Turns out there’s a slight problem. As soon as I click apply the whole thing breaks:

pbiblog7

After some discussion with Reza we came to the conclusion that there is a bug in the Table.FirstN function that filters out the rows. Using Table.Range or Table.RemoveLastN works much better. I’ve logged a bug at the PowerBI.com forums, so hopefully this will be taken care of in the future.

Here is the Power Query code that do what I just did:

pbiblog8

Reza did his own blog post on this technique as well as shared another nifty tip: using #section. Take a look at Reza’s blog!

 

*edit* It seems there might be another bug here, as the data won’t show up back in PBI Desktop, despite everything looking as it should in the query editor. Stay tuned…

SQLHangout #38 and Microsoft TechDays

My life is … hectic, to say the least. I just finished a talk at Atea IT-arena in Karlstad last week, as well as a 60-minute recap of Micrsoft Ignite in Atlanta that I held yesterday for 40 people at AddSkills in Linköping. The crowd was very attentive and lots of good questions were raised. Hopefully even more Swedes will go to Ignite next year! I’m preparing for speaking at the Atea IT-arena i Norrköping in late November as well as teaching two courses in December: 10986 (Upgrading your administration skills to SQL Server 2016) and 10989A (Analyzing data with Power BI). Full speed ahead!

At 1800 CET On Monday the 17th of October I have the pleasure to join MVP Cathrine Wilhelmsen for a chat about career transitions in IT. I’ve spent close to 20 years deep under the hood of database systems and only recently crawled up and decided to tackle something new – data visualization and the cloud. Or is it really new? That’s what we’ll cover on Monday, so be sure to tune in and don’t hesitate to tweet us if you have any questions!

SQLHangout #38 live stream

In November it’s time for Microsoft TechDays in Stockholm. I’m happy to say that I’ve received a speaking slot where I will be talking about Azure SQL Database – the cloud awakens. Or, as attendees will find out – what REALLY happened on the Death Star and what led to the downfall of the Empire. Few people know the inner workings of these events, and even fewer realized that it has a lot to do with IT…

SQL Database: the Cloud Awakens

 

Reducing SSIS anxiety

Story time again. This tuesday I was to create a set of import packages in SQL Server Integration Services (SSIS) to automate data loading from a couple of my customers’ storage systems. Said data comes in the shape of several text files – some delimited, some fixed length, all of varying size and shape. Anyone who’s worked with SSIS know how “fun” it is to click-click-click your way through the incomprehensibly boring GUI – a GUI that does NOT lend itself to any efficient work at all. We had been struggling with this for a couple of days before summer already, but the data format has changed and it’s basically the same amount of work to just redo the whole thing again from scratch than to try to edit all the tiny details in all the gazillion boxes everywhere.

I can’t really say that I was looking forward to this work. Again. And I knew that if we made this proof-of-concept work, we’d have to do it several times over for each and every customer that wanted in on this idea. Ugh. Then something interesting happened.
This weekend I had the honour of presenting at SQL Saturday #536 in Gothenburg and I had the pleasure of meeting several amazing people. Among these was a Norwegian MVP named Cathrine Wilhelmsen, whose presentation “From accidental to efficient BI developer” I decided on a whim to attend. She showed some interesting things about the SQL Server Management Studio that I had no idea about, and I felt that those tidbits alone were well worth my time listening to someone talk about something I know basically nothing about.

Then she whipped out BIML and everything I thougth I knew about SSIS went out the window. BIML stands for Business Intelligence Markup Language and is basically a XML wrapper language for generating SSIS packages. This way I can write a bunch of XML code, generate a package or a hundred, over and over again. I don’t have to click on forty different places to change one thing in forty packages, I just change one thing in the XML and regenerate the packages. This. Is. Awesome. She showed us some websites and her blog, and home I went to study.

Tuesday came around, and I first spent about 20 minutes outlining BIML for my colleague, and then we set off – basically from scratch. We are both old hands at programming in different languages, so that was nothing new for either of us. Using different websites, examples, blogs and the like we managed to do in about six hours what we had kind of failed to do in three days. We now have a few hundred lines of code that does exactly what we want, is well documented(!), is easy to read and above all – is extensible. When I need to add a new file or two to the input – I copy and paste some code. Need to add a derived column? Sure thing, a tad more XML. And so on and so forth. We’ve barely scratched the surface of what BIML can do, but the key takeaway is this: I could get from zero knowledge of BIML and a VERY difficult relationshop to SSIS to a fair grasp of the basics of BIML and a newfound appreciation of the power of SSIS in about six hours. When was the last time you had such a return of invested time? Exactly.
Now, BIML is definately not new but as a DBA I had not come across it earlier. Now I will sing the praise of BIML to all my DBA colleagues.

Now – I would not have attended Cathrine’s session had i not been invited to SQL Saturday in Gothenburg. I would not have been invited to Gothenburg had I not written an abstract and sent it for review to the organizers of said SQL Saturday. I had not had the idea to send in an abstract had I not met representatives of PASS (the association for SQL Server and BI professionals) at PASS Summit in Seattle and seen first-hand what the community can do. I’ve only been a part of the PASS community for a couple of years, but I can’t praise it enough. The community is about sharing above all – knowledge, contacts, the works. The people I’ve met have all been friendly, caring, eager to help and very, very good. I rate my getting involved in the community as one of the best and most pivotal choices of my professional career. My one regret is that I’m a bit late to the party. But I’m bringing cookies and 20 years’ worth of experience…

Useful links:
http://www.cathrinewilhelmsen.net/
http://bimlscript.com
http://www.sqlservercentral.com/stairway/100550/
https://bidshelper.codeplex.com/