THE SQL Server Blog Spot on the Web

Welcome to - The SQL Server blog spot on the web Sign in | |
in Search

SQLBI - Marco Russo

SQLBI is a blog dedicated to building Business Intelligence solutions with SQL Server.
You can follow me on Twitter: @marcorus

  • The new data modeling book for Power BI and Power Pivot users

    In the last few years, I and Alberto Ferrari assisted many users of Power Pivot and Power BI who wanted to create their reports using these tools and were struggling with getting the desired numbers from their data. The first approach of an Excel user is to look for a function, or a more complex DAX expression, that can implement the calculation required. In a similar way, this is also the approach of Power BI users that don’t have a database design background.

    Several times, before looking for a correct DAX expression, the problem is defining the correct data model, putting data in the right tables, and creating the proper relationships. The tools to create queries in M (Power Query in Excel, now called Get Data, and the Query Editor in Power BI) are wonderful to help the users in doing the proper massage to the data. However, these tools cannot help users that don’t know how to properly define the right data model. For this reason, when we thought to a new version of the book about Power Pivot (we previously wrote one for Excel 2013/2016 and one for Excel 2010) we considered that the DAX chapters were no longer necessary (because we have The Definitive Guide to DAX for that), so we decided to dedicate an entire book to the data modeling, targeted to business users that usually do not have such a skill, or that learned that by trial and errors and without a more structured approach.

    Because the concepts are identical for Power Pivot and Power BI, we wrote a single book that target both products: Analyzing Data with Power BI and Power Pivot for Excel. Several examples are created in Power BI Desktop (because it is free and available to anyone), but certain reports are created using pivot tables in Excel. The goal is to teach the concepts, rather than providing formulas to copy and paste. In fact, the reader should try to apply the same ideas to its own data, recognizing the data modeling patterns described in the chapters of the book.

    We tried to minimize the use of theoretical terms, trying to introduce with very practical examples and design patterns the terminology that is commonly used in data modeling (such as fact, dimensions, normalization, denormalization, star schema, snowflake schema, and so on). Thus, is this book for you?

    • If you are an advanced Excel user that adopted Power Pivot or Power BI, definitely yes.
    • If you are a BI developer, you should already know the theory, and this book could be useful to see practical examples using a model in these tools. Thus, it could be useful (well, several models are really easy to implement as design patterns), but you should not learn new concepts.
    • If you are a data analyst that moved to Power BI from other tools, then it depends on your background. You might be in the middle of the two cases described above, but in any case the book should be useful for you, too.

    At this point, you might wonder why we did not include the words “data modeling” in the book title. The simple answer could be “marketing”, but the real answer is more complex. A business user hardly recognizes the need of data modeling skills. He/she just want to obtain a result, analyzing its own data. He/she might think that data modeling is something for DBAs, and is not related to reporting. However, when you create a report in Power Pivot and Power BI, usually you are creating a data model (unless you create a live connection to an existing Analysis Services database). For this reason, the title should help these users to find this book and to take look at its description. Yes, it is a book for data modeling targeted to readers that do not know that they need such a skill. Thus, the reason for this title is “communication”: We hope that the book description is clear enough to avoid any misunderstanding!

    This is the table of contents of the book:

    • CHAPTER 1 – Introduction to data modeling
    • CHAPTER 2 – Using header/detail tables
    • CHAPTER 3 – Using multiple fact tables
    • CHAPTER 4 – Working with date and time
    • CHAPTER 5 – Tracking historical attributes
    • CHAPTER 6 – Using snapshots
    • CHAPTER 7 – Analyzing date and time intervals
    • CHAPTER 8 – Many-to-many relationships
    • CHAPTER 9 – Working with different granularity
    • CHAPTER 10 – Segmentation data models
    • CHAPTER 11 – Working with multiple currencies
    • APPENDIX A – Data modeling 101

    Remember, when the data model is correct, the DAX code you need is simpler, shorter, and more efficient. The first rule is always “keep it simple”, and a good data model is always the first step in the right direction!

  • Small benchmark for Tabular and Power BI formula engine (FE) #ssas #tabular #powerbi

    If you run SSAS Tabular or Power BI, please help me in gathering some performance data for a study I'm doing about Formula Engine performance following these steps: 

    1) Copy the PowerShell code below in a script Benchmark.ps1, modifying serverName, ssasInstanceName and databaseName so that they point to an existing database of an instance of Analysis Services Tabular. If you use a default instance, assign an empty string to ssasInstanceName. Any database that exists is valid. If you want to connect to Power BI, retrieve the address using DAX Studio and write the complete address in the serverName variable (e.g. "localhost:1234")

    2) Run the PowerShell code redirecting the output to a Benchmark.json file (eg. Test.ps1 > Benchmark.json)

    3) Send me the JSON file in a mail (even copy/paste the content is ok) to marco (at) sqlbi [dot] com

    Thanks in advance for your help!


    ## Specify the name of the server where SSAS runs
    $serverName = "." 
    ## Specify the instance name of SSAS Tabular (use empty string to use the default instance)
    $ssasInstance = "TAB16"
    #Specify any database available on SSAS Tabular (the database must exists, no data is read from there)
    $databaseName = "Contoso"
    ## End of parameters

    if ($ssasInstance.Length -ne 0) {
        $ssasInstance = "\" + $ssasInstance
    $null = [System.Reflection.Assembly]::LoadWithPartialName("Microsoft.AnalysisServices")
    $server = New-Object Microsoft.AnalysisServices.Server

    $null = [System.Reflection.Assembly]::LoadWithPartialName("Microsoft.AnalysisServices.AdomdClient")
    $connStr = "data source=$serverName$ssasInstance;Initial catalog = $databaseName"
    [Microsoft.AnalysisServices.adomdclient.adomdconnection]$cnn = new-object Microsoft.AnalysisServices.adomdclient.adomdconnection($connStr)
    $cmd = new-Object Microsoft.AnalysisServices.AdomdClient.AdomdCommand
    $cmd.Connection = $cnn
    $cmd.CommandText = @"
    ROW (
        "x", COUNTROWS (
            CROSSJOIN (
                SELECTCOLUMNS ( CALENDAR ( 1, 10000 ), "Num1", INT([Date]) ),
                SELECTCOLUMNS ( CALENDAR ( 1, 10000 ), "Num2", INT([Date]) )
    $sw1 = New-Object Diagnostics.Stopwatch
    $sw2 = New-Object Diagnostics.Stopwatch
    $sw3 = New-Object Diagnostics.Stopwatch
    $dr = $cmd.ExecuteReader()
    $dr = $cmd.ExecuteReader()
    $dr = $cmd.ExecuteReader()
    $colItems = Get-WmiObject -class "Win32_Processor" -namespace "root/CIMV2" -computername $serverName 
    $s_version = $server.Version
    $run1 = $sw1.Elapsed
    $run2 = $sw2.Elapsed
    $run3 = $sw3.Elapsed
    foreach ($objItem in $colItems) { 
        $cpuId = $objItem.DeviceID
        $cpuModel = $objItem.Name
        $cpuCores = $objItem.NumberOfCores
        $cpuMaxSpeed = $objItem.MaxClockSpeed
        $cpuCurrentSpeed = $objItem.CurrentClockSpeed
        $cpuStatus = $objItem.Status
        ## We only consider the first CPU
    Write-output "{"
    Write-output "    ""Benchmark"": ""FE CROSSJOIN 10k*10k"","
    Write-output "    ""SSAS_Version"": ""$s_version"","
    Write-output "    ""Run_1"": ""$run1"","
    Write-output "    ""Run_2"": ""$run2"","
    Write-output "    ""Run_3"": ""$run3"","
    Write-output "    ""CPU_ID"": ""$cpuId"","
    Write-output "    ""CPU_Model"": ""$cpuModel"","
    Write-output "    ""CPU_Cores"": ""$cpuCores"","
    Write-output "    ""CPU_MaxSpeed"": ""$cpuMaxSpeed"","
    Write-output "    ""CPU_CurrentSpeed"": ""$cpuCurrentSpeed"","
    Write-output "    ""CPU_Status"": ""$cpuStatus"""
    Write-output "}"

  • Avoid measure and column with the same name in #dax

    I recently wrote an article about duplicated names in DAX to highlight a possible case where you can have a measure with the same name of a column. This is not a good idea, mainly because the syntax to reference a measure and a column is identical, and this similarity is the root of a number of issues. It would be much better if the measure reference had a syntax different from the column reference.

    The syntax table[identifier] could identify either a column or a measure, and there is no secure way to understand that by just reading the code (unless you look at the color highlighting). I thought to a list of techniques to disambiguate that syntax, and the only available one is to modify the measure reference (not the column reference) in one of these ways:

    • CALCULATE ( table[identifier] ) is a syntax that certainly reference a measure, if table[identifier] is a column, the lack of a row context in CALCULATE would generate an error
    • table[identifier]() is another way to invoke a measure, which in reality corresponds to CALCULATE ( table[identifier] ); if you use DAX Formatter, a syntax like [measure]() is transformed in CALCULATE ( [measure] ); however I might consider modifying DAX Formatter if it was necessary. However, I don’t think that a syntax such as [measure]() is more readable.

    If it was possible to refactor the DAX syntax, I would use a different syntax for a measure reference:

    • (measure)
    • {measure}
    • [other ideas?]

    It’s probably too late for a similar discussion, but you never know.

    Feedbacks are welcome.

  • New edition of the Analysis Services Tabular book #ssas #tabular

    Last year I and Alberto Ferrari wrote a new edition of the Analysis Services Tabular book, which official title is Tabular Modeling in SQL Server Analysis Services (2nd Edition). Some delay in editing and production delayed the printing, but the book is finally available, so I can present it with a dedicated blog post, explaining its goals and contents, so you can evaluate whether it could be interesting for you.

    First of all, this is the second edition of the book we wrote with Chris Webb a few years ago for Analysis Services 2012. Several chapters did not require particular changes, and even if we updated many parts of these chapters to the new version, a great thank you goes to Chris, who allowed us to base our work on his content, too. For an unfortunate series of events and crossing edits, we didn’t include Chris in the acknowledgments of the book, so this blog post is just a first attempt to publicly address that – forgive me, Chris!

    Compared with the previous edition, we removed all the chapters about DAX, keeping only a small introduction of this language, because we have an entire book dedicated to it! This choice provided us additional time to cover other topics in more detail. We covered only the compatibility level 1200 in this new book, so if you have to develop a model using the model 1100 or 1103, you probably will find a better reference in the previous edition.

    This is the list of the chapters of this book, followed by a small comment of the main changes made in this edition.

    CHAPTER 1 Introducing the tabular model
    CHAPTER 2 Getting started with the tabular model
    CHAPTER 3 Loading data inside Tabular
    CHAPTER 4 Introducing calculations in DAX
    CHAPTER 5 Building hierarchies
    CHAPTER 6 Data modeling in Tabular
    CHAPTER 7 Tabular Model Scripting Language (TMSL)
    CHAPTER 8 The tabular presentation layer
    CHAPTER 9 Using DirectQuery
    CHAPTER 10 Security
    CHAPTER 11 Processing and partitioning tabular models
    CHAPTER 12 Inside VertiPaq
    CHAPTER 13 Interfacing with Tabular
    CHAPTER 14 Monitoring and tuning a Tabular service
    CHAPTER 15 Optimizing tabular models
    CHAPTER 16 Choosing hardware and virtualization

    The introduction has the same structure of the first chapter in the previous edition, and it covers the architecture of Analysis Services Tabular and its role in the Microsoft BI platform. Power BI was not even in the plans when we wrote the first edition, so you can imagine this was not just a simple search and replace of version numbers, many things changed since 2012.
    The chapter 4 is similar to the DAX Basics of the previous edition, but we updated a few details about the syntax, providing the foundation to write very simple calculations. This is the only chapter dedicated to DAX in this new book.

    The chapter 5 about hierarchies and the chapter 6 about data modeling were present also in the previous edition. The chapter 7 is brand new chapter about the Tabular Scripting Model Language. Beside the technical information, I think that this chapter is useful to clarify the role of TMSL in the new compatibility level (1200). The chapter 8 has been updated considering Power BI as the main client tool for Tabular, instead of Power View (which was the reference in 2012).

    The chapter 9 about DirectQuery has been completely rewritten, and we also wrote a separate whitepaper about DirectQuery that complements the content of this chapter in the book.

    The chapters 10-11-12 have been updated (considering also relationships with bidirectional filters) describing security, processing, partitioning, and VertiPaq internals. In particular, the VertiPaq chapter has many more details that will help you also in troubleshooting and optimization.

    The chapter 13 is another chapter that has been written from scratch, because the previous APIs are no longer available, and a new (better but different) one is now the reference library for PowerShell and C# scripts.

    The chapter 14 is an update of the chapter present also in the previous edition, whereas chapters 15 and 16 provide more space to topics (optimization of large structure and hardware/virtualization selections) that were cited in the previous edition very shortly, without the insights that you will find in this edition

    I hope you will enjoy this book!

  • Using and optimizing DirectQuery in #powerbi and #ssas #tabular

    More than one year ago, I and Alberto Ferrari started to work on DirectQuery, exploring the new implementation appeared in Analysis Services 2016 for the Tabular models, which uses the same engine of Power BI. The previous version of DirectQuery appeared in SSAS Tabular 2012 was very slow and suffered of performance issues also using small models. We have never seen it adopted in a production environment, it was very hard to optimize, and there were too many restrictions to the data model (for example, no MDX queries = no PivotTable in Excel as a client, and no time intelligence = complex DAX code to write even for trivial calculations).

    For these reasons, when we worked on Analysis Services 2016 and Power BI, we wanted to push the new version of DirectQuery, which solved many of the limitations of the first implementation, discovering its limits and where it could be used in a real production environment. The results of our efforts is now available in a white paper published by Microsoft: DirectQuery in Analysis Services 2016. The document contains a detailed step-by-step guide to implement DirectQuery in your model, and then investigate on how DirectQuery transforms a DAX or MDX query into one or more query in SQL. If you want a quick answer to the question “should we use DirectQuery?”, then the answer is…. it depends!

    You have to set your expectations to a correct level. You can use DirectQuery successfully in certain conditions (database size, frequency of refresh, capacity and performance of your RDBMS server). You certainly have to test the particular workload you want to apply to DirectQuery. The best suggestion is: try and measure it. The whitepaper will provide you many hints about what you should try and what you should expect. Specific measures we made could be different today, because there are often new release of this technology, and we can expect more improvements going forward. But you can certainly start to spend time testing it, and if you understand how it work, you can figure out what are the scenarios where you might want to adopt it.

    The white paper can be used also for Power BI: the engine is the same, we will probably see many improvements very soon, and maybe that the version of the engine you are using is already much better than the ones we used writing the whitepaper. However, the basic concepts are the same, and you can see and measure the improvements by repeating the same tests we described in the document.

    Out of the records, and in a very informal way, I would like to highlight that you should try DirectQuery with a correct attitude. Don’t expect the magic box, and don’t assume it will be bad, it could surprise you in both cases! Most important, think about why you might want to use this technology.

    The right reasons for using DirectQuery are:

    • You need real-time queries. Processing window creates latency that you cannot accept. Good reason. Just ask to yourself if your customers are ready for real-time queries in a dashboard. You might discover they are not.
    • You have a small database to query, that changes often.
    • You have a large amount of data, too big to fit in memory, and you *don’t* need queries returning a result in less than 2 seconds.

    There are also two very wrong reasons for choosing DirectQuery:

    • Saving money. I think that the in-memory Tabular model has a lower cost considering hardware+software+maintenance in order to provide the same level of performance.
    • Improving query performance. On the same hardware, it’s very hard. On a different hardware, maybe… but see the previous point about the cost. And also consider network latency (you run the RDBMS on another server, right?).

    I don’t want to scare you. I just want to set the right level of expectations. With that in mind, you can implement successful projects using DirectQuery today.

  • Unexpected side effects of table filters in #dax

    One of the side effects that you might not expect from a table filter in a CALCULATE statement is the removal of an existing filter. In reality, this is a condition that happens when you write an explicit CALCULATE in a row context, because the table filter overrides the context transition in case you filter the same table that generated the context transition.

    In practice, the following query return the same value for every color, ignoring the context transition for Product[Color]:

        VALUES ( Product[Color] ),
        "Sales", CALCULATE(
            [Average Sales Amount],
            FILTER ( Sales, Sales[Quantity] > 3 )

    If you know the context transition and you think this is strange, Alberto wrote a good article explaining this particular situation: read Context Transition and Expanded Tables on SQLBI web site!

  • New syntax in DAX: IN operator, table and row constructors #dax #powerbi

    I recently wrote an article to described in more details the syntax of the IN operator in DAX, which is available in Power BI and in the CTP of the next release of Analysis Services.

    Jeffrey Wang wrote a blog post about the IN operator a few months ago, so I waited some time to include more details about how it internally works and how to use it when there are two or more columns involved. 

    If you just need the simpler syntax for a single column, well, now instead of an infinite list of OR condition, you can write:

    table[column] IN { 1, 3, 5, 7, 11 }

    However, if you want to understand the implications of the new syntax of the IN operator and of the related table and row constructors, then I suggest you to read this article carefully!

  • Mark as Date table in Power BI #dax #powerbi

    One year ago I wrote an article describing how the time intelligence DAX functions work in Power BI (I just updated the article including the example described below). In a recent event in Sydney I observed a strange (or at least unexpected) behavior of Power BI during a demo, and I think it is interesting to share a few considerations about it.

    As you can read in the article linked above, I wrote that when you apply a filter to the date column of a table, and this column is used in a relationship, then the table is considered a Date table even if it is not marked as a Date table (you cannot do that in Power BI). During my demo, the Date table had two relationships: one with Sales using an integer column, and one with Purchases using a date column. My expectation was that when I was using a calculation involving only Date and Sales tables, a time intelligence filter would have required an ALL ( Date ) condition in the CALCULATE statement, but this was not the case. Let me show this with an example.

    Consider this formula:

    SalesYTD1 := CALCULATE ( SUM ( Sales[Amount] ), DATESYTD ( 'Date'[Date] ) )

    If the relationship between Sales and Date is made using an integer column in Power BI, I would expect the formula to not provide any YTD calculation unless I rewrite it as:

    SalesYTD2 := CALCULATE ( SUM ( Sales[Amount] ), DATESYTD ( 'Date'[Date] ), ALL ( 'Date' ) )

    However, SalesYTD1 was working during my demo, despite I had a relationship based on a DateKey column that was an integer value (such as 20170222). After some investigation, I realized that the other relationship connecting Purchases and Date, even if not used in the two measures above, was enough to consider the Date column as a primary key of the Date table, and this produces an automatic addition of the ALL ( 'Date' ) function in the CALCULATE statement when a filter is applied to 'Date'[Date].

    Thus, the lesson is that even if you cannot have Mark as Date table in Power BI, you can obtain the same result by creating an hidden empty table with a column Date in the data model, creating a 1-to-many relationship between Date[Date] and the table you created. For example, you can use MarkAsDateTable=FILTER(CALENDARAUTO(),FALSE) to obtain such a table and then create the relationship and hide the table.

    You can download a working version of this example in the updated demo of the article Time Intelligence in Power BI Desktop

  • Using SUMMARIZECOLUMNS instead of SUMMARIZE in #dax #powerbi #ssas #tabular

    One of the latest addition to DAX has been the ability to apply an external filter context to SUMMARIZECOLUMNS, which makes this function a perfect replacement for SUMMARIZE in DAX. In The Definitive Guide to DAX I introduced such a function, but at the time of writing it was used only by Power BI and it wasn’t possible to include it in a measure, because of the limitation that now has been removed. Using SUMMARIZECOLUMNS you can write:

    'Date'[Calendar Year], 
        "Sales Amount", SUMX ( Sales, Sales[Quantity] * Sales[Unit Price] )

    instead of:

    'Date'[Calendar Year], 
        "Sales Amount", SUMX ( Sales, Sales[Quantity] * Sales[Unit Price] )

    A more complete description of this new function is available in the Introducing SUMMARIZECOLUMNS article I wrote on SQLBI.

  • Year-over-year comparison using the same number of days in #dax

    When you use the time intelligence functions in DAX, it is relatively easy to filter the same dates selection in the previous year by using the SAMEPERIODLASTYEAR or DATEADD functions. However, if you follow the best practices, it is likely that you have a full date table for the current year, which includes many days in the future. If you are in the middle of March 2017, you have sales data until March 15, 2017, so you might want to compare such a month with the same number of days in 2016. And the same when you compare the Q1, or the entire year.

    A common solution is to translate this request in a month-to-date (MTD) or quarter-to-date (QTD) comparison, but depending on how you implement this, you might not obtain a reliable result. For example, you might assume that the current date on your PC is the boundary of the dates you want to consider, but you probably have a few hours if not days of latency in the data in your database, so you should constantly fix the offset between the current day and the last day available in your data.

    Thus, why not simply relying on the data you have to make an automatic decision? This is the purpose of the technique described in the article Compare equivalent periods in DAX that I wrote on SQLBI, where I show several approaches optimized for Power BI, Excel, and SSAS Tabular, which are different depending on the version you use.

    Personally, the version I prefer is the one with the variables in DAX:

    [PY Last Day Selection] :=
    VAR LastDaySelection =
        LASTNONBLANK ( 'Date'[Date], [Sales Amount] )
    VAR CurrentRange =
        DATESBETWEEN ( 'Date'[Date], MIN ( 'Date'[Date] ), LastDaySelection )
    VAR PreviousRange =
        SAMEPERIODLASTYEAR ( CurrentRange )
        IF (
            LastDaySelection > MIN ( 'Date'[Date] ),
            CALCULATE ( [Sales Amount], PreviousRange )
    It’s longer, but much more readable. More details and examples you can download for Excel and Power BI are available in the article.
  • VertiPaq Analyzer 1.7: a fundamental tool for #powerbi #powerpivot #ssas #tabular diagnostics and documentation

    If you work with Power Pivot, Power BI, or Analysis Services Tabular, you probably already know VertiPaq Analyzer. If you never used it, with the new release 1.7 you have another couple of very good reason to try it.

    VertiPaq Analyzer is a Power Pivot workbook that extracts information from data management views (DMVs) of a Tabular model. It can be connected to a SSAS Tabular or a Power BI database. If you have a Power Pivot data model, just import it in Power BI and extracts the information from there. You will quickly see what are the tables and columns that are more expensive, and you will have all the details about cardinality of columns and tables at a glance. This is of paramount importance when you have to optimize a DAX expression. However, these are the features we already had in VertiPaq Analyzer. What’s new in this release?

    • There are two versions of the VertiPaq Analyzer, which is an Excel file with VBA macros now (for the new VertiPaq Analyzer ribbon)
      • The version with suffix 1103 uses the “legacy” DMVs for Multidimensional
      • The version with suffix 1200 starts using the new DMVs for Tabular (which are available only for compatibility level 1200 – more on that in this article.
    • There is a ribbon for VertiPaq Analyzer that includes a button to format all the DAX expressions:
      DMV Size 13
    • And yes, there are three new worksheets that show all the measures, calculated columns, and calculated tables extracted from the data model, each one with its own DAX expression
    • Last but not least, the Relationships worksheet shows in a clear way the tables and columns involved for each relationship, including two measures about the maximum cardinality of the columns involved in the relationship.

    This tool is amazing to do remote troubleshooting, or even just to document what are the DAX formulas you have in a model at a given point in time. It saves me hours every week. I hope it will be useful to you, too!

  • Santa Claus brings presents to #dax, #powerbi, and #ssas #tabular users

    If you think that the end of the year would have been a quiet time because the Power BI team will skip their monthly release (but are we sure?), you might be reassured (or disappointed depending on your perspective) looking at the news I have in this blog post.

    SSAS Tabular vNext CTP 1.1
    In 2017 Microsoft will release a new version of Analysis Services. We already have a preview (CTP 1.1) that has been released last week. Christian Wade wrote a nice blog post about what’s new in this preview (and Christian Wade donated the BISM Normalizer to the community as an open source tool – yes, it’s free now!). We have a new compatibility level (1400) and I will not repeat the list of new features. But there is M integrated in SSAS Tabular. And you have full control over drillthrough. I waited for this feature since 2005, for Multidimensional. Next year we’ll have full drillthrough control in Tabular. This will make Multidimensional developers jealous, I know.

    If you want to test these features, you don’t have to install the entire SQL Server vNext: just create a virtual machine and download SQL Server Data Tools for CTP 1.1. You can test everything using the integrated workspace. (don’t use this release on your workstation or to create “real projects”, it’s unsupported and might have issues).

    DAX Studio 2.6
    If you use DAX, you probably use DAX Studio. If you don’t, you have to! Darren Gosbell just announced the 2.6 release. If you use the Server Timings pane or you analyze the DAX queries generated by Power BI, this version has a big improvement, because it manages multiple results (you can execute multiple EVALUATE statements in a single operation). In any case, you should install this version because of the many bug fixes and the improved stability. For example, if you modify the model in Power BI (or SSAS Tabular, or Power Pivot), the existing connection in DAX Studio will nicely manage the changes, showing you the updated metadata automatically. And without crashing (I know, there were some bug changing metadata before…).

    Updated Custom Visuals for Power BI from OkViz
    A few months ago, Microsoft released a new API for custom visuals in Power BI. This API will solve many stability issues, creating a shield for custom visuals that will not be impacted by changes in the underlying Power BI platform (something that happened at least once a month this year). The new API was not complete at the beginning, but in the latest version (1.3) it reached a level that allows to release a new version of the OkViz components based on the new APIs without losing existing features.

    In the meantime, we also added new features to the components and if you use one or more of these custom visuals, I strongly suggest you to consider upgrading them in your reports. The price to pay is that this time the update requires a manual activity. The automatic upgrade provided by the Power BI Gallery would have broke too many reports, because we modified (and improved) the behavior of certain properties, following the many suggestions and feedback received. Breaking compatibility is an exceptional event and we don’t want this to happen in the future, but we considered that this time it was a necessary operation. This choice is also conservative, because all the existing reports will continue to work. But you should consider that sooner or later Microsoft will remove support for “legacy” custom visuals, based on the old APIs. Thus, don’t be in a hurry trying to upgrade. Take your time, try the new features, but make a plan for an upgrade path of your reports.

    Ok, here is the list. It’s big. Every component has an article describing the new features. We will provide demo videos later, but we didn’t want to wait more before releasing them!

    DAX and SSAS Training worldwide
    As you know, I and Alberto Ferrari deliver training on DAX, Power BI, Power Pivot, and SSAS Tabular. We try to organize courses where there is a demand (we receive many suggestions for cities we should visit), and I’m happy to announce that in 2017 we will deliver several workshops in United States, and we already open registration for dates in Australia and Europe.

    Workshops in Australia
    I will be in Sydney and Melbourne between February and March 2017, delivering the workshops SSAS Tabular (Feb 20-21), Mastering DAX (Feb 22-24 and Feb 27-Mar 1), and Optimizing DAX (Mar 2-3). If you want to see more details, visit our page for training in Australia. If you are interested in some of these dates, hurry up and get the early bird discount before it’s too late!

    Workshops in United States
    The first dates are Seattle, WA (March 27-31, 2017 - Mastering DAX + Optimizing DAX) and Houston, TX (April 3-7, 2017 – SSAS Tabular + Mastering DAX). Registrations will open in January, and in the next few weeks we will announce workshops in other cities. If you are interested, vote your favorite city helping us to prioritize them. If you want to receive promptly notification about future dates, register to our newsletter.

    Workshops in Europe
    Finally, we will continue to deliver our workshops in Europe, too: London, Amsterdam, Vienna, and Copenhagen are the dates already published in the first semester (visit links for more details and for registrations). We might also add some other city in January (Paris, Helsinki, and another city are under evaluation).

  • Format measures as dates in Power BI #dax #powerbi

    If you want to format a measure as date in Power BI, you need an expression returning a date/time data type. This is different from what you can do in Power Pivot and in SSAS Tabular, where you can format any numeric expression using a date and/or time format, and the conversion is implicitly done to perform such a visualization.

    The problem might be not very common, because you have a numeric expression only when you start manipulating a date without using the dedicated DAX functions to do that. A date is a floating point number where the integer part is the distance in days since December 30, 1899, and the decimal part is the fraction of a day. In the following examples I will use NOW() to obtain a date and time, but you might have any other expression operating on your data.

    This measure in Power BI returns a date/time, and can be formatted using any “Date Time” format:

    Measure := NOW ()

    You can remove the time by truncating the number, obtaining only the day with the time corresponding to 12:00am:

    Measure := TRUNC ( NOW () )

    At this point, if you try to change the format of the measure, the “Date Time” format is disabled:


    In DAX there is no type conversion operator to date/time. However, you can sum the numeric expression that you have to a recognized date/time value. If you add the equivalent of 0 (corresponding to December 30, 1899), your final data type is a date/time. The following expression returns a date/time data type corresponding to the current day.

    Measure := TRUNC ( NOW() ) + DATE ( 1899, 12, 30 )

    The format of the measure now displays only Date Time formats.


    Please note that for the specific example used in this post, you can use the DAX function TODAY(), which returns the current date without the time. The function NOW has been used only to represent any date/time expression that you might obtain from other DAX expressions over your data.

    Thanks to Jeffrey Wang for providing me the hint used in the technique I described in this blog post.

  • DAX Editor 2.0 finally available: it is a must have for #ssas #tabular #dax developers

    The SQL Server Data Tools (SSDT) editor for Tabular models has been improved in the recent updates. For example, the Tabular Model Explorer enables you to quickly jump between the entities of your model organized in a hierarchical way. However, when it comes to editing DAX measures, you are still constrained in a user interface where you can only modify one measure at a time. Moreover, when you have hundreds of measure in the same Tabular model, every change you apply might require a few seconds of waiting time, because of the validation executed each time.

    Several years ago, a first version of DAX Editor for Visual Studio was created by Nick Medveditskov, a former developer of the Analysis Services team who released the tool to the community as an open source project. Later on, I and Teo Lachev contributed to upgrade the tool when new versions of SSDT were released (Visual Studio 2012 and 2013). However, such a version was no compatible with the new compatibility level (1200) introduced by Analysis Services 2016, which saves the model in a JSON format instead of XML. Thanks to the collaboration of a few customers who wanted to migrate this tool to the new version of Analysis Services, we (at SQLBI) hired a developer who helped us in this migration and now a new version of the DAX Editor is available in the Visual Studio Gallery.

    DAX Editor 2.0 supports both XML and JSON formats (compatibility levels 1103 and 1200), and it can be installed on Visual Studio 2012, 2013, and 2015.
    You can see a quick introduction of how to install and use it in the DAX Editor – Introduction video (just 3’).
    A more detailed explanation of how to install and start to use DAX Editor is available in the article Using DAX Editor on SQLBI.
    The source code is available on GitHub.
    Finally, a single page with all the links is available in DAX Editor tools on SQLBI.

    One warning and one request. The warning first: despite the many tests we did, there will be certainly other bugs. Create a backup of your BIM file before using DAX Editor, you don’t want to lose your hard work because of a bug in an editor!
    The request: please, report any bug in the Issues page on GitHub, write a review if you like the tool, and if you are interested in contribute to the development of new features, contact me to request access.

  • Install SQL Server 2016 SP1 for Analysis Services #ssas #tabular #sqlserver

    Microsoft recently released the Service Pack 1 for SQL Server 2016. If you use Analysis Services, installing this update is a no-brainer and you should do that as soon as possible. The reason are all related to memory management. Microsoft officially described this in a blog post on MSDN. I suggest you reading it, but let me describe in a more informal way why this update should be a priority for you.

    1. SSAS Tabular had a heap fragmentation issue in SSAS 2012 and 2014, but most of the issues were still present in SSAS 2016 because of other bugs. If you run SSAS 2012/2014 (any version/SP), or SSAS 2016 RTM (any CU), then you should mitigate the problems using the workarounds described in the article Heap Memory Settings for Analysis Services Tabular 2012 / 2014. The new allocator in SSAS 2016 SP1 should get rid of the issues, so you should reset the HeapTypeForObjects and MemoryHeapType settings to their new default values (MemoryHeapType=-1 and HeapTypeForObjects=0). I still don’t have metrics from real world usage, so I invite you to test these parameters, monitor the memory in your server after a few hours/days and report (using comments) whether the new settings work better than the previous ones.
    2. Improved performance thanks to the new memory allocator. I made some stress test on machines with one socket and multiple cores, and I measured improvements between 5% and 10% on test stressing queries over large tables using the same hardware and with the same condition. Since the improvement is for memory allocation, I would not be surprised if there are improvements also in processing time. However, getting faster performance for free is always welcome.
    3. NUMA support. The Microsoft post is very honest and describe it as “NUMA awareness” and talks about “better NUMA support”. I still don’t have numbers, but based on a number of information I collected we can say that the benefits are relevant for large data models running on 4 sockets or more. I have seen statements describing as 30GB the model size that would start to get improvements, but I think that you should also consider the model type. A single model with a huge number of tables and columns might not benefit too much, whereas if a single table has billions of rows, then the benefits should be more visible. Remember: if you are able to store your model on a single socket machine, you probably have (on average) better performance. But if you want to enter the club of 10+ billion rows in a single table, then this is an option you should consider. If you have these models and you have 4 sockets or more, please share your experience!
      At the same time, if you have many small models, it’s probably better to split your workload on different SSAS Tabular instances, each one running on a single socket. But you can try, and share your results especially if my prediction is wrong and you see performance benefits scattering data of small models on multiple sockets.
      Finally, if you have a large table (>2 billion rows) and only 2 sockets, please contact me to play with some VertiPaq non-default setting.

    Getting scalability over NUMA architecture is not easy, but this is a long awaited first step in the right direction. In any case, this update is a priority if you have SSAS Tabular 2016.

More Posts Next page »

This Blog



Privacy Statement