THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

SQLBI - Marco Russo

SQLBI is a blog dedicated to building Business Intelligence solutions with SQL Server.
You can follow me on Twitter: @marcorus

  • How to obtain last process date from #SSAS #Tabular

    I wrote an article about how to get the Last Process Date in SSAS Tabular. This is a common request and there are many ways to obtain this information. You can use the metadata provided by XMLA or DMV, just as you can do in SSAS Multidimensional. Or you can use a calculated column that contains the DAX expression NOW() – this might seem an expensive approach, but remember that a column that has the same value for all the rows is absolutely not expensive, because it will be completely compressed in memory!.

    I suggest taking a look at the examples in the article, because it is simple to confuse the date of last schema update with the date of last data update, because sometime the schema is updated in an unexpected way…

  • Scheduled Data Refresh in Power BI #powerbi #powerpivot

    Now that Power BI is finally available (I still use terms as “released” for software and “in production” for a service…) I made some test on the new Scheduled Refresh option that you have for your models (other info here.

    First of all, the Refresh feature within the Excel workbook no longer works, at least when you are using Data Management Gateway – you need to schedule a refresh for that, and I think this is a good idea. The refresh time could be very slow and waiting 30 seconds or more without being able to interact with the workbook wasn’t good from the user experience point of view. Well done.

    Now, when you schedule the refresh, you can set a frequency that can be daily or weekly. No intra-day updates, unfortunately. I think you might workaround this by creating several copies of the same workbook and then defining a different daily schedule for each copy, at different hours. I think that a more frequent update could be a good idea for certain businesses. Maybe with some limitation, but if the workbook is small I don’t see the real reason for such a limitation (considering that it could incentivize bad behaviors like the workaround I just described).

    You can also request a manual refresh – the strange thing is that you can do that only after you defined a scheduled refresh (maybe I don’t need that, so I might create an unnecessary schedule just for a single manual refresh).

    SNAGHTML2f2d425

    I observed that when you request a manual refresh, in reality you just put a request in an internal queue somewhere on the cloud. After you request a refresh, you cannot request a second one, because one refresh is running.

    image

    At this point, you see the status Running in the History tab:

    image

    In reality, chances are that you are just waiting but nothing is running. In fact, I have seen my workbook refreshed after 10 minutes, but history log says that execution time was only 28 seconds. For this reason I assume that there is a queue that handles these requests on the cloud and manual refresh can take so long to finish. However, the second execution required less than 2 minutes, so I am not sure it was because I refreshed a workbook on my tenant for the first time (maybe some deployment was involved) or I was just unlucky and my queue was particularly slow at the previous request.

    image

    I look forward to see more data sources supported by scheduled data refresh, and in particular Power Query would be an important step for making Power BI adoption even larger.

  • Natural and Unnatural Hierarchies in #powerpivot and #tabular

    Last week I wrote an article about Natural Hierarchies in Power Pivot and Tabular where I show why it is important creating natural hierarchies for performance reasons. I’d like to spend a few words about the reasons why this is so important, so you might read the article in advance before moving forward.

    I had to investigate on a performance issue in a Tabular model of a customer. Initially, I wasn’t able to reproduce the issue creating a similar query in a test environment, because I didn’t focused my attention on the presence of the hierarchy, and I was generating similar queries without the hierarchy at all. Once I realized that the problem was present only in MDX and not in DAX, I started analyzing the query created by Excel and I found a number of workarounds teaching users how to avoid dangerous combinations of attributes on the rows of a pivot table. After more investigations and some help from people in MS (thanks Marius!) I realized that the problem was the use of unnatural hierarchies. I was aware of the problem (I always explain to use natural hierarchies whenever possible), but I was not aware of the order of magnitude of the problem in certain conditions, as I described in the article.

    So, I think it is interesting to highlight how to detect the problem when you just look at the Profiler. If the data model uses unnatural hierarchies, you will find in MDX code that certain members have names such as:

    [Date].[Calendar].[Year].&[2007].&[1]
    [Date].[Calendar].[Year].&[2007].&[1].&[1]
    [Date].[Calendar].[Year].&[2007].&[1].&[1].&[March]

    These names, when you use a natural hierarchy, will be:

    [Date].[Calendar].[Semester].&[H1-2007]

    [Date].[Calendar].[Quarter].&[Q1-2007]

    [Date].[Calendar].[Month].&[March 2007]

    The reason why performance might be slow and the query starts consuming memory is because of the Non Empty operation performed by the engine in order to eliminate tuples that are not included in the result. It is important to know that this does not happen for any query, and is very sensitive to the number of combinations resulting by the Cartesian product of all the attributes included in an axis of the MDX query. For example, if you use the Country name instead of the City in the example that I included in the article, also the unnatural hierarchy works in a reasonable time.

    Lesson learned: always use natural hierarchies also in Power Pivot and SSAS Tabular!

  • Distinct Count in DAX Query Plans #dax #tabular

    In the last months I and Alberto Ferrari worked on many consulting engagements about designing Tabular models or optimizing existing ones. One of the strength of the Tabular model compared to Multidimensional is the performance of distinct count measures. Now, being faster doesn’t necessarily mean getting an immediate answer, and several times we had to work on performance improvements for this type of measure. The matter is very complex and we are still not ready to publish a set of guidelines about how to optimize distinct count measures in Tabular.

    However, during our tests we made interesting discoveries about different query plans you can obtain for distinct count calculations, which can be made using different techniques in DAX. First, I want to warn you that there are no silver bullets. Each technique might be faster in certain conditions but slower in other, so a big mistake is thinking that you found a better solution than a DISTINCTCOUNT function call in DAX and you apply the alternative way to any measure in any model. As I said, don’t do that.

    Now, if you want to know more and want to discover how to read DAX query plans and how to measure the best option for your specific model, then this new paper written by Alberto Ferrari is for you:

    Understanding Distinct Count in DAX Query Plans

    We already made much more discoveries, but we need to investigate more in order to provide common guidelines. It would be very wrong publishing some best practices that only work in certain conditions, and worsen performance in many other. But if you have queries running in too many seconds (or minutes), please contact me: I’d like to gather more use cases on this topic!

  • Using Added Columns in #DAX Queries

    I had an issue in a query that reminded me a few details naming columns created by ADDCOLUMNS.

    When you specify a simple name in ADDCOLUMNS, you can use it later, for example in a SUMMARIZE or in a FILTER expression, such as in the following example (see the Value column):

    EVALUATE
    CALCULATETABLE (
        ADDCOLUMNS (
            ADDCOLUMNS (
                VALUES ( 'Date'[Date] ),
                "Value", [Internet Total Sales]
            ),
            "Frequency", COUNTROWS (
                FILTER (
                    VALUES ( 'Date'[Date] ),
                    [Internet Total Sales] <= [Value]
                )
            )
        ),
        'Date'[Calendar Year] = 2006,
        'Date'[Month] = 7
    )

    Naming an added column in this way makes it accessible only through its name, using the “measure” syntax, which doesn’t have a table name before the name of the column (please, remind that this is not a best practice – always use the table name before the column name when you reference a column and always omit the table name when you reference a measure!).

    So, a better way to write the previous expression could be the following one:

    EVALUATE
    CALCULATETABLE (
        ADDCOLUMNS (
            ADDCOLUMNS (
                VALUES ( 'Date'[Date] ),
                "’Date’[Value]", [Internet Total Sales]
            ),
            "Frequency", COUNTROWS (
                FILTER (
                    VALUES ( 'Date'[Date] ),
                    [Internet Total Sales] <= ‘Date’[Value]
                )
            )
        ),
        'Date'[Calendar Year] = 2006,
        'Date'[Month] = 7
    )

    Until now, there is nothing really new. It could be just a good reminder. Now, for some reason a few days ago I wrote this:

    EVALUATE
    CALCULATETABLE (
        ADDCOLUMNS (
            ADDCOLUMNS (
                VALUES ( 'Date'[Date] ),
                "[Value]", [Internet Total Sales]
            ),
            "Frequency", COUNTROWS (
                FILTER (
                    VALUES ( 'Date'[Date] ),
                    [Internet Total Sales] <= [Value]
                )
            )
        ),
        'Date'[Calendar Year] = 2006,
        'Date'[Month] = 7
    )

    In this case executing the query you receive an error that says that the Value column is not found. The problem is that I used "[Value]" instead of  "Value". So the lesson is: be careful when you name a column, don’t use square brackets unless you want to include the table name, which is a good idea so you will not confuse its semantic with a measure when you reference such a column later in your expression.

  • Training for DAX, Tabular and Power Pivot in 2014 #powerpivot #dax

    I hadn’t time in the last two months for new blog posts, but with some good reasons! I’ve been busy putting DAX Formatter online and writing some other stuffs that I will be happy to share very soon, aside from daily engagements with customers, of course. But the waiting is almost over, and in the meantime I’d like to share the schedule for upcoming training about Tabular, Power Pivot and DAX.

    • Advanced DAX Workshop: this is the most advanced course about DAX available today. It is aimed at Advanced Power Pivot users and Analysis Services developers who want to master the DAX language and improve their skills in performance optimization. The course includes hands-on lab sessions with exercises for creating queries, solving business problems and locating performance bottlenecks in DAX. Only me or Alberto Ferrari deliver this workshop and at the moment we have this classrooms scheduled in Europe (hopefully we’ll have news about US later…):
    • SSAS Tabular Workshop: we extended the length of this workshop in classrooms to 3 days, so that now it includes also hands-on-labs. This course is suggested to BI developer who still don’t have hands-on experience on Tabular models. Previous experience in Multidimensional is not required, if you know Multidimensional you will simply have more questions that we’ll be happy to answer! The workshop is delivered only be me or Alberto Ferrari and at the moment we scheduled classrooms only in Europe, but we also have an online edition optimized for America’s Time Zones:
    • Power Pivot Workshop: this workshop is delivered online or in classroom at training centers around the world. It is aimed at Excel users who want to learn Power Pivot. I and Alberto Ferrari provides the online workshops, which are optimized for different time zones:

    I will write another blog post in a few weeks about my speeches in conferences and communities events – but other blog posts will be about technical content, I promise! I’m just working on a couple of articles… stay tuned!

  • Power Query now imports relationships #powerquery #powerpivot

    The December release of Power Query preview (version 2.9.3502.261) introduced an interesting feature: when you import multiple tables in the data model, relationships existing between tables in the data source are automatically detected and added to the data model as well. This is similar to the behavior you have in Power Pivot, but there is an important difference. When you import one or more tables in a data model where you already imported other tables before, the relationships are detected also between tables already imported and new tables selected for import. As you might know, Power Pivot detects relationships only between tables imported at once, not between tables imported and other tables already in the data model, even when they comes from the same data source.

    The detection in Power Query doesn’t work between tables imported in data model directly by Power Pivot. It only works between tables that have been imported using Power Query.

    I didn’t test the performance (I can imagine there is some price to pay using Power Query instead of a direct connection), but it’s clear that in the long term Power Query should be THE tool to import data in Power Pivot, or better in Excel, removing the overriding with existing functionalities that do similar things. I really like the progress that Power Query is doing, and I hope that the best is yet to come…

  • Format your #dax code with DAX Formatter

    When you write a DAX formula that is not trivial, you end up having a complex expression that is not really readable. I think that the same problem affect in part all Excel users, but in Excel you can easily split a formula in many steps, using different cells. This is not always possible in Tabular, because moving a sub-expression in a separate measure might change the semantic, and this operation is possible only for scalar values and not for function tables. So we need to write readable code.

    One year ago, we published the Rules for DAX Code Formatting. Today, I’m prod to announce the availability of DAX Formatter, a free service that automatically format the code following these rules. We designed the service to be easy to use. Type the DAX expression (or paste it) and press Format. That’s it.

    http://www.daxformatter.com/

    For those of you reading offline or too lazy to click on a link, here is how it works. Suppose you have this DAX query (but you can format also expressions for measures and calculated columns):

    EVALUATE ADDCOLUMNS(FILTER(CROSSJOIN(
    SUMMARIZE(Customer,Customer[Customer],Customer[Customer.Key0]),
    VALUES('Date'[Date.Key0])),NOT ISBLANK([Internet Sales Amount])),
    "Sales",[Internet Sales Amount])
    ORDER BY 'Date'[Date.Key0] DESC,Customer[Customer] ASC

    You paste it into DAX Formatter and you obtain this result:

    EVALUATE
    ADDCOLUMNS (
        FILTER (
            CROSSJOIN (
                SUMMARIZE (
                    Customer,
                    Customer[Customer],
                    Customer[Customer.Key0]
                ),
                VALUES ( 'Date'[Date.Key0] )
            ),
            NOT ISBLANK ( [Internet Sales Amount] )
        ),
        "Sales", [Internet Sales Amount]
    )
    ORDER BY 'Date'[Date.Key0] DESC, Customer[Customer] ASC

    Which is also more readable in the web site thanks to the layout available (you can choose between dark and light color scheme):

    SNAGHTML609884

     

    SNAGHTML60d8e8

    And if you use non-US locale, just select the separators for you.
    Please, Submit any bug using the Bug Report button.
    Currently you can format any DAX query and expression in the form:

    • EVALUATE …
    • Table[measure] := …
    • [measure] := …
    • Table[column] = …
    • = …
    • measure := …

    Feedback are welcome – enjoy!

  • Videos about #DAX on Project Botticelli

    More than one year ago, I and Alberto started recording videos for Project Botticelli, and now we have a set of videos about DAX that you can watch online. There are a few videos free, and others are available in the monthly subscription. If you are interested, use this 20% discount code before the end of the December 2013: SQLBI20HOLS2013

    The modules available now are the following:

    1. DAX in Action! Free!
    2. Calculated Columns vs. Measures Free!
    3. Introduction to DAX
    4. Evaluation Context
    5. CALCULATE Function
    6. Querying with DAX
    7. What Is Time Intelligence? Free!
    8. Time Intelligence in Depth
    9. Many-to-many Relationships in DAX Free!
    10. DAX Patterns: Banding, New vs Old, Many-to-many
    11. How to Calculate with Hierarchies in DAX? Free!
    12. Hierarchies in DAX

    If you work with DAX… well, there will be a nice surprise in a few days, certainly before Christmas – stay tuned!

    You can find on Project Botticelli other series of videos about BI in Excel (by Carmel Gunn), MDX (by Chris Webb) and Data Mining (by Rafal Lukawiecki).

  • Updates about Multidimensional vs Tabular #ssas #msbi

    I recently read the blog post from James Serra Tabular model: Not ready for prime time? (read also the comments because there are discussions about a few points raised by James) and the following post from Christian Wade Multidimensional or Tabular. In the last 2 years I worked with many companies adopting Tabular in different scenarios and I agree with some of the points expressed by James in his post (especially about missing features in Tabular if compared to Multidimensional), but I strongly disagree in others.

    In general, Tabular is a good choice for a new project when:

    • the development team does not have a good knowledge of Multidimensional and MDX (DAX is faster to learn, not so easy as it is sold by MS, but definitely easier than MDX)
    • you don’t need calculations based on hierarchies (common in certain financial applications, but not so common as it could seem)
    • there are important calculations based on distinct count measures
    • there are complex calculations based on many-to-many relationships

    Until now, I never suggested to migrate an existing Multidimensional model to a Tabular one. There should be very important reasons for that, such as performance issues in distinct count and many-to-many relationships that cannot be easily solved by optimizing the Multidimensional model, but I still never encountered this scenario.

    I would say that in 80% of the new projects, you might use either Multidimensional or Tabular and the real difference is the time-to-market depending on the skills of the development team. So it’s not strange that who is used to Multidimensional is not moving to Tabular, not getting a particular benefit from the new model unless specific requirements exist. The recent DAXMD feature that allows using SharePoint Power View on Multidimensional is a really important one, even if I’d like having also Excel Power View enabled for this scenario (this should be just a question of time).

    Another scenario in which I’m seeing a growing adoption of Tabular is in companies that creates models for their product/service and do that by using XMLA or Tabular AMO 2012. I am used to call them ISVs, even if those providing services cannot be really defined in this way. These companies are facing the multitenancy challenge with Tabular and even if this is a niche market, I see some potential here, because adopting Tabular seems a much more natural choice than Multidimensional in those scenario where an analytical engine has to be embedded to deliver one of the features of a larger product/service delivered to customers.

    I’d like to see other feedbacks in the comments: tell your story of choosing between Tabular and Multidimensional in a BI project you started with SQL Server 2012, thanks!

  • PowerShell Script to Deploy Multiple VM on Azure in Parallel #azure #powershell

    This blog is usually dedicated to Business Intelligence and SQL Server, but I didn’t found easily on the web simple PowerShell scripts to help me deploying a number of virtual machines on Azure that I use for testing and development. Since I need to deploy, start, stop and remove many virtual machines created from a common image I created (you know, Tabular is not part of the standard images provided by Microsoft…), I wanted to minimize the time required to execute every operation from my Windows Azure PowerShell console (but I suggest you using Windows PowerShell ISE), so I also wanted to fire the commands as soon as possible in parallel, without losing the result in the console.

    In order to execute multiple commands in parallel, I used the Start-Job cmdlet, and using Get-Job and Receive-Job I wait for job completion and display the messages generated during background command execution. This technique allows me to reduce execution time when I have to deploy, start, stop or remove virtual machines. Please note that a few operations on Azure acquire an exclusive lock and cannot be really executed in parallel, but only one part of their execution time is subject to this lock. Thus, you obtain a better response time also in these scenarios (this is the case of the provisioning of a new VM).

    Finally, when you remove the VMs you still have the disk containing the virtual machine to remove. This cannot be done just after the VM removal, because you have to wait that the removal operation is completed on Azure. So I wrote a script that you have to run a few minutes after VMs removal and delete disks (and VHD) no longer related to a VM. I just check that the disk were associated to the original image name used to provision the VMs (so I don’t remove other disks deployed by other batches that I might want to preserve).

    These examples are specific for my scenario, if you need more complex configurations you have to change and adapt the code. But if your need is to create multiple instances of the same VM running in a workgroup, these scripts should be good enough.

    I prepared the following PowerShell scripts:

    • ProvisionVMs: Provision many VMs in parallel starting from the same image. It creates one service for each VM.
    • RemoveVMs: Remove all the VMs in parallel – it also remove the service created for the VM
    • StartVMs: Starts all the VMs in parallel
    • StopVMs: Stops all the VMs in parallel
    • RemoveOrphanDisks: Remove all the disks no longer used by any VMs. Run this script a few minutes after RemoveVMs script.

    ProvisionVMs

    # Name of subscription

    $SubscriptionName = "Copy the SubscriptionName property you get from Get-AzureSubscription"

     

    # Name of storage account (where VMs will be deployed)

    $StorageAccount = "Copy the Label property you get from Get-AzureStorageAccount"

     

    function Provision-VM( [string]$VmName ) {

        Start-Job -ArgumentList $VmName {

            param($VmName)

    $Location = "Copy the Location property you get from Get-AzureStorageAccount"

    $InstanceSize = "A5" # You can use any other instance, such as Large, A6, and so on

    $AdminUsername = "UserName" # Write the name of the administrator account in the new VM

    $Password = "Password"      # Write the password of the administrator account in the new VM

    $Image = "Copy the ImageName property you get from Get-AzureVMImage"

    # You can list your own images using the following command:

    # Get-AzureVMImage | Where-Object {$_.PublisherName -eq "User" }

            New-AzureVMConfig -Name $VmName -ImageName $Image -InstanceSize $InstanceSize |

                Add-AzureProvisioningConfig -Windows -Password $Password -AdminUsername $AdminUsername|

                New-AzureVM -Location $Location -ServiceName "$VmName" -Verbose

        }

    }

     

    # Set the proper storage - you might remove this line if you have only one storage in the subscription

    Set-AzureSubscription -SubscriptionName $SubscriptionName -CurrentStorageAccount $StorageAccount

     

    # Select the subscription - this line is fundamental if you have access to multiple subscription

    # You might remove this line if you have only one subscription

    Select-AzureSubscription -SubscriptionName $SubscriptionName

     

    # Every line in the following list provisions one VM using the name specified in the argument

    # You can change the number of lines - use a unique name for every VM - don't reuse names

    # already used in other VMs already deployed

    Provision-VM "test10"

    Provision-VM "test11"

    Provision-VM "test12"

    Provision-VM "test13"

    Provision-VM "test14"

    Provision-VM "test15"

    Provision-VM "test16"

    Provision-VM "test17"

    Provision-VM "test18"

    Provision-VM "test19"

    Provision-VM "test20"

     

    # Wait for all to complete

    While (Get-Job -State "Running") {

        Get-Job -State "Completed" | Receive-Job

        Start-Sleep 1

    }

     

    # Display output from all jobs

    Get-Job | Receive-Job

     

    # Cleanup of jobs

    Remove-Job *

     

    # Displays batch completed

    echo "Provisioning VM Completed"

    RemoveVMs

    # Name of subscription

    $SubscriptionName = "Copy the SubscriptionName property you get from Get-AzureSubscription"

     

    function Remove-VM( [string]$VmName ) {

        Start-Job -ArgumentList $VmName {

            param($VmName)

            Remove-AzureService -ServiceName $VmName -Force -Verbose

        }

    }

     

    # Select the subscription - this line is fundamental if you have access to multiple subscription

    # You might remove this line if you have only one subscription

    Select-AzureSubscription -SubscriptionName $SubscriptionName

     

    # Every line in the following list remove one VM using the name specified in the argument

    # You can change the number of lines - use a unique name for every VM - don't reuse names

    # already used in other VMs already deployed

    Remove-VM "test10"

    Remove-VM "test11"

    Remove-VM "test12"

    Remove-VM "test13"

    Remove-VM "test14"

    Remove-VM "test15"

    Remove-VM "test16"

    Remove-VM "test17"

    Remove-VM "test18"

    Remove-VM "test19"

    Remove-VM "test20"

     

    # Wait for all to complete

    While (Get-Job -State "Running") {

        Get-Job -State "Completed" | Receive-Job

        Start-Sleep 1

    }

     

    # Display output from all jobs

    Get-Job | Receive-Job

     

    # Cleanup

    Remove-Job *

     

    # Displays batch completed

    echo "Remove VM Completed"

    StartVMs

    # Name of subscription

    $SubscriptionName = "Copy the SubscriptionName property you get from Get-AzureSubscription"

     

    function Start-VM( [string]$VmName ) {

        Start-Job -ArgumentList $VmName {

            param($VmName)

            Start-AzureVM -Name $VmName -ServiceName $VmName -Verbose

        }

    }

     

    # Select the subscription - this line is fundamental if you have access to multiple subscription

    # You might remove this line if you have only one subscription

    Select-AzureSubscription -SubscriptionName $SubscriptionName

     

    # Every line in the following list starts one VM using the name specified in the argument

    # You can change the number of lines - use a unique name for every VM - don't reuse names

    # already used in other VMs already deployed

    Start-VM "test10"

    Start-VM "test11"

    Start-VM "test11"

    Start-VM "test12"

    Start-VM "test13"

    Start-VM "test14"

    Start-VM "test15"

    Start-VM "test16"

    Start-VM "test17"

    Start-VM "test18"

    Start-VM "test19"

    Start-VM "test20"

     

    # Wait for all to complete

    While (Get-Job -State "Running") {

        Get-Job -State "Completed" | Receive-Job

        Start-Sleep 1

    }

     

    # Display output from all jobs

    Get-Job | Receive-Job

     

    # Cleanup

    Remove-Job *

     

    # Displays batch completed

    echo "Start VM Completed"

     

    StopVMs

    # Name of subscription

    $SubscriptionName = "Copy the SubscriptionName property you get from Get-AzureSubscription"

     

    function Stop-VM( [string]$VmName ) {

        Start-Job -ArgumentList $VmName {

            param($VmName)

            Stop-AzureVM -Name $VmName -ServiceName $VmName -Verbose -Force

        }

    }

     

    # Select the subscription - this line is fundamental if you have access to multiple subscription

    # You might remove this line if you have only one subscription

    Select-AzureSubscription -SubscriptionName $SubscriptionName

     

    # Every line in the following list stops one VM using the name specified in the argument

    # You can change the number of lines - use a unique name for every VM - don't reuse names

    # already used in other VMs already deployed

    Stop-VM "test10"

    Stop-VM "test11"

    Stop-VM "test12"

    Stop-VM "test13"

    Stop-VM "test14"

    Stop-VM "test15"

    Stop-VM "test16"

    Stop-VM "test17"

    Stop-VM "test18"

    Stop-VM "test19"

    Stop-VM "test20"

     

    # Wait for all to complete

    While (Get-Job -State "Running") {

        Get-Job -State "Completed" | Receive-Job

        Start-Sleep 1

    }

     

    # Display output from all jobs

    Get-Job | Receive-Job

     

    # Cleanup

    Remove-Job *

     

    # Displays batch completed

    echo "Stop VM Completed"

    RemoveOrphanDisks

    $Image = "Copy the ImageName property you get from Get-AzureVMImage"

    # You can list your own images using the following command:

    # Get-AzureVMImage | Where-Object {$_.PublisherName -eq "User" }

     

    # Remove all orphan disks coming from the image specified in $ImageName

    Get-AzureDisk |

        Where-Object {$_.attachedto -eq $null -and $_.SourceImageName -eq $ImageName} |

        Remove-AzureDisk -DeleteVHD -Verbose

     

  • Difference between LASTDATE and MAX for semi-additive measures in #DAX

    I recently wrote an article on SQLBI about the semi-additive measures in DAX. I included the formulas common calculations and there is an interesting point that worth a longer digression: the difference between LASTDATE and MAX (which is similar to FIRSTDATE and MIN – I just describe the former, for the latter just replace the correspondent names).

    LASTDATE is a dax function that receives an argument that has to be a date column and returns the last date active in the current filter context. Apparently, it is the same value returned by MAX, which returns the maximum value of the argument in the current filter context. Of course, MAX can receive any numeric type (including date), whereas LASTDATE only accepts a column of type date. But overall, they seems identical in the result. However, the difference is a semantic one. In fact, this expression:

    LASTDATE ( 'Date'[Date] )

    could be also rewritten as:

    FILTER ( VALUES ( 'Date'[Date] ), 'Date'[Date] = MAX ( 'Date'[Date] ) )

    LASTDATE is a function that returns a table with a single column and one row, whereas MAX returns a scalar value. In DAX, any expression with one row and one column can be automatically converted into the corresponding scalar value of the single cell returned. The opposite is not true. So you can use LASTDATE in any expression where a table or a scalar is required, but MAX can be used only where a scalar expression is expected.

    Since LASTDATE returns a table, you can use it in any expression that expects a table as an argument, such as COUNTROWS. In fact, you can write this expression:

    COUNTROWS ( LASTDATE ( 'Date'[Date] ) )

    which will always return 1 or BLANK (if there are no dates active in the current filter context). You cannot pass MAX as an argument of COUNTROWS.

    You can pass to LASTDATE a reference to a column or any table expression that returns a column. The following two syntaxes are semantically identical:

    LASTDATE ( 'Date'[Date] )
    LASTDATE ( VALUES ( 'Date'[Date] ) )

    The result is the same and the use of VALUES is not required because it is implicit in the first syntax, unless you have a row context active. In that case, be careful that using in a row context the LASTDATE function with a direct column reference will produce a context transition (the row context is transformed into a filter context) that hides the external filter context, whereas using VALUES in the argument preserve the existing filter context without applying the context transition of the row context (see the columns LastDate and Values in the following query and result).

    You can use any other table expressions (including a FILTER) as LASTDATE argument. For example, the following expression will always return the last date available in the Date table, regardless of the current filter context:

    LASTDATE ( ALL ( 'Date'[Date] ) )

    The following query recap the result produced by the different syntaxes described.

    EVALUATE
        CALCULATETABLE
    (
            ADDCOLUMNS

                VALUES ('Date'[Date] ),
                "LastDate", LASTDATE( 'Date'[Date] ),
                "Values", LASTDATE( VALUES ( 'Date'[Date] ) ),
                "Filter", LASTDATE( FILTER ( VALUES ( 'Date'[Date] ), 'Date'[Date] = MAX ( 'Date'[Date] ) ) ),
                "All", LASTDATE( ALL ( 'Date'[Date] ) ),
                "Max", MAX( 'Date'[Date] )
            ),
            'Date'[Calendar Year] = 2008
        )
    ORDER BY 'Date'[Date]

    The LastDate columns repeat the current date, because the context transition happens within the ADDCOLUMNS. The Values column preserve the existing filter context from being replaced by the context transition, so the result corresponds to the last day in year 2008 (which is filtered in the external CALCULATETABLE). The Filter column works like the Values one, even if we use the FILTER instead of the LASTDATE approach. The All column shows the result of LASTDATE ( ALL ( ‘Date’[Date] ) ) that ignores the filter on Calendar Year (in fact the date returned is in year 2010). Finally, the Max column shows the result of the MAX formula, which is the easiest to use and only don’t return a table if you need it (like in a filter argument of CALCULATE or CALCULATETABLE, where using LASTDATE is shorter).

    image

    I know that using LASTDATE in complex expressions might create some issue. In my experience, the fact that a context transition happens automatically in presence of a row context is the main reason of confusion and unexpected results in DAX formulas using this function. For a reference of DAX formulas using MAX and LASTDATE, read my article about semi-additive measures in DAX.

  • Using USERELATIONSHIP function in Measures and Calculated Columns #dax #powerpivot #ssas

    You can use the USERELATIONSHIP function in DAX to apply a non-active relationship in a particular DAX calculation. This approach is usually simple in a measure (just use USERELATIOSHIP in one of the filter arguments of CALCULATE) but as soon as you try using it in a calculated column, you can find several issues.

    In the article USERELATIONSHIP in Calculated Columns on SQLBI web site I described how to correctly write USERELATIONSHIP in this scenario, considering both sides of the relationship. AS you will see, using LOOKUPVALUE instead of USERELATIONSHIP can simplify your DAX code when you access to a lookup table!

  • SSAS Tabular Workshop in 3 days with hands-on-labs #ssas #tabular #dax

    I and Alberto Ferrari delivered many SSAS Tabular Workshops in the last 18 months and based on the feedback we received, we are moving toward a renewed edition. Instead of a 2-day seminar, we will run a 3-day course including hands-on-labs. We didn't change the content (after all, the product is still the same), but it is a common request having time for making practice with the product, creating tabular models and solving DAX problems.

    We will deliver the first 3-day edition in Milan on October 29-31, 2013 and the course will be in Italian language, but we are working on delivering the workshop in English in other countries. You can contact me if you are interested in attending the course in particular countries or on-site at your company.

  • Handling Wrong or Missing Dates in Tabular and #powerpivot #dax

    Recently I received a request from a reader related to the handling of missing or invalid dates in Tabular (but the same can be applied to Power Pivot). In the Analysis Services Tabular book, we wrote that you can insert a special row in the Date table with an ID that can be used to map all the rows in the fact table that don’t have a valid date. Unfortunately, using NULL in the Date column is not compatible with the “Mark as Date Table” feature of the data model, which is necessary for Time Intelligence functions in DAX. The alternative is between using a special date (such as January 1, 1900) or removing the row at all from the Tabular / Power Pivot data model.

    I described this last approach in my article Handling Wrong or Missing Dates in Tabular on SQLBI web site.
    What is your favorite approach? Comments are welcome!

This Blog

Syndication

Archives

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement