Tuesday, 30 December 2014

SQL Server - Current Execution Statistics

Happy holidays! Most of the times we have heard that the queries are not working as expected. Such simple statements just puts the investigation back to the DBAs or the SQL developers. The most common problem is what exactly is happening in SQL Server. Thanks to DMVs that this question can easily be answered without having admin permissions. What's needed is - VIEW SERVER STATE permission.
Query in below link file shows the snapshot of SQL Server at that particular time including CPU Time, Memory blocks, I/O issues, exact query being executed, query plan being used and much more. This have helped me a lot in the past and hope it helps you guys as well. Shoot me the questions which you think are not answered using this query and I'll try to answer them with best efforts.

Sunday, 21 December 2014

SSAS Tabular Model - Drillthrough & Other Actions

BI dashboards, reports etc. throw some random numbers and graphs on the screen but the data analysts always wants to know that how these numbers are being cooked up. SSAS provides actions as one of the method to show that. Although actions are far more powerful than just showing drillthrough. They can add lots and lots of context to the data being shown.
With tabular model actions were sort of difficult but with the help of BIDS Helper it can be done with lots of ease. Here is the reference link showing the use.
http://bidshelper.codeplex.com/wikipage?title=Tabular%20Actions%20Editor

Finding Outliers - Simple Statistical Way

Outliers? Why do we need them?
Too big or too small value is outlier in statistics. We need to consider outliers separately for good business decisions and beautiful graphs. Business decisions are mostly taken around (a) most frequent case OR around center of the population and (b) exceptions i.e. outliers and they are far away points from the center. My take is outliers should never be rejected from data but be treated as special cases for further analysis. Outliers may open new doors of possibilities.
The most common way to find an outlier is by using quartiles or percentiles. Quartile 1(Q1) == 25th Percentile,Quartile 2(Q2) == Median == 50th Percentile & Quartile 3(Q3) == 75th Percentile. To find out the accepted data range we need to find the spread of the data and most of the times it is done using either by Standard Variation or by IQR(Interquartile Range) i.e. (Q3-Q1) in simple way. Below figure gives the simple formula to find the outlier.
In simplest way any data point which is out of the (Q1 - 1.5 * IQR) and (Q3 + 1.5 * IQR) they can be treated for outlier analysis. There are many other methods but this one makes most sense based on common business analysis. This 1.5 is statistically negotiated value i.e. constant.
Here are a couple of examples from web search :)

Sunday, 14 December 2014

Don't disable SQL Jobs like ...

How to disable a SQL Server job?
This can't be a good enough question to post a solution. But sometimes it is. In the recent past we figured out that the SQL Server jobs are disabled but they were starting at their scheduled time. Amazing! What a f**k is happening? Why is this happening? Is SQL Server is not reliable enough?
One of the curious operations support guy figured out that the jobs were disabled by updating the metadata using the query "UPDATE [dbo].[sysjobs] SET [enabled] = 0 WHERE name IN (...)". The "Job Activity Monitor" showed that all the jobs are disabled. What's wrong then?
SQL Agent uses caching and caches the job status. Simple UPDATE statement will simply change the metadata and not change the cache. So it is recommended to use the the UI from SSMS or call sp_update_job. This method updates the cache as well.

Monday, 28 July 2014

Verifying Credit Card Numbers

Recently we came across an interesting problem of credit card theft and as part of the investigation we found tons of digits which matches the pattern of the credit card. After finding that bunch of data it was difficult to report the correct numbers. Any random number of 16 digits or 15 digits and so on can be either a credit card or some identifier or some dummy number into the system. The problem was really interesting.

After some research on web we found that most of the credit cards companies rather than generating any random number uses some algorithms to generate the number. Luhn test is one of the widely accepted algorithm to distinguish the credit card number. The below mentioned steps and SQL Code can be used to validate a credit card number. We are taking 49927398716 number as an example for the below steps.

Luhn’s Test:
  1. Get the credit card number. For our example we are assuming 49927398716 as the given number.
  2. Reverse the digits of the given number. For our example it will become like 61789372994.
  3. Take the sum of all the odd digits in the reversed number. Which is as 6 + 7 + 9 + 7 + 9 + 4 = 42.
  4. Take all the even digits. E.g. 1,  8,  3,  2,  9 and multiply each one of them with two. For our example the digits will become like 2, 16, 6, 4, 18.
  5. Wherever the number is greater than nine sum the digits of the number to make it a single digit number. The number s in step 4 will become like 2, 7, 6, 4, 9.
  6. Sum the numbers in the step 5. E.g. 2 + 7 + 6 + 4 + 9 = 28.
  7. Now sum the number in the step 3 and step 6. E.g. 28 + 42 = 70
  8. If the sum ends in zero then it can be a valid credit card number. In other words if the remainder of the Step 7 divided by 10 is zero than it can be a valid credit card number. So in our case 70%10 = 0 and hence it can be a valid credit card number.


Sample SQL Code for Luhn’s Test:

Here we have created a T-SQL function to verify the above test but it can be implemented in any other manner as well.


DECLARE
    @cardNo NVARCHAR(20) = '49927398716'
    ,@reverseCardNo NVARCHAR(20)
    ,@sum SMALLINT
    ,@number SMALLINT
    ,@counter SMALLINT

SELECT
    @cardNo = LTRIM(RTRIM(@cardNo))
    ,@reverseCardNo = REVERSE(@cardNo)
    ,@sum = 0
    ,@counter = 1

WHILE(@counter <= LEN(@cardNo))
BEGIN
    SELECT
        @number = TRY_CONVERT(SMALLINT, LEFT(@reverseCardNo, 1))

    IF(@counter%2 = 0) --EVEN NUMBER
    BEGIN
        SELECT
            @number *= 2
        IF (LEN(@number) > 1)
        BEGIN
            SELECT
                @number = TRY_CONVERT(SMALLINT, LEFT(@number, 1)) + TRY_CONVERT(SMALLINT, RIGHT(@number, 1))
        END
    END
   
    SELECT
        @sum += @number
        ,@counter += 1
        ,@reverseCardNo = RIGHT(@reverseCardNo, LEN(@reverseCardNo)-1)
END

IF(@sum > 0 AND @sum%10 = 0)
    PRINT CONCAT('Test passed. ', @cardNo, ' can be a potential card number.')
ELSE
    PRINT CONCAT('Test failed. ', @cardNo, ' invalid card number.')