THE SQL Server Blog Spot on the Web

Welcome to - The SQL Server blog spot on the web Sign in | |
in Search

Browse by Tags

All Tags » statistics   (RSS)
Showing page 1 of 5 (41 total posts)
  • Data Mining Algorithms – Naive Bayes

    I am continuing with my data mining and machine learning algorithms series. Naive Bayes is a nice algorithm for classification and prediction. It calculates probabilities for each possible state of the input attribute, given each state of the predictable attribute, which can later be used to predict an outcome of the predicted attribute based on ...
    Posted to Dejan Sarka (Weblog) by Dejan Sarka on September 9, 2015
  • Data Mining Algorithms – Support Vector Machines

    Support vector machines are both, unsupervised and supervised learning models for classification and regression analysis (supervised) and for anomaly detection (unsupervised). Given a set of training examples, each marked as belonging to one of categories, an SVM training algorithm builds a model that assigns new examples into one category. An SVM ...
    Posted to Dejan Sarka (Weblog) by Dejan Sarka on June 23, 2015
  • Data Mining Algorithms – K-Means Clustering

    Hierarchical clustering could be very useful because it is easy to see the optimal number of clusters in a dendrogram and because the dendrogram visualizes the clusters and the process of building of that clusters. However, hierarchical methods don’t scale well. Just imagine how cluttered a dendrogram would be if 10,000 cases would be shown on ...
    Posted to Dejan Sarka (Weblog) by Dejan Sarka on April 17, 2015
  • Data Mining Algorithms – Hierarchical Clustering

    Clustering is the process of grouping the data into classes or clusters so that objects within a cluster have high similarity in comparison to one another, but are very dissimilar to objects in other clusters. Dissimilarities are assessed based on the attribute values describing the objects. There are a large number of clustering algorithms. The ...
    Posted to Dejan Sarka (Weblog) by Dejan Sarka on March 28, 2015
  • T-SQL Querying

    We are close to the publishing day of the T-SQL Querying book. Of course, like always in this series, the main author of the book is Itzik Ben-Gan. This time, besides me, Adam Machanic and Kevin Farlee are the coauthors. The information I want to share now is that you can get a substantial discount if you preorder the book today, Monday, February ...
    Posted to Dejan Sarka (Weblog) by Dejan Sarka on February 16, 2015
  • Geek City: Creating New Tools for Statistics

      I was just starting to work on a post on column statistics, using one of my favorite metadata functions: sys.dm_db_stats_properties(), when I realized something was missing. The function requires a stats_id value, which is available from sys.stats. However, sys.stats does not show the column names that each statistics object is attached ...
    Posted to Kalen Delaney (Weblog) by Kalen Delaney on January 27, 2015
  • SQL TuneIn Zagreb 2014 – Session material

    I spent the last few days in Zagreb, Croatie, at the third edition of the SQL TuneIn conference, and I had a very good time here. Nice company, good sessions, and awesome audiences. I presented my “Understanding Execution Plans” precon to a small but interested audience on Monday. Participants have received a download link for the slide deck. On ...
  • Fake statistics, and how to get rid of them

    There are two ways to test how your queries behave on huge amounts of data. The simple option is to actually use them on huge amounts of data – but where do you get that if you have no access to the production database, and how do you store it if you happen not to have a multi-terabyte storage array sitting in your basement? So here’s the second ...
  • Skewed Data - Poor Cardinality Estimates... and Plans Gone Bad

    The session Skewed Data, Poor Cardinality Estimates, and Plans Gone Bad by Kimberly Tripp (@KimberlyLTripp) has been published on channel SQLPASS TV.  Abstract When data distribution is heavily skewed, cardinality estimation (how many rows the query optimizer expects each operator to process) can be wildly incorrect, resulting in ...
    Posted to Sergio Govoni (Weblog) by Sergio Govoni on February 14, 2014
  • Fraud Detection with the SQL Server Suite Part 5

    This is the fifth, the final part of the fraud detection whitepaper. You can find the first part, the second part, the third part, and the fourth part in my previous blog posts about this topic. The Results In my original fraud detection whitepaper I wrote for SolidQ, I was advised by my friends to include some concrete and simple numbers to ...
    Posted to Dejan Sarka (Weblog) by Dejan Sarka on January 6, 2014
1 2 3 4 5 Next >
Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement