Thursday, May 28, 2015

Webinar - June 4, 2015: ANALYTICS DASHBOARD DESIGN by Senturus

The Science Behind Effective Dashboard Design

Learn how to strategically design analytics dashboards to bring actionable insights to light. During this webinar, Senturus Co-Founder John Peterson will explore the science behind creating effective dashboards, including the challenges, opportunities, and human factors that come into play. Using this science as the basis, John will give his top 13 tips to optimize dashboard visualizations.

PRESENTER
John Peterson
CEO and Co-Founder
Senturus, Inc.


John is the company's thought leader and visionary. John directs the delivery of all projects with Senturus, providing the bridge of technical and business understanding.


Thursday, June 4, 2015
10 AM PT/1 PM ET
90 minutes

Saturday, May 9, 2015

SSRS: Can Reporting Services handle more than 210 columns??? YES IT CAN!!!

Hi Everyone, this is not one of my more interesting posts but it certainly answers a question that's bothered me in recent weeks. Can SSRS handle rendering over 210 columns??? Yes, 210! 

Reluctantly, I went about doing this enormously tedious exercise. But what worried me was whether Reporting Services "could actually do it, whether it could handle it". So I began searching on the net on any such limitations or someone who had encountered such an issue. Interestingly, I couldn't find anyone who had posted a similar kind of question. Maybe, I was the first, I thought. So I was forced to knuckle down and do the did. 

Lo and behold, Reporting Services can do it!!! It can handle a report with over 210 columns. I don't know about anything greater than 211, but I can certainly tell you that it handle 211!

So to my fellow SSRS Developers out there, see if you can beat that!!! :)


Thursday, April 23, 2015

Kimball University: Three ETL Compromises to Avoid

Greetings,

It's been a few years since I've posted on this blog (certainly something I'm not proud about). But I've been quite busy since then, have grown up a little, had my first kid (his name is Quilz too) and have moved to a different city, province and to a different company all together. I guess you can say things have for changed for Quilz in a major way :). 

Talking about moving companies, a colleague of mine has been charged with implementing SCD 2(s) on one of the first dimensions in our Enterprise Data Warehouse (EDW). This is after we, and I've include myself in this even though I joined the company after the initial EDW project roll out, have completed the first phase of our EDW roll out. He describes what he's going through as a "painful" and a "nightmare". So with out wasting anymore of your time let's get into the "meat" of these compromises that could be the undoing of your EDW initiative. 

Please note, I've copied this article as it is, from the Kimball Group website, so as not to dilute what the experts say. I've added the above paragraph as some sort of testimony to what the experts say. So here goes...
© Kimball Group. All rights reserved.

Whether you are developing a new dimensional data warehouse or replacing an existing environment, the ETL (extract, transform, load) implementation effort is inevitably on the critical path. Difficult data sources, unclear requirements, data quality problems, changing scope, and other unforeseen problems often conspire to put the squeeze on the ETL development team. It simply may not be possible to fully deliver on the project team’s original commitments; compromises will need to be made. In the end, these compromises, if not carefully considered, may create long-term headaches.
In my last article on “Six Key Decisions for ETL Architectures,” I described the decisions ETL teams face when implementing a dimensional data warehouse. This article focuses on three common ETL development compromises that cause most of the long-term problems around dimensional data warehouses. Avoiding these compromises will not only improve the effectiveness of your ETL implementation, but will also increase the likelihood of overall DW/BI success.

Compromise 1: Neglecting slowly changing dimension requirements

Kimball Group has written extensively on slowly changing dimension (SCD) strategies and complementary implementation alternatives. It’s important that the ETL team embrace SCDs as an important strategy early in the initial implementation process. A common compromise is to put off to the future the effort required to properly support SCDs, especially Type 2 SCDs where dimension changes are tracked by adding new rows to the dimension table. The result is often a total rework disaster.
Deferring the implementation of proper SCD strategies does save ETL development time in the immediate phase. But as a result, the implementation embraces only Type 1 SCDs, where all history in the data warehouse is associated with current dimension values. Initially, this seems to be a reasonable compromise. However, it’s almost always more difficult to “do it right” when you have to circle back in a later phase. The unfortunate realities are that:
  • Following a successful initial implementation, the team faces pressure to roll out new capabilities and additional phases without time to revisit prior deliverables and add the required change-tracking capabilities. Thus, the rework ultimately required to support SCD requirements continues to expand.
  • Once the ETL team finally has the bandwidth to address SCD, the ugly truth becomes apparent. Adding SCD Type 2 capabilities into the historical data requires rebuilding every dimension that contains Type 2 attributes; each dimension will have to have its primary key rekeyed to reflect the new historically appropriate Type 2 rows. Rebuilding and rekeying even one core conformed dimension will unavoidably require reloading all impacted fact tables due to the new dimension key structures.
  • Facing a possible rebuild of much of the data warehouse environment, many organizations will back away from the effort. Rather than reworking the existing historical data to restate the dimension and fact tables in their correct historical context, they implement the proper SCD strategies from a point-in-time forward. By compromising the implementation of proper SCD techniques in the initial development process, the organization has lost possibly years of important historic context.
Compromise 2: Failing to Embrace a Metadata Strategy

DW/BI environments spin off copious amounts of metadata. There is business metadata, process metadata, and technical infrastructure metadata that all needs to be vetted, captured and made available. The ETL processes alone generate significant amounts of metadata.
Unfortunately, many ETL implementation teams do not embrace metadata early in the development process, putting off its capture to a future phase. This compromise typically is made because the ETL team does not “own” the overall metadata strategy. In fact, in the early stages of many new implementation efforts, it’s not uncommon for there to be no designated owner of the metadata strategy.

Lack of ownership and leadership makes it easy to defer dealing with metadata, but that’s a short-sighted mistake. Much of the critical business metadata is identified and captured, often in spreadsheet form, during the dimensional-modeling and source-to-target mapping phases. What’s more, most organizations use ETL tools to develop their environment, and these tools have capabilities to capture the most pertinent business metadata. Thus, the ETL development phase presents an opportune moment — often squandered — to capture richly described metadata. Instead, the ETL development team only captures the information required for their development purposes, leaving valuable descriptive information on the cutting room floor. Ultimately, in a later phase, much of this effort ends up being redone in order to capture the required information.
At a minimum, the ETL team should strive to capture the business metadata created during the data-modeling and source-to-target mapping processes. Most organizations find it valuable to focus initially on capturing, integrating, flowing, and, ultimately, surfacing the business metadata through their BI tool; other metadata can be integrated over time.

Compromise 3: Not Delivering a Meaningful Scope

The ETL team is often under the gun to deliver results under tight time constraints. Compromises must be made. Reducing the scope of the initial project can be an acceptable compromise. If, for example, a large number of schemas was included in the initial scope, one time-honored solution is to break that effort up into several phases. It’s a reasonable, considered compromise assuming the DW/BI project team and sponsors are all fully, if not grudgingly, on board.
But it’s a problem when the ETL team makes scope compromises without proactively communicating with the DW/BI project team and sponsors. Clearly, this is a recipe for failure and an unacceptable compromise.

This situation is often a symptom of deeper organizational challenges. It can start innocently enough, with shortcuts taken under pressure in the heat of the moment. In retrospect, however, these compromises would never have been made in the full light of day. In an effort to achieve overly ambitious deadlines, the ETL team might fail to handle data quality errors uncovered during the development process, fail to properly support late arriving data, neglect to fully test all ETL processes, or perform only cursory quality assurance checks on loaded data. These compromises lead to inconsistent reporting, an inability to tie into existing environments, and erroneous data and often lead to a total loss of confidence among business sponsors and users. The outcome can be total project chaos and failure.

Make Compromises Openly and Honestly

Compromises may be necessary. The most common concession is to scale back an overly ambitious project scope; but key stakeholders need to be included in this decision. Other, less intrusive changes can be considered, such as reducing the number of years of back history used to seed a new environment, reducing the number of dimension attributes or number of metrics required in the initial phase (while being careful about SCD Type 2 requirements), or reducing the number of source systems integrated in the initial phase. Just keep everyone informed and on the same page. The key is to compromise in areas that do not put the long-term viability of the project at risk.

Tuesday, June 12, 2012

ETL Performance Auditing - Part 1: Introduction to ETL Auditing by Frank A. Banin - SQL ServerCentral

Greetings,

I would like to share this great post by Frank A. Banin at SQL ServerCentral. It's a 3 Part series and only the first 2 parts have been posted so far. I'm yet to read the second part but if the first is anything to go by, it's definitely worth a read. 

This is not a posting of the entire article but I'm merely spreading the gospel. :) So here is the link to part 1ne of the article: ETL Performance Auditing - Part 1: An Introduction to ETL Auditing.

P.S. There's a lot of great SQL(BI) Material on SQL ServerCentral so please look around the website and read a little, you could even join in and be a member. I promise you, you won't regret it. If you do, fine. I probably could do much to brighten you mood anyways...

Anyways, that's it for now. Have a gr8 1ne!!!
 

Monday, May 21, 2012

SSRS: To Stored Procedure or To Embedded T-SQL? That is the question!

Hi Everyone,

I've always had this question and being a staunch believer in the way I do things, I thought its best that I put my views out there. I know that someone out there may disagree with me, just like my colleague who always argues a contrary view to my logic, that's fine, I was never the star in my class anyways but I reckon my reasons are legitimate and a lot of other SQL experts agree with me when it comes to this 1ne. I've taken some of the points from other bloggers and have added them to this post. I'll add their links at the end, so as to not infringe on the copyright issue. This is will be in point form so as to not bore you with long "paragraphic" reading. Anyways, enough about what people think, let's get to the business part of this blog, but before we do, let's go back to the question again; is it better to use a Stored Procedure for your SSRS Report Dataset or should 1ne stick with Embedded SQL instead?

Stored Procedures vs Embedded SQL

Stored Procedures as a Dataset Source for an SSRS Report
Pros
  • Performance:  the ability for the stored procedure to reuse a query plan (stored procedure cache)
  • DBA can tune more effectively, if needed
  • Permits the DBMS to secure the object, if needed
  • Centralization of queries into views is often preferable to DBAs because the queries are more transparent to them
  • Provides a layer of abstraction between the database tables & the report (for example: you can alias column names, create derived fields, minimize the effect of other physical object changes, or show less fields to simplify the reporting process)
  • Provides an additional layer of security since "Execute" permissions are required for either the user running the report, or an impersonation account
  • Ability to query system tables to find usage of tables & columns (which may help with change management)
  • For a minor change, permits the ability to alter the stored procedure without requiring the RDL to be redeployed
  • Refactoring the database is easier
  • Unit testing of the code is much easier. You can easily build tests to just call the stored procedures. While possible via the web service interface, it's much harder to test the reports directly and requires a different skill set
  • It allows the UI to be built by one person and the stored procedures to be built by another
  • Benefits on procedure cache efficiency
  • The same stored procedure may be used on multiple reports.

Cons
  • Harder to manage security
  • Need “Create” permissions on the source database or need another person to create the stored procedure for you
  • Slightly more knowledge is required to create a stored procedure than a simple select statement
  • Can clutter up the database with quite a few simple queries and/or redundant queries
  • Additional handling is needed to parse multi-valued parameters in SSRS
  • Two-step deployment (the stored procedure, and/or the RDL); this creates an opportunity for error if not deployed concurrently
  • Additional testing of a changed stored procedure & the effect of the change on the report (which may take slightly more time since they are separate)
  • May require additional personnel / time / coordination of efforts if the stored procedures are maintained & enhanced by staff other than the reports (for example, if a field changes, or a new parameter is requested)
  • Report may break when schema changes
  • Difficult to make changes to embedded TSQL
  • Causes procedure cache to bloat

Embedded SQL in SSRS Dataset
Pros
  • Easy (less syntax for a beginner to learn)
  • No “Create” permissions needed on the source database
  • One-step deployment (unless you are using a Shared Dataset, which is stored outside of the RDL)
Cons
  • For large datasets, may not perform as well as a stored procedure
  • Greater possibility that individual report queries are redundant or handling logic in different ways (a great way to combat this is to use Shared Datasets, a new feature in SQL Server 2008 R2)
  • SSRS Query Designer doesn’t retain formatting well
  • SSRS Query Designer removes comments (a big shortcoming in my opinion)
  • SSRS Query Designer renames aliases in some circumstances
  • The report developer may need additional permissions to database objects to construct the queries (i.e., if direct table access is being utilized instead of views)
  • Need to open the report in BIDS (or Report Builder) in order to make a change
  • Much more difficult to monitor the database objects being accessed by these queries (i.e., not as easy as querying the system tables with a stored procedure)

Please note that this is not the be all and end all to this design decision but it should provide one with valid reasons why they would choose one way over the other. I do however advise a combination of the 2, that is, were 1ne wants to populate the report's parameters, for example, they would rather use embedded T-SQL instead of a Stored Procedure and were 1ne wants to populate the report's main dataset, they would a use Stored Procedure over embedded T-SQL.

That's it for this session but before we go, here's a list of the references for this article. There are some clever people that write about the wonders of SQL and I've gained a lot from reading their blogs. So go ahead and have a look at what they do.

References:

Wednesday, April 11, 2012

Time to start blogging again!

Good Day Everyone,

I've recently been having numerous conversations with my housemate (@kulilegobingca) about blogging, or in our case not blogging. These conversations were not IT related by any means but they reminded me of the reasons for starting this blog site in the first place. In fact, more so than anything, he made me realize the selfishness that goes with NOT publishing/sharing information that may, OR may not, be relevant to the masses (even if it is just 1ne* person). In fact it is the cardinal sin many of my fellow beings fall victim to. This was further perpetuated when a stumbled upon the quote, which my African friends my relate to; which says: "Oppression can only survive through silence!"


I've therefore taken an oath not to fall victim to this sickness again, I shall blog, blog, blog; and will continue to encourage others to do the same. 

My Focus
I'll be focusing on most things BI, that is Business Intelligence (that's my field by the way), if I stray a bit off the topic, please don't blame me. I'm still a young man and would like to keep my blog interesting.

I'll also do my best to post articles I read from other sources. I PROMISE NOT TO PLAGIARIZE, BUT I WAS NEVER THAT GOOD AT ADDING REFERENCES, THAT STEMS FROM MY HIGH SCHOOL DAYS and remember I'm still a young man. 



Note
*1ne = one (This is how I Like to write my ones!!! ;)

Monday, May 31, 2010

Whats New in SQL Server 2008

Greetings guys,

Welcome to my first post. It's not really mine actually but i thought I'd share this valuable article with you.

This article will highlight some of the new features and benefits found in SQL Server 2008. Follow the links below.

What’s new in SQL 2008 Part 1
What’s new in SQL 2008 Part 2
What’s new in SQL 2008 Part 3

Till next time!!!