Hello and welcome to our latest blog post here on BIDA Brains.
I am pleased you have come back to see our latest blog post.
Thank you.
Just to be clear in this opening.
This blog post was written by Peter Jones.
Peter is the Professional Services Manager here at BIDA.
In the past we put these blog posts through voice generation software.
Now we have decided that I will read the blog posts into our videos because I am the Business Development Manager here at BIDA.
I would love to get to know you if you are interested in what we have to say.
One of the reasons Peter wrote this series of blog posts is that he was talking with people on Linkedin about Data Vault.
He had personally known about Data Vault since about 1996 or 1997.
Something like that.
However, Data Vault has “suddenly” become “popular”.
So, he thought he would ask some people about Data Vault, and read some books about Data Vault 2.0.
Peter decided to do a number of blog posts on the topic of Data Vault.
He is creating these blog posts just to set the public record straight.
His comment is that he saw a lot of LinkedIn comments about Data Vault that are just ridiculous.
It’s almost like some people have taken on Data Vault as a religion.
By setting the public record straight, some people who are such enthusiasts about Data Vault might stop to wonder if they are barking up the right tree, as Australians say.
But in this first blog post he wanted to talk about the cost benefit to building data warehouses.
The linked in discussions are woefully inadequate in discussions on the benefits of data warehousing.
He will address some of the benefits of data warehousing on later blog posts.
In this blog post he just wanted to comment on the costs of building data warehouses.
So.
On with the subject of the blog post which is the cost of building a data warehouse.
While he was still on linked in Peter put up a post outlining the publicly known cost of building a data warehouse based on a version of data models similar to the Sybase Industry Warehouse Studio Models.
Peter invited those who are Data Vault enthusiasts to publish a publicly known case study of similar size and complexity.
There was no one willing to publish such numbers.
That should concern anyone considering a Data Vault implementation.
If your vendor is telling you.
“We really don’t know how much it costs to implement a data warehouse”.
Then you should be very concerned about buying from them.
Why?
If your vendor has experience in your industry segment, and has the appropriate suite of tools, they should be able to tell you pretty much exactly how much your data warehouse is going to cost you.
Don’t believe me?
I am aware Sean Kelly and Associates provided proposals that were priced as “variable capped price” projects.
Meaning, Sean would provide a maximum price the project would cost.
If he ran over?
He would eat the difference.
If he ran under?
He would give back the money saved.
Now, how could Sean Kelly do that more than 12 years ago?
Because Sean Kelly knew what he was doing.
That’s how.
If Sean Kelly was giving “variable capped price” projects for multi-billion dollar telcos 12 years ago and your vendor is not doing so today?
That should be a red flag to you.
Those people like Peter, who have been around thirty plus years, know how much it costs to build a data warehouse.
We can tell you exactly how much we will charge you.
And we will stick to it.
We have this down to an art form now.
Sure, we will include some risk if you want a fixed price.
But we won’t charge you one cent more than the fixed price.
Now, having said all that?
Please allow me to give you an example.
This is a public example that was released many years ago before Sean Kelly passed away.
The customer was Talk Talk in the UK.
It was owned by Carphone Warehouse at the time.
It was a land line telco with four million subscribers.
Talk Talk and Sean Kelly released a promotional video with Netezza.
The numbers for the project were presented at the Netezza Users Conference in 2009.
To set the scene, the billing system was Single View.
A widely used Telco Billing System even today.
The CRM was Chordiant.
A widely used CRM at the time.
The proposal was to move every field from Single View, Chordiant, and the Network Management System, into the Telco Data Warehouse Models that Sean Kelly was promoting.
This was reported to be about 4,000 fields.
The implementation date was set about a year out.
Interested vendors were asked to provide proposals.
Sean Kelly and Associates were less than 50% of the cost of the next least expensive vendor.
The overall project took 8 months, from SKA arriving on site, to going into production.
The piece I want to isolate is the piece where the data models provided by Sean Kelly were customized, and the ETL to populate the models, was written.
These two areas are the areas that bear best comparison between using dimensional models and using Data Vault models.
So that you can do an “apples” to “apples” comparison I want to be clear that what is being discussed is.
- Building the Staging to SKA Data Models ETL.
- The customization of the Models as part of the development of the ETL.
In a Sean Kelly project this work is done by one person who usually has a DBA 50% of the time to make database changes.
In this case the database was Netezza, which was much easier to use, so only a small amount of DBA time was used to solve specific Netezza problems.
Netezza is much easier to use than Oracle.
According to Sean Kelly, there were 75 dimension tables and 55 fact tables, delivered in the project.
The elapsed time for the customization of the models and writing of the ETL was 4 work months.
Now, Sean did point out that the guy who did the work also worked weekends and long nights.
It was the habit of Sean Kelly and Associates consultants to work very long hours.
It was part of the deal.
So, maybe, we could call it 5 work months for more reasonable hours.
Something like that.
So, let’s call that one hundred thousands pounds for the consulting time.
Twenty thousand pounds per month consulting fees sounds about right for 2009.
The data models were sold to Talk Talk for eighty thousand pounds.
So.
The cost for a very advanced dimensional data warehouse, for a three billion pounds sterling Telco, with 4 million subscribers, for the Billing System, CRM, System, and Network Management System, for a total of 4,000 data fields mapped, was approximately one hundred and eighty thousand British pounds.
Please remember this is not the total cost of the whole project.
There was more work done.
We are only talking about a specified specific portion.
Now.
Those of you who have been around data warehousing for a long time know that the number quoted is stunning.
Further, that was 13 years ago.
We can do that same amount of work much faster now.
Because we are based in Romania, we can also do it much cheaper than the twenty thousand pounds per month consulting rate we have estimated.
To be clear.
We are not selling projects as described above.
We don’t generally do custom build data warehouses any more.
But if we were to do such a project, it would be much cheaper than one hundred and eighty thousands pounds for this portion of the project.
The question that this post is putting in the public is this.
What is the quote from someone who is an “experienced Data Vault consultant”?
We don’t know.
We asked the question in public and we didn’t get an answer.
That, in and of itself, raises questions for us.
Any consulting company, who knows what they are doing, can tell you how much time they would ask to build a data warehouse of similar size and complexity.
In fact, as long ago as the late 90s Peter and his colleagues standardized on 1 work month per 1,000 fields being mapped to the data warehouse.
They were able to hit that mark pretty closely for 20 years.
They saw no reason to go any faster, because at 1,000 fields mapped per work month, they were already much faster than everyone else and so they could win deals.
The first project, that I am aware of, that hit that magic 1,000 fields mapped per work month, happened in 1997.
Today, we have achieved rates of up to 400 fields mapped per day on a good day.
And less than 100 fields mapped per day on a tougher day.
When we are working on complex, compound, measures tables, those numbers are not applicable of course.
But there are generally only a small number of very complex fact tables.
Today we are in the region of six thousand to eight thousand fields mapped per work month on a 1.0 data warehouse.
Today the bigger problem is learning the database.
So, this is what we wanted to put into the public for the consideration of whoever wants to read the blog post.
We wanted to raise the warning.
If your proposed consultants are not very sure about how long the construction of the underlying data warehouse will take?
You should be very wary.
That includes consultants who are selling you dimensional models as well as data vault models.
If the consultant selling you the project knows his stuff?
He can tell you how much it will cost, how long it will take, and he will have a large amount of collateral, in tools and technologies, to prove to you that he knows what he is talking about.
If you buy from a consulting firm that is very hesitant about how much they might charge you?
Then at least you were warned.
One last point before I close this blog post.
If you have an approved budget for a data warehouse build, or replacement, and you have a proposal that you have decided you want to go with?
Then we would be pleased to review it for you, under a Non Disclosure Agreement, and give you our opinion.
We will do this for free because we would rather see successful projects than unsuccessful projects.
It will not be an in depth review.
It will be a review where we will tell you if the project is doomed or likely to succeed.
As simple as that.
From our experience we can usually tell if a project will fail with less than 10 minutes spent reading the proposal.
A doomed project can usually be spotted just from the proposed project plan.
If the project plan makes sense, then it can take an hour or two to confirm the person who wrote the proposal knows what they are doing.
But it is never necessary to read a proposal for 3 hours to know if the person writing it knows what they are doing.
So, for free, we are willing to give you our opinion on whether the proposal is likely to be successful or not.
What you do with our opinion is up to you.
We can’t do much more than that.
As I said.
We are not in the business of building custom data warehouses any more.
We might do some under some circumstances.
But we are building data warehouses as a product now.
In finishing.
Thank you very much for your time and attention.
We really appreciate you dropping by to read our blogs.
Best Regards.
Mihai Neacsu.
Business Development Manager.
The BIDA Team.
Ask Us Anything
You are welcome to ask us any question you would like below on our contact form.
We will be emailing our subscribers useful and
valuable information regarding business intelligence.
Please subscribe here to be on our general emailing list.