Groups    |     Jobs    |      User Space 

Hello, pls   log in or   register
current location:   up9rade > groups > words from sql
From Database to Complete Data Platform

20071010 22:35:19   from: freedo


This past week, I was in Hong Kong in connection with Microsoft TechEd Hong Kong. Ron Jacobs and I gave the closing keynote where we highlighted the new wave of innovation in Microsoft’s Application Platform. I covered the Data Platform part of course, and I had the pleasure of giving the audience a quick glimpse into the many innovations coming to market in SQL Server 2008 (which is scheduled to be released next calendar year). We’ll talk more about SQL Server 2008 in a future posting, but if you are interested in learning more and playing with the bits right now, here’s the place to start.

In addition to speaking at TechEd, we also met with members of the local HK media and briefed them on SQL Server’s future direction as a product. And we met with some of our valued customers in the region – learning about the latest projects customers are undertaking on the Microsoft platform and the types of issues they need help with is always an educational experience.

But perhaps the most interesting aspect of the trip for me this time were the talks I gave at the University of Hong Kong (HKU) and also at the Hong Kong (HKUST) of Science and Technology. The sessions were entitled From Database to Complete Data Platform and the goal was to connect with students and faculty doing research in areas that we now broadly refer to as the “Data Platform” and give them an industry perspective of what we believe is a historic transformation of the field from its traditional roots in “Databases”.

The field of modern databases has been around for over 40 years. From the early work on Hierarchical and Network models, through Codd’s ground-breaking work on the Relational model, through the many innovations in the area of transactions, isolation levels, access methods, declarative query languages and query processing, cursors, APIs, and so on, the field that has provided the basis for building the robust enterprise-scale mission-critical applications that drive much of today’s “Information Economy”. As such, when an average student in a university thinks about the field of “Databases” they may picture a mature, somewhat musty, field in which all the exciting problems were solved many years ago and all that remains is incremental advancements that wring the last bit of life out of a dying field.

How wrong they would be.

Over the last few years, a powerful confluence of trends – technology trends, consumer and business trends, application trends – has resulted in the broadening and redefinition of Database field like never before, with more exciting problems to solve than at any previous time in the long history of the field. Let’s examine these trends briefly:

Technology Trends: Everyone is familiar with Moore’s law – processing capacity doubling every 18 months, first manifested as increasing MHz and now as multiple cores. The trends in hard disk storage capacity (and price) have actually been even more amazing. Just to give an example, hard disk prices per GB have dropped from about USD 40,000/GB in 1980 to about USD 0.5/GB today!!! Memory and flash memory prices are on an even steeper trajectory down, with capacity going up exponentially to match. At the same time, there is a tremendous proliferation of devices – cell-phones, PDAs, gaming devices, GPS devices and so on – all of which generate, store, process and send/receive/synchronize data at tremendous rates. And of course, the ubiquitous Internet has not only made possible new types of applications but also changed expectations around the characteristics of existing applications – more on this in a minute.

Consumer and Business Trends: Enabled by the technology trends mentioned above, there have been huge changes in how consumers and businesses relate to data and information. First, there is the sheer explosion of data – the amount of new data being generated, much of it born electronically, has been growing exponentially (ever notice how your hard-disk, no matter how large, is always significantly full?). Email, documents, digital photos, music, video, streaming data from sensor, satellite imagery, all are part of this great data explosion. But it is not just about storing all this data – consumers and businesses want to derive value from all this data – being able to search, share, synchronize, analyze, visualize and manipulate the data so it becomes useful information – the notion of “Your Data, Any Time, Any Place”. And all this while ensuring the data is secure, privacy is protected, and all regulatory requirements –both external and internal – are met.

Application Trends: First, there was “batch processing” – basically an automation of human processes that had previously existed for centuries. Then came “OLTP” – Online Transaction Processing that in many cases changed the way business was done – what used to take many days could now be done instantaneously. The OLTP architecture and underlying platform technology went through several iterations, but the core concept remained the same. As companies started to accumulate more and more data from these OLTP systems, they discovered an opportunity to gain a vital competitive advantage by analyzing this data and understanding their customers better – thus was born the Business Intelligence, including Data Warehousing, Online Analytical Processing (OLAP), Reporting, Data Mining and so on. But today we live in a Web 2.0 world – applications surface through a variety of end-points (rich client, browser, device, laptop/desktop, …), they seamlessly bring together data from a variety of data sources, and they provide a variety of rich services including query, search, analysis (increasingly real-time analysis), reporting, visualization, and so on. And they do all this while operating at unprecedented levels of scale, reliability and security.

A Complete Data Platform

The changing trends described above are driving a fundamental transformation of our field – from just Databases to what we now call a Complete Data Platform. This Data Platform builds upon and extends the concept of a Database in three different dimensions:

All Data: It is no longer sufficient for the Data Platform to be able to store and manipulate words and numbers as databases have done for a long time now. A Complete Data Platform must be able to work with all kinds of data – including text, XML, Objects, documents, files, streaming data from sensor networks, and any kind of user-defined data. And it must be able to provide the appropriate services for each type of data – storage, indexing, query etc.

All Tiers: Gone are the days when a Database ran only on a “server”. Today, a Complete Data Platform must provide data services across the entire spectrum of hardware tiers – from phones and mobile devices, to laptops, desktops, servers, server farms and finally cloud-scale mega-service infrastructure. And it must do so with seamless interoperability for data and applications across these tiers.

All Services: The range of services on data is no longer restricted to store, query, backup, restore and a few other verbs. A Complete Data Platform must provide a broad range of services including those and others – search, cache, synchronize, analyze, mine, integrate, report, visualize, secure, audit, archive, … In short, it must service the entire data life-cycle, from birth to archival.

We can drill into a lot more details (and we likely will in future posts) but that in a nutshell is the scope of what we call a Complete Data Platform – one that handles all data, on all tiers, and provides all relevant services on that data. And does so while maintaining consistency in some vital dimensions – for example in the data model, security model, management model, data access APIs, development tools, etc. And of course it goes without saying it does all this with high performance, rapid time to solution, and low TCO. Quite simple – isn’t it? J

Our Opportunity

I hope the discussion above gives you a sense about the unprecedented opportunity for innovation in the Data Platform space. At no other time in the history of the field of modern Databases has there been such a wide array of technical challenges, such a broad canvas on which to paint. If you are a student in university – as were some of the bright minds I met in Hong Kong this last week – this is a time of unprecedented opportunity for you. The spectrum of problems to pick from is so vast and varied. And not just for “Database” majors - there are interesting problems in this space for those with an interest in virtually any aspect of computer science – computer architecture, networking, programming languages, data mining, XML, search, visualization, web-scale computing, semantic web – the list is quite long. The database field has always been one where people easily spend a long time – it is not unusual to find people who have built multi-decade careers, indeed spent their entire professional life, in this space. The opportunity to do so, if you choose to, is even more compelling today. After all, we live in the information age – this is our time.

Until next time - cheers,

Prakash
Posted Monday, September 03, 2007 5:59 AM by prakas
------------------------------------------------------------------------------------------------------------




I want to say something...

Input autho code, click on image to change another code: