The fourth shift in the evolution of seeking insights from data

Real-time analytics databases that combine CRUD with streams are needed for high concurrency and sub-second response times across billions of data points for the next generation of data analytics.

Using data to gain insights helps drive the success of nearly any organization, plain and simple. Some of the benefits are getting the right products for the right people, leveling the playing field, understanding risks, helping people find their preferences, and a plethora of other compelling results.

In the world of data analytics, some would say there are three major shifts in finding insights from data — and now we’re seeing a fourth.

Create a CRUD

It all started with Codd’s creation of the relational database. For some time, hierarchical and grid models have focused on automating ancient processes that were done with pens, paper, and mechanical calculators. In 1970, Dr. Ted Codd of IBM published the “Relational Data Model for Large Shared Data Banks”, which marked the beginning of a new era of data. Relational databases became the basis for the data revolution of the 1980s and 1990s, as tables were derived with the rows and columns we use today.

Watch Now: Simplify Accessing a Database Cloud Service with RHODA

Codd’s idea inspired another group at IBM to develop SQL, which made it very easy to get data in and out of databases. SQL is being used by many groups around the world, and a new wave of relational databases has emerged.

See also: Why SQL will remain a data scientist’s best friend

Simply put, relational SQL is raw (Data creation, reading, updating and deleting) which has become revolutionary in the way it has made large data sets practical at a time when computing and storage were very expensive. CRUD helped cut these costs by bringing together a set of tools to store data more efficiently by breaking the data into many smaller tables, which Dr. called normalization code.

This made data management more complex, which meant more developer time to work with the data. For example, if GB storage is the same price as five years of developer time, then the complexity is worth the price.

The CRUD standard allowed for a new level of complex data questions to be asked, such as “What are my most profitable products, and how have they changed over the past few quarters?”

Analytics databases need to store data in an analytics-friendly environment, with data partially deformatted – which means larger and fewer spreadsheets. With this approach, it was soon discovered that the same set of data for both transactions and analytics made them perform poorly. The developers started using a second copy of the data upon the second installation of the database software.

Devices that use raw

With the development of analytics, a new wave of devices has appeared. These devices used relational CRUD but incorporated new classes of software to extract data transaction systems in order to adapt to a different CRUD schema that can be easily loaded into analytical databases using software. In response, business intelligence tools have also been used to transform data into images and reports that people can use more easily.

The Internet has radically transformed the data ecosystem and increased the amount of data created and used. In the 1990s, the “big application” could have been considered 5,000 users and a 1 TB data warehouse. By the early 2000s, the “big apps” were social media giants supporting millions of users. What was soon discovered was that pushing that much data through CRUD pipelines was both expensive and limited.

raw to cloud

A new era of analytics databases is about to deal with larger data sets. Many believed that these databases would change data storage and connect new discoveries of the Internet to an outdated CRUDdy infrastructure. The Internet is what prompted the creation of the cloud, which has completely changed the way data is handled.

The cloud has made unlimited cheap computing power, as well as affordable on-demand storage. A re-engineered and re-created approach was created for the analytics.

Workplace applications were limited in capacity. Infrastructure and software licenses were expensive, and increasing capacity required time and money. On the cloud, computing can be added and removed on demand, and storage is durable and cheap. Suddenly, analytics is more scalable and less expensive than ever. New cloud data warehouse ecosystems, cloud data pipelines, cloud visualization tools, and analytics redefine cloud data management.

Cloud computing has inspired rapid app growth, allowing average businesses, not just internet giants, the ability to run apps that support millions of users — but then again, they’ve found pushing this massive amount of data through CRUDdy pipelines inefficient and expensive.

Watch Now: Simplify Accessing a Database Cloud Service with RHODA

Data engineers, as early as 2010, were struggling to tackle this problem. They wanted to know how to conduct interactive conversations with a large amount of data. Data flows from the Internet and other applications, so why not just analyze the data flow instead of applying it all to a relational CRUD?

The need for a fourth shift

In response to the growing need for a technology that uses both streaming and historical data, there needs to be a database with sub-second response times for questions from billions of data points that pull from data in streams and in historical datasets. Concurrency is also important because there can be hundreds of people asking questions at the same time. It must also be affordable – cost equals tangible value.

Storage and computing cost money, but in modern development this cost is much less than the developer’s time. Consider the developer’s salary, benefits, equipment, and management. It’s easy to see that developer time comes at a much greater cost.

From raw to modern

Since the beginnings of relational databases through data storage, there is now a need for modernity. The CRUD approach that has hitherto been the foundation of data analytics shows how modern stream data needs a different architecture to succeed. Real-time analytics databases that combine CRUD with streams for high concurrency and sub-second response times across billions of data points help developers ready for the next generation of data analytics. It is important for developers to see the importance of bypassing CRUD and embracing modernity.

Leave a Comment