Comments on: The World Has Changed – Why Haven’t Database Designs? https://www.nextplatform.com/2021/03/25/the-world-has-changed-why-havent-database-designs/ In-depth coverage of high-end computing at large enterprises, supercomputing centers, hyperscale data centers, and public clouds. Sat, 06 Aug 2022 04:10:55 +0000 hourly 1 https://wordpress.org/?v=6.7.1 By: Mark Funk https://www.nextplatform.com/2021/03/25/the-world-has-changed-why-havent-database-designs/#comment-195369 Sat, 06 Aug 2022 04:10:55 +0000 http://www.nextplatform.com/?p=138134#comment-195369 Yes, the ever-changing relative performance has a lot to do with database design. But I wonder whether you might have wanted to attack this issue is to start off with the concept of a Transaction and all that that implies. And under that is the notion of ACID … Atomicity, Consistency, Isolation, and Durability. This further implies support of massive concurrency and durability over all sorts of failures. Various databases play around with each of these, but whether folks think about it or not, they are all expecting that these attributes are met. We all individually believe that we are the one and only user of a database and that whatever we do with the database, when we are done, everything that we did will remain indefinitely. We believe that, but the reality is very different than that, and all sorts of DB architecture exists to hide all that reality. We don’t want to know that there could be thousand of others just like us concurrently mucking with the same data. Sure, we can speed up where and how the data becomes persistent, but a heck of a lot of that architecture comes from the basic fact that our processors are tied to a memory that has no real persistence and the persistence is available outside of that processor complex, out in IO space. The latency involved in that distinction has a way of limiting a DBMS’ throughput and from there the response time at high throughput. If what we are after is a major change is DBMS architecture, it would seem to require starting with a way that persistence becomes very close to the processor, if not directly accessible to and addressable by the processors.

As a related thought, all that a database really is is persistent data, no matter the format, which follows the normal rules of Transactions. As you likely know that data need not be rows and columns. It could be exactly as complex as any data structure that any program can create. The only difference is that whatever addressing it uses must also be capable of being understood within the persistent storage. It’s still got to make sense no matter the number of failures or power cycling. (Addressing cannot be the usual process-local addressing that all our programs tend to use, but there are all sorts of alternatives.) That is part of the database design as well, and there are a lot of different ways that that data could be represented. So, yes, it strikes me that database designs have been changing, merely because there are so many ways to represent persistent data, data which is accessed via Transactions, all the while maintaining ACID. Still, hardware architecture continues to change as well, and database design will be following that.

]]>
By: Monalisa Nyasha Billiat https://www.nextplatform.com/2021/03/25/the-world-has-changed-why-havent-database-designs/#comment-165589 Wed, 25 Aug 2021 14:01:30 +0000 http://www.nextplatform.com/?p=138134#comment-165589 In reply to Peter Fox.

Hi but do you think database will be still useful till 2050

]]>
By: DBA https://www.nextplatform.com/2021/03/25/the-world-has-changed-why-havent-database-designs/#comment-161504 Wed, 31 Mar 2021 21:00:05 +0000 http://www.nextplatform.com/?p=138134#comment-161504 In reply to Hugo Kornelis.

@hugo;

There’s so much confusion on display in the original article that it makes it difficult to respond coherently. You touched on a lot of the points I think are important, so thanks for taking the time and effort.

]]>
By: Nathan https://www.nextplatform.com/2021/03/25/the-world-has-changed-why-havent-database-designs/#comment-161388 Mon, 29 Mar 2021 17:18:50 +0000 http://www.nextplatform.com/?p=138134#comment-161388 Read the article. Read the comments. Re-read the article.

The core assertion here seems to be that ACID is no longer relevant to databases, or I missed something. You seem to be lobbying for more of a database that functions as an air traffic controller allowing data to fly in and out with most critical processes being carried out at destination points rather than the origination point by arguing that it is so much cheaper now to have tons of distributed computing (which I think is also the reason you asserted Hadoop was scrapped?).

If you scale back your argument a bit and decide to target only data consuming applications like a BI platform where data changes on a predictable time schedule (daily/hourly/etc) I could support offloading some of the calculation work onto a client machine, but that might also bring up some data security issues.

A lot of these relational databases are also moving to web platforms where you gain access through a web browser and I don’t believe that those have anywhere near the technical focus on optimizing my database level transactions that the SQL/Oracle teams do.

]]>
By: Stephen Channell https://www.nextplatform.com/2021/03/25/the-world-has-changed-why-havent-database-designs/#comment-161386 Mon, 29 Mar 2021 16:19:27 +0000 http://www.nextplatform.com/?p=138134#comment-161386 There is a kernel of truth in the article, the Relational Model has outlived many technologies that were predicted to replace it, most recently Hadoop that is indeed a legacy technology. Hadoop still survives the in the for HDFS, the Hadoop Distributed File System that underpins many of the technologies that replaced Map/Reduce. HDFS is also a legacy (always was – GFS was always at the OS level and distribute file systems have been standard in HPC for decades). HDFS is difficult to replace because persistent storage is difficult to replace.

The relational model has evolved to include in-memory databases, column-store structures and typed blob storage (Objects, then XML, then JSON) and even Graph and block-chain. In most cases the distinction between “NoSQL” databases and RDBMS is a cost equation rather than functionality.

Globally distributed databases with eventual consistency is the current area of change which is already seeing overlap block-chain concepts of non-updatable ledger with evolution of trusted block-chain technology.

The big renascence of relational technology will come with GPGPU optimised table/index scan that will render the current generation of NoSQL database obsolete. Expect to see more query-engine functions to move to block storage devices, when ICL’s 1970’s idea of Content addressable storage (CAFS) finally comes of age.

]]>
By: andrew https://www.nextplatform.com/2021/03/25/the-world-has-changed-why-havent-database-designs/#comment-161375 Mon, 29 Mar 2021 13:04:08 +0000 http://www.nextplatform.com/?p=138134#comment-161375 @hugo kornelis.
Having read your comment, I was inspired to re-read the article, looking for accuracies:

close to the end:”Amazon created a whole new product – the Aurora database – by rethinking the core assumptions behind RDBMS storage abstraction.”

Looking at the product page it seems Aurora is an RDMBS that is Postgres/MySQL compatible.
…and so whilst that bit of the article is also inaccurate, I learnt something and will give Aurora a closer look.

]]>
By: Hugo Kornelis https://www.nextplatform.com/2021/03/25/the-world-has-changed-why-havent-database-designs/#comment-161364 Mon, 29 Mar 2021 10:11:59 +0000 http://www.nextplatform.com/?p=138134#comment-161364 *Sigh* Where do I even begin?

I was tempted to stop reading after the first few sentences already. Did you even bother to check your story about QWERTY? The main reason for its design is in fact that it makes typing faster, because alternating hands between commonly used together keys is faster. There were some modifications to ensure letters often used together were not adjacent; and those were indeed made to reduce jamming. But the overall design was for typing speed.

So if you didn’t bother to check this fact, what about the rest?
Well, it didn’t take long for me to reach the next verifiably false claim. “The relational database predates the Internet”. Actually, Internet started as ARPANET, in the late 1960s. The first relational databases were from the early 1970s. All facts that are incredibly easy to find, if you bother to check your facts.

It gets only worse when you start talking about databases. It seems to be that you decided to rant on a topic you are not really familiar with. So let’s check some of your claims, right?

“For instance, they avoid caching on the disk layer and employ ACID semantics, writing to disk immediately and holding other requests until the current request has finished”
No. Various databases use various techniques to prevent data loss. But writing to disk immediately and holding other requests until the current request has finished is not one of them. Even relatively low end database systems such as MySQL or Access support concurrency.

“The underlying assumption is that with these precautions in place, if problems crop up, the administrator can always take the disk to forensics and recover the missing data.”
BWAHAHAHA!!!
Ever heard of a thing called “backups”? Most database administrators use them, you know.
If problems crop up, we simply restore from backup. Many of the high end database products have mechanisms that effectively boil down to continuous backing up, meaning we can restore to the millisecond before you dropped your coffee on the SSDs in the SAN.
And if we know that outage is expensive, we prepare by setting up a standby failover. Again, techniques are different between vendors and evolve over time, but functionally you could describe it as spare system that is constantly in a state that a new system would be in after restoring the backup.

“Yet RDBMS persists in putting redundancy on top of redundancy. Business and technical requirements often demand this capability even though it’s no longer needed”
Perhaps true for some. Those businesses can use other solutions (though a relational database might still bring other advantages).
Make sure to have them sign at the dotted line that they can’t sue you if your solution fails and they lose data or availability.
Most of MY customers do tend to care about not losing their business critical data. And many companies in the world lose millions if their systems are down for just an hour or so.

Your next two paragraphs are harder to debunk. Not because they are true, but because they are mere assertions without any evidence or argumentation.
In both cases, your key point appears to be that it would be better to decentralize processing (true in some cases, false in others), and then you assert that “it’s not how databases work” and “it violates the design of RDBMS.”

And that brings me to the conclusion. The overarching arguments of your post appear to be: (1) relational databases are an old technology, there’s new stuff available, so why still use the old; and (2) there are many advancements in hardware and relational databases are still using old assumptions about hardware.

As to the first argument: Kitchen knives are old technology too. Much older than relational databases. And yet we still use them. Why? Because they are still the best tool for the job of cutting foods. There have been innovations in the past decades and they have found their own use for some specific cases, but as an overall tool for generic “we need to cut food”, a knife is still the best solution. Perhaps one day a new tool will be developed that’s better than knives. But until then, we’ll always have them in our kitchen.
Same with relational databases. There are more and more innovative non-relational solutions that are a much better fit for some specific scenarios, and I would always recommend using those specific solutions for those cases. But as a generic use large scale data storage and retrieval tool, relational databases are still the best idea available so far.

And the second argument is simply bollocks. You seem to think that because the basic idea of relational databases stems from the 1970s, the technology has not changed. Well, I’ve got news for you. Major relational database users are not running their database on Oracle v4 or SQL Server 7.0. They are on Oracle 18c or higher, SQL Server 2017 or higher. And those new versions of relational databases have all been changed to make use of hardware advancements.

Your rant is poorly researched. Your knowledge of how modern databases work seems to be limited or outdated. You also seem to have little grasp on the difference between the (indeed mostly unchanged, like my kitchen knife) basic principles of the relational model on the one hand, and the actual implementations in RDBMS’s on the other hand.
It’s as if you’re criticizing modern car manufacturers and your main argument is that the original T-Ford was designed for a world with limited paved roads and mostly horseback traffic.

Perhaps there were also some valuable observations in your article. If so, then they got lost between all the fallacies.

]]>
By: Steven Posick https://www.nextplatform.com/2021/03/25/the-world-has-changed-why-havent-database-designs/#comment-161358 Mon, 29 Mar 2021 01:43:09 +0000 http://www.nextplatform.com/?p=138134#comment-161358 There is the tremendously large assumption that RDBMS designs have not changed in 50 years, which is absolutely incorrect. Oracle Grid, RAM persistence, removal of caching, using NVMe have already been implemented. The internal structures have changed, how they are accessed, etc…

The assumptions may be true for Open Source databases that haven’t adapted, but for RDBMS technology in general these assumptions are false. Just lookup Exadata or the latest Oracle database features and implementations. Another hole in the assumptions made us that RDBMS databases have always been defined to utilize the maximum amount of memory, either for caching or execution. In the end all that truly matters is if the database focuses on Consistency or performance (eventual Consistency). The fundamental problems that drove ACID did not change with the advances in technology.

]]>
By: Tez https://www.nextplatform.com/2021/03/25/the-world-has-changed-why-havent-database-designs/#comment-161353 Sun, 28 Mar 2021 20:06:38 +0000 http://www.nextplatform.com/?p=138134#comment-161353 Try using a nosql solution in production and you will run in to tons of problems that have long since been solved in an sql db. I have used in production NEO4J, mongodb, intersystems cache and they all through up horrible problems in deployment, replication, consistency scalability.

POSTGRES or MySQL are so much easier to rely on. Problems rarely occur and have well known solutions, nosql often has weird problems that you could not have imagined would exist. I see them as useful for exploring complex data in unique ways but to run any service you pick sql every time or you will experience pain after a year or two.

]]>
By: Guy Rouillier https://www.nextplatform.com/2021/03/25/the-world-has-changed-why-havent-database-designs/#comment-161306 Sat, 27 Mar 2021 07:38:45 +0000 http://www.nextplatform.com/?p=138134#comment-161306 A centralized RDBMS has decades of development and theoretical underpinnings to it, and countless number of people-hours of experience with it. In short, it is a proven, stable solution. The author’s point is valid that since the theoretical model for the RDBMS was defined, much has changed. However, these newer options do not all have the proven reliability and transactional consistency that RDBMSs have. That’s important to many organizations that have their own and their customer’s money on the line, with legal and financial ramifications.

]]>