Apache Cassandra
Apache Cassandra published the latest version of Apache Cassandra. Cassandra is an open-source, high-performance, distributed big data management platform. Version 4.0 will offer significantly improved performance and management.
“Cassandra 4.0 has been a long-awaited release,” stated Nate McCall (Vice President of Apache Cassandra and Software Engineer at Apple). “The new version is faster and more scalable than the previous one, and it’s ready for production with unprecedented scale in cloud computing”.
Cassandra will be able to manage unstructured data with millions of writes per second. V4.0 represents the culmination of three years of hard work. It includes more than 1,000 bug fixes, enhancements, and new features.
Key features
Increased speed, scalability, and flexibility
Data will be streamed up 5x faster during scaling operations and “up to 25%” more read and write throughput. These are the results of more flexible architecture, particularly in cloud and Kubernetes environments.
Increased consistency
Data replicas are kept in sync to ensure faster, more efficient operation.
Audit logging
Provides enhanced security and observability. Users’ access and activities are recorded with minimal impact on workload performance. Users can now use the new capture and replay function to review production workloads in order to ensure compliance with SOX and PCI regulations.
Operators can make new configuration settings – They may use system metrics and configuration options that are easily accessible to them to help improve their deployments.
Minimized latency
As heap sizes increase, garbage collector wait times are reduced to just a few milliseconds and have minimal latency impact.
Higher compression.
A higher compression efficiency lowers the data that must be saved on disk and improves read performance.
Cassandra 4.0 has been tested and hardened by Apple, Amazon, and Netflix.
During the testing and QA phase, the community developed repeatable workloads that were as close to the real world as possible. They also successfully checked the status of the cluster against the model without interrupting their work.
Cassandra’s deployment examples
Apache Cassandra, as a NoSQL database, can handle large amounts of data for load-intensive applications. It also maintains high availability and avoids single points of failure. Apple has 160,000+ instances, storing more than 100 petabytes data across 1000+ clusters, Huawei has 33,000+ instances across 300+ Clusters and Netflix has 6,050+ petabytes across 100+ Clusters, with over 1,000,000 requests per day. The previous examples are just a few of Cassandra’s largest production deployments.
Cassandra started as a Facebook Project in 2008 and then moved to the Apache Incubator in January 2009 and was promoted to an Apache Top-Level Project in February 2010. Apache Cassandra is used by Activision, Apple, Backblaze, BazaarVoice, Best Buy, Comcast, DoorDash, Monzo, and Outbrain.
“Netflix heavily uses Apache Cassandra to fulfill its ever-growing persistence requirements on its mission of entertaining the world”. Vinay Chella is the Netflix Engineering Manager and Apache Cassandra Commissioner. We have been testing and partly using the 4.0 beta in some of our environments. “We can reduce infrastructure costs by using Apache Cassandra 4.0 for its improved performance. 4.0’s stability, correctness, and speed allow us to concentrate on building higher-level abstractions over data store compositions. This results in improved developer velocity and optimized data storage access patterns. Apache Cassandra 4.0 runs faster, is more secure, and is enterprise-ready. I recommend it to all of my clients”.
Conclusion
Cassandra’s new update is a complete overhaul which will allow us to see many exciting innovations through the technology. If you liked this article you can also check out our article on what is IoT which will benefit greatly from Cassandra’s new update.