Update Mnemonic: Memory Management for Big Data

Main Contents:

Mnemonic: Memory Management for Big Data is an article under the topic Data Science Many of you are most interested in today !! Today, let’s InApps.net learn Mnemonic: Memory Management for Big Data in today’s post !

Hardware + Software Improvements

She and Gang “Gary” Wang worked together on how to improve memory management for Big Data, ultimately writing 6,000 lines of code.

The software provides Java programmers a way to define objects and structures to be managed by Mnemonic. In testing their work on Spark, they were able to cut pause time in half and achieve 2x to 3x gains in speed, she said. Companies such as Cisco and Cloudera have joined the project at ASF, she said, while Intel’s artificial intelligence and client computing groups are furthering the work.

Yanping Wang, meanwhile, has left Intel to focus the work on financial trading use cases, where no pauses are acceptable.

Storage continues to evolve, with NAND and 3D XPoint filling the speed and capacity gap between DRAM and hard disk drives, she said.

“We have a lot of non-volatile memory these days. In the old days, there was a memory disk and cache, which left speed gaps. Then software developed a lot of algorithms for caching — if they didn’t do the caching, some of the data they wanted to use was swapped back to disk. Then it would take an extremely long time to access. Now hardware developers are starting to put fast memory to fill in the gaps between cache and disk, so a large amount of memory is available. That chunk of memory is not being used very effectively so far on Spark, Hadoop and others on the software side,” she said.

In-Place Computing

Mnemonic is a project to extend in-memory computing to in-place computing using next-generation NVM storage media.

For Big Data computation, it keeps massive data object and its schema on storage, with no need to serialize/deserialize the objects.

“In-place computing means doing calculations where the data is,” she explained. “You do not need to move the data. We work with CPU — it could be GPU, on a card — memory can have a processing unit pretty much anywhere. Take the data, it can do sorting, do partition, when some critical work is being done, processing is done in the database itself, very close to the processing unit. In the old days, the computer could not do that because it was too expensive and the memory was so far away from the cache. This is based on hardware innovation as well.”

Actually, Mnemonic is a project somewhere between software and hardware, she said.

“The optimization is to ensure the best usage of hardware while the software has flexibility and a fast usage model. The software can run seamlessly, but underneath … we make sure hardware works the best, but also makes software happy,” she said.

Mnemonic provides a mechanism to communicate with native code directly through in-place object data update to avoid complex object data-type conversion and stack marshaling.

Using Apache Mnemonic, objects also can be directly accessed by other computing languages such as C/C++. The durable object model and durable computing model might also lead to new cache-less and SerDe-less (serializer and deserializer-less) architecture for high-performance applications and frameworks.

The project is continuing to develop the library, she said, and is working on pure Java memory service, durable object vectorization and durable query service features.

Apache Mnemonic provides a unified interface for memory management,” said Yanhui Zhao, Apache Mnemonic committer. “It is playing a significant role in reshaping the memory management in current computer architecture along with the developments of large capacity NVMs, making a smooth transition from present mechanical-based storage to flash-based storage with the minimum cost.”

Feature Image: “The Flame of Memory” by Maxim Peremojnii, licensed under CC BY-SA 2.0.

Source: InApps.net

Rate this post

Phu Nguyen

As a Senior Tech Enthusiast, I bring a decade of experience to the realm of tech writing, blending deep industry knowledge with a passion for storytelling. With expertise in software development to emerging tech trends like AI and IoT—my articles not only inform but also inspire. My journey in tech writing has been marked by a commitment to accuracy, clarity, and engaging storytelling, making me a trusted voice in the tech community.